Tracing & Eval

    You need to trace and eval, otherwise you're not in production.

    LLM for Devs

    Collection thumbnail

    Recently added

    Compare Tavily, Perplexity API, Google Search Grounding (Gemini), Exa with LLM as Judge in LangSmith
    Lesson 4

    Compare Tavily, Perplexity API, Google Search Grounding (Gemini), Exa with LLM as Judge in LangSmith

    This AI workshop evaluated the performance of various search providers, using Langchain and a custom Python framework to compare their accuracy and efficiency across multiple queries. The workshop leveraged Langsmith for tracking experiments, generating evaluation datasets, and visualizing results, providing a comprehensive analysis of LLM performance.

    32mJan 29, 2025
    Free
    How to trace LLMs other than OpenAI in LangSmith and track costs & tokens count
    Lesson 3

    How to trace LLMs other than OpenAI in LangSmith and track costs & tokens count

    This workshop demonstrates LangSmith's capabilities for tracing and analyzing LLM calls, specifically using Google's Gemini model for cost-effective text summarization. The lesson highlights error tracking, token counting, and cost analysis within a practical example focused on Ottawa's unique bootstrapping tech ecosystem.

    9mNov 23, 2024
    Member
    Use LLM as a Judge and Create your first dataset from LangSmith runs
    Lesson 2

    Use LLM as a Judge and Create your first dataset from LangSmith runs

    A real eval task! Setup a dataset, then an evaluator (LLM as a Judge) and run an experiment on LangSmith.

    20mNov 19, 2024
    Member
    Evals are surprisingly all you need (why?)
    Lesson 1

    Evals are surprisingly all you need (why?)

    Why is evals your next superpower?

    7mNov 17, 2024
    Member

    All lessons

    Evals are surprisingly all you need (why?)
    Lesson 1

    Evals are surprisingly all you need (why?)

    Why is evals your next superpower?

    7mNov 17, 2024
    Member
    Use LLM as a Judge and Create your first dataset from LangSmith runs
    Lesson 2

    Use LLM as a Judge and Create your first dataset from LangSmith runs

    A real eval task! Setup a dataset, then an evaluator (LLM as a Judge) and run an experiment on LangSmith.

    20mNov 19, 2024
    Member
    How to trace LLMs other than OpenAI in LangSmith and track costs & tokens count
    Lesson 3

    How to trace LLMs other than OpenAI in LangSmith and track costs & tokens count

    This workshop demonstrates LangSmith's capabilities for tracing and analyzing LLM calls, specifically using Google's Gemini model for cost-effective text summarization. The lesson highlights error tracking, token counting, and cost analysis within a practical example focused on Ottawa's unique bootstrapping tech ecosystem.

    9mNov 23, 2024
    Member
    Compare Tavily, Perplexity API, Google Search Grounding (Gemini), Exa with LLM as Judge in LangSmith
    Lesson 4

    Compare Tavily, Perplexity API, Google Search Grounding (Gemini), Exa with LLM as Judge in LangSmith

    This AI workshop evaluated the performance of various search providers, using Langchain and a custom Python framework to compare their accuracy and efficiency across multiple queries. The workshop leveraged Langsmith for tracking experiments, generating evaluation datasets, and visualizing results, providing a comprehensive analysis of LLM performance.

    32mJan 29, 2025
    Free