Organization logo

    Strux

    Strux-related demonstrations for LLM / Agent evaluations

    1 lesson

    Explore Collection
    Organization logo

    Strux

    Strux-related demonstrations for LLM / Agent evaluations

    1 lesson

    Explore Collection
    Mikhail AI

    Mikhail AI

    AI Engineering Projects, Research Paper Implementations, and deploying vertical agents

    Latest Video

    Recently added

    Pairwise Evaluations Paper Implementation

    Pairwise Evaluations Paper Implementation

    This lesson introduces an implementation for the paper on Pairwise Evaluations, a novel LLM evaluation method using pairwise comparisons, offering a more reliable and human-aligned ranking than traditional direct scoring. The cookbook can be found here: https://github.com/mikhailocampo/Strux/blob/main/cookbook/pairwise-preference/pairwise-preference.ipynb

    13mFeb 24, 2025

    All Lessons

    Pairwise Evaluations Paper Implementation

    Pairwise Evaluations Paper Implementation

    This lesson introduces an implementation for the paper on Pairwise Evaluations, a novel LLM evaluation method using pairwise comparisons, offering a more reliable and human-aligned ranking than traditional direct scoring. The cookbook can be found here: https://github.com/mikhailocampo/Strux/blob/main/cookbook/pairwise-preference/pairwise-preference.ipynb

    13mFeb 24, 2025

    Collections

    1 items

    Strux