🌍 Africa 🌱 Agriculture 🎭 Culture 🎓 Education 🏛️ Government 🩺 Health 🇮🇳 India 🇰🇪 Kenya 🌐 Language 📈 Marketing 📡 Telco 🛠️ Trades 🇺🇸 US

8 results

Evaluator

⚖️ Eval

Evaluate any Gooey.AI Workflow output against a dataset of inputs and "golden" or expert-created desired answers. Score every row of any CSV, google sheet or excel with any LLM-as-Judge instruction prompt; then average every score in any column to generate automated evaluations.

⚖️

Gooey.AI

Updated to Include Gemini3, GPT5.1, LLaMA, Deepseek3.1

967 runs

Copilot Evaluator

⚖️ Eval

Our general bulk evaluator to compare AI generated copilot answers against a collection of golden Answers.

⚖️

416 runs

Speech Recognition Model Evaluator

⚖️ Eval

This recipe is used with https://gooey.ai/bulk to evaluate the latest private & open source speech recognition models (from Google, Meta, OpenAI and others). It takes a CSV file of golden (aka human provided) translations and compares those against a set of AI created translations to generate scores from 0 to 1. It then takes the mean of the scores to determine which model performed best.

⚖️

Gooey.AI

212 runs

Low Resource ASR Evaluator

⚖️ Eval

⚖️

114 runs

Compare Output Text

⚖️ Eval

A bulk evaluator workflow that compares AI-generated answers (copilot responses) to a set of golden reference answers. Requires input data columns: "input_prompt" (the question/task) and "reference_answer" (the ideal response). The workflow uses custom evaluation prompts to compare outputs, scoring them for accuracy and penalizing hallucinations. Aggregates results to provide an overall performance metric for your AI answers.

⚖️

Gooey.AI

88 runs

Evaluate Telugu Speech Reco + Translation

⚖️ Eval

Here we compare the top 5 ASR models from a set of Telugu samples. Speech output created from https://gooey.ai/bulk/?example_id=nrkx2u17

⚖️

Gooey.AI

308 runs

Evaluate Kannada Speech Reco + Translation

⚖️ Eval

Here we compare the top 3 ASR models from a set of Kannada samples. Speech output created from https://gooey.ai/bulk/?example_id=m8c3mb98