Gates Foundation

gatesfoundation

A workspace for the Gates Foundation DPI, FairFoward and Gooey teams focused on evals for low-resource languages plus the home of our Agriculture advisory work e.g. https://gooey.ai/ageval

352 Public Workflows

8 Members

(30Qs) Urdu Audio to Text Benchmark | Gemini 3.1 Pro, Gemini 3.1 Flash-Lite, …

(Updated April 28, 2026)
This page presents a comparative demonstration of multiple LLM and speech-to-text systems combinations in Urdu → English translation pipelines. Results

Abhay Vedantham

2mo ago

Public

Abhay's Copilot Builder

Gooey.AI's base AI workflow with built-in RAG, web search, voice understanding of 1000+ languages, code creation + execution, API connections & integrations to create your own WhatsApp, Web, FB and voice AI bots. Includes follow-up and location buttons on WhatsApp. Built on the Claude, this bot has 3 functions that give it superpower - websearch - giving it the ability to search anything on the internet; and code writing & execution - giving it the ability to calculate or convert data results. It also makes beautiful QR codes and images. ;-)

💬

Abhay Vedantham

2mo ago

Public

Compare Audio Responses

A bulk evaluator workflow that compares AI-generated answers (copilot responses) to a set of golden reference answers. Requires input data columns: "input_prompt" (the question/task) and "reference_answer" (the ideal response). The workflow uses custom evaluation prompts to compare outputs, scoring them for accuracy and penalizing hallucinations. Aggregates results to provide an overall performance metric for your AI answers.

⚖️

Abhay Vedantham

2mo ago

87 runs

Public

1 GPT5.5 + Omni - Urdu

💬

Abhay Vedantham

2mo ago

112 runs

Public

0 GPT5.5 - Urdu

💬

Abhay Vedantham

2mo ago

105 runs

Public

10 Gemma4 26B + Omni - Urdu

💬

Abhay Vedantham

2mo ago

Public

(30Qs) Pashto Audio to Text Benchmark | Gemini 3.1 Pro, Gemini 3.1 Flash-Lite, …

(Updated April 28, 2026)
This page presents a comparative demonstration of multiple LLM and speech-to-text systems combinations in Pashto → English translation pipelines. Results

Abhay Vedantham

2mo ago

Public

(3Qs) Pashto Audio to Text Benchmark | Gemini 3.1 Pro, Gemini 3.1 Flash-Lite, …

(Updated April 27, 2026)
Pashto Eval test

🦾

Abhay Vedantham

2mo ago

Public

10 Gemma4 26B + Omni - Pashto

💬

Abhay Vedantham

2mo ago

Public

1 GPT5.5 + Omni - Pashto

💬

Abhay Vedantham

2mo ago

109 runs

Public

0 GPT5.5 - Pashto

💬

Abhay Vedantham

2mo ago

109 runs

Public

Compare Latency

Use this workflow compare latency on Copilot Bulk Runs. The graph display the median score. Lower is better!

⚖️

Milind Choudhary

- rename metric Eval Prompt / Graph Description

2mo ago

98 runs

Public

1 GPT5.5 + Omni - Sindhi

💬

Abhay Vedantham

2mo ago

187 runs

Public

0 GPT5.5 - Sindhi

💬

Abhay Vedantham

2mo ago

183 runs

Public

8 Gem3.1FlashLite + Omni - Hausa

💬

Abhay Vedantham

2mo ago

339 runs

Public

9 Gem3.1FlashLite + Intron - Hausa

💬

Abhay Vedantham

2mo ago

350 runs

Public

16 Gemma 4 26B + Omni - Hausa

💬

Abhay Vedantham

2mo ago

337 runs

Public

10 Gemma4 26B + Omni - Sindhi

💬

Abhay Vedantham

2mo ago

111 runs

Public

17 Gemma 4 26B + Intron - Hausa

💬

Abhay Vedantham

2mo ago

336 runs

Public

15 GPT-OSS 120b + Intron - Hausa

💬

Abhay Vedantham

2mo ago

339 runs

Public

14 GPT-OSS 120b + Omni - Hausa

💬

Abhay Vedantham

2mo ago

336 runs

Public