Home

Ship your LLM with confidence

AI Output Grader · LLM Evaluator · Data Labeling Specialist

I am Alex — AI Output Grader, LLM Evaluator and Data Labeling Specialist

Why work with me?

Hands-on evaluator. Real projects across evaluation, annotation, tagging, and QA.
Reproducible results. Clear rubrics, versioned guidelines, audit trails.
Edge-case mindset. Stress tests for hallucinations, safety, and grounding.
Fast loops. Pilot → calibrate → scale, with daily notes and metrics.
Multilingual. English, Spanish, French.
Easy to work with. Europe/Madrid time zone, responsive comms, clean deliverables.

Projects & documentation

Below is an example of the kind of evaluation documentation and project work I do for LLM teams.

X project – multi-criteria LLM evaluation (2025)

Goal
Design and document a robust evaluation workflow so a distributed team can reliably grade LLM responses and surface weaknesses before deployment.

What I did

Translated product requirements into concrete scoring rubrics (clear criteria, weights, and severity levels).
Defined error categories (from minor issues to blocking failures) so raters could choose the right severity consistently.
Wrote step-by-step instructions for annotators: how to read the prompt, how to judge the response, and how to pick the final score.
Created realistic user prompts and scenarios to test the model on both typical and edge-case behavior.
Added examples of “good” vs “bad” ratings to calibrate the team and reduce disagreement.
Iterated on the guidelines based on pilot results, clarifying ambiguous cases and tightening definitions.

Impact

A self-contained evaluation pack (rubrics + guidelines + examples) that other raters could use without extra training calls.
More consistent scores across annotators, especially on subtle issues like partial correctness or borderline safety concerns.
Faster feedback loops to the model team, with clear evidence for where and why the model failed.

Have a project in mind?

moreno@modelevaluator.com · Madrid, Spain

Page updated

Google Sites

Report abuse