AI Coding Evaluator | $15/hr Remote

Position: Agentic Coding Annotator Online / Offline Tasks

Type: Short-Term Contract (5 weeks)

Compensation: $20 per hour

Location: Remote

Commitment: 8 hours per day with 4 hours overlap with PST

Role Responsibilities

  • Perform online evaluations by interacting with models on predefined coding tasks and grading outputs
  • Conduct offline evaluations by designing realistic coding tasks and defining evaluation criteria
  • Review and analyze model-generated code by reading, debugging, and validating outputs
  • Run tests, scripts, and terminal commands to verify correctness of solutions
  • Write clear, evidence-based rationales for trajectory rankings and assessments
  • Design task-specific rubrics and ensure consistent evaluation across runs
  • Identify issues in outputs, environments, or instructions and escalate with supporting evidence
  • Work with agentic coding tools and evaluation frameworks to assess model performance
Requirements
  • Strong years of experience in software engineering, QA, or similar code-heavy roles
  • Strong proficiency in at least 1 2 programming languages (Python, JavaScript, Java, C/C++, etc.)
  • Experience working with Linux/terminal, Git, and development tools
  • Familiarity with coding-agent tools (e.g., Cursor, Claude Code, OpenCode, or similar)
  • Ability to read unfamiliar codebases, debug issues, and evaluate correctness
  • Strong attention to detail and ability to follow structured evaluation processes
  • Comfortable with repetitive, high-precision evaluation work
  • Experience with Docker or reproducible environments is a plus
  • Ability to work independently in a remote environment
Application Process
  • Apply/Easy Apply and check email for application form
  • Fill Google form
  • Assessment Link (After shortlisting; candidates can choose between two options and complete within 24 hours)
Back to blog