AI Coding Evaluator | $15/hr Remote
Position: Agentic Coding Annotator Online / Offline Tasks
Type: Short-Term Contract (5 weeks)
Compensation: $20 per hour
Location: Remote
Commitment: 8 hours per day with 4 hours overlap with PSTRole Responsibilities
- Perform online evaluations by interacting with models on predefined coding tasks and grading outputs
- Conduct offline evaluations by designing realistic coding tasks and defining evaluation criteria
- Review and analyze model-generated code by reading, debugging, and validating outputs
- Run tests, scripts, and terminal commands to verify correctness of solutions
- Write clear, evidence-based rationales for trajectory rankings and assessments
- Design task-specific rubrics and ensure consistent evaluation across runs
- Identify issues in outputs, environments, or instructions and escalate with supporting evidence
- Work with agentic coding tools and evaluation frameworks to assess model performance
- Strong years of experience in software engineering, QA, or similar code-heavy roles
- Strong proficiency in at least 1 2 programming languages (Python, JavaScript, Java, C/C++, etc.)
- Experience working with Linux/terminal, Git, and development tools
- Familiarity with coding-agent tools (e.g., Cursor, Claude Code, OpenCode, or similar)
- Ability to read unfamiliar codebases, debug issues, and evaluate correctness
- Strong attention to detail and ability to follow structured evaluation processes
- Comfortable with repetitive, high-precision evaluation work
- Experience with Docker or reproducible environments is a plus
- Ability to work independently in a remote environment
- Apply/Easy Apply and check email for application form
- Fill Google form
- Assessment Link (After shortlisting; candidates can choose between two options and complete within 24 hours)