AI Safety & Evaluation Engineer

AI Safety & Evaluation Engineer

Contract Type:

Contract

Location:

Campbell - CA

Industry:

Information Technology

Contact Name:

Lucas Davis

Contact Email:

ldavis@dewintergroup.com

Contact Phone:


Date Published:

03-30-2026

Salary:

$50.00 - $175.00 Per Hour

Job ID:

38817

Title:  AI Safety and Evaluations Engineer
Job Type:  Contract
Contract Length:  12 Months
Pay Range:  $50/hr – $175/hr
Start Date:  ASAP
Location:  Remote

About the Opportunity:

Our client, a leader in AI testing and Generative AI solutions, is looking for a skilled AI Safety and Evaluations Engineer  to join their team for a 12-month engagement. This project involves designing and building rigorous evaluation frameworks to measure model bias, hallucinations, and toxicity, ensuring models are safe and compliant before deployment. This is a high-impact role that requires a self-motivated professional who can hit the ground running and deliver results quickly.

Key Responsibilities & Deliverables:

This role is focused on the successful completion of specific tasks and deliverables. Your responsibilities will include:

  • Designing and building rigorous evaluation frameworks to measure model bias, hallucinations, and toxicity.
  • Creating automated "Eval" datasets to benchmark new models before they are promoted to production.
  • Developing metrics for "Grounding" and "Faithfulness" in RAG-based systems.
  • Building monitoring tools that flag harmful or non-compliant AI outputs in real-time.
  • Partnering with legal and ethics teams to translate policy into technical safety constraints.
Required Skills & Experience:

We are looking for someone with a proven track record of successful contract engagements. The ideal candidate will have:
  • 3+ years of experience in AI Research or Quality Engineering.
  • Deep expertise in model evaluation techniques and NLP metrics (ROUGE, BLEU, BERTScore). This isn't a learning role—you need to be a subject matter expert.
  • Demonstrated ability to work autonomously and manage your own time effectively to meet project goals.
  • Experience with Python, data analysis tools, and LLM-as-a-Judge frameworks.
  • Strong communication skills to provide clear and concise status updates to the project team.
#LI-LD1 
#LI-JN1

DeWinter Group and Maris Consulting  is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. We post pay scales which are based on our client pay ranges. DeWinter, Maris, and our clients have the right to modify the requirements of the role which can impact the pay ranges posted.

APPLY NOW

Share this job

Interested in this job?
Save Job
Create As Alert

Similar Jobs

Read More
SCHEMA MARKUP ( This text will only show on the editor. )