HPC Engineer

Contract Type:

Contract

Location:

Somerville - Massachusetts

Industry:

Information Technology

Contact Name:

Colin Reposa

Contact Email:

creposa@dewintergroup.com

Contact Phone:

(415) 882-9002

Date Published:

12-03-2025

Salary:

$140.00 - $170.00 Per Hour

Job ID:

38038

Title: HPC Engineer
Job Type: Contract
Contract Length: 6-7 Month Contract (with potential for extension)
Target Start Date: January
Work Location/Structure: Remote (local to the Northeast or Midwest preferred)

About the Opportunity:

Our client, a leader in Academic Research and Higher Education, is looking for a skilled HPC Engineer to join their team for a 6-7 month contract engagement. This project involves scaling and maintaining a critical High-Performance Computing (HPC) ecosystem used by university researchers for parallel processing, AI/ML applications, and massive data transfers. This is a high-impact role that requires a self-motivated, tenured professional who can immediately contribute to the stability and efficiency of a complex, large-scale research computing environment.

Key Responsibilities & Deliverables:

This role is focused on the successful completion of specific tasks and deliverables. Your responsibilities will include:

Maintain the entire HPC ecosystem, including system specification, provisioning, OS installation (Rocky Linux), and managing updates/changes to approximately 200 Linux systems. This includes login/file transfer nodes, compute nodes, job schedulers (Slurm), and virtualization (VMware).
Utilize configuration management and security best practices to maintain all systems using Ansible and the Werewolf cluster management system.
Manage the Globus data transfer software and support the storage team with Vast and TrueNad Storage maintenance. Provide support for data indexing tools like Starburst.
Maintain and support user-facing HPC web gateways and research tools (e.g., Open OnDemand, Jupyter Notebook/Lab/Hub, FastX, OpenXDMod).
Respond to outage/urgent systems issues and develop/document continual operational improvements in the HPC system administration service. Assist with vendor management as needed.

Required Skills & Experience:

We are looking for someone with a proven track record of successful contract engagements. The ideal candidate will have:

5+ years of experience in a similar role within a large-scale enterprise or research environment, with a "tenured" approach to system administration.
Deep expertise in Linux Systems Administration, Ansible, and HPC cluster management tools like Werewolf and the Slurm job scheduler. This isn't a learning role—you need to be a subject matter expert.
Demonstrated ability to work autonomously and manage your own time effectively to meet project goals and handle critical system issues.
Experience installing and maintaining common research computing frameworks and software, particularly AI/ML/DL libraries (TensorFlow, PyTorch) and container platforms.
Familiarity with high-performance storage solutions like Vast Storage and TrueNad Storage, and experience with Globus or a strong willingness to quickly learn.
Strong communication skills to provide clear and concise status updates to the project team and technical expertise regarding network, storage administration, and data center issues.
Scripting proficiency in Shell or Python is a plus.

DeWinter Group and Maris Consulting is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. We post pay scales which are based on our client pay ranges. DeWinter, Maris, and our clients have the right to modify the requirements of the role which can impact the pay ranges posted.

Share this job

HPC Engineer

Similar Jobs

(408) 297-7500