Software Engineer – Big Data

Category: Information Technology Location: San Francisco Job ID: 17536 Date Posted: July 23rd, 2019

Title: Software Engineer – Big Data (Spark, Hadoop MapReduce, Cascading)

Location: San Francisco 

Terms: Contract until end of year, possible to go up to 18 months and potential conversion

Our creative lifestyle social media client is looking for a seasoned data engineer to empower internal users to make the best of big data platform and infrastructure. 

What You'll Do: 


  • Improve big data pipeline efficiencies and perform performance tuning for user queries, complex query plan analysis, configuration settings, optimization, using technologies like Spark, Hadoop MapReduce, Cascading.
  • Involve in maintaining and enhancing our performance monitoring and analysis tools/frameworks, workflow and data compute and execution engines, etc
  • Actively monitor production using various internal tool as related to user experiences in query engines, and track performance profiles over time and investigate changes therein
  • Proactively analyze performance metrics and logs to identify inefficiencies and opportunities to improve for scalability and performance
  • Actively engage our internal users, who are our customers, to analyze their respective use cases to address scale and performance demands
  • Provide thought leadership and helpful feedback to the rest of data team in our processes, support models, performance monitoring, analysis metrics, frameworks and tools from internal users’ perspective so as to improve their experiences and productivity.

What You'll Have:


  • 5+ years of hands on experience on open-source Big Data technologies
  • Experience in writing big data applications in Spark or Hadoop MapReduce, Cascading
  • Strong debugging skills in Linux environment for performance issues
  • Experience in troubleshooting big data technology issues in processing of petabytes of data
  • Experience in deploying and maintaining distributed systems. 
  • Experience in using big data storage technologies and their applications (HDFS, Columnar Storage format, Hive, Spark, Presto, etc.)
  • Proficiency in multiple systems languages (Java, Scala, Python)

Apply Now