Senior C++ Software Engineer - Apache Spark Solution

Posted:
12/11/2024, 9:18:04 PM

Location(s):
Shanghai, China

Experience Level(s):
Senior

Field(s):
AI & Machine Learning ⋅ Software Engineering

We are seeking experienced C++ Software Engineers to join our Spark Acceleration group.

Data scientists spend a considerable amount of time exploring data, iterating over machine learning (ML) experiments.  NVIDIA  believes that data science workflows can benefit tremendously from being accelerated, to enable data scientists to explore many more and larger datasets to drive towards their business goals, faster, and more reliably.

You will work with the open source community to accelerate Apache Spark for data science. Apache Spark is the most popular data processing engine in data centers for data science. We aim to dramatically accelerate Apache Spark use cases without application code changes. You will work on open source libraries including Spark-RAPIDS (https://github.com/NVIDIA/spark-rapids), RAPIDS (https://github.com/rapidsai), and Velox (https://github.com/facebookincubator/velox).

What you'll be doing:

  • Design and implement native Spark execution engine using RAPIDS, Velox, UCX and other related libraries. 
  • Design and implement solutions to optimize data exchange between Velox and RAPIDS libraries
  • Enhance Velox OSS library for improved performance and Spark compatibility
  • Contribute to RAPIDS library for large-scale adoptions in major enterprises
  • Conduct performance benchmarking and profiling to achieve speed-of-light performance
  • Working with a team of exceptional engineers including PMC and Committers of Apache Spark, Apache Hadoop, Apache Hive, and Apache Arrow
  • Presenting technical solutions in industry conferences and meetups

What we need to see:

  • BS, MS, or PhD in Computer Science, Computer Engineering, or closely related field 
  • 8+ years of work or research experience in software development
  • 3+ years hands-on development experience with Velox, RAPIDS or similar data processing frameworks in memory management techniques and data serialization
  • Exceptional C++ development experience in design, programming, testing, and debugging  
  • Design and development expertises in columnar data processing with SIMD (Single Instruction, Multiple Data) and vectorization techniques
  • Familiarity with operating systems and software development environments for ARM
  • Proven technical skills in designing and implementing high-quality distributed systems 
  • Able to work successfully with multi-functional teams across organizational boundaries and geographies
  • Highly motivated with strong communication skills

Ways to stand out from the crowd:

  • Committership at major open source big-data projects 
  • Working experience with GPU-accelerated libraries (CUDA, cuBLAS, NCCL, RAPIDS, UCX) 

We are an AA/EEO/Disabled employer and with highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people on the planet working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you.

NVIDIA

Website: https://www.nvidia.com/

Headquarter Location: Santa Clara, California, United States

Employee Count: 10001+

Year Founded: 1993

IPO Status: Public

Last Funding Type: Grant

Industries: Artificial Intelligence (AI) ⋅ GPU ⋅ Hardware ⋅ Software ⋅ Virtual Reality