Posted:
9/24/2024, 5:00:00 PM
Location(s):
California, United States
Experience Level(s):
Entry Level/New Grad ⋅ Junior ⋅ Mid Level
Field(s):
AI & Machine Learning ⋅ Software Engineering
At NVIDIA, we are at the forefront of the rapidly advancing field of large language models and their applications in agentic AI use cases. As the scale and complexity of these agentic systems continue to grow, we are seeking exceptional engineers to join our team and help shape the future of agentic inference.
Our team is committed to pushing the limits of what’s achievable with agentic LLMs by enhancing the algorithmic performance and efficiency of the systems that support them. We continuously seek ways to refine these systems, develop innovative inference algorithms and protocols, improve existing models, and seamlessly integrate enhancements to ensure NVIDIA’s solutions can effectively manage large-scale, complex tasks.
What you’ll be doing:
Research and Development: Investigate and integrate the latest advancements in generative AI, agent frameworks, and inference systems into NVIDIA’s LLM software ecosystem.
Workload Analysis and Optimization: Perform comprehensive analysis, profiling, and optimization of agentic LLM workloads to significantly reduce request latency and increase throughput while preserving workflow integrity
System Design and Implementation: Architect and build scalable systems to accelerate agentic workflows, ensuring they can efficiently handle sophisticated, datacenter-scale applications.
Collaboration and Communication: Collaborate with diverse teams within NVIDIA and with external partners to guide future iterations of NVIDIA’s software, hardware, and systems. Define and formalize strategic requirements based on the demands of various workloads.
What we need to see:
Bachelor’s, Master’s, or Ph.D. in Computer Science, Electrical Engineering, Computer Engineering, or a related field (or equivalent experience).
Proven experience in deep learning and the design of deep learning systems.
Proficiency in Python and C++ programming languages.
Strong understanding of computer architecture, GPU computing, and parallel datacenter computing fundamentals.
Performance Optimization: Demonstrated ability and interest in analyzing, modeling, and tuning application performance.
Ways to stand out from the crowd:
Large-Scale Systems: Experience in developing large-scale LLM inference systems, particularly those involving complex AI functionalities.
Agentic Frameworks: Familiarity with agentic LLM frameworks.
Performance Modeling: Expertise in processor and system-level performance modeling.
GPU Programming: Proficiency in GPU programming using CUDA or OpenAI Triton.
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for over 25 years. Our unique legacy of innovation is driven by cutting-edge technology and exceptional talent. Today, we are leveraging the limitless potential of AI to define the next era of computing—where our GPUs serve as the intelligent cores powering computers, robots, and autonomous vehicles that understand and interact with the world. Achieving unprecedented advancements requires vision, innovation, and the world’s best talent.
The base salary range is 104,000 USD - 189,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
Website: https://www.nvidia.com/
Headquarter Location: Santa Clara, California, United States
Employee Count: 10001+
Year Founded: 1993
IPO Status: Public
Last Funding Type: Grant
Industries: Artificial Intelligence (AI) ⋅ GPU ⋅ Hardware ⋅ Software ⋅ Virtual Reality