About the Role
NVIDIA is seeking a talented High Performance LLM Training Engineer to join our team as a New College Graduate in 2026. In this role, you will contribute to the development and optimization of cutting-edge training methodologies for large language models on NVIDIA's GPU platforms. This is an exciting opportunity to work on the frontier of Generative AI, focusing on achieving unprecedented performance and scalability for LLM training. You will be involved in implementing and optimizing parallel algorithms, distributed training strategies, and low-level software optimizations to push the boundaries of what's possible with large-scale AI models.
Responsibilities
- Research, design, and implement high-performance training algorithms for Large Language Models (LLMs).
- Optimize LLM training pipelines for maximum efficiency and scalability on NVIDIA GPUs.
- Develop and debug distributed training strategies across multiple GPUs and nodes.
- Analyze performance bottlenecks and propose innovative solutions.
- Collaborate with research scientists and other engineers to advance the state-of-the-art in LLM training.
- Contribute to NVIDIA's software stack for AI.
Requirements
- Master's or Ph.D. in Computer Science, Electrical Engineering, or a related field (expected graduation by 2026).
- Strong foundation in deep learning, particularly with large language models.
- Proficiency in C++ and Python.
- Experience with deep learning frameworks (e.g., PyTorch, TensorFlow).
- Familiarity with GPU programming (CUDA) and parallel computing.
- Strong problem-solving and analytical skills.
Qualifications
- Experience with distributed training frameworks (e.g., DeepSpeed, Megatron-LM).
- Knowledge of compiler optimizations for AI workloads.
- Publications in top-tier AI/ML conferences.
Benefits
Competitive salary, comprehensive health benefits, retirement plans, paid time off, and opportunities for professional growth in a leading AI company. Relocation assistance may be provided.