Website NVIDIA
NVIDIA at AWS re:Invent 2024
Want to work at the forefront of AI infrastructure? NVIDIA is seeking a skilled Senior Software Engineer to join their DGX Cloud team. You’ll be responsible for ensuring the smooth operation and scaling of their cutting-edge GPU clusters used for diverse AI workloads.
What you’ll do:
- Develop and optimize Kubernetes-based solutions for scheduling GPU resources.
- Build and maintain systems for monitoring the health and performance of GPU clusters.
- Collaborate with teams across NVIDIA to ensure reliable and efficient AI infrastructure.
- Troubleshoot system failures and improve services through incident management.
What you’ll need:
- 5+ years of experience in a similar software engineering role, with a proven track record of impactful work.
- Strong Kubernetes API and framework experience (beyond just cluster operations).
- Excellent communication and collaboration skills.
- A Bachelor’s degree in Computer Science or a related field.
- Proficiency in Go or Python and a solid understanding of data structures and algorithms.
Bonus points for:
- Experience managing large-scale distributed systems.
- Deep understanding of cluster management systems like Kubernetes, Slurm, or Bright Cluster Manager.
- A passion for AI and a desire to push the boundaries of technology.
Benefits:
- Competitive salary (ranging from $148,000 to $339,250 USD, depending on location and experience).
- Equity and comprehensive benefits package.
- Opportunity to work with some of the brightest minds in the industry on groundbreaking AI projects.
If you’re a Kubernetes expert with a passion for AI and a drive to innovate, apply now!
Apply Now: Click Here
To apply for this job please visit nvidia.wd5.myworkdayjobs.com.