top of page
blabk dig.jpg

Building and Scaling Inference workloads on Amazon EKS (Workshop)

Steve Messenger | Senior Specialist Solution Architect at AWS

Chhavi Negi | Senior Accelerated Compute Sales Specialist at AWS

Chris Merrett | Principal Consultant at Steamhaus

Join us for an immersive hands-on workshop exploring how to build and scale production-ready inference deployments on Amazon EKS using NVIDIA GPUs. As organizations move beyond experimentation to production deployment of GenAI applications, Kubernetes has emerged as a preferred platform for managing inference workloads at scale, offering robust orchestration, cost optimization, and enterprise-grade reliability.

Whether you're looking to deploy your first language model or scale existing inference workloads, this workshop will provide you with best practices and hands-on experience using industry-leading tools and frameworks. Learn directly from AWS experts who have helped organizations successfully deploy and manage large-scale GenAI infrastructure.

Through hands-on labs and real-world examples, you'll learn to master:

  • EKS cluster setup optimised for NVIDIA GPU workloads

  • Efficient model serving and scaling using vLLM

  • Distributed inference architecture implementation with Ray

  • Comprehensive monitoring and observability using Prometheus and Grafana

  • Best practices for production GenAI deployments on Kubernetes

  •  Agentic AI on Amazon EKS

 

Who should attend: DevOps/MLOps engineers, AI/ML developers, Solution Architects, Product owners (or similar functions/titles).
 

Please do not forget to bring your laptops as this is a hands-on workshop.

Registration for this workshop will be available for ticket holders in the days prior to the event.

Steve Messenger.jpg
Chhavi Negi.jpg
Chris Merret.jpg

More workshops TBA!

  • LinkedIn
  • YouTube
bottom of page