Webinar

Mastering K8s Rightsizing - Optimizing for Costs and Performance

A practical playbook for balancing reliability, performance, and spend in Kubernetes cluster optimization

February 29, 2024
43 mins

Topics Covered

standardizationplatform engineering

Webinar Summary

Are your Kubernetes clusters burning money while sitting half empty? Or worse, are they constantly hitting resource limits and causing outages? This webinar tackles the challenging balance every DevOps team faces: how to rightsize Kubernetes workloads for both cost efficiency and rock-solid performance.

Understanding Your Real Resource Needs

Learn to read workload patterns like a pro, distinguishing between steady-state usage and legitimate traffic spikes. We'll show you how to interpret CPU and memory metrics over time, not just peak snapshots, and discover why most teams overprovision by 2-3x and how to safely reduce that buffer while maintaining performance.

Practical Rightsizing Strategies

Get baseline recommendations for web servers, background jobs, databases, and caching layers. Understand the critical relationship between requests, limits, and Kubernetes QoS classes. Learn when to prioritize headroom for latency-sensitive services versus batch processing workloads, and see real examples of services that achieved 60% cost reduction without sacrificing performance.

Automation Without the Pain

Navigate the HPA versus VPA decision with confidence. Choose metrics sources that remain reliable under heavy load and implement safe rollout patterns for introducing autoscaling to critical production paths. Build guardrails that prevent aggressive autoscaling from causing cascading failures.

Avoiding Expensive Mistakes

Prevent incidents that plague multi-tenant clusters and break free from eviction loops that destabilize your applications. Protect your SLOs while pursuing aggressive cost optimization and learn from real incidents where rightsizing went wrong.

The Business Impact

See how teams achieve 40-60% reduction in Kubernetes infrastructure costs, 30% improvement in application startup times, 50% reduction in out-of-memory incidents, and 25% increase in cluster density without new hardware.

Your Actionable Takeaways

Discover which metrics to instrument first for immediate impact, how to calculate initial resource values based on workload patterns, safe testing procedures for new settings in production, and templates that explain "why these numbers" for future teams.

This webinar includes live demonstrations using real production data, showing exactly how to analyze workloads, calculate optimal settings, and implement changes safely. Perfect for platform engineers, SREs, and DevOps teams who manage Kubernetes at scale.

What You'll Learn

• In-depth insights from industry experts

• Practical strategies you can implement today

• Real-world examples and case studies

• Interactive Q&A and community discussion

Share This Content

Stay Updated

Get our latest live content and insights delivered to your inbox.

Speakers

Vishnu K.V

Vishnu K.V

Platform Engineer
Facets
Rohit Raveendran

Rohit Raveendran

Co-Founder & VP Engg
Facets

Related Content

More Live Content

View all
AI x DevOps with Sanjeev Ganjihal - AWS Solutions Architect
Podcast

AI x DevOps with Sanjeev Ganjihal - AWS Solutions Architect

Join Rohit Raveendran as he sits down with Sanjeev Ganjihal, Senior Container Specialist at AWS and one of the first 100 Kubernetes certified professionals globally. This deep dive conversation explores the transformative shift from traditional DevOps to AI-powered operations and what it means for the future of infrastructure management. ### Evolution of DevOps and SRE Explore Sanjeev's unique journey from being an early Kubernetes adopter in 2017 to becoming a specialist in AI/ML operations at AWS. Discover how the industry has evolved from manual operations to automated, intelligent infrastructure management and what this means for traditional SRE roles. ### Multi-LLM Strategies in Practice Get insider insights into Sanjeev's personal AI development toolkit, including how he uses Claude, Q Developer, and local models for different tasks. Learn practical multi-LLM routing strategies, code review workflows, and how to choose the right AI tool for specific infrastructure challenges. ### Kubernetes Meets AI Infrastructure Understand the unique challenges of running AI workloads on Kubernetes, from GPU resource management to model serving at scale. Sanjeev shares real-world experiences from supporting financial services customers and the patterns that work for high-performance computing environments. ### The Future of AIOps Dive into discussions about Model Context Protocol (MCP), autonomous agents, and the concept of "agentic AI" that will define 2025. Learn how these technologies are reshaping the relationship between humans and infrastructure, with the memorable analogy of "you are Krishna steering the chariot." ### Security and Best Practices Explore critical security considerations when implementing AI in DevOps workflows, including safe practices for model deployment, data handling, and maintaining compliance in enterprise environments. Perfect for DevOps engineers, SREs, platform engineers, and technical leaders navigating the intersection of AI and infrastructure operations.

Sep 8, 20251 h 6 mins
AI Security Reality Check
Podcast

AI Security Reality Check

Nathan Hamiel, Head of Research at Kudelski Security, joins Rohit Raveendran for an essential reality check on AI security in DevOps environments. This candid conversation cuts through the hype to address real-world threats, vulnerabilities, and practical defense strategies that every team integrating AI into their infrastructure should understand. ### Real-World AI Security Threats Explore the actual security landscape facing organizations adopting AI, from model poisoning and prompt injection attacks to data exfiltration risks. Nathan shares insights from Kudelski Security's research into emerging threat vectors and how attackers are targeting AI-powered systems in production environments. ### DevOps-Specific Vulnerabilities Understand the unique security challenges that arise when AI meets DevOps workflows, including supply chain risks, model integrity issues, and the security implications of AI-generated infrastructure code. Learn how traditional security practices need to evolve for AI-augmented development pipelines. ### Practical Defense Strategies Get actionable guidance on implementing robust security measures for AI in DevOps, including model validation techniques, secure prompt engineering practices, and monitoring strategies for AI-powered infrastructure operations. Discover how to balance innovation with security requirements. ### Industry Insights and Trends Benefit from Nathan's perspective on the evolving threat landscape, emerging security standards for AI systems, and what organizations should prioritize when building security into their AI-driven DevOps practices. ### Key Takeaways for Teams Learn how to assess AI security risks in your current environment, implement baseline security controls for AI systems, and build a security-first culture around AI adoption without stifling innovation. Essential listening for security professionals, DevOps engineers, platform teams, and anyone responsible for safely integrating AI into production infrastructure and development workflows.

Jul 14, 202559 mins
Intro to Facets Intelligence!
Office Hours

Intro to Facets Intelligence!

Get your first look at Facets Intelligence, our revolutionary AI-powered platform that transforms how teams approach infrastructure management. This episode provides a comprehensive walkthrough of generating production-ready Terraform modules with built-in compliance and context-awareness using the Model Context Protocol (MCP). ### AI-Powered Infrastructure Generation Watch live demonstrations of how Facets Intelligence understands your infrastructure context, automatically generates compliant Terraform modules, and integrates seamlessly with your existing workflows. See how AI can accelerate infrastructure provisioning while maintaining security and governance standards. ### Model Context Protocol Integration Explore how MCP enables deep infrastructure awareness, allowing AI to understand your organization's specific requirements, compliance needs, and architectural patterns. Learn how this context-awareness ensures generated infrastructure follows your established best practices and security policies. ### Live Demo Highlights Experience real-time infrastructure generation as we create production-ready modules, demonstrate compliance validation, and show how the AI adapts to different organizational contexts and requirements. See how teams can reduce infrastructure provisioning time from hours to minutes. ### Key Benefits & Use Cases Discover how Facets Intelligence enables faster time-to-market for new services, reduces infrastructure drift and inconsistencies, ensures compliance across all generated resources, and empowers developers to self-serve infrastructure while maintaining guardrails. ### Perfect For Platform Engineers building self-service infrastructure capabilities, DevOps teams looking to accelerate provisioning workflows, compliance teams needing consistent governance, and organizations scaling their infrastructure operations with AI assistance.

Jun 12, 202538 mins