Catalogue
/
Microservices
/
RabbitMQ on Kubernetes Performance and Operations

RabbitMQ on Kubernetes Performance and Operations

A focused, advanced course for engineers who run RabbitMQ on Kubernetes. The goal is to improve cluster setup, raise throughput, reduce latency, and harden reliability using proven patterns for scheduling, storage, networking, and broker tuning.

What will you learn?

You will validate your deployment pattern, baseline performance, and implement targeted optimizations in RabbitMQ and Kubernetes. You will apply observability, scaling, and reliability techniques to meet concrete SLOs while keeping operations safe during upgrades and failures.

After this training you will be confident in:

  • Choosing and hardening a deployment approach on Kubernetes and aligning it to failure domains
  • Tuning queues, connections, channels, confirms, and prefetch for predictable throughput and latency
  • Optimizing storage, networking, and resource requests or limits for stable performance
  • Operating quorum and stream queues, handling failures, and planning controlled rollouts
  • Using metrics and tracing to detect bottlenecks and prevent regressions
  • Securing traffic and access with TLS, least privilege, and network policies

Requirements:

  • Strong familiarity with RabbitMQ fundamentals and Kubernetes basics
  • Comfort with kubectl, Helm or the RabbitMQ Cluster Operator, and container registries
  • Access to a non-production cluster with permissions to create namespaces, StatefulSets, Services, Ingress, and Secrets

Course Outline*:

*We know each team has their own needs and specifications. That is why we can modify the training outline per need.

Module 1: Deployment patterns and cluster architecture
  • Helm charts vs RabbitMQ Cluster Operator and when to use each
  • StatefulSets, headless Services, and stable network identities
  • Node pools, topology spread, anti-affinity, and PodDisruptionBudgets
  • Version strategy and plugin selection for performance features

Module 2: Storage and durability for predictable throughput
  • StorageClasses, PVCs, throughput vs IOPS tradeoffs, and filesystem notes
  • Durable, quorum, and stream queues and their operational implications
  • Lazy queues for cold traffic, message TTL, and dead-letter routing patterns
  • Disk alarms, watermark tuning, and safe compaction expectations

Module 3: Networking and connectivity at scale
  • ClusterIP, headless, and external access via Ingress or LoadBalancer
  • Connection management: connections vs channels, heartbeats, and TCP keepalives
  • MTU, kube-proxy modes, and avoiding unnecessary hops
  • TLS termination choices and impact on latency

Module 4: Baseline performance and broker tuning
  • Perf-test methodology, representative payloads, and SLO-driven scenarios
  • Publisher confirms strategies, in-flight limits, and batching
  • Consumer flow control with basic.qos prefetch and fair dispatch
  • Memory alarms, garbage collection considerations, and VM resource tuning
Module 5: Reliability engineering on Kubernetes
  • Quorum queue behavior, leader election, and placement across failure zones
  • Stream queues for high-throughput append workloads and retention configuration
  • Drains, reschedules, and graceful shutdown to prevent message loss
  • Rolling updates and node maintenance with minimal disruption

Module 6: Observability, SLOs, and capacity planning
  • Prometheus metrics to watch: channels, confirms, acks, unacked, consumers, memory, disk
  • Grafana dashboards for capacity and saturation signals
  • Tracing message paths and identifying hot exchanges or bindings
  • Forecasting capacity and setting alert thresholds that reduce noise

Module 7: Security and governance
  • TLS between clients and nodes, certificates with cert-manager, and rotation
  • Least-privilege users, vhosts, and permissions aligned to applications
  • Kubernetes NetworkPolicies to isolate brokers and client namespaces
  • Image provenance, supply chain considerations, and secret handling

Module 8: Troubleshooting and continuous improvement
  • rabbitmq-diagnostics and rabbitmqctl techniques for live systems
  • Detecting and fixing blocked connections, flow control, and backlogs
  • Shovel and Federation throughput tuning and backlog recovery
  • Playbook for incident response, postmortems, and controlled performance experiments

Hands-on learning with expert instructors at your location for organizations.

0
Graph Icon - Education X Webflow Template
Level: 
Intermediate
Clock Icon - Education X Webflow Template
Duration: 
14
Hours (days:
2
Camera Icon - Education X Webflow Template
Training customized to your needs
Star Icon - Education X Webflow Template
Immersive hands-on experience in a dedicated setting
*Price can range depending on number of participants, change of outline, location etc.

Master new skills guided by experienced instructors from anywhere.

0
Graph Icon - Education X Webflow Template
Level: 
Intermediate
Clock Icon - Education X Webflow Template
Duration: 
14
Hours (days:
2
Camera Icon - Education X Webflow Template
Training customized to your needs
Star Icon - Education X Webflow Template
Reduced training costs
*Price can range depending on number of participants, change of outline, location etc.

You can participate in a Public Course with people from other organisations.

0

/per trainee

Number of Participants

1 Participant

Thanks for the numbers, they could be going to your emails. But they're going to mine... Thanks ;D
Oops! Something went wrong while submitting the form.
Graph Icon - Education X Webflow Template
Level: 
Intermediate
Clock Icon - Education X Webflow Template
Duration: 
14
Hours (days:
2
Camera Icon - Education X Webflow Template
Fits ideally for individuals and small groups
Star Icon - Education X Webflow Template
Networking opportunities with fellow participants.
*Price can range depending on number of participants, change of outline, location etc.