The 'Prime' Pod Question: Kubernetes Scheduling Demystified

2026-05-01


I recently went through a flurry of interviews trying to hire a strong Platform Engineer for my team. Since our platform is built on Kubernetes, a solid grasp of its fundamentals isn't optional, it's table stakes.

Here is my favorite question.

"Imagine a multi-tenant Kubernetes cluster running many different workloads. What is the best way to configure requests and limits for your workload so that this workload is never OOM-killed?"

To answer this correctly, you need to understand how Kubernetes thinks about resources. Kubernetes makes resource management decisions across two distinct phases:

Requests

Resource requests are simply requests made to the Kubernetes Scheduler. If a Pod requests 3 GiB of memory, the scheduler guarantees that it will be placed on a node that can accommodate that request.

However, if the node has additional free resources available, the container is allowed to use more than its requested amount. Requests are only about placement, not enforcement.

Caveats to Keep in Mind

Limits

Resource limits are enforced at runtime by the container runtime and the kubelet. Unlike requests, which only affect scheduling, limits define hard boundaries on how a container can consume resources.

CPU Limits

Memory Limits

This asymmetry -- throttling of CPU vs killing off pods that consume too much memory is critical to understand when planning mission-critical workloads like prime.

QoS Classes

Kubernetes assigns every Pod a QoS class based on the resource requests and limits of its component containers. QoS classes are derived, not configured.

Kubernetes uses QoS to decide which Pods are evicted first when the node is under resource pressure.

Guaranteed

Burstable

BestEffort

An ideal Answer

For a mission-critical, memory-intensive workload like prime, the correct approach is to use Guaranteed QoS:

resources:
  requests:
    cpu: 4
    memory: 6Gi
  limits:
    cpu: 4
    memory: 6Gi

This ensures that memory is reserved, eliminating the risk of memory overcommitment, and that the Pod is last in line for eviction.

So there you go. If you ever come across a similar question, you know how to impress.