uncovered: the five hidden performance pitfalls in kubernetes that are killing your deployments
introduction
when you’re launching an application on kubernetes, the first thing you’ll notice is how opinionated the platform is about resource management and deployment patterns. however, too many developers (and even seasoned engineers) stumble into performance pitfalls that quietly sabotage their workloads. these hidden bottlenecks can lead to high latency, pod evictions, and even unexpected downtime—everything a devops team, a full‑stack developer, or a coding enthusiast wants to avoid.
below we expose the five most common, yet overlooked, performance traps in kubernetes. each section contains actionable tips—complete with code snippets—to help you optimize deployments, reduce costs, and improve reliability.
1. misconfigured resource requests and limits
every pod in kubernetes declares requests (the amount of resources definitely allocated) and limits (the maximum the pod can consume). if these values are off, the scheduler has no visibility, leading to either resource contention or wasted capacity.
- requests too low → pods get scheduled on nodes with insufficient resources, causing oom/kill events.
- limits too high → pods monopolize cpu/memory, starving other workloads.
- no limits defined → the pod can grow unbounded, blowing up the node.
✅ fix: perform a quick baseline test in a staging cluster and tweak the yaml.
apiversion: apps/v1
kind: deployment
metadata:
name: fastapi-app
spec:
replicas: 3
selector:
matchlabels:
app: fastapi
template:
metadata:
labels:
app: fastapi
spec:
containers:
- name: fastapi
image: ghcr.io/yourorg/fastapi:latest
resources:
requests:
cpu: 200m
memory: 256mi
limits:
cpu: 500m
memory: 512mi
use horizontal pod autoscaler (hpa) to automatically scale within these constraints based on cpu or memory usage.
monitoring tool recommendation
deploy prometheus + grafana to collect metrics on container_cpu_usage_seconds_total and container_memory_working_set_bytes. set alerts when usage consistently exceeds 80 % of limits.
2. inefficient use of persistent volumes (pv) and statefulsets
stateful applications need durable storage. a common mistake is coupling stateful workloads with ephemeral local disks or using the same pv class across unrelated services.
- using
emptydirinstead of pv when data should survive node reboots. - using
hostpathgives no persistence across cluster upgrades. - incorrect storageclass (e.g., using
standardfor high iops workloads).
✅ fix: separate pv classes for different i/o profiles and use claiming etiquette.
apiversion: v1
kind: persistentvolumeclaim
metadata:
name: database-pvc
spec:
accessmodes:
- readwriteonce
resources:
requests:
storage: 10gi
storageclassname: high-io
when defining a statefulset, ensure the volumeclaimtemplates reference the appropriate storage class:
apiversion: apps/v1
kind: statefulset
metadata:
name: mysql
spec:
servicename: "mysql"
replicas: 2
selector:
matchlabels:
app: mysql
template:
spec:
containers:
- name: mysql
image: mysql:8.0
volumemounts:
- name: mysql-data
mountpath: /var/lib/mysql
volumeclaimtemplates:
- metadata:
name: mysql-data
spec:
accessmodes: [ "readwriteonce" ]
resources:
requests:
storage: 10gi
storageclassname: high-io
3. over‑provisioning ingress controllers and services
ingress controllers like nginx ingress or traefik are often deployed with high replica counts without considering traffic patterns, leading to wasted cpu cycles.
- replica count set to
10by default → 10× the memory/cpu usage. - without custom resource definitions, default values cannot be tuned for lightweight services.
✅ fix: scale ingress based on estimated traffic bytes and configure resource logging to review usage.
apiversion: apps/v1
kind: deployment
metadata:
name: nginx-ingress-controller
spec:
replicas: 2
selector:
matchlabels:
app.kubernetes.io/name: ingress-nginx
template:
metadata:
labels:
app.kubernetes.io/name: ingress-nginx
spec:
containers:
- name: controller
image: k8s.gcr.io/ingress-nginx/controller:v1.1.0
resources:
requests:
cpu: 100m
memory: 128mi
limits:
cpu: 250m
memory: 256mi
implement auto‑scaling using cluster autoscaler or an ingress autoscaler hook.
4. lacking readiness/liveness probes
probes are your first line of defense against “ode to evil pods.” when missing or misconfigured, pods may start accepting traffic before they’re fully ready or get respawned repeatedly.
- zero probe → zero reliability for traffic routing.
- wrong
timeoutseconds→ false positives or negatives. - overly aggressive
periodseconds→ wasteful health checks.
✅ fix: add both readiness and liveness probes that reflect real startup and health conditions.
apiversion: apps/v1
kind: deployment
metadata:
name: app-demo
spec:
replicas: 3
selector:
matchlabels:
app: demo
template:
metadata:
labels:
app: demo
spec:
containers:
- name: demo
image: yourrepo/demo:1.0
ports:
- containerport: 8080
readinessprobe:
httpget:
path: /healthz
port: 8080
initialdelayseconds: 5
periodseconds: 10
livenessprobe:
httpget:
path: /livez
port: 8080
initialdelayseconds: 15
periodseconds: 20
5. poor pod anti‑affinity and scheduler configuration
by default, pods could be collocated on the same node, which is fine until you hit saturation. misuse of anti‑affinity leads to suboptimal scheduling, leading to list-of-dangling pods and inefficient use of nodes.
- no anti‑affinity → a single node can become a performance bottleneck.
- incorrect weight or topology key → pods spread too lazily, making scaling uneven.
✅ fix: define preferredduringschedulingignoredduringexecution rules or hard requirements for critical workloads.
apiversion: apps/v1
kind: deployment
metadata:
name: cache
spec:
replicas: 4
selector:
matchlabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
affinity:
podantiaffinity:
requiredduringschedulingignoredduringexecution:
- labelselector:
matchlabels:
app: redis
topologykey: "kubernetes.io/hostname"
containers:
- name: redis
image: redis:6.2
resources:
requests:
cpu: 500m
memory: 1gi
limits:
cpu: 1
memory: 2gi
putting it all together: a checklist
here’s a quick cheat sheet you can run before a production rollout:
- ✅ verify resource requests/limits.
- ✅ confirm storage class matches i/o requirements.
- ✅ scale ingress controllers to realistic load.
- ✅ implement readiness/liveness probes.
- ✅ validate pod anti‑affinity rules.
- ✅ enable prometheus alerts for high cpu/memory.
implementing these steps may feel like a chore, but once your cluster is tuned, you’ll witness smoother deployments, fewer restarts, and a lower cost footprint. remember, kubernetes is powerful but performance matters. keep your configurations lean, your metrics humming, and your code devops‑friendly—you’ll be amazed at the difference!
Comments
Share your thoughts and join the conversation
Loading comments...
Please log in to share your thoughts and engage with the community.