uncovered: the five hidden performance pitfalls in kubernetes that are killing your deployments

August 14, 20255 min read

11 months ago0views

introduction

when you’re launching an application on kubernetes, the first thing you’ll notice is how opinionated the platform is about resource management and deployment patterns. however, too many developers (and even seasoned engineers) stumble into performance pitfalls that quietly sabotage their workloads. these hidden bottlenecks can lead to high latency, pod evictions, and even unexpected downtime—everything a devops team, a full‑stack developer, or a coding enthusiast wants to avoid.

below we expose the five most common, yet overlooked, performance traps in kubernetes. each section contains actionable tips—complete with code snippets—to help you optimize deployments, reduce costs, and improve reliability.

1. misconfigured resource requests and limits

every pod in kubernetes declares requests (the amount of resources definitely allocated) and limits (the maximum the pod can consume). if these values are off, the scheduler has no visibility, leading to either resource contention or wasted capacity.

requests too low → pods get scheduled on nodes with insufficient resources, causing oom/kill events.
limits too high → pods monopolize cpu/memory, starving other workloads.
no limits defined → the pod can grow unbounded, blowing up the node.

✅ fix: perform a quick baseline test in a staging cluster and tweak the yaml.

apiversion: apps/v1
kind: deployment
metadata:
  name: fastapi-app
spec:
  replicas: 3
  selector:
    matchlabels:
      app: fastapi
  template:
    metadata:
      labels:
        app: fastapi
    spec:
      containers:
      - name: fastapi
        image: ghcr.io/yourorg/fastapi:latest
        resources:
          requests:
            cpu: 200m
            memory: 256mi
          limits:
            cpu: 500m
            memory: 512mi

use horizontal pod autoscaler (hpa) to automatically scale within these constraints based on cpu or memory usage.

monitoring tool recommendation

deploy prometheus + grafana to collect metrics on container_cpu_usage_seconds_total and container_memory_working_set_bytes. set alerts when usage consistently exceeds 80 % of limits.

2. inefficient use of persistent volumes (pv) and statefulsets

stateful applications need durable storage. a common mistake is coupling stateful workloads with ephemeral local disks or using the same pv class across unrelated services.

using emptydir instead of pv when data should survive node reboots.
using hostpath gives no persistence across cluster upgrades.
incorrect storageclass (e.g., using standard for high iops workloads).

✅ fix: separate pv classes for different i/o profiles and use claiming etiquette.

apiversion: v1
kind: persistentvolumeclaim
metadata:
  name: database-pvc
spec:
  accessmodes:
    - readwriteonce
  resources:
    requests:
      storage: 10gi
  storageclassname: high-io

when defining a statefulset, ensure the volumeclaimtemplates reference the appropriate storage class:

apiversion: apps/v1
kind: statefulset
metadata:
  name: mysql
spec:
  servicename: "mysql"
  replicas: 2
  selector:
    matchlabels:
      app: mysql
  template:
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumemounts:
        - name: mysql-data
          mountpath: /var/lib/mysql
  volumeclaimtemplates:
  - metadata:
      name: mysql-data
    spec:
      accessmodes: [ "readwriteonce" ]
      resources:
        requests:
          storage: 10gi
      storageclassname: high-io

3. over‑provisioning ingress controllers and services

ingress controllers like nginx ingress or traefik are often deployed with high replica counts without considering traffic patterns, leading to wasted cpu cycles.

replica count set to 10 by default → 10× the memory/cpu usage.
without custom resource definitions, default values cannot be tuned for lightweight services.

✅ fix: scale ingress based on estimated traffic bytes and configure resource logging to review usage.

apiversion: apps/v1
kind: deployment
metadata:
  name: nginx-ingress-controller
spec:
  replicas: 2
  selector:
    matchlabels:
      app.kubernetes.io/name: ingress-nginx
  template:
    metadata:
      labels:
        app.kubernetes.io/name: ingress-nginx
    spec:
      containers:
      - name: controller
        image: k8s.gcr.io/ingress-nginx/controller:v1.1.0
        resources:
          requests:
            cpu: 100m
            memory: 128mi
          limits:
            cpu: 250m
            memory: 256mi

implement auto‑scaling using cluster autoscaler or an ingress autoscaler hook.

4. lacking readiness/liveness probes

probes are your first line of defense against “ode to evil pods.” when missing or misconfigured, pods may start accepting traffic before they’re fully ready or get respawned repeatedly.

zero probe → zero reliability for traffic routing.
wrong timeoutseconds → false positives or negatives.
overly aggressive periodseconds → wasteful health checks.

✅ fix: add both readiness and liveness probes that reflect real startup and health conditions.

apiversion: apps/v1
kind: deployment
metadata:
  name: app-demo
spec:
  replicas: 3
  selector:
    matchlabels:
      app: demo
  template:
    metadata:
      labels:
        app: demo
    spec:
      containers:
      - name: demo
        image: yourrepo/demo:1.0
        ports:
        - containerport: 8080
        readinessprobe:
          httpget:
            path: /healthz
            port: 8080
          initialdelayseconds: 5
          periodseconds: 10
        livenessprobe:
          httpget:
            path: /livez
            port: 8080
          initialdelayseconds: 15
          periodseconds: 20

5. poor pod anti‑affinity and scheduler configuration

by default, pods could be collocated on the same node, which is fine until you hit saturation. misuse of anti‑affinity leads to suboptimal scheduling, leading to list-of-dangling pods and inefficient use of nodes.

no anti‑affinity → a single node can become a performance bottleneck.
incorrect weight or topology key → pods spread too lazily, making scaling uneven.

✅ fix: define preferredduringschedulingignoredduringexecution rules or hard requirements for critical workloads.

apiversion: apps/v1
kind: deployment
metadata:
  name: cache
spec:
  replicas: 4
  selector:
    matchlabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      affinity:
        podantiaffinity:
          requiredduringschedulingignoredduringexecution:
          - labelselector:
              matchlabels:
                app: redis
            topologykey: "kubernetes.io/hostname"
      containers:
      - name: redis
        image: redis:6.2
        resources:
          requests:
            cpu: 500m
            memory: 1gi
          limits:
            cpu: 1
            memory: 2gi

putting it all together: a checklist

here’s a quick cheat sheet you can run before a production rollout:

✅ verify resource requests/limits.
✅ confirm storage class matches i/o requirements.
✅ scale ingress controllers to realistic load.
✅ implement readiness/liveness probes.
✅ validate pod anti‑affinity rules.
✅ enable prometheus alerts for high cpu/memory.

implementing these steps may feel like a chore, but once your cluster is tuned, you’ll witness smoother deployments, fewer restarts, and a lower cost footprint. remember, kubernetes is powerful but performance matters. keep your configurations lean, your metrics humming, and your code devops‑friendly—you’ll be amazed at the difference!