postgresql on kubernetes: master storage with persistent volumes (pvs) and pvs best practices
why use persistent volumes in postgresql on kubernetes?
when running postgresql on kubernetes, data persistence is crucial. by default, containers are ephemeral — meaning if the pod restarts, all your data is lost. this is not acceptable for databases like postgresql, where data durability and reliability are essential.
this is where persistent volumes (pvs) come in. pvs provide long-term storage that exists independently of the pod lifecycle. when postgresql runs in a kubernetes cluster, connecting it to a persistent volume ensures your data survives pod restarts, node failures, and even cluster upgrades.
understanding persistent volumes and persistent volume claims
in kubernetes, storage is managed using two key resources: persistent volumes (pv) and persistent volume claims (pvc).
- persistent volume (pv): a piece of networked storage in the cluster, provisioned by an administrator or dynamically created using storageclasses.
- persistent volume claim (pvc): a user's request for storage. it lets a pod consume storage resources without knowing the details of the underlying infrastructure.
think of pv as the "hard drive" in the cluster, and pvc as the "request" to use part of that drive.
how postgresql uses pvcs in kubernetes
when deploying postgresql, you define a pvc in your pod or statefulset specification. kubernetes binds this claim to an available pv (or dynamically provisions one), and mounts it into the postgresql container — typically at /var/lib/postgresql/data, where postgresql stores its database files.
setting up postgresql with persistent volumes: step-by-step
let’s walk through a practical example of deploying postgresql with persistent storage using a pvc.
step 1: define a persistent volume claim
create a postgres-pvc.yaml file:
apiversion: v1
kind: persistentvolumeclaim
metadata:
name: postgres-pvc
spec:
accessmodes:
- readwriteonce
resources:
requests:
storage: 10gi
storageclassname: standard
this claim requests 10gb of storage with readwriteonce access mode, meaning it can be mounted as read-write by a single node.
step 2: deploy postgresql with the pvc
use a statefulset to ensure stable network identities and persistent storage. here's a basic postgres-statefulset.yaml:
apiversion: apps/v1
kind: statefulset
metadata:
name: postgres
spec:
servicename: postgres
replicas: 1
selector:
matchlabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
ports:
- containerport: 5432
env:
- name: postgres_db
value: myappdb
- name: postgres_user
value: admin
- name: postgres_password
value: securepassword
volumemounts:
- name: postgres-storage
mountpath: /var/lib/postgresql/data
volumes:
- name: postgres-storage
persistentvolumeclaim:
claimname: postgres-pvc
here, the container mounts the pvc at the correct path to store the database data.
step 3: expose postgresql with a service
create a clusterip service to allow other services to access postgresql:
apiversion: v1
kind: service
metadata:
name: postgres
spec:
selector:
app: postgres
ports:
- protocol: tcp
port: 5432
targetport: 5432
type: clusterip
persistent volume best practices for postgresql
to ensure reliability, performance, and scalability when using pvs with postgresql, follow these best practices:
1. use statefulsets for stateful applications
always use statefulsets instead of deployments for databases. statefulsets ensure stable networking, ordered deployment, and persistent storage attachment — all critical for databases like postgresql.
2. choose the right storage class
select a storageclass that matches your performance needs. for example:
standard: good for development and testing.ssdorfast-disks: recommended for production postgresql workloads to reduce i/o latency.
check available classes with:
kubectl get storageclass
3. set appropriate resource requests and limits
ensure your postgresql pod has enough cpu and memory. under-resourcing can cause performance issues, especially under load.
resources:
requests:
memory: "1gi"
cpu: "500m"
limits:
memory: "2gi"
cpu: "1000m"
4. enable backup and restore strategies
pvs persist data, but they are not backups. use tools like pg_dump, wal-g, or kubernetes-native tools like velero to take regular backups.
example backup command:
kubectl exec -t postgres-0 -- pg_dump -u admin -d myappdb > backup.sql
5. monitor disk usage and i/o performance
track disk usage via kubernetes monitoring tools (prometheus + grafana) or cloud provider dashboards. postgresql performance heavily depends on disk speed — avoid hdds in production environments.
common pitfalls and how to avoid them
- using emptydir for postgresql: never use
emptydirin production. it stores data only for the pod’s lifetime and is deleted when the pod restarts. - forgetting to set proper access modes: use
readwriteonceunless you have a clustered setup requiring shared storage. - not labeling pvcs correctly: always label pvcs so they can be easily managed and selected by statefulsets or backup jobs.
final thoughts
running postgresql on kubernetes with persistent volumes is a foundational skill for anyone in devops or full stack development. whether you're building a personal project or a scalable enterprise app, mastering pvs and pvcs ensures your data stays safe and accessible.
remember: great coding isn’t just about writing logic — it’s about managing infrastructure wisely. with the right storage setup, your postgresql databases will be secure, durable, and ready for growth.
keep experimenting, keep learning, and take advantage of kubernetes' powerful storage features to build robust, scalable applications that rank high in both performance and seo (just like your knowledge now!).
Comments
Share your thoughts and join the conversation
Loading comments...
Please log in to share your thoughts and engage with the community.