d
WE ARE EXPERTS IN TECHNOLOGY

Let’s Work Together

n

StatusNeo

Persistent Applications in Kubernetes with Stateful Sets

Stateful Set (General Definition)

A Stateful Set is used in Kubernetes when you need to run applications that remember data even after restarting. This is different from a Deployment, which creates new Pods without keeping any memory of the previous ones.

Key Features of Stateful Sets

  • Pods Have Unique Names
    Example: If you have 3 database servers, their names will be db-0, db-1, and db-2. If db-1 crashes, it restarts with the same name, keeping its identity.
  • Pods Start in Order
    Example: If you start three servers, db-0 starts first, then db-1, then db-2. If db-1 is not ready, db-2 won’t start.
  • Persistent Storage (Mandatory to have)
    Example: If you run a MySQL database, it stores data in a volume. Even if the MySQL Pod crashes, the data remains safe because it’s stored separately.
  • Stable Network Identity
    Example: Instead of getting a new IP address every time it restarts, each database server gets a fixed DNS name like db-0.database-service, db-1.database-service. since the pods in stateful sets has unique identity

How Stateful Set is different from Deployment

Feature Stateful Set Deployment
Pod Names Fixed (e.g., db-0, db-1, db-2) random (e.g., webapp-xyz)
Startup Order Starts in sequence (db-0 → db-1 → db-2) Any order
Shutdown Order Stops in reverse order (db-2 → db-1 → db-0) Any order
Persistent Storage Needed (e.g., database storage) Not needed
Applications Web apps, APIs, Databases, Kafka, Loki

Where Are Stateful Sets Used?

Stateful Sets are useful for applications that require consistent data storage and ordering, such as:

Databases (MySQL, PostgreSQL, MongoDB)
Message Queues (Kafka, RabbitMQ)
Monitoring Tools (Loki, Prometheus)

Is a Stateful Set Always the Best Choice?

Not Always ! While Stateful Sets provide important features like data persistence, ordered scaling, and stable network identities, they also come with challenges that can make them expensive and hard to manage.

Why Can Stateful Sets Be Expensive and Complex?

  • High CPU and Memory Usage
    Databases like MySQL, PostgreSQL, and MongoDB perform read and write operations constantly. This requires a lot of CPU power and RAM to process queries efficiently. Unlike a simple web application, which only handles user requests, a database needs to store, retrieve, and update large amounts of data, leading to high resource consumption
  • Storage Costs
    Stateful applications require Persistent Volumes (PVs) to store data. These volumes are usually backed by SSDs for fast performance, which can be expensive. Example: If you store terabytes of data in Kubernetes using Stateful Sets, the cost of managing that storage will increase as data grows
  • Maintenance Overhead
    You need to manage backups, scaling, failover, and recovery yourself. If a database crashes, Kubernetes will restart the pod, but data recovery and replication need to be configured separately
  • Scaling is Complicated
    In a Deployment, new pods can be added or removed easily. But in a Stateful Set, scaling requires careful planning. Example: If a Stateful Set runs a 3-node MySQL cluster, you can’t simply add a 4th node without ensuring data replication is set up properly.

Alternative is using Cloud Managed Databases , To avoid these issues can prefer cloud managed Databases like Azure SQL , Google Cloud SQL , Because these provides Automatic Backups & Recovery,
Auto-Scaling, High Availability, Optimized Performance

Conclusion

  • Use Stateful Sets when you must run a database inside Kubernetes (e.g., strict data locality requirements).
  • For better performance, lower costs, and easier maintenance, a Cloud-Managed Database is often the better choice