At the moment, Kubernetes is one of the most exciting technologies in the world of DevOps. Recently, it's been generating a lot of hype for one simple reason: the mighty containers.
Not too long ago, we ran applications on actual server hardware. You'd have a website running on a hardware server and a database storing the website's state. Users visited the site, everything was cool, and everything worked...for a while.
The website became successful, and our company grew. Suddenly, the website became more complex and evolved into a platform with a bunch of services.
Then, along came virtual machines (VMs). These allowed us to run multiple operating systems and applications on a single physical machine. This was a game-changer! Companies could now run ten or more server instances on just one piece of hardware. But what if we could do even more? What if we could strip away the VM's heavy operating system overhead and squeeze more server programs onto a single machine? Enter: containers.
Docker
Docker containers have become the de facto development standard, but it's worth mentioning that Docker wasn't the first player in the container world.
Docker builds on namespaces
and cgroups
(the first isolates resources, while the second groups and limits resources). In terms of virtualization, it's not all that different from LXC/OpenVZ, which historically came first. Same native speed, same Linux kernel isolation methods. But here's where Docker changes the game: high-level simplicity. Docker allows you to deploy a fully virtual environment, run an application on it, and manage it easily.
Like a virtual machine, the Docker runs its processes in its own pre-configured operating system. However, all the Docker processes run on a physical host server sharing the processors and available memory with all the other processes running in the host system. This approach — somewhere between running everything directly on a physical server and full - blown VM virtualization — is called containerization.
Starting with Linux containers, the excitement moved to Docker containers, which brought about the need for container orchestration.
Back to our story. At some point, the company realized the current server was maxed out (vertical scaling). The next phase? The company buys more servers and divides the load (horizontal scaling).
The IT team's job just got a lot harder. Now, they have to deal with problems like updates, patches, monitoring, security, backups, durability, reliability... you name it — leading to pager duty at 6 AM on a Sunday...
Then came the Cloud.
"Great," they said, "now we only need to worry about application logic!" But the complexity of a highly available, distributed system didn't disappear; it simply shifted. Now, we had the complexity of orchestrating and connecting all the infrastructure components: deployments, service discovery, gateway configurations, monitoring, replication, consistency...
We've come full circle to today. Now, we have a distributed system with countless services running inside a bunch of containers in the cloud. Enter Kubernetes.
Kubernetes
Even though I’ve been working with Kubernetes, I still feel like I’ve only scratched the surface. So, this won’t be a deep dive but rather a smooth flight over its key subsystems and concepts.
Kubernetes is a sophisticated system designed to make managing containerized applications scalable, resilient, and easier to deploy. It automates container orchestration — starting, scaling, and managing applications across clusters, handling everything from resource allocation, scheduling, service discovery, monitoring, to secrets management.
Kubernetes allows you to automate the start and rollback of deployments. It manages resources intelligently, scaling them up or down based on the application’s requirements, preventing resource waste. So essentially, Kubernetes is process automation for containerized applications. Applications in Kubernetes are rolled out and managed without administrators. Developers write application code, and then cloud magic happens. In the ideal Kubernetes world, operational support is shifted onto the developers. Administrators then only need to ensure that the cloud infrastructure layer — Kubernetes itself — runs steadily. This is why companies are flocking to the cloud: they want to remove the administrative burdens and focus solely on development.
One of Kubernetes' standout features is its ability to standardize work across Cloud Service Providers (CSPs). No matter which CSP you choose, working with Kubernetes feels the same. The developer specifies the requirements in a declarative manner, and Kubernetes takes care of the resource management, allowing the developer to ignore the platform's underlying complexities.
Originally, Kubernetes was a Google project designed to address the limitations of Borg, Kubernetes' ancestor. Google has used Kubernetes to manage its massive container infrastructure. At some point, Google gave Kubernetes to the world through the Cloud Native Computing Foundation (CNCF). Today, Docker includes Kubernetes as one of its orchestrators, making Kubernetes an integral part of both Docker Community and Docker Enterprise Editions.
Kubernetes has many nicknames: kube, k8s (because who doesn’t love acronyms with numbers?). I’ll use k8s going forward.
Kubernetes itself is a massive abstraction layer that maps to both virtual and physical infrastructure. To understand how it works, let’s go over some of its basic components.
Pod
The pod is the smallest deployable unit in Kubernetes. It consists of one or more containers that should run together, sharing resources such as port numbers, Linux kernel namespaces, and network stack settings. When scaling your application in Kubernetes, you increase the number of pods rather than the number of containers within a single pod. By default, containers inside pods are automatically restarted to fix intermittent issues. This means that even at this fundamental level, Kubernetes is helping keep your containers running reliably. And with a bit of additional tuning, you can achieve even more resilience.
Quite often, there’s only a single container in a pod. However, the pod abstraction provides flexibility for cases where multiple containers need to work closely together. For instance, if two containers need to share the same data storage or communicate through interprocess communication, you can run them within the same pod.
In Kubernetes, you can draw parallels to software development primitives: a container image is like a class, a container instance is an object, a pod is a deployment unit, and composition can be achieved through a sidecar pattern.
Another benefit of pods is that they aren't tied to Docker containers. You can use other containerization technologies, like rkt, if needed.
Desired State
The desired state is one of Kubernetes' core concepts. You define the required state for running pods rather than specifying how to achieve that state. For example, if a pod crashes or stops working, Kubernetes will automatically recreate it to match the specified desired state.
Kubernetes constantly monitors the state of the containers in the cluster, using control loops on the Kubernetes Master, a key part of the control plane. More on that in a bit.
Objects in Kubernetes
A Kubernetes object is a record of intent that represents the desired state of the cluster. After creating an object, Kubernetes will continuously check to ensure the object's state matches its definition. These objects serve as an additional abstraction layer over the container interface, so you can interact with them instead of directly manipulating containers.
Most Kubernetes objects include two nested fields governing their configuration: the object spec and the object status.
The pod, which we discussed earlier, is just one type of Kubernetes object.
Pods are mortal — they come and go. To facilitate communication between pods and ensure reliability, Kubernetes introduces the Service abstraction. A Service provides a stable access point for a set of pods delivering the same functionality. There are different types of services: ClusterIP, NodePort, LoadBalancer, and ExternalName. By default, Kubernetes uses ClusterIP, which exposes the service on a cluster - internal IP, so it's only accessible through the Kubernetes proxy.
To expose services to the outside world, Kubernetes uses the Ingress object. Ingress isn’t a Service type but acts as an entry point for the cluster. It defines rules for routing traffic from outside the cluster to internal services, effectively functioning as a "smart router" or Application Load Balancer (ALB). Ingress lets you consolidate routing rules into a single resource, aggregating multiple services under a single IP address.
For example, your web application might have a homepage at https://example.com
, a shopping cart at https://example.com/cart
, and an API at https://example.com/api
. You could implement all of these in a single pod, but to scale them independently, it's better to decouple them into different pods and connect them using Ingress.
Kubernetes also includes a suite of controllers to manage different aspects of the cluster:
- ReplicaSet: Ensures that a specified number of pod replicas are running.
- StatefulSet: Used for managing stateful applications and distributed systems.
- DaemonSet: Copies pods to all or specified nodes in the cluster.
All of these implement a control loop — a non-terminating process that monitors and adjusts the state of subsystems to align with the desired state. Each controller works to bring the current state of the cluster closer to what you’ve defined.
Users expect applications to be available at all times, often deploying new versions multiple times a day. This is where the Deployment object comes in. It simplifies the otherwise tedious process of manually updating applications into a repeatable, automated workflow. Without the Deployment object, you’d have to create, update, and delete pods manually. The Deployment abstracts this process, managing replica sets and pod objects on your behalf.
If something goes wrong during a deployment, it allows you to quickly roll back to the previous working version. Also, using the Deployment object makes scaling applications straightforward — and we'll explore that in practice later.
Kubernetes supports a variety of deployment strategies. Common ones include:
- Rolling deployment: Gradually replaces old containers with new ones.
- Fixed deployment: Replaces all instances at once.
- Blue-green deployment: Creates a new environment (blue) while keeping the current one (green) live. Traffic is switched to the blue environment once it's ready.
- Canary deployment: Slowly rolls out changes to a small subset of users before extending to the full environment.
The last two strategies, blue-green and canary, often require human oversight and interaction to manage the transition.
Architecture
Like most high-availability (HA) systems, Kubernetes follows a Master-Slave architecture. In Kubernetes' world, this is often referred to as the Control Plane and Worker Nodes.
Master Node
The Kubernetes Control Plane is a set of processes that control the state of the cluster. Typically, these processes run on a single node called the Master Node. For redundancy and fault tolerance, the Master Node can be replicated across multiple nodes.
To interact with the Master Node, you use the kubectl
command-line tool. This tool communicates with the cluster through the API server on the Master Node, sending commands and receiving responses to manage various aspects of the cluster.
The services running on the Master Node are called Kubernetes Control Plane (except etcd
), and the Master itself is only used for administrative tasks, while the real containers with your services will run on Worker Node.
The Master Node is primarily used for administrative tasks and does not run the actual application containers — those run on the Worker Nodes. The Master Node comprises several critical components:
etcd
etcd
is a strongly consistent, distributed key-value store used for configuration management and service discovery. Think of it as the cluster’s brain. It stores both the current state and the desired state of the system. When there is a discrepancy between the current and desired states, Kubernetes makes the necessary adjustments to align them. If you’re familiar with tools like Zookeeper or Consul, etcd
plays a similar role.
kube-apiserver
The kube-apiserver is the cluster's main control endpoint. Every command you send through kubectl
is an API request that the kube-apiserver processes. It handles all REST requests, validates them, authenticates and authorizes clients, and updates information in etcd
. Importantly, the kube-apiserver is the only component that interacts directly with etcd
. All other components must go through the kube-apiserver to modify or retrieve information.
kube-controller-manager
The kube-controller-manager is a daemon that runs all the core control loops in Kubernetes. It includes controllers like the Replication Controller, Endpoints Controller, and Namespace Controller that continuously work to move the cluster’s current state towards the desired state. Instead of a single control loop, Kubernetes operates with multiple, simultaneously running loops to keep everything in sync.
kube-scheduler
The kube-scheduler is responsible for task scheduling — determining which Worker Node will run a new pod based on resource requirements and the current load on each node. It’s like Kubernetes' air traffic controller, ensuring that every pod finds a suitable place to run.
Worker Node
A Worker Node can be a virtual or physical machine that runs the containerized applications. Each Worker Node has two main components that work together to manage and run pods:
kubelet
The kubelet is the primary Kubernetes agent on each node. It communicates with the kube-apiserver to receive the specifications for the pods that should be running on that node. It then manages the containers through its container runtime (Docker or other). The kubelet also monitors the state of the containers, reporting status information back to the kube-apiserver. In short, it ensures that containers are running as they should be according to the cluster’s desired state.
kube-proxy
kube-proxy functions like a reverse proxy server within the Kubernetes network. It manages network routing and ensures that requests reach the appropriate services and applications inside the cluster. By default, kube-proxy uses iptables
for handling network traffic, but it can also use other networking tools.
Kubernetes "Lite"
To get hands-on with Kubernetes, you'll need to provision a Kubernetes cluster. Luckily, there are several tools available to do this. The most popular ones include: Minikube, K3s, Kind(Kubernetes-in-Docker), Kubeadm, MicroK8s, Kops, kubernetes-ansible. Each of these tools has different goals, trade-offs, and ideal use cases. Some are designed for local development, while others are meant for production environments.
Minikube is the closest thing to an "official" mini Kubernetes distribution for local testing and development. It's managed by the same foundation as Kubernetes and can be easily controlled using kubectl
. Minikube is cross-platform, running on Windows, macOS, and Linux. However, it heavily relies on an intermediary virtual machine (which can add some overhead), except on Linux, where it can run directly on the host.
Each of these tools makes it easier to explore Kubernetes' powerful features without committing to a complex production-grade setup. Let's dive into setting up a Kubernetes cluster using Minikube in the next section.
Running Kubernetes
Now, let's explore how to use Kubernetes with Minikube to create a local Kubernetes cluster.
The first step is to download and install Docker (latest version), followed by installing Minikube and kubectl. Minikube sets up a single-node cluster on a local virtual machine. Note: Minikube is meant for local testing and development. It’s not recommended for production use.
Setting Up Minikube
After installing Minikube, verify the installation by running:
$ minikube version
minikube version: v1.34.0
commit: 210b148df93a80eb872ecbeb7e35281b3c582c61
Next, start the Minikube cluster:
$ minikube start
This command initializes the Minikube virtual machine, sets up a single-node cluster, and starts Kubernetes. To check if everything is running correctly:
$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
If the status shows 'Running,' you’re all set to start using kubectl
to interact with your cluster.
Using Minikube's Docker Environment
You can speed up local experiments and testing by reusing Minikube’s built-in Docker daemon. Normally, you would push your images to an external Docker registry. But with the command below, you can use Minikube's internal Docker daemon instead. More details are available here.
$ eval $(minikube docker-env)
Hello World on Kubernetes
To see Kubernetes in action, let's deploy a simple application. We’ll use a basic HTTP server that echoes back client requests:
$ kubectl create deployment hello-world --image=kicbase/echo-server:1.0
deployment.apps/hello-world created
Check the running pods:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-66fdb8b694-2f2mt 1/1 Running 0 4s
You’ll see that a pod with a generated name has been created and is running. For more detailed information about this pod, use:
$ kubectl describe pod hello-world-66fdb8b694-2f2mt
This command provides detailed insights into the pod's status, including its IP address and any events related to it. Note that the pod has an internal IP address within the Kubernetes network, but this address isn’t accessible outside the cluster. Additionally, this IP isn't permanent — if the pod is recreated, it may receive a different IP.
To expose this pod externally, we use a Service. The Service assigns persistent IP addresses to pods, provides access from external networks, and balances requests between pods.
Let's create a Service object to expose an external IP address and port:
$ minikube service hello-world
This command will open a browser showing the client/server headers:
Request served by hello-world-66fdb8b694-2f2mt
HTTP/1.1 GET /
Host: 127.0.0.1:54476
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/png,image/svg+xml,*/*;q=0.8
Accept-Encoding: gzip, deflate, br, zstd
...
Let's clean up our resources so they don't get in the way.
$ kubectl delete services hello-world
$ kubectl delete deployment hello-world
Deploying a Kubernetes Application
Let's take it a step further and orchestrate a more complex backend service on the Kubernetes cluster. This service can be as intricate as you like, but for now, we'll keep things manageable.
We’ll use a Deployment specification to delegate the low-level management to Kubernetes. The goal is to make the service resilient and highly available. We’ll create five replicas to ensure high availability: if a pod fails (due to a node failure or maintenance), the controller will automatically start a new pod elsewhere in the cluster.
Also, we’ll define livenessProbe
and readinessProbe
for each container. The kubelet
performs these checks on all pods at regular intervals, sending the results to the kube-apiserver. These probes help Kubernetes proactively manage container health, restarting pods when necessary.
It's crucial to set the initialDelaySeconds
parameter to give your application enough time to initialize. This is especially important for applications like Spring Boot, which take longer to launch their HTTP endpoints.
Also avoid launching your application directly from a pod. It won't survive node crashes. Even single-pod applications should use a ReplicaSet or Deployment to manage pods across the cluster and ensure a specified number of instances are always running.
We’ll introduce an Ingress controller as an L7 application load balancer. This will enable us to direct traffic based on URL paths, simplifying routing between services. For example, requests to URLs starting with mobile.*
can be directed to the backend API designed for mobile devices.
In production, you might use a service mesh like Envoy to handle more complex routing and communication requirements between services.
Here’s the final setup:
Below is the code to define our Deployment, Service, and Ingress in Kubernetes:
apiVersion: apps/v1
kind: Deployment # Defines a Deployment resource
metadata:
name: nginx-dep # The name of the Deployment
spec:
replicas: 5 # Number of pod replicas to create
selector:
matchLabels:
app: server # Label selector to identify which pods belong to this Deployment
minReadySeconds: 10 # Minimum time for a new pod to be ready before it is considered available
strategy:
type: RollingUpdate # Rolling update strategy for zero-downtime deployments
rollingUpdate:
maxUnavailable: 1 # Maximum number of pods that can be unavailable during the update process
maxSurge: 1 # Maximum number of extra pods that can be created during the update process
template: # Template for the pods created by this Deployment
metadata:
labels:
app: server # Label applied to the pods, used by the selector above
spec:
containers:
- name: nginx # Name of the container
image: nginx:1.18-alpine # NGINX container image with a specific version
ports:
- containerPort: 80 # Port the container listens on
readinessProbe: # Probe to check if the container is ready to accept traffic
httpGet:
path: / # Path to check for readiness
port: 80 # Port to check for readiness
initialDelaySeconds: 5 # Delay before the first check
periodSeconds: 10 # Frequency of readiness checks
livenessProbe: # Probe to check if the container is alive
httpGet:
path: / # Path to check for liveness
port: 80 # Port to check for liveness
initialDelaySeconds: 15 # Delay before the first liveness check
periodSeconds: 20 # Frequency of liveness checks
resources: # Resource requests and limits for the container
requests:
memory: "64Mi" # Minimum amount of memory the container will use
cpu: "250m" # Minimum amount of CPU the container will use
limits:
memory: "128Mi" # Maximum amount of memory the container can use
cpu: "500m" # Maximum amount of CPU the container can use
---
apiVersion: v1
kind: Service # Defines a Service resource
metadata:
name: nginx-service # The name of the Service
spec:
ports:
- port: 8080 # Port on which the Service is exposed
targetPort: 80 # Port on which the container is listening
selector:
app: server # The label selector to identify which pods this Service targets
type: ClusterIP # Default type; Service is accessible only within the cluster
---
apiVersion: networking.k8s.io/v1 # Updated API version for Ingress
kind: Ingress # Defines an Ingress resource
metadata:
name: nginx-ingress # The name of the Ingress
spec:
rules:
- host: localhost # Uses localhost for local testing
http:
paths:
- path: / # Path to route to the backend Service
pathType: Prefix # Defines how the path is matched ('Prefix', 'Exact', etc.)
backend:
service:
name: nginx-service # Name of the Service to route traffic to
port:
number: 8080 # Port on the Service to send traffic to
Yep, one more shitty DSL in your life. Yep, in shitty YAML format. I merged it all together for visibility and simplicity but it was supposed to be several files.
To apply this configuration:
$ kubectl create -f nginx.yaml
deployment.apps/nginx-dep created
service/nginx-service created
ingress.networking.k8s.io/nginx-ingress created
Minikube doesn’t include an Ingress controller by default, so you need to enable it:
$ minikube addons enable ingress
Expose the services using:
$ minikube tunnel
Now, if you open the browser, you should see:
Let's try to update the version of the application online. The following command will allow you to see the process of updating individual pods of our service — you should see how they are terminated and initialized:
$ kubectl set image deployment/nginx-dep nginx=library/nginx:1.20-alpine \
&& watch -n 1 kubectl get pods
It is supposed to be a gif here to show you how it's done but I'm lazy to do that and you are lazy to do that yourself, so trust me and imagine that old pods are magically terminated and new pods are magically appearing.
Updating the Application
With our deployment in place, we can now scale it up. Let's say you have a need to scale the number of nginx pods from three to five.
There are two ways to do this. First, you could edit the YAML file and change the line or by doing this via the command line:
$ kubectl scale deployments/nginx-dep --replicas=10 \
&& watch -n 1 kubectl get pods
You’ll see new containers being created and initialized. Pretty cool, right?
When you're done, clean up everything with:
kubectl delete deployment --all
kubectl delete service --all
kubectl delete pods --all
kubectl delete ingress --all
Conclusion
- These examples are quite simple. In real-world scenarios, you’ll need to handle configuration storage, application secrets, stateful volumes, etc. However, the concepts themselves do not change. For those interested in setting up a production-ready Kubernetes cluster yourself you can visit this great tutorial.
- Kubernetes is designed as a collection of more than half a dozen interoperable services that together provide full functionality. Kubernetes at a smaller scale solves most of the problems without a lot of fuss. At a larger scale, it requires a lot more thought, glue code, and putting wrappers/safeguards on pretty much everything to make it work safely and reliably. Generally, as mentioned above, folks tend to add a Service Mesh to enable more advanced features/requirements. K8s support the launch in a highly available configuration but is operationally complex to configure. In addition, securing Kubernetes is not a trivial, simple, or well-understood operation.
- K8s is a huge ecosystem that was formed very quickly. In addition to the k8s itself, there are many tools to work with it, in addition to those we have seen Kubebox, Containerum, Kubetail, Twistlock, Sysdig Secure, Kubesec.io, Aquasec, Searchlight, Kail... And also many solutions were built on top of it like Kubeflow, KubeDB, KubeVault, and Voyager. This article is already too long although I wanted to include Helm as another component in k8s world.b
- The shift from imperative to declarative models is growing. Kubernetes embodies this trend, allowing you to define desired states rather than manual processes. This approach results in cleaner, more compact, and elegant systems.
As we continue shifting from servers and VMs to containers, Kubernetes is becoming an inevitable part of modern infrastructure. It is IT revolution.
Additional materials
- Kubernetes Patterns by Bilgin Ibryam & Roland Huss
- Logging for Success
- Different ways to spin up Kubernetes clusters [Reddit]
- minikube docs
- Kubernetes Explained in 6 Minutes by ByteByteGo