Diving Deeper into Kubernetes

Introduction

In our previous blog, we touched upon the basics of Kubernetes and why it has become so popular among the masses. In this blog, let's try to get into some exciting concepts of Kubernetes that will help you understand what's inside the magic box of the Kubernetes cluster and how can we make the best use of it.

P.S.: This blog will focus more on informational points, rather than a story-telling approach.

Pod Helpers

Init Containers and Pause Containers

These are specialized containers that run before app containers in a Pod. Init containers can contain utilities, custom code, security and checkup software, setup scripts not present in an app image, or modify the file system of some volume that will be later mounted to and accessed by the app container as well. If there are multiple Init Containers for the same pod, they are run sequentially.

Pause containers, also known as Sleep containers, are also present in pods. These are empty containers that hold the network namespace and related attributes for the pod. So the container runtime create the network namespaces, and the pause containers hold it. You can check it by the following commands:

lsns |grep pod
lsns -p <pid from previous command>

and then see that (for example) net, utc, ipc are held by /pause container.

Pod and Container Probes

To perform effective initiation, operation, and termination of the pods, Kubernetes allows us to perform the following procedures:

Probes - Liveness, Readiness, and Startup probes
Post-start Hooks - run something right after the app container starts
Pre-stop Hooks - run something right before the app container stops
Init Containers - explained above

Let's discuss the probes a bit more:

Liveness Probes - these probes help to determine the health and check if the applications running within the containers are operational. Basically, it checks whether the container is available and alive, but not whether it is ready to communicate. If it succeeds, no action is taken and no events are logged. If it fails, the kubelet kills the container, and it is restarted in line with the pod restartPolicy, ideally it is set to either Always or OnFailure. Know more.
Readiness Probe - these probes indicate whether the application running in the container is ready to accept requests and serve the traffic. Thus, these probes are essential for pods that are used as backends for your services. They are most useful when an application is temporarily malfunctioning and unable to serve traffic. When a pod is deleted, it automatically puts itself into an unready state, regardless of whether readiness probes are used. Know more.
Startup probes - these probes indicate whether the application in the container has fully started. If it does, the other probes start their diagnostics since they do not start until startup probe succeeds, and if it does not, then the kubelet kills and restarts the container based on restartPolicy.

Taints and Tolerations

Taints and tolerations are a mechanism that allows you to ensure that pods are not placed on inappropriate nodes. Taints are added to nodes, while tolerations are defined in the pod specification. Node affinity is a property of Pods that attracts them to a set of nodes (either as a preference or a hard requirement). Taints are the opposite -- they allow a node to repel a set of pods.

Tolerations are applied to pods. Tolerations allow the scheduler to schedule pods onto the nodes with matching taints. Only those Pods that have a toleration for the taint can be let into the node with that taint.

Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints. However, it should also be clarified that taints and tolerations have nothing to do with security.

Volumes in Kubernetes

Volumes in Kubernetes can be thought of as a directory that is accessible to the containers in a pod. The access policies of volumes are generally categorized into ReadWriteOnce, ReadOnlyMany, and ReadWriteMany. Volumes help you persist data even if your container restarts, which is crucial since the containers and pods are ephemeral. The Kubernetes volumes are typically classified into the following:

Remote Storage - Cloud volumes, GlusterFS, NFS. The data is safe even if the cluster goes down.
Ephemeral Storage - EmptyDir, Secrets, ConfiMaps, CSI Ephemeral Volumes, etc.; so they are suitable for caching, environment variables, etc. They are required when the pod is running, so their lifecycle is generally similar to the lifecycle of the pod.
Hostpath: This volume mounts a file or directory (it can be created or can pre-exist as well) from the host node's filesystem into your Pod. Since it carries high-security risks, it is generally avoided in deployment.
Persistent Volume and Persistent Volume Claim - most popular, PVC talks to PV and claims it. PV can be provisioned by admin in a Static or Dynamic way, based on StorageClass.

Persistent Volume

A persistent volume is an object that allows pods to access persistent pieces of storage in a Kubernetes cluster. In other words, these storages continue to exist even after the pods accessing them have been destroyed. These PersistentVolumes are cluster-level resources like nodes, thus they don’t belong to any namespace.

Persistent Volume Claim

PVs must be requested through persistent volume claims (PVCs), which are requests for storage. A PVC is essentially a request to mount a PV meeting certain requirements on a pod. PVCs do not specify a specific PV—instead, they specify which StorageClass the pod requires. Administrators can define StorageClasses that indicate properties of storage devices, such as performance, service levels, and back-end policies.

StorageClass

A StorageClass can also be considered as a "profile" of storage where different class/profile might map to different quality-of-service levels, backup policies, or other arbitrary policies as determined by the cluster administrators. Each StorageClass contains the fields provisioner, parameters, and reclaimPolicy.

Static Provisioning: A cluster administrator creates several PVs. They carry the details of the real storage, which is available for use by cluster users.

Dynamic Provisioning: When none of the static PVs match a user’s PersistentVolumeClaim, the cluster may try to dynamically provision a volume, especially for the PVC. This provisioning is based on StorageClasses, the PVC must request a storage class and the administrator must have created and configured that class for dynamic provisioning to occur.

Know more.

Kubernetes GVR

GVR stands for Group, Version, and Resource and it determines the Kubernetes API Server structure. You will essentially confront them in the manifest (YAML) files. There is another division known as Kind which then encapsulates in complete understanding of the API structure.

Group and Versions

A Group is simply a collection of Kubernetes Objects with related functionality. Each group has one or more versions, which, as the name suggests, allow us to change (Rolling updates and Rollbacks) how an API works over time by releasing groups as tagged versions. They are divided into Alpha, Beta and Stable versions. Deployments, ReplicaSets, and StatefulSets are part of apps group, Ingress is a part of networking.k8s.io group, Function is of lambda.services.k8s.aws (custom object, not a part of default Kubernetes so controller manager doesn't know about it, so we need to define custom controller and CRD - Custom Resource Definition as well), and so on.

Resource (and Kind)

A Kind represents the type of Kubernetes objects to be created while using the YAML file. A resource is an endpoint in the Kubernetes API that stores a collection of API objects of a certain kind. For example, the built-in Pods resource contains a collection of Pod objects. Often, there’s a one-to-one mapping between Kinds and resources. For instance, the Pods resource corresponds to the Pod Kind. However, sometimes, the same Kind may be returned by multiple resources. For instance, the Scale Kind is returned by all scale subresources, like deployments/scale or replicasets/scale.

What's Next

There are also some more interesting topics that I am looking forward to adding via a new blog or updating this one. Some of the topics can be Networking inside Kubernetes, Inter Node (Pod-to-Pod) Communication and the role of ARP, Node-to-Node Communication, easier monitoring and deployments, etc.

Abhiram's Blog