What is Kubernetes Scheduling?
When creating a Kubernetes cluster, scheduling the pod to an available node is an important component of the process. This component works under specific rules and technicalities that I’d like to explore in this article. Understanding this can help you increase your cluster deployment rate and quality.
What Happens When You Create a Pod On a Kubernetes Cluster?
The Pod will be executed in a matter of seconds but a lot of things are going on in the background as this happens.
- While scanning the API server (which it is continuously doing), the Kubernetes Scheduler detects that there is a new Pod without a nodeName parameter. The nodeName is what shows which node should be owning this Pod.
- The Scheduler selects a suitable node for this Pod and updates the Pod definition with the node name (through the nodeName parameter).
- The kubelet on the chosen node is notified that there is a pod that is pending execution.
- The kubelet executes the Pod, and the latter starts running on the node.
Selecting The Right node?
This part is the hardest work as there are several algorithms that the Scheduler must utilize to make this decision. Some of those algorithms depend on user-supplied options, while Kubernetes itself calculates others.
Determining If You Have The Resources for a Pod
A node may be overloaded with so many busy pods consuming most of its CPU and memory. So, when the scheduler has a Pod to deploy, it determines whether or not the node has the necessary resources. If a Pod was deployed to a node that does not have enough memory (for example) that the Pod is requesting, the hosted application might behave unexpectedly or even crash.
Which Node Is Capable of Running The Pod?
In addition to true/false decisions a.k.a predicates, the scheduler executes some calculations (or functions) to determine which node is more suited to be hosting the pod in question.
How K8 Scheduler Makes The Decision?
How does the scheduler make the decision to run the pod after the previous steps?
- Undergoes status checks
- Filters nodes based on predicate checks
- Priority tests are done to sort the nodes
- Nodes with the highest scores are moved to the final list
Alternative Decision-making Options
In busy Kubernetes Clusters, the time between the Scheduler choosing the right node and the kubelet on that node executing the pod may be sufficient for changes to occur on the nodes. Even if that time is no more than a few milliseconds, a pod may get terminated on one of the nodes that were filtered out due to insufficient memory. That node could’ve had a higher score on the priority test only if it wasn’t overloaded back then. But now, perhaps a less-suitable node was selected for the pod.
User-Defined Decision Making
You can directly run a pod on a determined node using the nodeselector parameters. Specifically, the .spec.nodeSelector parameter which should be mentioned in the pod definition. The nodeSelector chooses nodes with one or more labels. However, sometimes, user requirements get more complicated. A nodeSelector, for example, selects nodes that have all the labels defined in the parameter. What if you want to make a more flexible selection?
Node Affinity lets you attach pods to nodes based on specifications you have determined. You can choose the disk state, number of cores, and other specs to find nodes of that specific nature.
Some scenarios require that you don’t use one or more nodes except for particular pods. Think of the nodes that host your monitoring application. Those nodes shouldn’t have many resources due to the nature of their role. Thus, if pods other than those which have the monitoring app is scheduled to those nodes, they hurt monitoring and also degrade the application they are hosting. In such a case, you need to use node anti-affinity to keep pods away from a set of nodes.
Hopefully, you’ve come to understand the nature of Kubernetes Scheduler and how you can take advantage of it for your cluster deployments. Kubernetes documentation guides can give you more examples and commands to use.