Working hard on your microservices application, building your deployment files, finally the moment of truth – you’re applying your changes… Boom, lack of resources, nodes are not able to handle your request.
Try to imagine a totally opposite scenario: You have a lot of nodes which use only 10% of their power, but the invoice will still be for 100% allocated resources.
For these scenarios and probably a lot more, the rescue has come – Google Kubernetes Engine (GKE) Autopilot.
More experienced developers might say: “Hey, we can use cluster autoscaler and achieve the same as in Autopilot” – and yes, this argument is partially correct. You are able to set up a cluster autoscaler which will take care of provisioning your required number of nodes based on your demand. But it won’t manage your node pool and it won’t fit the correct compute engine type to your needs. While this approach looks similar to GKE Autopilot, it is not. Autopilot was created to give you a feeling of a real serverless infrastructure, where you can create a cluster and immediately deploy your work. In the pictures below you can see how GKE cluster management works in standard and Autopilot mode:
The main difference between Autopilot and Standard mode GKE lies in managing infrastructure. Autopilot automatically creates nodes based on your workloads. Whenever they need more resources, they will be automatically provided. What will happen in the opposite situation? Let’s say you’ve removed a few workloads, or changed them so they’re not using that much computing power - autopilot will scale down your nodes without your intervention.
– Whenever your workloads need more resources, they will be automatically provided.
– Autopilot will scale down your nodes without your intervention.
– It will give you more time to actually develop your application, instead of working on infrastructure underneath.
A lot of developers, including me, have huge problems with calculating how much resources they actually need. Predicting how much RAM or CPU your application will eat while under stress or in some special cases, is an art. Paying for unused resources is just a waste of money, also managing node pools, picking up correct configuration and proper compute machines might be overwhelming.
Everyone, from the developer to the end customer just wants to finish their applications as quickly as possible. That’s why GKE Autopilot might be the correct choice for your team or application. It will give you more time to actually develop your application, instead of working on infrastructure underneath. Also, the entrypoint to learn Kubernetes is a lot lower in Autopilot than Standard GKE. That’s because you don’t have to learn those complicated configurations, you’re almost ready to go after creating a cluster. In simple words; autopilot lets you spend less time to achieve the same, it ramps up your security with some built-in mechanisms, and it allows you and your team to work faster and more efficiently.
As always, there must be a downside too... And yes, there is. But for maybe as much as 99% of cases, Autopilot will be suitable, and nobody will even notice those limitations, but it’s worth mentioning that there are some:
The maximum number of pods per cluster is set to 12,800, versus 150k in the standard cluster.
Limits | GKE Standard cluster | GKE Autopilot cluster |
---|---|---|
Nodes per cluster | 5,000 for GKE versions up to 1.17. 15,000 for GKE versions 1.18 and later. Note: To run more than 5,000 nodes requires lifting a cluster size quota. Contact support for assistance. |
400 To lift this quota, contact support for assistance. |
Nodes per node pool zone | 1,000 | Not applicable |
Nodes in a zone | No node limitations for container-native load balancing with NEG-based Ingress, which is recommended whenever possible. In GKE versions 1.17 and later, NEG-based Ingress is the default mode. 1,000 if you are using Instance Group-based Ingress. |
Not applicable |
Pods per node | 110 | 32 |
Pods per cluster | 150,000 | 12,800 (32 Pods/node * 400 nodes) |
Containers per cluster | 300,000 | 25,600 |
Maximum number of pods and containers per cluster
Autopilot forces you to set resources according to the table below, all the resources below this range will be automatically scaled up. For example, if you set 128MiB of memory, Autopilot will scale it to 512MiB. However, you are able to overcome this issue: For example, you can deploy 2 different pods in a single deployment - this way you’re able to provide 256MiB of memory for each. Ratio between CPU and Memory must be in range of 1 vCPU:1 GiB to 1 vCPU:6.5 GiB
Resource | Minimum resources | Maximum resources | |
---|---|---|---|
Normal Pods | DaemonSet Pods | Normal and DaemonSet Pods | |
CPU | 250 mCPU | 10 mCPU | 28 vCPU2 |
Memory | 512 MiB | 10 MiB | 80 GiB2 |
Ephemeral storage | 10 MiB (per container) | 10 MiB (per container) | 10 GiB |
Minimum resources you are able to set
In my opinion, it is totally worth giving it a try. Setting up a new environment with a new cluster is a lot faster on Autopilot. It may be suitable even for development or proof of concept projects, but then you will lose some key features of Autopilot, as they were created to run in a production environment to give you the best experience.
In case you would like to give it a try, you’ll find the whole documentation of GKE Autopilot HERE: