Karpenter — AWS
AWS Karpenter uses the karpenter-provider-aws and two CRD types: EC2NodeClass (defines the EC2 node template) and NodePool (defines scheduling constraints and limits per workload).
Bundle source:
stacks/aws/karpenter
Infrastructure Components
The bundle deploys two resource groups in order:
infra group (runs first via tofu-module):
- Creates an IAM role with EC2, pricing API, and SQS permissions
- Associates the role to the
karpenterservice account via Pod Identity - Provisions an SQS queue for EC2 Spot interruption notifications
- Installs the Karpenter Helm chart with the role ARN and queue name injected
kubernetes group (runs after infra):
- Applies a default
EC2NodeClass(node template) - Applies all
NodePooldefinitions inline via the control plane UI
EC2NodeClass
The karpenter-nodeclass-default is the shared node template referenced by all NodePools.
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: karpenter-nodeclass-default
spec:
amiFamily: AL2023 # Amazon Linux 2023
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
deleteOnTermination: true
encrypted: true
volumeSize: 100Gi
volumeType: gp3
detailedMonitoring: true
metadataOptions:
httpTokens: required # IMDSv2 enforced
httpPutResponseHopLimit: 2
role: "{{ cluster_name }}-karpenter-node-role"
securityGroupSelectorTerms:
- tags:
Name: "{{ cluster_name }}-node" # matches cluster node SG
subnetSelectorTerms:
- tags:
Name: "{{ ext_account_id }}-private*" # private subnets only
Key points:
- Subnet and security group selection is tag-based — no hardcoded IDs
{{ cluster_name }}and{{ ext_account_id }}are resolved by the cluster agent at apply time- IMDSv2 is enforced (
httpTokens: required) - All nodes get encrypted gp3 volumes
NodePools
airflow-worker
Source — used by Airflow KubernetesExecutor pods and Spark driver/executor pods.
| Setting | Value |
|---|---|
| Instance family | m5.large (general compute) |
| Architecture | arm64 |
| Capacity | Spot + On-Demand |
| Consolidation | WhenEmpty after 2m |
| Limits | 2000 CPU · 8000Gi memory |
| Taint | airflow-worker: NoSchedule |
| Node expiry | 720h |
trino-xsmall
Source — used by Trino coordinator and worker pods.
| Setting | Value |
|---|---|
| Instance family | r8g.large (memory-optimized) |
| Architecture | arm64 |
| Capacity | On-Demand + Spot |
| Consolidation | WhenEmptyOrUnderutilized after 5m |
| Limits | 1000 CPU · 4000Gi memory |
| Taint | trino-xsmall: NoSchedule |
| Weight | 50 (lower priority than airflow pool) |
jupyterhub-small
Source — used by JupyterHub single-user notebook servers.
| Setting | Value |
|---|---|
| Instance family | t3.large / xlarge / 2xlarge (burstable) |
| Architecture | amd64 |
| Capacity | Spot + On-Demand |
| Consolidation | WhenEmptyOrUnderutilized after 5m |
| Limits | 1600 CPU · 16000Gi memory |
| Taint | jupyterhub-small: NoSchedule |
Deploying via bundle.yaml
The bundle.yaml registers the stack in the platform catalog. Deployment follows the standard stack workflow:
The bundle takes a single required input: the target cluster resource. All other values (cluster name, account ID) are resolved from the cluster resource at deploy time.
Adding a New NodePool
- Define a
NodePoolmanifest referencingkarpenter-nodeclass-default - Add a taint matching the workload type (e.g.
spark-worker: NoSchedule) - Update the
karpenter-resourcesmanifest in the catalog via the control plane UI - Re-apply the stack to the target workspace
- Configure the workload's pod template with the matching
nodeSelectorandtoleration