Skip to main content

Karpenter — AWS

AWS Karpenter uses the karpenter-provider-aws and two CRD types: EC2NodeClass (defines the EC2 node template) and NodePool (defines scheduling constraints and limits per workload).

Bundle source: stacks/aws/karpenter


Infrastructure Components

The bundle deploys two resource groups in order:

infra group (runs first via tofu-module):

  • Creates an IAM role with EC2, pricing API, and SQS permissions
  • Associates the role to the karpenter service account via Pod Identity
  • Provisions an SQS queue for EC2 Spot interruption notifications
  • Installs the Karpenter Helm chart with the role ARN and queue name injected

kubernetes group (runs after infra):

  • Applies a default EC2NodeClass (node template)
  • Applies all NodePool definitions inline via the control plane UI

EC2NodeClass

The karpenter-nodeclass-default is the shared node template referenced by all NodePools.

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: karpenter-nodeclass-default
spec:
amiFamily: AL2023 # Amazon Linux 2023
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
deleteOnTermination: true
encrypted: true
volumeSize: 100Gi
volumeType: gp3
detailedMonitoring: true
metadataOptions:
httpTokens: required # IMDSv2 enforced
httpPutResponseHopLimit: 2
role: "{{ cluster_name }}-karpenter-node-role"
securityGroupSelectorTerms:
- tags:
Name: "{{ cluster_name }}-node" # matches cluster node SG
subnetSelectorTerms:
- tags:
Name: "{{ ext_account_id }}-private*" # private subnets only

Key points:

  • Subnet and security group selection is tag-based — no hardcoded IDs
  • {{ cluster_name }} and {{ ext_account_id }} are resolved by the cluster agent at apply time
  • IMDSv2 is enforced (httpTokens: required)
  • All nodes get encrypted gp3 volumes

NodePools

airflow-worker

Source — used by Airflow KubernetesExecutor pods and Spark driver/executor pods.

SettingValue
Instance familym5.large (general compute)
Architecturearm64
CapacitySpot + On-Demand
ConsolidationWhenEmpty after 2m
Limits2000 CPU · 8000Gi memory
Taintairflow-worker: NoSchedule
Node expiry720h

trino-xsmall

Source — used by Trino coordinator and worker pods.

SettingValue
Instance familyr8g.large (memory-optimized)
Architecturearm64
CapacityOn-Demand + Spot
ConsolidationWhenEmptyOrUnderutilized after 5m
Limits1000 CPU · 4000Gi memory
Tainttrino-xsmall: NoSchedule
Weight50 (lower priority than airflow pool)

jupyterhub-small

Source — used by JupyterHub single-user notebook servers.

SettingValue
Instance familyt3.large / xlarge / 2xlarge (burstable)
Architectureamd64
CapacitySpot + On-Demand
ConsolidationWhenEmptyOrUnderutilized after 5m
Limits1600 CPU · 16000Gi memory
Taintjupyterhub-small: NoSchedule

Deploying via bundle.yaml

The bundle.yaml registers the stack in the platform catalog. Deployment follows the standard stack workflow:

The bundle takes a single required input: the target cluster resource. All other values (cluster name, account ID) are resolved from the cluster resource at deploy time.


Adding a New NodePool

  1. Define a NodePool manifest referencing karpenter-nodeclass-default
  2. Add a taint matching the workload type (e.g. spark-worker: NoSchedule)
  3. Update the karpenter-resources manifest in the catalog via the control plane UI
  4. Re-apply the stack to the target workspace
  5. Configure the workload's pod template with the matching nodeSelector and toleration