Prerequisites
- An AKS cluster, and
kubectlpointed at it (az aks get-credentials …). - Helm 3.x.
- Your Client ID and password from the clients page.
Install
values.yaml (see Configuration → Authentication).
Verify
Running, your workspace Online on the clients page.
Environment-specific config
Use Azure Files for shared storage:- NetworkPolicy: create the AKS cluster with
--network-policy azure(Azure NPM) or Calico — otherwise the training-pod egress lockdown won’t enforce. metrics-serveris bundled on AKS.
Production notes
- Use GPU node pools for training workloads; size requests/limits per job.
- Day-2 upgrades and rollbacks: see Operations.