- talm-subscription.yaml: Subscription applied to openshift-operators, Manual approval, startingCSV pinned to v4.19.3. - smoke-test/: namespace, no-op inform Policy (asserts kube-system exists), staged ClusterGroupUpgrade. Smoke result on anaeem: ClustersSelected=True, Validated=True, then after enable=true the CGU reached Succeeded=True with reason "All clusters already compliant" — full Policy → Placement → ACM → TALM → CGU chain verified. |
||
|---|---|---|
| smoke-test | ||
| curator-anaeem-4.20.11.yaml | ||
| README.md | ||
| talm-subscription.yaml | ||
| TALM-zero-touch.md | ||
Upgrading OpenShift via ACM ClusterCurator
End-to-end procedure for upgrading a managed OpenShift cluster from the ACM hub using a ClusterCurator YAML — no console clicks, GitOps-friendly, and reusable across clusters.
This document is written against the live setup:
| Hub cluster | local-cluster (api.virt.na-launch.com:6443) — also the KubeVirt host for spoke nodes |
| Spoke cluster | anaeem (api.anaeem.na-launch.com:6443) |
| ACM | multiclusterhub 2.14.2 in open-cluster-management |
| Curator controller | cluster-curator-controller (2 replicas) in multicluster-engine |
| CRD | clustercurators.cluster.open-cluster-management.io/v1beta1 |
Other managed clusters (hybrid, additional spokes) follow the same pattern — only the namespace and CR metadata.name change.
1. Background: how the curator drives an upgrade
ClusterCurator is a hub-side CR that the cluster-curator-controller reconciles. It does not itself talk to the spoke. Instead it spawns a Kubernetes Job in the cluster's namespace (anaeem in our case). That Job runs two stages as init/main containers:
curator-job-<rand>
├── initContainer: upgrade-cluster (writes the desired version to the spoke)
└── container: monitor-upgrade (polls until the spoke reaches it or times out)
Both stages communicate with the spoke through ACM's "work" channel:
-
upgrade-cluster- Creates a
ManagedClusterViewnamed after the cluster, pointing at the spoke'sClusterVersionresource — this is how the hub reads spoke state. - Creates a short-lived
ManagedClusterActionthat asks the klusterlet on the spoke to patchClusterVersion.spec.desiredUpdate(andspec.channel). The klusterlet executes the patch, the action self-deletes after success/failure. - Marks
upgrade-clusterconditionTrueon the curator.
- Creates a
-
monitor-upgrade- Re-reads the
ManagedClusterViewon a poll loop. - Updates
monitor-upgradecondition with whatever the spoke'sClusterVersion.statuscurrently says (e.g.Working towards 4.20.11: 119 of 959 done (12% complete), waiting on etcd, kube-apiserver). - Exits successfully when the spoke reports the new version
Completed. - Exits failed when
monitorTimeout(minutes) elapses — but the upgrade itself does not roll back; CVO and MCO continue independently.
- Re-reads the
The CR also supports prehook / posthook Ansible Tower job specs and an overrideJob that replaces the default Job entirely — out of scope for this doc.
Who does what after desiredUpdate is set
ClusterCurator (hub)
│ (writes to)
▼
ClusterVersion.spec.desiredUpdate (spoke)
│
▼
Cluster Version Operator (CVO) — sequences cluster-operator updates
│
▼
Each ClusterOperator — own controller does its rolling update
│
▼
Machine Config Operator (MCO) — when needed, generates new rendered MachineConfig
│
▼
MachineConfigPool (master, then worker)
│ ┌──────────────────────────────────┐
│ │ Per-node loop: │
│ │ 1. cordon │
│ │ 2. drain (respects PDBs) │
│ │ 3. apply rendered MachineConfig│
│ │ 4. reboot │
│ │ 5. uncordon │
│ └──────────────────────────────────┘
▼
Upgrade complete ⇒ ClusterVersion.status.history[0].state = Completed
Almost every "upgrade is stuck" symptom traces back to one of those five per-node steps failing.
2. Pre-flight checks
Run all of these from the hub. Replace anaeem with your cluster name.
2.1 Confirm the cluster is reachable and healthy
oc get managedcluster anaeem
# HUB ACCEPTED JOINED AVAILABLE → expect true / True / True
AVAILABLE=Unknown (as hybrid currently shows) means the klusterlet is not reporting; the curator will create the Job but the action will never reach the spoke. Fix availability before upgrading.
2.2 Confirm the curator controller is running
oc -n multicluster-engine get pods -l app=cluster-curator-controller
Without it, the CR sits in pending.
2.3 Inspect what the spoke thinks it can upgrade to
oc get managedclusterinfo -n anaeem anaeem -o json | jq '.status.distributionInfo.ocp |
{current:.version, channel:.channel, desired:.desiredVersion,
inChannelUpdates:.availableUpdates,
conditional:[.versionAvailableUpdates[].version]}'
Two relevant lists:
availableUpdates— versions reachable from the currentchanneland recommended. These do not require the not-recommended annotation.versionAvailableUpdates— the broader graph (other channels, conditional updates). To pick from this set you must:- set the annotation
cluster.open-cluster-management.io/upgrade-allow-not-recommended-versions: "true", - and usually set
spec.upgrade.upstream:to the OpenShift update service URL (https://api.openshift.com/api/upgrades_info/v1/graph).
- set the annotation
Pick a version that is in availableUpdates whenever possible.
2.4 Confirm the spoke is currently healthy
The MCO will refuse to drain a degraded pool, and CVO will refuse to start an upgrade if any operator is Available=False. Spoke-side check:
oc --context anaeem get clusterversion
oc --context anaeem get co | awk '$3!="True" || $4!="False" || $5!="False"'
oc --context anaeem get mcp
oc --context anaeem get nodes
(oc --context anaeem requires the spoke kubeconfig context to be present; if not, log in: oc login https://api.anaeem.na-launch.com:6443.)
If you don't want to leave the hub, mirror these via ManagedClusterView (see §6.1).
2.5 Make sure no other curator is already running
oc get clustercurator -n anaeem
oc get jobs -n anaeem | grep curator
A second ClusterCurator while the first is in flight will race. Delete the old CR (and its Job) before applying a new one.
3. The CR
/home/anaeem/upgrades/curator-anaeem-4.20.11.yaml:
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: ClusterCurator
metadata:
name: anaeem # MUST match the managed cluster name
namespace: anaeem # MUST be the cluster's namespace
annotations:
cluster.open-cluster-management.io/upgrade-allow-not-recommended-versions: "false"
spec:
desiredCuration: upgrade
upgrade:
desiredUpdate: "4.20.11" # quoted to keep it a string
channel: stable-4.20
monitorTimeout: 120 # minutes
# upstream: https://api.openshift.com/api/upgrades_info/v1/graph # only when crossing channels / conditional
# intermediateUpdate: "4.20.99" # EUS→EUS hop
# prehook: [...] / posthook: [...] # Ansible Tower
# overrideJob: <PodTemplateSpec> # replace default upgrade job
Field reference (verified against the live CRD via oc explain clustercurator.spec.upgrade):
| Field | Type | Notes |
|---|---|---|
desiredCuration |
string enum | install, upgrade, scale, destroy. "" clears, allowing re-arm. |
upgrade.desiredUpdate |
string | Target X.Y.Z. Required. |
upgrade.channel |
string | Update channel. Should match what the spoke is on (or one it can switch to). |
upgrade.intermediateUpdate |
string | EUS→EUS only. Curator will hop through this version first. |
upgrade.monitorTimeout |
int (minutes) | Default 120. Only affects the monitor-upgrade container; the upgrade keeps going if it expires. |
upgrade.upstream |
string | Override OSUS URL. |
upgrade.prehook / posthook |
[]obj | Ansible Tower job specs. |
upgrade.overrideJob |
obj | Full pod template — completely replaces the default job. |
upgrade.towerAuthSecret |
string | Tower secret for prehook/posthook. |
Server-side validation before applying
oc apply --dry-run=server -f curator-anaeem-4.20.11.yaml
Catches schema mistakes without creating anything.
Apply
oc apply -f curator-anaeem-4.20.11.yaml
The curator controller picks it up within a few seconds, populates spec.curatorJob, and creates the Job.
4. Watching progress
4.1 Curator-side (hub)
# CR conditions — the canonical view of curator state
oc get clustercurator -n anaeem anaeem \
-o jsonpath='{range .status.conditions[*]}{.type}={.status} {.message}{"\n"}{end}'
# Example output during a healthy upgrade:
# clustercurator-job=False curator-job-rtpw6 DesiredCuration: upgrade
# upgrade-cluster=True Completed executing init container
# monitor-upgrade=False Upgrade status - Working towards 4.20.11: 119 of 959 done (12% complete), waiting on etcd, kube-apiserver
# The job and pod
oc get jobs,pods -n anaeem | grep curator
# Live tail of the monitor container (this is the most useful single command during an upgrade)
JOB=$(oc get clustercurator -n anaeem anaeem -o jsonpath='{.spec.curatorJob}')
oc logs -n anaeem -l job-name=$JOB -c monitor-upgrade -f
Condition meanings:
clustercurator-job— overall job lifecycle.False/Job_has_finishedhere is misleading; it just means a job was launched.upgrade-cluster—Trueonce the init container successfully patched the spoke.monitor-upgrade— carries the live progress message. BecomesTrueon success.
4.2 Spoke-side, viewed through the curator's view
The curator already created ManagedClusterView/anaeem in the anaeem namespace, mirroring the spoke's ClusterVersion:
oc get managedclusterview -n anaeem anaeem \
-o jsonpath='{.status.result.status.conditions[?(@.type=="Progressing")].message}{"\n"}'
oc get managedclusterview -n anaeem anaeem \
-o jsonpath='{.status.result.status.history[0].version}{" → state: "}{.status.result.status.history[0].state}{"\n"}'
4.3 Spoke-side, directly
oc --context anaeem get clusterversion
oc --context anaeem get co
oc --context anaeem get mcp
oc --context anaeem get nodes
oc adm upgrade status (newer OpenShift) gives a clean summary if you have it.
4.4 Re-arming the curator for the next bump
The CR is single-shot: once the Job ends (success or fail), reconciliation stops. To run another upgrade:
# either delete + re-apply
oc delete clustercurator -n anaeem anaeem
oc apply -f curator-anaeem-4.20.12.yaml
# or patch in place (clear, then set new spec)
oc patch clustercurator -n anaeem anaeem --type=merge -p '{"spec":{"desiredCuration":""}}'
oc patch clustercurator -n anaeem anaeem --type=merge \
-p '{"spec":{"desiredCuration":"upgrade","upgrade":{"desiredUpdate":"4.20.12","channel":"stable-4.20","monitorTimeout":180}}}'
Either way, a new curator-job-* will be created. Old completed jobs are not auto-cleaned; periodically oc delete job -n <ns> <old-job> to keep the namespace tidy.
5. Post-upgrade validation
# version landed
oc get managedclusterinfo -n anaeem anaeem \
-o jsonpath='{.status.distributionInfo.ocp.version}{"\n"}'
# all operators healthy
oc --context anaeem get co | awk '$3!="True" || $4!="False" || $5!="False"' # should be empty
# all pools updated
oc --context anaeem get mcp -o wide
# UPDATED=True UPDATING=False DEGRADED=False for both pools
# nodes ready and on the new RHCOS
oc --context anaeem get nodes -o wide
oc --context anaeem get nodes -o jsonpath='{range .items[*]}{.metadata.name}{" "}{.status.nodeInfo.osImage}{"\n"}{end}'
Once green: optionally delete the curator CR and old job.
6. Stuck-node / stuck-upgrade playbook
An upgrade is "stuck" when the spoke's
ClusterVersionstops making progress for >15 minutes, or aClusterOperator/MachineConfigPoolreportsDegraded=True. The curator monitor will keep ticking untilmonitorTimeoutthen fail — but the failure is informational, not corrective.
6.1 Diagnose from the hub without leaving it
ManagedClusterView lets you read any spoke resource. Apply once, then re-read its .status.result.
# nodes
oc apply -f - <<'EOF'
apiVersion: view.open-cluster-management.io/v1beta1
kind: ManagedClusterView
metadata: { name: anaeem-nodes, namespace: anaeem }
spec:
scope:
resource: nodes
EOF
oc get managedclusterview -n anaeem anaeem-nodes -o json |
jq '.status.result.items[] |
{name:.metadata.name,
ready:(.status.conditions[]|select(.type=="Ready")|.status),
unsched:.spec.unschedulable,
image:.status.nodeInfo.osImage}'
# machineconfigpools
oc apply -f - <<'EOF'
apiVersion: view.open-cluster-management.io/v1beta1
kind: ManagedClusterView
metadata: { name: anaeem-mcp, namespace: anaeem }
spec:
scope:
resource: machineconfigpools
apiGroup: machineconfiguration.openshift.io
EOF
oc get managedclusterview -n anaeem anaeem-mcp -o json |
jq '.status.result.items[] |
{name:.metadata.name,
desired:.status.configuration.name,
ready:.status.readyMachineCount,
updated:.status.updatedMachineCount,
degraded:.status.degradedMachineCount,
conditions:[.status.conditions[]|select(.status=="True")|.type]}'
# clusteroperators
oc apply -f - <<'EOF'
apiVersion: view.open-cluster-management.io/v1beta1
kind: ManagedClusterView
metadata: { name: anaeem-co, namespace: anaeem }
spec:
scope:
resource: clusteroperators
apiGroup: config.openshift.io
EOF
For ad-hoc patches without leaving the hub, use ManagedClusterAction (the same mechanism the curator itself uses).
6.2 Common failure modes and fixes
A. Drain hangs because of a PodDisruptionBudget
Symptom: MCP Updating=True for a long time on one node; oc describe node <n> shows Drain failed; an MCD log line like error when evicting pod "...": Cannot evict pod as it would violate the pod's disruption budget.
Fix on spoke:
oc get pdb -A -o wide # find the offender
oc patch pdb <name> -n <ns> --type=merge -p '{"spec":{"minAvailable":0}}'
# upgrade resumes within seconds
After the upgrade restore the PDB and (better) fix the workload to tolerate maxUnavailable: 1.
B. Pod won't terminate (long grace period, missing controller, stuck finalizer)
Symptom: MCD log shows pods with local storage or pod has no controller. oc describe node <n> lists offending pods.
Fix:
# force-delete after confirming the workload tolerates it
oc delete pod <p> -n <ns> --grace-period=0 --force
# stuck finalizer
oc patch pod <p> -n <ns> --type=merge -p '{"metadata":{"finalizers":null}}'
C. emptyDir / hostPath blocks drain
Symptom: MCD log says pods with local storage (use --delete-local-data to override).
The MCO won't pass that flag. Either annotate the pod's owning workload to evict cleanly (controller.kubernetes.io/pod-deletion-cost, accept restart) or recreate the workload elsewhere first.
D. Node reboots but comes back NotReady
Symptom: oc get nodes shows NotReady,SchedulingDisabled. oc describe node may show kubelet errors, certificate problems, or CNI not ready.
Investigation:
oc --context anaeem get csr | grep -i pending # approve any pending
oc --context anaeem adm certificate approve <csr-name>
oc --context anaeem -n openshift-machine-config-operator logs $(oc --context anaeem -n openshift-machine-config-operator get pod -o name | grep mcd-on-stuck-node)
If kubelet is dead on the node, you need console / serial access. For our setup the node is a KubeVirt VM on the virt host (see §6.2.F).
E. A ClusterOperator won't progress
Symptom: monitor-upgrade message stays on waiting on <op> for 30+ min, e.g. waiting on etcd, kube-apiserver.
oc --context anaeem get co <op> -o yaml | yq '.status.conditions'
oc --context anaeem -n openshift-<op> get pods
oc --context anaeem -n openshift-<op> logs <pod>
etcd and kube-apiserver are the most common stallers — usually quorum, certs, or a single bad master node. Stabilize that node first.
F. Node is gone / unreachable (virt-host issue)
The anaeem cluster's nodes are KubeVirt VMs on the virt host. From the hub:
oc get vm,vmi -A | grep anaeem
oc -n <vm-ns> describe vmi <vm>
# common host-side problems: VMI Failed, evicted from node, scheduling pressure on the host
# restart the VM (graceful)
virtctl restart <vm> -n <vm-ns>
# force-stop / start if hung
virtctl stop <vm> -n <vm-ns> --force
virtctl start <vm> -n <vm-ns>
If a master VM is the one that died and you've lost etcd quorum, follow the OpenShift "restore etcd quorum" runbook — the upgrade is the least of your problems at that point.
G. Pause the bleed
When in doubt, pause the affected pool so MCO stops touching the next node while you investigate:
oc --context anaeem patch mcp worker --type=merge -p '{"spec":{"paused":true}}'
# ...investigate / fix...
oc --context anaeem patch mcp worker --type=merge -p '{"spec":{"paused":false}}'
While paused, CVO will still report Progressing=True indefinitely; the curator monitor will tick toward its timeout. Pause is safe; do not leave a pool paused for days because it blocks security-critical MachineConfig changes too.
H. Manual drain when MCO refuses
If you've decided the disruption is acceptable and just want the node moved:
oc --context anaeem adm cordon <node>
oc --context anaeem adm drain <node> \
--ignore-daemonsets --delete-emptydir-data --force --disable-eviction
# MCO will then proceed to apply the new MachineConfig and reboot
--disable-eviction bypasses PDBs — last resort.
I. Abandon a bad target version
CVO doesn't truly downgrade, but you can stop chasing the target by writing the previous version back into desiredUpdate. Operators that already updated stay updated; CVO will stop trying to advance the rest.
oc --context anaeem patch clusterversion version --type=merge \
-p '{"spec":{"desiredUpdate":{"version":"4.20.8","force":false}}}'
oc delete clustercurator -n anaeem anaeem
This is rare and has consequences (mixed-version cluster); coordinate with whoever owns the cluster.
6.3 What the curator does while the cluster is stuck
| State | monitor-upgrade condition |
curator-job-* Job |
Underlying upgrade |
|---|---|---|---|
| Healthy progress | False, message updates each poll |
Running | Advancing |
Stuck < monitorTimeout |
False, message stays the same |
Running | Stalled |
monitorTimeout reached |
False, message ends with ... timed out |
Failed |
Still trying — CVO/MCO continue |
| Spoke reaches target | True, Cluster has been upgraded |
Complete |
Done |
So a failed curator job ≠ failed upgrade. Always corroborate with the spoke's ClusterVersion.
To re-arm monitoring after a timeout (no need to re-apply the CR if the upgrade is still progressing — but if you want curator to resume watching):
oc patch clustercurator -n anaeem anaeem --type=merge -p '{"spec":{"desiredCuration":""}}'
oc patch clustercurator -n anaeem anaeem --type=merge \
-p '{"spec":{"desiredCuration":"upgrade","upgrade":{"desiredUpdate":"4.20.11","channel":"stable-4.20","monitorTimeout":240}}}'
7. Reference
7.1 Useful commands cheat sheet
# curator state
oc get clustercurator -n anaeem anaeem -o jsonpath='{range .status.conditions[*]}{.type}={.status} {.message}{"\n"}{end}'
# tail upgrade
oc logs -n anaeem -l job-name=$(oc get clustercurator -n anaeem anaeem -o jsonpath='{.spec.curatorJob}') -c monitor-upgrade -f
# spoke version + progress without context switch
oc get managedclusterview -n anaeem anaeem -o json |
jq '.status.result.status | {desired:.desired.version, history:.history[0], progressing:(.conditions[]|select(.type=="Progressing"))}'
# what versions are reachable
oc get managedclusterinfo -n anaeem anaeem -o json |
jq '.status.distributionInfo.ocp | {channel,available:.availableUpdates}'
7.2 Annotations the curator understands
| Annotation | Purpose |
|---|---|
cluster.open-cluster-management.io/upgrade-allow-not-recommended-versions |
Allow desiredUpdate to come from versionAvailableUpdates instead of availableUpdates. |
cluster.open-cluster-management.io/upgrade-clusterversion-backoff-limit |
Override the Job's backoffLimit. |
7.3 Files in this directory
README.md— this documentcurator-anaeem-4.20.11.yaml— applied 2026-04-30 to takeanaeemfrom 4.20.8 to 4.20.11
7.4 Source of truth
- CRD schema:
oc explain clustercurator.spec.upgrade - Controller:
cluster-curator-controllerinmulticluster-engine - Upstream: https://github.com/stolostron/cluster-curator-controller