-
Notifications
You must be signed in to change notification settings - Fork 232
USHIFT-6800: Add c2cc reboot tests #6943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,146 @@ | ||
| *** Settings *** | ||
| Documentation Verify C2CC survives VM reboots in escalating scenarios. | ||
| ... Tests single cluster, two clusters simultaneously, and all three | ||
| ... clusters simultaneously. After each reboot cycle, full-stack | ||
| ... verification confirms connectivity, infrastructure, health probes, | ||
| ... and DNS all recover. | ||
|
|
||
| Resource ../../resources/c2cc.resource | ||
|
|
||
| Suite Setup Setup | ||
| Suite Teardown Teardown | ||
|
|
||
| Test Tags c2cc | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If they can fit into single scenario, then definitely. If not, it should be separate one. Except I'd replace |
||
|
|
||
|
|
||
| *** Test Cases *** | ||
| Reboot Single Cluster | ||
| [Documentation] Reboot cluster-a while cluster-b and cluster-c remain up. | ||
| ... Verifies that the rebooted cluster re-establishes C2CC connectivity | ||
| ... with both peers. | ||
| [Setup] Ensure All Clusters Healthy | ||
| Reboot Clusters Simultaneously cluster-a | ||
| Wait For Clusters Ready | ||
| Verify Full C2CC Stack | ||
|
|
||
| Reboot Two Clusters Simultaneously | ||
| [Documentation] Reboot cluster-b and cluster-c at the same time. | ||
| ... The surviving cluster-a must wait for both peers to recover. | ||
| ... The two rebooted clusters must also reconnect with each other. | ||
| [Setup] Ensure All Clusters Healthy | ||
| Reboot Clusters Simultaneously cluster-b cluster-c | ||
| Wait For Clusters Ready | ||
| Verify Full C2CC Stack | ||
|
|
||
| Reboot All Three Clusters Simultaneously | ||
| [Documentation] Reboot all three clusters at once. | ||
| ... Every cluster starts from scratch simultaneously — no running peer | ||
| ... to reference. All must independently reconstruct C2CC state. | ||
| [Setup] Ensure All Clusters Healthy | ||
| Reboot Clusters Simultaneously cluster-a cluster-b cluster-c | ||
| Wait For Clusters Ready | ||
| Verify Full C2CC Stack | ||
|
|
||
|
|
||
| *** Keywords *** | ||
| Setup | ||
| [Documentation] Set up clusters and deploy test workloads on all. | ||
| Check Required Env Variables | ||
| Login MicroShift Host | ||
| Setup Kubeconfig | ||
| Logout MicroShift Host | ||
|
|
||
| Register Remote Cluster cluster-a ${USHIFT_HOST} ${SSH_PORT} ${KUBECONFIG} | ||
| Register Remote Cluster cluster-b ${HOST2_IP} ${HOST2_SSH_PORT} ${KUBECONFIG_B} | ||
| Register Remote Cluster cluster-c ${HOST3_IP} ${HOST3_SSH_PORT} ${KUBECONFIG_C} | ||
| Deploy Test Workloads | ||
| Verify Full C2CC Stack | ||
|
|
||
| Teardown | ||
| [Documentation] Remove test workloads and close connections. | ||
| Cleanup Test Workloads | ||
| Teardown All Remote Clusters | ||
| Remove Kubeconfig | ||
|
|
||
| Wait For Clusters Ready | ||
| [Documentation] Wait for test pods and service endpoints to become ready | ||
| ... after a reboot cycle. | ||
| Wait For Test Pods | ||
| Wait For Service Endpoints | ||
|
|
||
| Verify Full C2CC Stack | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| [Documentation] Comprehensive verification of all C2CC components across all clusters. | ||
| Wait Until Keyword Succeeds 10m 10s Verify C2CC Connectivity | ||
| Wait Until Keyword Succeeds 10m 10s Verify C2CC Infrastructure | ||
| Wait Until Keyword Succeeds 10m 10s Verify C2CC Health Probes | ||
| Wait Until Keyword Succeeds 10m 10s Verify C2CC DNS | ||
|
|
||
| Verify C2CC Connectivity | ||
| [Documentation] Verify pod-to-pod, pod-to-service connectivity and source IP preservation | ||
| ... across all 6 cluster pairs. | ||
| FOR ${src} ${dst} IN | ||
| ... cluster-a cluster-b | ||
| ... cluster-a cluster-c | ||
| ... cluster-b cluster-a | ||
| ... cluster-b cluster-c | ||
| ... cluster-c cluster-a | ||
| ... cluster-c cluster-b | ||
| Test Connectivity Between Clusters ${src} ${dst} pod | ||
| Test Connectivity Between Clusters ${src} ${dst} service | ||
| Test Source IP Preserved Between Clusters ${src} ${dst} pod | ||
| Test Source IP Preserved Between Clusters ${src} ${dst} service | ||
| END | ||
|
|
||
| Verify C2CC Infrastructure | ||
| [Documentation] Verify routes, IP rules, nftables, OVN static routes, | ||
| ... and node annotations for all cluster-peer combinations. | ||
| Verify Infra For Remote Peer cluster-a ${CLUSTER_B_POD_CIDR} ${CLUSTER_B_SVC_CIDR} ${CLUSTER_A_SVC_CIDR} | ||
| Verify Infra For Remote Peer cluster-a ${CLUSTER_C_POD_CIDR} ${CLUSTER_C_SVC_CIDR} ${CLUSTER_A_SVC_CIDR} | ||
| Verify Infra For Remote Peer cluster-b ${CLUSTER_A_POD_CIDR} ${CLUSTER_A_SVC_CIDR} ${CLUSTER_B_SVC_CIDR} | ||
| Verify Infra For Remote Peer cluster-b ${CLUSTER_C_POD_CIDR} ${CLUSTER_C_SVC_CIDR} ${CLUSTER_B_SVC_CIDR} | ||
| Verify Infra For Remote Peer cluster-c ${CLUSTER_A_POD_CIDR} ${CLUSTER_A_SVC_CIDR} ${CLUSTER_C_SVC_CIDR} | ||
| Verify Infra For Remote Peer cluster-c ${CLUSTER_B_POD_CIDR} ${CLUSTER_B_SVC_CIDR} ${CLUSTER_C_SVC_CIDR} | ||
|
|
||
| Verify Infra For Remote Peer | ||
| [Documentation] Verify all infrastructure components on a cluster for one remote peer. | ||
| [Arguments] ${alias} ${remote_pod_cidr} ${remote_svc_cidr} ${local_svc_cidr} | ||
| Verify Routes In Table 200 ${alias} ${remote_pod_cidr} ${remote_svc_cidr} | ||
| Verify IP Rules For Table 200 ${alias} ${remote_pod_cidr} ${remote_svc_cidr} | ||
| Verify Routes In Table 201 ${alias} ${local_svc_cidr} | ||
| Verify Service IP Rules ${alias} ${remote_pod_cidr} ${remote_svc_cidr} ${local_svc_cidr} | ||
| Verify NFTables Bypass Rules ${alias} ${remote_pod_cidr} ${remote_svc_cidr} | ||
| Verify OVN Static Routes ${alias} ${remote_pod_cidr} ${remote_svc_cidr} | ||
| Verify Node SNAT Annotation ${alias} ${remote_pod_cidr} ${remote_svc_cidr} | ||
| Verify C2CC Tracking Annotation ${alias} ${remote_pod_cidr} ${remote_svc_cidr} | ||
|
|
||
| Verify C2CC Health Probes | ||
| [Documentation] Verify all RemoteCluster CRs report Healthy with populated timestamps. | ||
| FOR ${alias} IN cluster-a cluster-b cluster-c | ||
| Verify RemoteCluster State ${alias} Healthy | ||
| ${stdout}= Oc On Cluster ${alias} | ||
| ... oc get remoteclusters.microshift.io -o jsonpath='{.items[*].status.lastProbeTime}' | ||
| Should Not Be Empty ${stdout} | ||
| ${stdout}= Oc On Cluster ${alias} | ||
| ... oc get remoteclusters.microshift.io -o jsonpath='{.items[*].status.lastSuccessfulProbe}' | ||
| Should Not Be Empty ${stdout} | ||
| END | ||
|
|
||
| Verify C2CC DNS | ||
| [Documentation] Verify CoreDNS Corefile contains C2CC server blocks and | ||
| ... cross-cluster DNS resolution works for all pairs. | ||
| Verify Corefile Contains C2CC Server Block cluster-a ${CLUSTER_B_DOMAIN} | ||
| Verify Corefile Contains C2CC Server Block cluster-a ${CLUSTER_C_DOMAIN} | ||
| Verify Corefile Contains C2CC Server Block cluster-b ${CLUSTER_A_DOMAIN} | ||
| Verify Corefile Contains C2CC Server Block cluster-b ${CLUSTER_C_DOMAIN} | ||
| Verify Corefile Contains C2CC Server Block cluster-c ${CLUSTER_A_DOMAIN} | ||
| Verify Corefile Contains C2CC Server Block cluster-c ${CLUSTER_B_DOMAIN} | ||
| Curl Remote Service Via DNS cluster-a cluster-b | ||
| Curl Remote Service Via DNS cluster-a cluster-c | ||
| Curl Remote Service Via DNS cluster-b cluster-a | ||
| Curl Remote Service Via DNS cluster-b cluster-c | ||
| Curl Remote Service Via DNS cluster-c cluster-a | ||
| Curl Remote Service Via DNS cluster-c cluster-b | ||
|
|
||
| Ensure All Clusters Healthy | ||
| [Documentation] Pre-condition: all clusters must have Healthy RemoteCluster CRs. | ||
| Verify All RemoteClusters Healthy | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Re-registration is not failure-safe for teardown.
Remove Values From Listdrops the alias from${C2CC_REMOTE_ALIASES}beforeRegister Remote Clusterre-adds it. IfRegister Remote Clustererrors (host still down), the alias is gone from the tracked list. WithinWait Until Keyword Succeedsretries this self-heals, but if all retries exhaust,Teardown All Remote Clusterswill never switch to / close that connection, leaking it and leaving teardown state inconsistent.Consider only mutating the tracking list after a successful re-registration, or guarding with
TRY/FINALLYso the list is reconciled even on the failure path.Based on learnings: teardown state (the alias/interface list consumed by teardown keywords) must be populated reliably even when the mutating keyword errors before completing.
🤖 Prompt for AI Agents
Source: Learnings