KubeVirt Live Migration
How Live Migration Works
KubeVirt live migration moves a running VM from one node to another without downtime. The process involves:
- The virt-controller creates a target
virt-launcherpod on the destination node. - A migration proxy establishes a connection between source and target
virt-launcherpods. - Memory pages are iteratively copied from source to target while the VM continues running.
- Once the remaining dirty pages are small enough, the VM is briefly paused, final state is transferred, and the VM resumes on the target node.
- The source
virt-launcherpod is terminated.
Network Configuration
This environment uses Multus with bridge CNI (vlan-13, vlan-12 NADs). VMs get IPs directly on the physical network. Bridge-attached VMs require the vif-cache to be populated during the CNI phase so the target virt-launcher pod can re-attach to the correct network interface.
Triggering a Manual Migration
```bash copy
virtctl migrate
Monitoring Migration Progress
Watch the VirtualMachineInstanceMigration (VMIM) resource:
```bash copy
kubectl get vmim -n
Watch migration events in real time:
```bash copy
kubectl get events -n
Example:
```bash copy virtctl migrate-cancel my-windows-vm -n virtual-machines
---
## Common Failure Modes
### `client socket is closed`
!!! failure "Cause"
Version skew between the source and target `virt-launcher` pods. This happens when a KubeVirt upgrade has been applied but existing VMs are still running with the old `virt-launcher` image.
**Fix:** Restart the affected VM so it picks up the new `virt-launcher` version:
```bash copy
virtctl restart <vm-name> -n <namespace>
vif-cache-pod*.json: no such file or directory
Cause
Stale network state after a previously failed migration. The target virt-launcher pod expects cached network interface metadata that was never written or was cleaned up.
Fix: Restart the VM to force a clean CNI setup:
```bash copy
virtctl restart ### Migration Backoff Loops
!!! failure "Cause"
The virt-controller continuously retries a failing migration with exponential backoff. This can happen when the underlying issue (version skew, network state, resource pressure) is not resolved.
**Fix:** Cancel the migration and address the root cause before retrying:
```bash copy
virtctl migrate-cancel <vm-name> -n <namespace>
When to Use virtctl restart Instead
Live migration is not always the right tool. Use virtctl restart when:
-
KubeVirt version upgrades have been applied and VMs are running outdated
virt-launcherimages. Check for outdated launchers:bash copy kubectl get vmi -l kubevirt.io/outdatedLauncherImage --all-namespaces -
Stuck migrations that cannot be cancelled cleanly.
- Persistent vif-cache errors after failed migration attempts.
bash copy
virtctl restart <vm-name> -n <namespace>
Warning
virtctl restart causes VM downtime. The VM will be stopped and started on a new node. This is a cold migration, not a live migration.
Network Requirements
For Multus bridge-attached VMs:
- The vif-cache must be populated during CNI phase 1 so the target pod can reconstruct the network interface.
- Bridge CNI configuration must be consistent across all nodes that may host the VM.
- The
NetworkAttachmentDefinition(NAD) must exist on the target node's namespace. - All nodes must have the underlying bridge interface (
vlan-13,vlan-12) configured.
Note
VMs using Multus bridge CNI get IPs directly on the physical network. There is no pod network NAT involved, so the VM's IP address is preserved across migrations as long as the same bridge is available on the target node.