KubeVirt Live Migration

How Live Migration Works

KubeVirt live migration moves a running VM from one node to another without downtime. The process involves:

The virt-controller creates a target virt-launcher pod on the destination node.
A migration proxy establishes a connection between source and target virt-launcher pods.
Memory pages are iteratively copied from source to target while the VM continues running.
Once the remaining dirty pages are small enough, the VM is briefly paused, final state is transferred, and the VM resumes on the target node.
The source virt-launcher pod is terminated.

Network Configuration

This environment uses Multus with bridge CNI (vlan-13, vlan-12 NADs). VMs get IPs directly on the physical network. Bridge-attached VMs require the vif-cache to be populated during the CNI phase so the target virt-launcher pod can re-attach to the correct network interface.

Triggering a Manual Migration

```bash copy virtctl migrate -n

Example:

```bash copy
virtctl migrate my-windows-vm -n virtual-machines

Monitoring Migration Progress

Watch the VirtualMachineInstanceMigration (VMIM) resource:

```bash copy kubectl get vmim -n

For detailed status:

```bash copy
kubectl get vmim -n <namespace> -o yaml

Watch migration events in real time:

```bash copy kubectl get events -n --field-selector reason=Migrating --watch

---

## Cancelling a Migration

```bash copy
virtctl migrate-cancel <vm-name> -n <namespace>

Example:

```bash copy virtctl migrate-cancel my-windows-vm -n virtual-machines

---

## Common Failure Modes

### `client socket is closed`

!!! failure "Cause"
    Version skew between the source and target `virt-launcher` pods. This happens when a KubeVirt upgrade has been applied but existing VMs are still running with the old `virt-launcher` image.

**Fix:** Restart the affected VM so it picks up the new `virt-launcher` version:

```bash copy
virtctl restart <vm-name> -n <namespace>

`vif-cache-pod*.json: no such file or directory`

Cause

Stale network state after a previously failed migration. The target virt-launcher pod expects cached network interface metadata that was never written or was cleaned up.

Fix: Restart the VM to force a clean CNI setup:

```bash copy virtctl restart -n

### Migration Backoff Loops

!!! failure "Cause"
    The virt-controller continuously retries a failing migration with exponential backoff. This can happen when the underlying issue (version skew, network state, resource pressure) is not resolved.

**Fix:** Cancel the migration and address the root cause before retrying:

```bash copy
virtctl migrate-cancel <vm-name> -n <namespace>

When to Use `virtctl restart` Instead

Live migration is not always the right tool. Use virtctl restart when:

KubeVirt version upgrades have been applied and VMs are running outdated virt-launcher images. Check for outdated launchers:

bash copy kubectl get vmi -l kubevirt.io/outdatedLauncherImage --all-namespaces
Stuck migrations that cannot be cancelled cleanly.
Persistent vif-cache errors after failed migration attempts.

bash copy virtctl restart <vm-name> -n <namespace>

Warning

virtctl restart causes VM downtime. The VM will be stopped and started on a new node. This is a cold migration, not a live migration.

Network Requirements

For Multus bridge-attached VMs:

The vif-cache must be populated during CNI phase 1 so the target pod can reconstruct the network interface.
Bridge CNI configuration must be consistent across all nodes that may host the VM.
The NetworkAttachmentDefinition (NAD) must exist on the target node's namespace.
All nodes must have the underlying bridge interface (vlan-13, vlan-12) configured.

Note

VMs using Multus bridge CNI get IPs directly on the physical network. There is no pod network NAT involved, so the VM's IP address is preserved across migrations as long as the same bridge is available on the target node.