30 Commits

Author SHA1 Message Date
Bastian Mäuser
813c54760d WIP 2024-02-26 17:25:46 +01:00
Bastian Mäuser
5588b7342e Require Unique Disk names throughout all Storage pools. 2024-02-26 17:09:13 +01:00
Bastian Mäuser
2f985df07d Added option, do specify VMs to process using Proxmox UI Tags 2024-02-26 15:32:23 +01:00
Bastian Mäuser
b7c86b0206 fix incremental ecpool 2024-02-23 12:08:50 +01:00
Bastian Mäuser
3d0babd12c Adjust Documentation 2024-02-23 11:43:00 +01:00
Bastian Mäuser
a885c9fbf9 Added Option to select ssh cipher and set decent default for it 2024-02-23 11:38:11 +01:00
root
694396255d Add support for ceph erasure-coded pools, fix a bug, when a pool was called pool 2024-02-22 17:39:36 +01:00
Bastian
8cd0472cba Fix Documentation typo 2024-01-15 12:04:50 +01:00
Bastian Mäuser
e0d1814c15 doc: Clarify on preflight checks 2023-08-09 10:56:51 +02:00
Bastian
8467bcd08e improvement: precise wording 2023-08-04 16:05:56 +02:00
Bastian
48eb3f840e fix: missing vm_id in log message 2023-08-04 15:59:07 +02:00
Bastian
514d19b9f6 added: retrieve ceph versions for compatibility checks 2023-08-04 15:54:37 +02:00
Bastian
a6e1f9342a added: support for EFI Disks 2023-08-04 15:36:18 +02:00
Bastian
59b8ab5ce2 added: default pool, feature: confirm --migrate, add: --noconfirm 2023-08-04 13:38:26 +02:00
Bastian
4bfd79e79e improved: retrieve source/destination cluster name for better insights 2023-07-13 15:18:24 +02:00
Bastian
6e8eb7ce2b fixed: preflight checks 2023-07-13 14:45:55 +02:00
Bastian
be88cb4d40 fixed: perf_vm_stopped++ never counted. 2023-07-13 13:54:20 +02:00
Bastian
1343dc6b51 fixed: Correct target host now displayed in log messsage, Add downtime counter 2023-07-13 13:51:58 +02:00
Bastian
5ce325beec Strip ansi color codes from syslog and mail 2023-06-13 16:29:06 +02:00
Bastian
b8d2386e69 Added Logging by mail functionality, added --mail parameter, added logfilehandling 2023-06-13 16:13:23 +02:00
Bastian
a5ea397d11 bump version 2023-06-13 14:33:20 +02:00
Bastian
36dabe9d79 fixed several linting issues 2023-06-13 14:21:06 +02:00
Bastian
284cfb6e76 Added features to README, minor wording changes 2023-04-26 15:41:13 +02:00
Bastian
5b7fd4986b Add preflight check: pool_exists 2023-03-23 16:14:04 +01:00
Bastian
41abd0429a Fix Regex to exclude cloud-init drive 2023-03-23 15:46:58 +01:00
Bastian
890567ad05 Remove vm from ha group before shutting down on migration 2023-03-23 14:10:55 +01:00
Bastian
f5441f4c0b Sanitize cloud-init drives from the config 2023-03-22 15:22:36 +01:00
Bastian
fb5b3a6d09 Merge pull request #1 from lephisto/feature-move
Add --migrate feature: near-live migrate between clusters
2023-03-22 14:42:05 +01:00
Bastian
5bf37e886c Add --migrate feature: near-live migrate between clusters 2023-03-22 14:40:01 +01:00
Bastian
010f04c412 make --vmids=num vorking with --prefixids, bump version 2022-12-06 14:20:28 +01:00
2 changed files with 377 additions and 107 deletions

103
README.md
View File

@@ -2,7 +2,7 @@
[![License](https://img.shields.io/github/license/EnterpriseVE/eve4pve-barc.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html) [![License](https://img.shields.io/github/license/EnterpriseVE/eve4pve-barc.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html)
Cross-Pool (live) Replication and near-live migration for Proxmox VE Cross-Pool asynchronous online-replication and near-live migration for Proxmox VE
```text ```text
@@ -11,7 +11,7 @@ ______
| --| _| . |_ -|_ -| . | | | -_| _| | --| _| . |_ -|_ -| . | | | -_| _|
|_____|_| |___|___|___|___|\_/|___|_| |_____|_| |___|___|___|___|\_/|___|_|
Cross Pool (live) replication and near-live migration for Proxmox VE Cross Pool asynchronous online-replication and near-live migration for Proxmox VE
Usage: Usage:
crossover <COMMAND> [ARGS] [OPTIONS] crossover <COMMAND> [ARGS] [OPTIONS]
@@ -25,6 +25,7 @@ Commands:
mirror Replicate a stopped VM to another Cluster (full clone) mirror Replicate a stopped VM to another Cluster (full clone)
Options: Options:
--sshcipher SSH Cipher to use for transfer (default: aes128-gcm@openssh.com,aes128-cbc)
--vmid The source+target ID of the VM, comma separated (eg. --vmid=100:100,101:101) --vmid The source+target ID of the VM, comma separated (eg. --vmid=100:100,101:101)
(The possibility to specify a different Target VMID is to not interfere with VMIDs on the (The possibility to specify a different Target VMID is to not interfere with VMIDs on the
target cluster, or mark mirrored VMs on the destination) target cluster, or mark mirrored VMs on the destination)
@@ -86,10 +87,12 @@ It'll work according this scheme:
* Retention policy: (eg. keep x snapshots on the source and y snapshots in the destination cluster) * Retention policy: (eg. keep x snapshots on the source and y snapshots in the destination cluster)
* Rewrites VM configurations so they match the new VMID and/or poolname on the destination * Rewrites VM configurations so they match the new VMID and/or poolname on the destination
* Secure an encrypted transfer (SSH), so it's safe to mirror between datacenter without an additional VPN * Secure an encrypted transfer (SSH), so it's safe to mirror between datacenter without an additional VPN
* Near live-migrate: To move a VM from one Cluster to another, make an initial copy and re-run with --migrate. This will shutdown the VM on the source cluster and start it on the destination cluster.
## Installation of prerequisites ## Installation of prerequisites
```apt install git pv gawk jq ```
apt install git pv gawk jq curl
## Install the Script somewhere, eg to /opt ## Install the Script somewhere, eg to /opt
@@ -99,9 +102,9 @@ git clone https://github.com/lephisto/crossover/ /opt
Ensure that you can freely ssh from the Node you plan to mirror _from_ to _all_ nodes in the destination cluster, as well as localhost. Ensure that you can freely ssh from the Node you plan to mirror _from_ to _all_ nodes in the destination cluster, as well as localhost.
## Examples ## Continuous replication between Clusters
Mirror VM to another Cluster: Example 1: Mirror VM to another Cluster:
``` ```
root@pve01:~/crossover# ./crossover mirror --vmid=all --prefixid=99 --excludevmids=101 --destination=pve04 --pool=data2 --overwrite --online root@pve01:~/crossover# ./crossover mirror --vmid=all --prefixid=99 --excludevmids=101 --destination=pve04 --pool=data2 --overwrite --online
@@ -136,21 +139,102 @@ Full xmitted..........: 0 byte
Differential Bytes ...: 372.96 KiB Differential Bytes ...: 372.96 KiB
``` ```
This example creates a mirror of VM 100 (in the source cluster) as VM 10100 (in the destination cluster) using the ceph pool "data2" for storing all attached disks. It will keep 4 Ceph snapshots prior the latest (in total 5) and 8 snapshots on the remote cluster. It will keep the VM on the target Cluster locked to avoid an accidental start (thus causing split brain issues), and will do it even if the source VM is running. This example creates a mirror of VM 100 (in the source cluster) as VM 10100 (in the destination cluster) using the ceph pool "data2" for storing all attached disks. It will keep 4 Ceph snapshots prior the latest (in total 5) and 8 snapshots on the remote cluster. It will keep the VM on the target Cluster locked to avoid an accidental start (thus causing split brain issues), and will do it even if the source VM is running.
The use case is that you might want to keep a cold-standby copy of a certain VM on another Cluster. If you need to start it on the target cluster you just have to unlock it with `qm unlock VMID` there. The use case is that you might want to keep a cold-standby copy of a certain VM on another Cluster. If you need to start it on the target cluster you just have to unlock it with `qm unlock VMID` there.
Another usecase could be that you want to migrate a VM from one cluster to another with the least downtime possible. Real live migration that you are used to inside one cluster is hard to achive cross-cluster, but you can easily make an initial migration while the VM is still running on the source cluster (fully transferring the block devices), shut it down on source, run the mirror process again (which is much faster now because it only needs to transfer the diff since the initial snapshot) and start it up on the target cluster. This way the migration basically takes one boot plus a few seconds for transferring the incremental snapshot. Another usecase could be that you want to migrate a VM from one cluster to another with the least downtime possible. Real live migration that you are used to inside one cluster is hard to achive cross-cluster, but you can easily make an initial migration while the VM is still running on the source cluster (fully transferring the block devices), shut it down on source, run the mirror process again (which is much faster now because it only needs to transfer the diff since the initial snapshot) and start it up on the target cluster. This way the migration basically takes one boot plus a few seconds for transferring the incremental snapshot.
## Near-live Migration
To minimize downtime and achive a near-live Migration from one Cluster to another it's recommended to do an initial Sync of a VM from the source to the destination cluster. After that, run the job again, and add the --migrate switch. This causes the source VM to be shut down prior snapshot + transfer, and be restarted on the destination cluster as soon as the incremental transfer is complete. Using --migrate will always try to start the VM on the destination cluster.
Example 2: Near-live migrate VM from one cluster to another (Run initial replication first, which works online, then run with --migrate to shutdown on source, incrematally copy and start on destination):
```
root@pve01:~/crossover# ./crossover mirror --jobname=migrate --vmid=100 --destination=pve04 --pool=data2 --online
ACTION: Onlinemirror
Start mirror 2023-04-26 15:02:24
VM 100 - Starting mirror for testubuntu
VM 100 - Checking for VM 100 on destination cluster pve04 /etc/pve/nodes/*/qemu-server
VM 100 - Transmitting Config for to destination pve04 VMID 100
VM 100 - locked 100 [rc:0] on source
VM 100 - locked 100 [rc:0] on destination
VM 100 - Creating snapshot data/vm-100-disk-0@mirror-20230426150224
VM 100 - Creating snapshot data/vm-100-disk-1@mirror-20230426150224
VM 100 - unlocked source VM 100 [rc:0]
VM 100 - F data/vm-100-disk-0@mirror-20230426150224: e:0:09:20 r: c:[36.6MiB/s] a:[36.6MiB/s] 20.0GiB [===============================>] 100%
VM 100 - created snapshot on 100 [rc:0]
VM 100 - Disk Summary: Took 560 Seconds to transfer 20.00 GiB in a full run
VM 100 - F data/vm-100-disk-1@mirror-20230426150224: e:0:00:40 r: c:[50.7MiB/s] a:[50.7MiB/s] 2.00GiB [===============================>] 100%
VM 100 - created snapshot on 100 [rc:0]
VM 100 - Disk Summary: Took 40 Seconds to transfer 22.00 GiB in a full run
VM 100 - Unlocking destination VM 100
Finnished mirror 2023-04-26 15:13:47
Job Summary: Bytes transferred 22.00 GiB for 2 Disks on 1 VMs in 00 hours 11 minutes 23 seconds
VM Freeze OK/failed.......: 1/0
RBD Snapshot OK/failed....: 2/0
RBD export-full OK/failed.: 2/0
RBD export-diff OK/failed.: 0/0
Full xmitted..............: 22.00 GiB
Differential Bytes .......: 0 Bytes
root@pve01:~/crossover# ./crossover mirror --jobname=migrate --vmid=100 --destination=pve04 --pool=data2 --online --migrate
ACTION: Onlinemirror
Start mirror 2023-04-26 15:22:35
VM 100 - Starting mirror for testubuntu
VM 100 - Checking for VM 100 on destination cluster pve04 /etc/pve/nodes/*/qemu-server
VM 100 - Migration requested, shutting down VM on pve01
VM 100 - locked 100 [rc:0] on source
VM 100 - locked 100 [rc:0] on destination
VM 100 - Creating snapshot data/vm-100-disk-0@mirror-20230426152235
VM 100 - Creating snapshot data/vm-100-disk-1@mirror-20230426152235
VM 100 - I data/vm-100-disk-0@mirror-20230426152235: e:0:00:03 c:[1.29MiB/s] a:[1.29MiB/s] 4.38MiB
VM 100 - Housekeeping: localhost data/vm-100-disk-0, keeping Snapshots for 0s
VM 100 - Removing Snapshot localhost data/vm-100-disk-0@mirror-20230323162532 (2930293s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-0@mirror-20230426144911 (2076s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-0@mirror-20230426145632 (1637s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-0@mirror-20230426145859 (1492s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-0@mirror-20230426150224 (1290s) [rc:0]
VM 100 - Housekeeping: pve04 data2/vm-100-disk-0-data, keeping Snapshots for 0s
VM 100 - Removing Snapshot pve04 data2/vm-100-disk-0-data@mirror-20230426150224 (1293s) [rc:0]
VM 100 - Disk Summary: Took 4 Seconds to transfer 4.37 MiB in a incremental run
VM 100 - I data/vm-100-disk-1@mirror-20230426152235: e:0:00:00 c:[ 227 B/s] a:[ 227 B/s] 74.0 B
VM 100 - Housekeeping: localhost data/vm-100-disk-1, keeping Snapshots for 0s
VM 100 - Removing Snapshot localhost data/vm-100-disk-1@mirror-20230323162532 (2930315s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-1@mirror-20230426144911 (2098s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-1@mirror-20230426145632 (1659s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-1@mirror-20230426145859 (1513s) [rc:0]
VM 100 - Removing Snapshot localhost data/vm-100-disk-1@mirror-20230426150224 (1310s) [rc:0]
VM 100 - Housekeeping: pve04 data2/vm-100-disk-1-data, keeping Snapshots for 0s
VM 100 - Removing Snapshot pve04 data2/vm-100-disk-1-data@mirror-20230426150224 (1313s) [rc:0]
VM 100 - Disk Summary: Took 2 Seconds to transfer 4.37 MiB in a incremental run
VM 100 - Unlocking destination VM 100
VM 100 - Starting VM on pve01
Finnished mirror 2023-04-26 15:24:25
Job Summary: Bytes transferred 4.37 MiB for 2 Disks on 1 VMs in 00 hours 01 minutes 50 seconds
VM Freeze OK/failed.......: 0/0
RBD Snapshot OK/failed....: 2/0
RBD export-full OK/failed.: 0/0
RBD export-diff OK/failed.: 2/0
Full xmitted..............: 0 Bytes
Differential Bytes .......: 4.37 MiB
```
## Things to check ## Things to check
From Proxmox VE Hosts you want to backup you need to be able to ssh passwordless to all other Cluster hosts, that may hold VM's or Containers. This goes for the source and for the destination Cluster. From Proxmox VE Hosts you want to backup you need to be able to ssh passwordless to all other Cluster hosts, that may hold VM's or Containers. This goes for the source and for the destination Cluster. Doublecheck this.
This is required for using the free/unfreeze and the lock/unlock function, which has to be called locally from that Host the guest is currently running on. Usually this works out of the box for the source cluster, but you may want to make sure that you can "ssh root@pvehost1...n" from every host to every other host in the cluster. This is required for using the free/unfreeze and the lock/unlock function, which has to be called locally from that Host the guest is currently running on. Usually this works out of the box for the source cluster, but you may want to make sure that you can "ssh root@pvehost1...n" from every host to every other host in the cluster.
For the Destination Cluster you need to copy your ssh-key to the first host in the cluster, and login once to every node For the Destination Cluster you need to copy your ssh-key to the first host in the cluster, and login once to every node in your cluster.
in your cluster.
Currently preflight checks don't include the check for enough resources in the destination cluster. Check beforehand that you don't exceed the maximum safe size of ceph in the destination cluster.
## Unique Disk names
There are cases, when the Source VM has Disks on different ceph pools. Now, in theory you can have identical image names for different disks. Since all disk images are migrated to one destination pool, they need to be unique. This tool detects this in Preflight checks, and skips these VMs and issues a warning. To solve this, give them unique names, like vm-100-disk-0, vm,100-disk-1 and so on. `rbd mv` will help you.
## Some words about Snapshot consistency and what qemu-guest-agent can do for you ## Some words about Snapshot consistency and what qemu-guest-agent can do for you
@@ -292,4 +376,5 @@ Ceph Documentation:
[rdb manage rados block device (rbd) images](http://docs.ceph.com/docs/master/man/8/rbd/) [rdb manage rados block device (rbd) images](http://docs.ceph.com/docs/master/man/8/rbd/)
Proxmox Wiki: Proxmox Wiki:
https://pve.proxmox.com/wiki/ https://pve.proxmox.com/wiki/

381
crossover
View File

@@ -1,5 +1,8 @@
#!/bin/bash #!/bin/bash
# Cross Pool Migration and incremental replication Tool for Proxmox VMs using Ceph.
# Author: Bastian Mäuser <bma@netz.org>
LC_ALL="en_US.UTF-8" LC_ALL="en_US.UTF-8"
source rainbow.sh source rainbow.sh
@@ -13,13 +16,11 @@ declare opt_influx_jobname=''
declare opt_influx_job_metrics='crossover_xmit' declare opt_influx_job_metrics='crossover_xmit'
declare opt_influx_summary_metrics='crossover_jobs' declare opt_influx_summary_metrics='crossover_jobs'
# Cross Pool Migration and incremental replication Tool for Proxmox VMs using Ceph. name=$(basename "$0")
# Author: Bastian Mäuser <bma@netz.org> # readonly variables
declare -r NAME=$name
declare -r VERSION=0.5 declare -r VERSION=0.9
declare -r NAME=$(basename "$0")
declare -r PROGNAME=${NAME%.*} declare -r PROGNAME=${NAME%.*}
declare -r PVE_DIR="/etc/pve" declare -r PVE_DIR="/etc/pve"
declare -r PVE_NODES="$PVE_DIR/nodes" declare -r PVE_NODES="$PVE_DIR/nodes"
declare -r QEMU='qemu-server' declare -r QEMU='qemu-server'
@@ -27,14 +28,21 @@ declare -r QEMU_CONF_CLUSTER="$PVE_NODES/*/$QEMU"
declare -r EXT_CONF='.conf' declare -r EXT_CONF='.conf'
declare -r PVFORMAT_FULL='e:%t r:%e c:%r a:%a %b %p' declare -r PVFORMAT_FULL='e:%t r:%e c:%r a:%a %b %p'
declare -r PVFORMAT_SNAP='e:%t c:%r a:%a %b' declare -r PVFORMAT_SNAP='e:%t c:%r a:%a %b'
logfile=$(mktemp)
declare -r LOG_FILE=$logfile
declare -r LOG_FILE=$(mktemp) # associative global arrays
declare -A -g pvnode declare -A -g pvnode
declare -A -g dstpvnode declare -A -g dstpvnode
declare -A -g svmids declare -A -g svmids
declare -A -g dvmids declare -A -g dvmids
declare -g scluster
declare -g dcluster
declare -g scephversion
declare -g dcephversion
# global integers
declare -g -i perf_freeze_ok=0 declare -g -i perf_freeze_ok=0
declare -g -i perf_freeze_failed=0 declare -g -i perf_freeze_failed=0
declare -g -i perf_ss_ok=0 declare -g -i perf_ss_ok=0
@@ -51,11 +59,17 @@ declare -g -i perf_bytes_total=0
declare -g -i perf_vm_running=0 declare -g -i perf_vm_running=0
declare -g -i perf_vm_stopped=0 declare -g -i perf_vm_stopped=0
declare -g -i perf_snaps_removed=0 declare -g -i perf_snaps_removed=0
declare -g -i perf_vm_total=0
declare -g -i perf_vm_ok=0
# commandline parameters
declare opt_destination declare opt_destination
declare opt_vm_ids='' declare opt_vm_ids=''
declare opt_snapshot_prefix='mirror-' declare opt_snapshot_prefix='mirror-'
declare opt_rewrite='' declare opt_rewrite=''
declare opt_pool='rbd'
declare opt_sshcipher='aes128-gcm@openssh.com,aes128-cbc'
declare opt_tag=''
declare -i opt_prefix_id declare -i opt_prefix_id
declare opt_exclude_vmids='' declare opt_exclude_vmids=''
declare -i opt_debug=0 declare -i opt_debug=0
@@ -66,6 +80,8 @@ declare -i opt_keepslock=0
declare -i opt_keepdlock=0 declare -i opt_keepdlock=0
declare -i opt_overwrite=0 declare -i opt_overwrite=0
declare -i opt_online=0 declare -i opt_online=0
declare -i opt_migrate=0
declare -i opt_noconfirm=0
declare opt_keep_local='0s' declare opt_keep_local='0s'
declare opt_keep_remote='0s' declare opt_keep_remote='0s'
@@ -74,6 +90,7 @@ declare -r redstconf='^\/etc\/pve\/nodes\/(.*)\/qemu-server\/([0-9]+).conf$'
declare -r recephimg='([a-zA-Z0-9]+)\:(.*)' declare -r recephimg='([a-zA-Z0-9]+)\:(.*)'
declare -r restripsnapshots='/^$/,$d' declare -r restripsnapshots='/^$/,$d'
declare -r redateex='^([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})$' declare -r redateex='^([0-9]{4})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})([0-9]{2})$'
declare -r restripansicolor='s/\x1b\[[0-9;]*m//g'
function usage(){ function usage(){
shift shift
@@ -103,11 +120,13 @@ Commands:
mirror Replicate a stopped VM to another Cluster (full clone) mirror Replicate a stopped VM to another Cluster (full clone)
Options: Options:
--sshcipher SSH Cipher to use for transfer (default: aes128-gcm@openssh.com,aes128-cbc)
--tag Include all VMs with a specific tag set in the Proxmox UI (if set, implies vmid=all)
--vmid The source+target ID of the VM/CT, comma separated (eg. --vmid=100:100,101:101), or all for all --vmid The source+target ID of the VM/CT, comma separated (eg. --vmid=100:100,101:101), or all for all
--prefixid Prefix for VMID's on target System [optional] --prefixid Prefix for VMID's on target System [optional]
--excludevmids Exclusde VM IDs when using --vmid==all --excludevmids Exclusde VM IDs when using --vmid==all
--destination Target PVE Host in target pool. e.g. --destination=pve04 --destination Target PVE Host in target pool. e.g. --destination=pve04
--pool Ceph pool name in target pool. e.g. --pool=data --pool Ceph pool name in target pool. e.g. --pool=data [default=rbd]
--keeplocal How many additional Snapshots to keep locally. e.g. --keeplocal=2d --keeplocal How many additional Snapshots to keep locally. e.g. --keeplocal=2d
--keepremote How many additional Snapshots to keep remote. e.g. --keepremote=7d --keepremote How many additional Snapshots to keep remote. e.g. --keepremote=7d
--rewrite PCRE Regex to rewrite the Config Files (eg. --rewrite='s/(net0:)(.*)tag=([0-9]+)/\1\2tag=1/g' would --rewrite PCRE Regex to rewrite the Config Files (eg. --rewrite='s/(net0:)(.*)tag=([0-9]+)/\1\2tag=1/g' would
@@ -116,18 +135,20 @@ Options:
--influxtoken Influx API token with write permission --influxtoken Influx API token with write permission
--influxbucket Influx Bucket to write to (e.g. --influxbucket=telegraf/autogen) --influxbucket Influx Bucket to write to (e.g. --influxbucket=telegraf/autogen)
--jobname Descriptive name for the job, used in Statistics --jobname Descriptive name for the job, used in Statistics
Switches: --mail Mail address to send report to, comma-seperated (e.g. --mail=admin@test.com,admin2@test.com)
Switches:
--online Allow online Copy --online Allow online Copy
--migrate Stop VM on Source Cluster before final Transfer and start on destination Cluster
--nolock Don't lock source VM on Transfer (mainly for test purposes) --nolock Don't lock source VM on Transfer (mainly for test purposes)
--keep-slock Keep source VM locked on Transfer --keep-slock Keep source VM locked on Transfer
--keep-dlock Keep VM locked after transfer on Destination --keep-dlock Keep VM locked after transfer on Destination
--overwrite Overwrite Destination --overwrite Overwrite Destination
--noconfirm Don't ask for confirmation before starting --migrate mode (use with care!)
--debug Show Debug Output --debug Show Debug Output
Report bugs to <mephisto@mephis.to> Report bugs to <mephisto@mephis.to>
EOF EOF
exit 1
} }
function parse_opts(){ function parse_opts(){
@@ -137,7 +158,7 @@ function parse_opts(){
local args local args
args=$(getopt \ args=$(getopt \
--options '' \ --options '' \
--longoptions=vmid:,prefixid:,excludevmids:,destination:,pool:,keeplocal:,keepremote:,rewrite:,influxurl:,influxorg:,influxtoken:,influxbucket:,jobname:,online,nolock,keep-slock,keep-dlock,overwrite,dry-run,debug \ --longoptions=sshcipher:,tag:,vmid:,prefixid:,excludevmids:,destination:,pool:,keeplocal:,keepremote:,rewrite:,influxurl:,influxorg:,influxtoken:,influxbucket:,jobname:,mail:,online,migrate,nolock,keep-slock,keep-dlock,overwrite,dry-run,noconfirm,debug,syslog \
--name "$PROGNAME" \ --name "$PROGNAME" \
-- "$@") \ -- "$@") \
|| end_process 128 || end_process 128
@@ -146,6 +167,8 @@ function parse_opts(){
while true; do while true; do
case "$1" in case "$1" in
--sshcipher) opt_sshcipher=$2; shift 2;;
--tag) opt_tag=$2; shift 2;;
--vmid) opt_vm_ids=$2; shift 2;; --vmid) opt_vm_ids=$2; shift 2;;
--prefixid) opt_prefix_id=$2; shift 2;; --prefixid) opt_prefix_id=$2; shift 2;;
--excludevmids) opt_exclude_vmids=$2; shift 2;; --excludevmids) opt_exclude_vmids=$2; shift 2;;
@@ -159,14 +182,17 @@ function parse_opts(){
--influxtoken) opt_influx_token=$2; shift 2;; --influxtoken) opt_influx_token=$2; shift 2;;
--influxbucket) opt_influx_bucket=$2; shift 2;; --influxbucket) opt_influx_bucket=$2; shift 2;;
--jobname) opt_influx_jobname=$2; shift 2;; --jobname) opt_influx_jobname=$2; shift 2;;
--mail) opt_addr_mail="$2"; shift 2;;
--online) opt_online=1; shift 2;; --online) opt_online=1; shift ;;
--migrate) opt_migrate=1; shift ;;
--dry-run) opt_dry_run=1; shift;; --dry-run) opt_dry_run=1; shift;;
--noconfirm) opt_noconfirm=1; shift;;
--debug) opt_debug=1; shift;; --debug) opt_debug=1; shift;;
--nolock) opt_lock=0; shift;; --nolock) opt_lock=0; shift;;
--keep-slock) opt_keepslock=1; shift;; --keep-slock) opt_keepslock=1; shift;;
--keep-dlock) opt_keepdlock=1; shift;; --keep-dlock) opt_keepdlock=1; shift;;
--overwrite) opt_overwrite=1; shift;; --overwrite) opt_overwrite=1; shift;;
--syslog) opt_syslog=1; shift;;
--) shift; break;; --) shift; break;;
*) break;; *) break;;
esac esac
@@ -179,12 +205,11 @@ function parse_opts(){
log info "============================================" log info "============================================"
log info "Proxmox VE Version:" log info "Proxmox VE Version:"
echowhite $(pveversion) echowhite "$(pveversion)"
log info "============================================" log info "============================================"
fi fi
[ -z "$opt_vm_ids" ] && { log info "VM id is not set."; end_process 1; }
[ -z "$opt_influx_jobname" ] && { log info "Jobname is not set."; end_process 1; } [ -z "$opt_influx_jobname" ] && { log info "Jobname is not set."; end_process 1; }
@@ -202,12 +227,28 @@ function parse_opts(){
fi fi
fi fi
if [ $opt_keepdlock -eq 1 ] && [ $opt_migrate -eq 1 ]; then
log error "--keepdlock/--migrate: Invalid parameter Combination: you can't keep the destination locked in near-live migration mode"
end_process 255
fi
if [ -n "$opt_tag" ] && [ -n "$opt_vm_ids" ] && [ "$opt_vm_ids" != "all" ]; then
log error "You can't use --tag and --vmid at the same time"
end_process 255
fi
[ -n "$opt_tag" ] && [ -z $opt_vm_ids ] && opt_vm_ids="all"
[ -z "$opt_vm_ids" ] && { log info "VM id is not set."; end_process 1; }
if [ "$opt_vm_ids" = "all" ]; then if [ "$opt_vm_ids" = "all" ]; then
local all='' local all=''
local data='' local data=''
local cnt='' local cnt=''
local ids=''
all=$(get_vm_ids "$QEMU_CONF_CLUSTER/*$EXT_CONF" "$LXC_CONF_CLUSTER/*$EXT_CONF") all=$(get_vm_ids "$QEMU_CONF_CLUSTER/*$EXT_CONF" "$LXC_CONF_CLUSTER/*$EXT_CONF")
log debug "all: $all"
all=$(echo "$all" | tr ',' "\n") all=$(echo "$all" | tr ',' "\n")
opt_exclude_vmids=$(echo "$opt_exclude_vmids" | tr ',' "\n") opt_exclude_vmids=$(echo "$opt_exclude_vmids" | tr ',' "\n")
for id in $all; do for id in $all; do
@@ -218,8 +259,17 @@ function parse_opts(){
done done
vm_ids=$(echo "$vm_ids" | tr ',' "\n") vm_ids=$(echo "$vm_ids" | tr ',' "\n")
else else
vm_ids=$(echo "$opt_vm_ids" | tr ',' "\n") if [ ! -z $opt_prefix_id ]; then
ids=$(echo "$opt_vm_ids" | tr ',' "\n")
for id in $ids; do
vm_ids=$(echo "$vm_ids$id:$opt_prefix_id$id,")
done
vm_ids=$(echo "$vm_ids" | tr ',' "\n")
else
vm_ids=$(echo "$opt_vm_ids" | tr ',' "\n")
fi
fi fi
} }
human_readable() { human_readable() {
@@ -263,7 +313,14 @@ function exist_file(){
function lookupcephpool() { function lookupcephpool() {
pvehost=$1 pvehost=$1
pvepoolname=$2 pvepoolname=$2
res=$(ssh $pvehost cat /etc/pve/storage.cfg | sed -n "/rbd: $pvepoolname/,/^$/p" | grep pool | cut -d " " -f 2) res=$(ssh $pvehost cat /etc/pve/storage.cfg | sed -n "/rbd: $pvepoolname/,/^$/p" | grep -E "\s+pool\s" | cut -d " " -f 2)
echo $res
}
function lookupdatapool() {
pvehost=$1
pvepoolname=$2
res=$(ssh $pvehost cat /etc/pve/storage.cfg | sed -n "/rbd: $pvepoolname/,/^$/p" | grep -E "\s+data-pool\s" | cut -d " " -f 2)
echo $res echo $res
} }
@@ -274,7 +331,9 @@ function get_vm_ids(){
while [ $# -gt 0 ]; do while [ $# -gt 0 ]; do
for conf in $1; do for conf in $1; do
[ ! -e "$conf" ] && break [ ! -e "$conf" ] && break
if [ -n "$opt_tag" ] && ! grep -qE "^tags:\s.*$opt_tag(;|$)" $conf; then
continue
fi
conf=$(basename "$conf") conf=$(basename "$conf")
[ "$data" != '' ] && data="$data," [ "$data" != '' ] && data="$data,"
data="$data${conf%.*}" data="$data${conf%.*}"
@@ -285,20 +344,6 @@ function get_vm_ids(){
echo "$data" echo "$data"
} }
function get_config_file(){
local file_config=''
if exist_file "$QEMU_CONF_CLUSTER/$vm_id$EXT_CONF"; then
file_config=$(ls $QEMU_CONF_CLUSTER/$vm_id$EXT_CONF)
else
log error "VM $vm_id - Unknown technology or VMID not found: $QEMU_CONF_CLUSTER/$vm_id$EXT_CONF"
end_process 128
fi
echo "$file_config"
}
function get_disks_from_config(){ function get_disks_from_config(){
local disks; local disks;
local file_config=$1 local file_config=$1
@@ -310,7 +355,7 @@ function get_disks_from_config(){
[[ "$line" == "" ]] && break [[ "$line" == "" ]] && break
echo "$line" echo "$line"
done < "$file_config" | \ done < "$file_config" | \
grep -P '^(?:((?:virtio|ide|scsi|sata|mp)\d+)|rootfs): ' | \ grep -P '^(?:((?:efidisk|virtio|ide|scsi|sata|mp)\d+)|rootfs): ' | \
grep -v -P 'cdrom|none' | \ grep -v -P 'cdrom|none' | \
grep -v -P 'backup=0' | \ grep -v -P 'backup=0' | \
awk '{ split($0,a,","); split(a[1],b," "); print b[2]}') awk '{ split($0,a,","); split(a[1],b," "); print b[2]}')
@@ -318,10 +363,35 @@ function get_disks_from_config(){
echo "$disks" echo "$disks"
} }
function check_unique_disk_config() {
local file_config=$1
disks=$(while read -r line; do
[[ "$line" == "" ]] && break
echo "$line"
done < "$file_config" | \
grep -P '^(?:((?:efidisk|virtio|ide|scsi|sata|mp)\d+)|rootfs): ' | \
grep -v -P 'cdrom|none' | \
grep -v -P 'backup=0' | \
awk '{ split($0,a,","); split(a[1],b," "); print b[2]}'| wc -l)
echo disks
uniquedisks=$(while read -r line; do
[[ "$line" == "" ]] && break
echo "$line"
done < "$file_config" | \
grep -P '^(?:((?:efidisk|virtio|ide|scsi|sata|mp)\d+)|rootfs): ' | \
grep -v -P 'cdrom|none' | \
grep -v -P 'backup=0' | \
awk '{ split($0,a,","); split(a[1],b," "); print b[2]}'|cut -d ':' -f 2 | sort -nr | uniq | wc -l)
# TBD: ^(vm|ct)-([0-9]+)-([a-z]+)-[\d]+.*$
difference=$(expr $disks - $uniquedisks)
echo "$difference"
}
function log(){ function log(){
local level=$1 local level=$1
shift 1 shift 1
local message=$* local message=$*
local syslog_msg=''
case $level in case $level in
debug) debug)
@@ -333,28 +403,32 @@ function log(){
info) info)
echo -e "$message"; echo -e "$message";
echo -e "$message" >> "$LOG_FILE"; echo -e "$message" | sed -e 's/\x1b\[[0-9;]*m//g' >> "$LOG_FILE";
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" "$message" syslog_msg=$(echo -e "$message" | sed -e ${restripansicolor})
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" "$syslog_msg"
;; ;;
warn) warn)
echo -n $(echoyellow "WARNING: ") echo -n "$(echoyellow 'WARNING: ')"
echo $(echowhite "$message") 1>&2 echowhite "$message" 1>&2
echo -e "$message" >> "$LOG_FILE"; echo -e "$message" | sed -e ${restripansicolor} >> "$LOG_FILE";
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" -p daemon.warn "$message" syslog_msg=$(echo -e "$message" | sed -e ${restripansicolor})
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" -p daemon.warn "$syslog_msg"
;; ;;
error) error)
echo -n $(echored "ERROR: ") echo -n "$(echored 'ERROR: ')"
echo $(echowhite "$message") 1>&2 echowhite "$message" 1>&2
echo -e "$message" >> "$LOG_FILE"; echo -e "$message" | sed -e ${restripansicolor} >> "$LOG_FILE";
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" -p daemon.err "$message" syslog_msg=$(echo -e "$message" | sed -e ${restripansicolor})
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" -p daemon.err "$syslog_msg"
;; ;;
*) *)
echo "$message" 1>&2 echo "$message" 1>&2
echo -e "$message" >> "$LOG_FILE"; echo -e "$message" | sed -e ${restripansicolor} >> "$LOG_FILE";
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" "$message" syslog_msg=$(echo -e "$message" | sed -e ${restripansicolor})
[ $opt_syslog -eq 1 ] && logger -t "$PROGNAME" "$syslog_msg"
;; ;;
esac esac
} }
@@ -386,6 +460,11 @@ function mirror() {
local -i endjob local -i endjob
local -i vmcount=0 local -i vmcount=0
local -i diskcount=0 local -i diskcount=0
local -i vmdiskcount=0
local -i skipped_vm_count=0
local -i startdowntime
local -i enddowntime
local -i ga_ping
local disp_perf_freeze_failed local disp_perf_freeze_failed
local disp_perf_ss_failed local disp_perf_ss_failed
@@ -396,6 +475,9 @@ function mirror() {
log info "Start mirror $(date "+%F %T")" log info "Start mirror $(date "+%F %T")"
startjob=$(date +%s) startjob=$(date +%s)
get_ceph_version
log info "Local Ceph Version: $scephversion, Remote Ceph Version: $dcephversion"
#create pid file #create pid file
local pid_file="/var/run/$PROGNAME.pid" local pid_file="/var/run/$PROGNAME.pid"
if [[ -e "$pid_file" ]]; then if [[ -e "$pid_file" ]]; then
@@ -410,32 +492,69 @@ function mirror() {
end_process 1 end_process 1
fi fi
scluster=$(grep cluster_name /etc/pve/corosync.conf | cut -d " " -f 4)
dcluster=$(ssh "$opt_destination" grep cluster_name /etc/pve/corosync.conf | cut -d " " -f 4)
if [ $opt_migrate -eq 1 ] && [ $opt_noconfirm -eq 0 ]; then
echo "VM(s) $opt_vm_ids will subsequently be shutdown on [$scluster] and started on [$dcluster]"
read -p "Do you want to proceed? (yes/no) " yn
case $yn in
yes ) echo ok, we will proceed;;
no ) echo exiting...;
exit;;
* ) echo invalid response;
exit 1;;
esac
fi
map_source_to_destination_vmid map_source_to_destination_vmid
map_vmids_to_host map_vmids_to_host
map_vmids_to_dsthost "$opt_destination" map_vmids_to_dsthost "$opt_destination"
if [ "$(check_pool_exist "$opt_pool")" -eq 0 ]; then
log error "Preflight check: Destination RBD-Pool $opt_pool does not exist."
end_process 255
fi
for vm_id in $svmids; do for vm_id in $svmids; do
file_config="$PVE_NODES/${pvnode[$vm_id]}/$QEMU/$vm_id.conf"
check_unique_disk_config "$file_config"
end_process 255
if [[ $(check_unique_disk_config "$file_config") -ge 1 ]]; then
log error "VM $vm_id - Preflight check: VM $vm_id has duplicate disk entries - skipping to next VM. Check Documentation to learn how to avoid this."
(( skipped_vm_count++ ))
continue
fi
if ! exist_file "$file_config"; then
log error "VM $vm_id - Preflight check: VM $vm_id does not exist on source cluster [$scluster] - skipping to next VM."
(( skipped_vm_count++ ))
continue
fi
ga_ping=$(gaping "$vm_id")
log debug "ga_ping: $ga_ping"
if [ "$ga_ping" -eq 255 ] ; then #vm running but no qemu-guest-agent answering
log error "VM $vm_id - Preflight check: VM $vm_id on source cluster [$scluster] has no qemu-guest-agent running - skipping to next VM."
(( skipped_vm_count++ ))
continue
fi
(( vmcount++ )) (( vmcount++ ))
local file_config; file_config=$(get_config_file)
[ -z "$file_config" ] && continue
local disk='' local disk=''
dvmid=${dvmids[$vm_id]} dvmid=${dvmids[$vm_id]}
vmname=$(cat $PVE_NODES/"${pvnode[$vm_id]}"/$QEMU/"$vm_id".conf | sed -e ''$restripsnapshots'' | grep "name\:" | cut -d' ' -f 2) vmname=$(cat $PVE_NODES/"${pvnode[$vm_id]}"/$QEMU/"$vm_id".conf | sed -e ''$restripsnapshots'' | grep "name\:" | cut -d' ' -f 2)
log info "VM $vm_id - Starting mirror for $(echowhite "$vmname")" log info "VM $vm_id - Starting mirror for $(echowhite "$vmname")"
srcvmgenid=$(cat $PVE_NODES/"${pvnode[$vm_id]}"/$QEMU/"$vm_id".conf | sed -e ''$restripsnapshots'' | grep vmgenid | sed -r -e 's/^vmgenid:\s(.*)/\1/') srcvmgenid=$(cat $PVE_NODES/"${pvnode[$vm_id]}"/$QEMU/"$vm_id".conf | sed -e ''$restripsnapshots'' | grep vmgenid | sed -r -e 's/^vmgenid:\s(.*)/\1/')
dstvmgenid=$(ssh $opt_destination cat $PVE_NODES/"${dstpvnode[$dvmid]}"/$QEMU/"$dvmid".conf 2>/dev/null | grep vmgenid | sed -e ''$restripsnapshots'' | sed -r -e 's/^vmgenid:\s(.*)/\1/') dstvmgenid=$(ssh "$opt_destination" cat $PVE_NODES/"${dstpvnode[$dvmid]}"/$QEMU/"$dvmid".conf 2>/dev/null | grep vmgenid | sed -e ''$restripsnapshots'' | sed -r -e 's/^vmgenid:\s(.*)/\1/')
log info "VM $vm_id - Checking for VM $dvmid on Destination Host $opt_destination $QEMU_CONF_CLUSTER" log info "VM $vm_id - Checking for VM $dvmid on destination cluster $opt_destination $QEMU_CONF_CLUSTER"
log debug "DVMID:$dvmid srcvmgenid:$srcvmgenid dstvmgenid:$dstvmgenid" log debug "DVMID:$dvmid srcvmgenid:$srcvmgenid dstvmgenid:$dstvmgenid"
conf_on_destination=$(ssh $opt_destination "ls -d $QEMU_CONF_CLUSTER/$dvmid$EXT_CONF 2>/dev/null") conf_on_destination=$(ssh "$opt_destination" "ls -d $QEMU_CONF_CLUSTER/$dvmid$EXT_CONF 2>/dev/null")
[[ "$conf_on_destination" =~ $redstconf ]] [[ "$conf_on_destination" =~ $redstconf ]]
host_on_destination=${BASH_REMATCH[1]} host_on_destination=${BASH_REMATCH[1]}
if [ $host_on_destination ]; then if [ $host_on_destination ]; then
dststatus=$(ssh root@${dstpvnode[$dvmid]} qm status $dvmid|cut -d' ' -f 2) dststatus=$(ssh root@${dstpvnode[$dvmid]} qm status $dvmid|cut -d' ' -f 2)
if [ $dststatus == "running" ]; then if [ $dststatus == "running" ]; then
log error "Destination VM is running. bailing out" log error "VM is running on Destination Cluster [$dcluster]. bailing out"
end_process 255 end_process 255
fi fi
fi fi
@@ -451,62 +570,81 @@ function mirror() {
log error "Source VM genid ($srcvmgenid) doesn't match destination VM genid ($dstvmgenid). This should not happen. Bailing out.." log error "Source VM genid ($srcvmgenid) doesn't match destination VM genid ($dstvmgenid). This should not happen. Bailing out.."
end_process 255 end_process 255
fi fi
log info "VM $vm_id - Transmitting Config for to destination $opt_destination VMID $dvmid" log info "VM $vm_id - Transmitting Config for VM $vm_id to destination $opt_destination VMID $dvmid"
rewriteconfig $PVE_NODES/"${pvnode[$vm_id]}"/$QEMU/"$vm_id".conf $opt_destination "$opt_pool" $PVE_NODES/"$opt_destination"/$QEMU/"$dvmid".conf "$dvmid" rewriteconfig $PVE_NODES/"${pvnode[$vm_id]}"/$QEMU/"$vm_id".conf $opt_destination "$opt_pool" $PVE_NODES/"$opt_destination"/$QEMU/"$dvmid".conf "$dvmid"
map_vmids_to_dsthost "$opt_destination" map_vmids_to_dsthost "$opt_destination"
fi fi
#Lock on source + destination #--move so we need to shutdown and remove from ha group?
if [ $opt_lock -eq 1 ]; then if [ $opt_migrate -eq 1 ]; then
do_run "ssh root@"${pvnode[$vm_id]}" qm set "$vm_id" --lock backup" >/dev/null log info "VM $vm_id - Migration requested, shutting down VM on ${pvnode[$vm_id]}"
log info "VM $vm_id - locked $vm_id [rc:$?]" if [ "$(get_ha_status "$vm_id")" == "started" ]; then
do_run "ssh root@"${dstpvnode[$dvmid]}" qm set "$dvmid" --lock backup" >/dev/null log info "VM $vm_id - remove from HA"
log info "VM $dvmid - locked $dvmid [rc:$?]" do_run "ha-manager remove $vm_id"
fi
do_run "ssh root@${pvnode[$vm_id]} qm shutdown $vm_id >/dev/null"
startdowntime=$(date +%s)
fi fi
vm_freeze "$vm_id" "${pvnode[$vm_id]}" >/dev/null #Lock on source + destination
freezerc=$? if [ $opt_lock -eq 1 ]; then
if [ $freezerc -gt 0 ]; then do_run "ssh root@""${pvnode[$vm_id]}"" qm set ""$vm_id"" --lock backup" >/dev/null
log error "VM $vm_id - QEMU-Guest could not fsfreeze on guest." log info "VM $vm_id - locked $vm_id [rc:$?] on source"
(( perf_freeze_failed++ )) do_run "ssh root@""${dstpvnode[$dvmid]}"" qm set ""$dvmid"" --lock backup" >/dev/null
else log info "VM $dvmid - locked $dvmid [rc:$?] on destination"
(( perf_freeze_ok++ )) fi
#Freeze fs only if no migration running and qemu-guest-agent okay.
if [ $opt_migrate -eq 0 ] && [ $ga_ping -eq 0 ]; then
vm_freeze "$vm_id" "${pvnode[$vm_id]}" >/dev/null
freezerc=$?
if [ $freezerc -gt 0 ]; then
log warn "VM $vm_id - QEMU-Guest could not fsfreeze on guest."
(( perf_freeze_failed++ ))
else
(( perf_freeze_ok++ ))
fi
fi fi
for disk in $(get_disks_from_config "$file_config"); do for disk in $(get_disks_from_config "$file_config"); do
src_image_spec=$(get_image_spec "$disk") src_image_spec=$(get_image_spec "$disk")
create_snapshot "$src_image_spec@$opt_snapshot_prefix$timestamp" 2>/dev/null create_snapshot "$src_image_spec@$opt_snapshot_prefix$timestamp" 2>/dev/null
ssrc=$? ssrc=$?
if [ $ssrc -gt 0 ]; then if [ $ssrc -gt 0 ]; then
log error "VM $vm_id - rbd snap failed." log warn "VM $vm_id - rbd snap failed."
(( perf_ss_failed++ )) (( perf_ss_failed++ ))
else else
(( perf_ss_ok++ )) (( perf_ss_ok++ ))
fi fi
done done
vm_unfreeze "$vm_id" "${pvnode[$vm_id]}" >/dev/null if [ $opt_migrate -eq 0 ]; then
unfreezerc=$? vm_unfreeze "$vm_id" "${pvnode[$vm_id]}" >/dev/null
if [ $unfreezerc -gt 0 ]; then unfreezerc=$?
log error "VM $vm_id - QEMU-Guest could not fsunfreeze on guest." if [ $unfreezerc -gt 0 ]; then
log error "VM $vm_id - QEMU-Guest could not fsunfreeze on guest."
fi
if [ ! $opt_keepslock -eq 1 ]; then
do_run "ssh root@${pvnode[$vm_id]} qm unlock $vm_id" >/dev/null
log info "VM $vm_id - unlocked source VM $vm_id [rc:$?]"
fi
fi fi
if [ ! $opt_keepslock -eq 1 ]; then
do_run "ssh root@${pvnode[$vm_id]} qm unlock $vm_id" >/dev/null
log info "VM $vm_id - unlocked source VM $vm_id [rc:$?]"
fi
for disk in $(get_disks_from_config "$file_config"); do for disk in $(get_disks_from_config "$file_config"); do
(( diskcount++ )) (( diskcount++ ))
log debug "VMID: $vm_id Disk: $disk DESTVMID: $dvmid" (( vmdiskcount++ ))
src_image_spec=$(get_image_spec "$disk") src_image_spec=$(get_image_spec "$disk")
log debug "src_image_spec: $src_image_spec"
[ -z "$src_image_spec" ] && continue [ -z "$src_image_spec" ] && continue
dst_image_spec=$(echo $src_image_spec | sed -r -e "s/(.*\/[a-zA-Z0-9]+\-)([0-9]+)(\-[a-zA-Z0-9]+\-[0-9]+)/\1$dvmid\3/") dst_image_spec=$(echo $src_image_spec | sed -r -e "s/(.*\/[a-zA-Z0-9]+\-)([0-9]+)(\-[a-zA-Z0-9]+\-[0-9]+)/\1$dvmid\3/")
[ -z "$dst_image_spec" ] && continue [ -z "$dst_image_spec" ] && continue
[[ $disk =~ $recephimg ]] [[ $disk =~ $recephimg ]]
src_image_pool_pve=${BASH_REMATCH[1]} # src_image_pool_pve=${BASH_REMATCH[1]}
src_image_pool=$(lookupcephpool "localhost" ${BASH_REMATCH[1]}) src_image_pool=$(lookupcephpool "localhost" ${BASH_REMATCH[1]})
src_image_name=${BASH_REMATCH[2]} src_image_name=${BASH_REMATCH[2]}
[[ $dst_image_spec =~ ^.*\/(.*)$ ]] [[ $dst_image_spec =~ ^.*\/(.*)$ ]]
dst_image_name=${BASH_REMATCH[1]}-$src_image_pool_pve dst_image_name=${BASH_REMATCH[1]} #-$src_image_pool_pve
dst_image_pool=$(lookupcephpool $opt_destination $opt_pool) dst_image_pool=$(lookupcephpool $opt_destination $opt_pool)
dst_data_pool=$(lookupdatapool $opt_destination $opt_pool)
if [ -n "$dst_data_pool" ]; then
dst_data_opt="--data-pool $dst_data_pool"
fi
snapshot_name="@$opt_snapshot_prefix$timestamp" snapshot_name="@$opt_snapshot_prefix$timestamp"
localsnapcount=$(rbd ls -l $src_image_pool | grep $src_image_name@$opt_snapshot_prefix | cut -d ' ' -f 1|wc -l) localsnapcount=$(rbd ls -l $src_image_pool | grep $src_image_name@$opt_snapshot_prefix | cut -d ' ' -f 1|wc -l)
if [ $localsnapcount -ge 2 ]; then if [ $localsnapcount -ge 2 ]; then
@@ -530,7 +668,7 @@ function mirror() {
#snapts=$(echo $currentlocal | sed -r -e 's/.*@mirror-(.*)/\1/') #snapts=$(echo $currentlocal | sed -r -e 's/.*@mirror-(.*)/\1/')
snapshotsize=$(rbd du --pretty-format --format json $src_image_pool/$src_image_name|jq '.images[] | select (.snapshot_id == null) | {provisioned_size}.provisioned_size'|tail -1) snapshotsize=$(rbd du --pretty-format --format json $src_image_pool/$src_image_name|jq '.images[] | select (.snapshot_id == null) | {provisioned_size}.provisioned_size'|tail -1)
log debug "snapsize: $snapshotsize " log debug "snapsize: $snapshotsize "
xmitjob="rbd export --rbd-concurrent-management-ops 8 $src_image_pool/$src_image_name$snapshot_name --no-progress - | tee >({ wc -c; } >/tmp/$PROGNAME.$pid.$dst_image_pool-$dst_image_name.size) | pv -s $snapshotsize -F \"VM $vm_id - F $src_image_pool/$src_image_name$snapshot_name: $PVFORMAT_FULL\" | ssh $opt_destination rbd import --image-format 2 - $dst_image_pool/$dst_image_name 2>/dev/null" xmitjob="rbd export --rbd-concurrent-management-ops 8 $src_image_pool/$src_image_name$snapshot_name --no-progress - | tee >({ wc -c; } >/tmp/$PROGNAME.$pid.$dst_image_pool-$dst_image_name.size) | pv -s $snapshotsize -F \"VM $vm_id - F $src_image_pool/$src_image_name$snapshot_name: $PVFORMAT_FULL\" | ssh -c $opt_sshcipher $opt_destination rbd import --image-format 2 - $dst_image_pool/$dst_image_name $dst_data_opt 2>/dev/null"
# create initial snapshot on destination # create initial snapshot on destination
log debug "xmitjob: $xmitjob" log debug "xmitjob: $xmitjob"
startdisk=$(date +%s) startdisk=$(date +%s)
@@ -558,7 +696,7 @@ function mirror() {
#disk was not attached, or really nothing has changed.. #disk was not attached, or really nothing has changed..
snapshotsize=0 snapshotsize=0
fi fi
xmitjob="rbd export-diff --no-progress --from-snap $opt_snapshot_prefix$basets $src_image_pool/$currentlocal - | tee >({ wc -c; } >/tmp/$PROGNAME.$pid.$dst_image_pool-$dst_image_name.size) | pv -F \"VM $vm_id - I $src_image_pool/$src_image_name$snapshot_name: $PVFORMAT_SNAP\" | ssh $opt_destination rbd import-diff --no-progress - $dst_image_pool/$dst_image_name" xmitjob="rbd export-diff --no-progress --from-snap $opt_snapshot_prefix$basets $src_image_pool/$currentlocal - | tee >({ wc -c; } >/tmp/$PROGNAME.$pid.$dst_image_pool-$dst_image_name.size) | pv -F \"VM $vm_id - I $src_image_pool/$src_image_name$snapshot_name: $PVFORMAT_SNAP\" | ssh -c $opt_sshcipher $opt_destination rbd import-diff --no-progress - $dst_image_pool/$dst_image_name"
log debug "xmitjob: $xmitjob" log debug "xmitjob: $xmitjob"
startdisk=$(date +%s) startdisk=$(date +%s)
do_run "$xmitjob" do_run "$xmitjob"
@@ -586,11 +724,20 @@ function mirror() {
do_run "$cmd" do_run "$cmd"
fi fi
unset basets unset basets
vmdiskcount=0
done done
if [ $opt_keepdlock -eq 0 ]; then if [ $opt_keepdlock -eq 0 ]; then
ssh root@${dstpvnode[$dvmid]} qm unlock $dvmid ssh root@${dstpvnode[$dvmid]} qm unlock $dvmid
log info "VM $dvmid - Unlocking destination VM $dvmid" log info "VM $dvmid - Unlocking destination VM $dvmid"
fi fi
#--migrate so start on destination?
if [ $opt_migrate -eq 1 ]; then
log info "VM $dvmid - Starting VM on node ${dstpvnode[$dvmid]} in cluster [$dcluster]"
do_run "ssh root@""${dstpvnode[$dvmid]}"" qm start "$dvmid >/dev/null
enddowntime=$(date +%s)
log info "VM $dvmid - Downtime: $(( enddowntime - startdowntime )) Seconds"
fi
done done
endjob=$(date +%s) endjob=$(date +%s)
log info "Finnished mirror $(date "+%F %T")" log info "Finnished mirror $(date "+%F %T")"
@@ -600,12 +747,14 @@ function mirror() {
if [ "$perf_ss_failed" -gt 0 ]; then disp_perf_ss_failed="$(echored $perf_ss_failed)"; else disp_perf_ss_failed="$(echogreen $perf_ss_failed)"; fi if [ "$perf_ss_failed" -gt 0 ]; then disp_perf_ss_failed="$(echored $perf_ss_failed)"; else disp_perf_ss_failed="$(echogreen $perf_ss_failed)"; fi
if [ "$perf_full_failed" -gt 0 ]; then disp_perf_full_failed="$(echored $perf_full_failed)"; else disp_perf_full_failed="$(echogreen $perf_full_failed)"; fi if [ "$perf_full_failed" -gt 0 ]; then disp_perf_full_failed="$(echored $perf_full_failed)"; else disp_perf_full_failed="$(echogreen $perf_full_failed)"; fi
if [ "$perf_diff_failed" -gt 0 ]; then disp_perf_diff_failed="$(echored $perf_diff_failed)"; else disp_perf_diff_failed="$(echogreen $perf_diff_failed)"; fi if [ "$perf_diff_failed" -gt 0 ]; then disp_perf_diff_failed="$(echored $perf_diff_failed)"; else disp_perf_diff_failed="$(echogreen $perf_diff_failed)"; fi
if [ "$skipped_vm_count" -gt 0 ]; then disp_skipped_vm_count="$(echored $skipped_vm_count)"; else disp_skipped_vm_count="$(echogreen $skipped_vm_count)"; fi
log info "VM Freeze OK/failed.......: $perf_freeze_ok/$disp_perf_freeze_failed" log info "VM Freeze OK/failed.......: $perf_freeze_ok/$disp_perf_freeze_failed"
log info "RBD Snapshot OK/failed....: $perf_ss_ok/$disp_perf_ss_failed" log info "RBD Snapshot OK/failed....: $perf_ss_ok/$disp_perf_ss_failed"
log info "RBD export-full OK/failed.: $perf_full_ok/$disp_perf_full_failed" log info "RBD export-full OK/failed.: $perf_full_ok/$disp_perf_full_failed"
log info "RBD export-diff OK/failed.: $perf_diff_ok/$disp_perf_diff_failed" log info "RBD export-diff OK/failed.: $perf_diff_ok/$disp_perf_diff_failed"
log info "Full xmitted..............: $(human_readable $perf_bytes_full)" log info "Full xmitted..............: $(human_readable $perf_bytes_full)"
log info "Differential Bytes .......: $(human_readable $perf_bytes_diff)" log info "Differential Bytes .......: $(human_readable $perf_bytes_diff)"
log info "Skipped VMs ..............: $disp_skipped_vm_count"
if [ -n "$opt_influx_api_url" ]; then if [ -n "$opt_influx_api_url" ]; then
log info "VM $vm_id - Logging Job summary to InfluxDB: $opt_influx_api_url" log info "VM $vm_id - Logging Job summary to InfluxDB: $opt_influx_api_url"
influxlp="$opt_influx_summary_metrics,jobname=$opt_influx_jobname perf_bytes_diff=$perf_bytes_diff""i,perf_bytes_full=$perf_bytes_full""i,perf_bytes_total=$perf_bytes_total""i,perf_diff_failed=$perf_diff_failed""i,perf_diff_ok=$perf_diff_ok""i,perf_freeze_failed=$perf_freeze_failed""i,perf_freeze_ok=$perf_freeze_ok""i,perf_full_failed=$perf_full_failed""i,perf_full_ok=$perf_full_ok""i,perf_ss_failed=$perf_ss_failed""i,perf_ss_ok=$perf_ss_ok""i,perf_vm_running=$perf_vm_running""i,perf_vm_stopped=$perf_vm_stopped""i" influxlp="$opt_influx_summary_metrics,jobname=$opt_influx_jobname perf_bytes_diff=$perf_bytes_diff""i,perf_bytes_full=$perf_bytes_full""i,perf_bytes_total=$perf_bytes_total""i,perf_diff_failed=$perf_diff_failed""i,perf_diff_ok=$perf_diff_ok""i,perf_freeze_failed=$perf_freeze_failed""i,perf_freeze_ok=$perf_freeze_ok""i,perf_full_failed=$perf_full_failed""i,perf_full_ok=$perf_full_ok""i,perf_ss_failed=$perf_ss_failed""i,perf_ss_ok=$perf_ss_ok""i,perf_vm_running=$perf_vm_running""i,perf_vm_stopped=$perf_vm_stopped""i"
@@ -613,6 +762,8 @@ function mirror() {
cmd="curl --request POST \"$opt_influx_api_url/v2/write?org=$opt_influx_api_org&bucket=$opt_influx_bucket&precision=ns\" --header \"Authorization: Token $opt_influx_token\" --header \"Content-Type: text/plain; charset=utf-8\" --header \"Accept: application/json\" --data-binary '$influxlp'" cmd="curl --request POST \"$opt_influx_api_url/v2/write?org=$opt_influx_api_org&bucket=$opt_influx_bucket&precision=ns\" --header \"Authorization: Token $opt_influx_token\" --header \"Content-Type: text/plain; charset=utf-8\" --header \"Accept: application/json\" --data-binary '$influxlp'"
do_run "$cmd" do_run "$cmd"
fi fi
(( perf_vm_ok++ ))
end_process 0
} }
function do_housekeeping(){ function do_housekeeping(){
@@ -654,6 +805,15 @@ function do_housekeeping(){
done done
} }
function gaping() {
local vmid=$1
local rc
cmd="ssh root@${pvnode[$vmid]} qm guest cmd $vmid ping >/dev/null 2>&1"
eval "$cmd"
rc=$?
echo $rc
}
function create_snapshot(){ function create_snapshot(){
local snap="$1" local snap="$1"
log info "VM $vm_id - Creating snapshot $snap" log info "VM $vm_id - Creating snapshot $snap"
@@ -669,8 +829,8 @@ function vm_freeze() {
status=$(ssh root@"$fhost" qm status "$fvm"|cut -d' ' -f 2) status=$(ssh root@"$fhost" qm status "$fvm"|cut -d' ' -f 2)
if ! [[ "$status" == "running" ]]; then if ! [[ "$status" == "running" ]]; then
log info "VM $fvm - Not running, skipping fsfreeze-freeze" log info "VM $fvm - Not running, skipping fsfreeze-freeze"
return
(( perf_vm_stopped++ )) (( perf_vm_stopped++ ))
return
else else
(( perf_vm_running++ )) (( perf_vm_running++ ))
fi fi
@@ -710,7 +870,7 @@ function rewriteconfig(){
else else
sedcmd='sed -e /^$/,$d' sedcmd='sed -e /^$/,$d'
fi fi
cat "$oldconfig" | sed -r -e "s/^(virtio|ide|scsi|sata|mp)([0-9]+):\s([a-zA-Z0-9]+):(.*)-([0-9]+)-disk-([0-9]+).*,(.*)$/\1\2: $newpool:\4-$newvmid-disk-\6-\3,\7/g" | $sedcmd | sed -e '/^$/,$d' | grep -v "^parent:\s.*$" | ssh "$dst" "cat - >$newconfig" cat "$oldconfig" | sed -r -e "s/^(efidisk|virtio|ide|scsi|sata|mp)([0-9]+):\s([a-zA-Z0-9]+):(.*)-([0-9]+)-disk-([0-9]+).*,(.*)$/\1\2: $newpool:\4-$newvmid-disk-\6,\7/g" | $sedcmd | sed -e '/^$/,$d' | sed -e '/ide[0-9]:.*-cloudinit,media=cdrom.*/d' | grep -v "^parent:\s.*$" | ssh "$dst" "cat - >$newconfig"
} }
function checkvmid(){ function checkvmid(){
@@ -741,23 +901,25 @@ function do_run(){
function end_process(){ function end_process(){
local -i rc=$1; local -i rc=$1;
# if ! [[ -z "$startts" && -z "$endts" ]]; then local -i runtime
# local -i runtime=$(expr $endts - $startts) local -i bps
# local -i bps=$(expr $bytecount/$runtime) local -i ss_total
# fi local subject
# local subject="Ceph [VM:$vmok/$vmtotal SS:$snapshotok/$snapshottotal EX:$exportok/$exporttotal] [$(bytesToHuman "$bytecount")@$(bytesToHuman "$bps")/s]" if ! [[ -z "$startjob" || -z "$endjob" ]]; then
# [ $rc != 0 ] && subject="$subject [ERROR]" runtime=$(expr $endjob - $startjob)
bps=$(expr $perf_bytes_total/$runtime)
fi
ss_total=$(expr $perf_ss_ok + $perf_ss_failed)
subject="Crossover [VM:$perf_vm_ok/$vmcount SS:$perf_ss_ok/$ss_total]"
[ $rc != 0 ] && subject="[ERROR] $subject" || subject="[OK] $subject"
#send email local mail;
# local mail; for mail in $(echo "$opt_addr_mail" | tr "," "\n"); do
# local mailhead="Backup $imgcount Images in $vmcount VMs (Bytes: $bytecount)" do_run "cat '$LOG_FILE' | mail -s '$subject' '$mail'"
# for mail in $(echo "$opt_addr_mail" | tr "," "\n"); do done
# do_run "cat '$LOG_FILE' | mail -s '$subject' '$mail'"
# done
#remove log #remove log
# rm "$LOG_FILE" rm "$LOG_FILE"
exit "$rc"; exit "$rc";
} }
@@ -774,6 +936,29 @@ function get_image_spec(){
echo "$image_spec" echo "$image_spec"
} }
function get_ha_status() {
local havmid="$1"
ha_status=$(ha-manager status| grep vm:"$havmid" | cut -d " " -f 4| sed 's/.$//')
echo "$ha_status"
}
function check_pool_exist() {
local poolname="$1"
local -i exists=255
pool_status=$(ssh $opt_destination pvesm status|grep rbd|cut -d " " -f 1|grep $poolname)
if [ "$pool_status" == "$poolname" ]; then
exists=1
else
exists=0
fi
echo $exists
}
function get_ceph_version() {
scephversion=$(ceph -v | cut -d " " -f 3)
dcephversion=$(ssh $opt_destination ceph -v | cut -d " " -f 3)
}
function main(){ function main(){
[ $# = 0 ] && usage; [ $# = 0 ] && usage;