Raft Consensus Algorithm to make sure all the manager nodes in charge of managing and scheduling tasks in the cluster, are storing the same consistent state.
Having the same consistent state across the cluster means that in case of a failure, any Manager node can pick up the tasks and restore the services to a stable state.
Raft tolerates up to (N-1)/2 failures and requires a majority or quorum of (N/2)+1 members to agree on values proposed to the cluster.
validate secrets were created and stored correctly:
1 2 3 4 5 6 7
# Get the username $ kubectl get secret mariadb-user-creds -o jsonpath='{.data.MYSQL_USER}' | base64 --decode - kubeuser
# Get the password $ kubectl get secret mariadb-user-creds -o jsonpath='{.data.MYSQL_PASSWORD}' | base64 --decode - kube-still-rocks
ConfigMap
create a file named max_allowed_packet.cnf:
1 2
[mysqld] max_allowed_packet = 64M
create configmap by:
1 2 3
$ kubectl create configmap mariadb-config --from-file=max_allowed_packet.cnf # could add multiple --from-file=<filename> $ kubectl create configmap mariadb-config --from-file=max-packet=max_allowed_packet.cnf # set max-packet as key rather than the file name configmap/mariadb-config created
the following minimum requirements for Docker UCP 2.2.4 on Linux:
• UCP Manager nodes running DTR: 8GB of RAM with 3GB of disk space
• UCP Worker nodes: 4GB of RAM with 3GB of free disk space
Recommended requirements are:
• UCP Manager nodes running DTR: 8GB RAM, 4 vCPUs, and 100GB disk space
• UCP Worker nodes: 4GB RAM 25-100GB of free disk space
A Swarm backup is a copy of all the files in directory /var/lib/docker/swarm:
Stop Docker on the Swarm manager node you are performing the backup from(not a good idea to perform the backup on the leader manager). This will stop all UCP containers on the node. If UCP is configured for HA, the other managers will make sure the control plane remains available.
1
$ service docker stop
Backup the Swarm config, e.x.:
1 2 3 4 5 6
$ tar -czvf swarm.bkp /var/lib/docker/swarm/ tar: Removing leading `/' from member names /var/lib/docker/swarm/ /var/lib/docker/swarm/docker-state.json /var/lib/docker/swarm/state.json <Snip>
Verify that the backup file exists. rotate, and store the backup file off-site according to your corporate backup policies.
1 2
$ ls -l -rw-r--r-- 1 root root 450727 Jan 29 14:06 swarm.bkp
Restart Docker.
1
$ service docker restart
recover Swarm from a backup:
stop docker:
1
$ service docker stop
Delete any existing Swarm configuration:
1
$ rm -r /var/lib/docker/swarm
Restore the Swarm configuration from backup:
1
$ tar -zxvf swarm.bkp -C /
Initialize a new Swarm cluster:
1 2
$ docker swarm init --force-new-cluster Swarm initialized: current node (jhsg...3l9h) is now a manager.
check by:
1 2
$ docker network ls $ docker service ls
Add new manager and worker nodes to the Swarm, and take a fresh backup.
If possible, you should run your DTR instances on dedicated nodes. You definitely
shouldn’t run user workloads on your production DTR nodes.
As with UCP, you should run an odd number of DTR instances. 3 or 5 is best for fault
tolerance. A recommended configuration for a production environment might be:
3 dedicated UCP managers
3 dedicated DTR instances
However many worker nodes your application requirements demand
Install DTR:
Log on to the UCP web UI and click Admin > Admin Settings > Docker
Trusted Registry.
Fill out the DTR configuration form.
DTR EXTERNAL URL: Set this to the URL of your external load balancer.
UCP NODE: Select the name of the node you wish to install DTR on.
Disable TLS Verification For UCP: Check this box if you’re using
self-signed certificates.
Copy the long command at the bottom of the form.
Paste the command into any UCP manager node.
The command includes the --ucp-node flag telling UCP which node to
perform the install on.
The following is an example DTR install command that matches the configuration
in Figure 16.10. It assumes that you already have a load balancer
configured at dtr.mydns.com
Enter the UCP URL and port, as well as admin credentials when prompted.
Describe and demonstrate how to configure backups for UCP and DTR
You can run the backup from any UCP manager node in the cluster, and you only
need to run the operation on one node (UCP replicates its configuration to all
manager nodes, so backing up from multiple nodes is not required).
Backing up UCP will stop all UCP containers on the manager that you’re executing
the operation on. With this in mind, you should be running a highly available UCP
cluster, and you should run the operation at a quiet time for the business.
Log on to the UCP web UI and ensure that the user created earlier is still present
(or any other UCP objects that previously existed in your environment).
Backup DTR:
As with UCP, DTR has a native backup command that is part of the Docker image
that was used to install the DTR. This native backup command will backup the DTR
configuration that is stored in a set of named volumes, and includes:
DTR configuration
Repository metadata
Notary data
Certificates
Images are not backed up as part of a native DTR backup. It is expected that
images are stored in a highly available storage backend that has its own independent
backup schedule using non-Docker tools.
Run the following command from a UCP manager node to perform a DTR backup:
Restoring DTR from backups should be a last resort, and only attempted when the
majority of replicas are down and the cluster cannot be recovered any other way.
If you have lost a single replica and the majority are still up, you should add a new
replica using the dtr join command.
restore from backup, the workflow is like this:
Stop and delete DTR on the node (might already be stopped)
1 2 3
$ docker run -it --rm \ docker/dtr:2.4.1 destroy \ --ucp-insecure-tls
Restore images to the shared storage backend (might not be required)