DRBD9 | How to setup a basic three node cluster

DRBD9 | How to setup a basic three node cluster

This guide assumes that you already have DRBD9 kernel module, drbd-utils and drbdmanage installed on the servers.
In this setup, we will need 3 identical (as much as possible) servers with a dedicated network and storage backend for the DRBD replication.

HostA: drbd-host1
ip addr: 10.1.1.3
netmask: 255.255.255.0

HostB: drbd-host2
ip addr: 10.1.1.4
netmask: 255.255.255.0

HostC: drbd-host3
ip addr: 10.1.1.5
netmask: 255.255.255.0

* make sure you setup /etc/hosts on each node to reflect the ip addresses of the other nodes. This is a requirement for DRBD.

* all hosts must be able to connect to each other via ssh without password. To do so execute the following commands on each node:

ssh-keygen # Follow the wizard, make sure you don’t set a passphrase!
ssh-copy-id <node name> # where <node name> is the hostname of the opposite node, e.g if you are on drbd-host1 then the opposite hosts should be drbd-host2 and drbd-host3. Do the same on all 3 nodes.

* make sure you can connect to each node password less:

drbd-host1:~# ssh drbd-host2 && ssh drbd-host3

Ok, now that you have access to each node without needing to enter password let’s configure drbd.

First, we must select the underlying storage that we will use for drbd. In this example each host has /dev/sdb volume which we dedicate for drbd, where /dev/sdb corresponds to an RAID10 disk array on each host.

Now let’s connect to the first host, drbd-host1 and create needed LVM VG. We should name this VG ‘drbdpool’ which is how drbd will recognise it and use it to allocate the space:

drbd-host1~# pvcreate /dev/sdb
drbd-host1~# vgcreate drbdpool /dev/sdb
drbd-host1~# lvcreate -L 4T –thinpool drbdthinpool drbdpool # We create a Thin provisioned LV inside drbdpool VG. It is necessary to call it drbdthinpool otherwise operations later will fail!

* repeat steps above to the rest of nodes (drbd-host2, drbd-host3).

Now will have to use drbdmanage utility to initiate the drbd cluster. One drbd-host1 execute the following:

drbdmanage init 10.1.1.3 # where 10.1.1.3 is the ip address on drbd-host1 which is dedicated for drbd replication (see at the top of this guide).

If successful then proceed adding the rest 2 nodes in the cluster (again from drbd-host1!)

drbdmanage add-node drbd-host2 10.1.1.4 # note that here you need to specify node’s hostname as well!. You should be able to auto complete each parameter (even the ip) by pressing the TAB button.

drbdmanage add-node drbd-host3 10.1.1.5

Now verify that everything is good:

drbdmanage list-nodes # all nodes should appear with OK status

Next, let’s create the first resource with a test volume within it:

drbdmanage add-resource res01

drbdmanage add-volume vol01 40G –deploy 3 # additionally here we specify on how many nodes the volume should reside. In this case we set 3 nodes.

Verify that the above are created successfully:

ls /dev/drbd* #Here you should see /dev/drbd0, /dev/drbd1 which belong to the control volumes that drbdmanage automatically creates during “drbdmanage init“. Additionally there should be /dev/drbd100 which corresponds to the vol01 volume we created above. You can handle this as a usual block device, e.g fdisk it then create a fs with mkfs and finally mount and write data to it. All writes will be automatically replicated to the rest of nodes.

drbdmanage list-volumes

drbdmanage list-resources

drbd-overview # Shows the current status of the cluster. Please take a note of the node which is elected to be in Primary state. That is the only node which can mount the newly created drbd volume!

drbdadm status

Proxmox VE users: If you setup the DRBD9 cluster on PVE nodes make sure you add the following entries in /etc/pve/storage.cfg , Note that you don’t need to create volumes manually like we did previously. Proxmox storage plugin will create those automatically per each VM you create :

drbd: drbd9-stor1  # where drbd9-stor1 can be any arbitrary label to identify the storage
content images
redundancy 3   # This is the volume redundancy level, in this case it’s 3.

Ok, so if everything went right you should now have a working 3 node DRBD9 cluster. Of course you will need to spend a lot of time to familiarise yourself with drbd utils command line and drbd in general. Have fun!

Proxmox cluster | Reverse proxy with noVNC and SPICE support

I have a 3 node proxmox cluster in production and I was trying to find a way to centralize the webgui management.

Currently the only way to access proxmox cluster web interface is by connecting to each cluster node individually, e.g https://pve1:8006 , https://pve2:8006 etc from your web browser.

The disadvantage of this is that you have either to bookmark every single node on your web browser, or type the url manually each time.

Obviously this can become pretty annoying, especially as you are adding more nodes into the cluster.

Below I will show how I managed to access any of my PVE cluster nodes web interface by using a single dns/host name (e.g https://pve in my case).

Note that you don’t even need to type the default proxmox port (8006) after the hostname since Nginx will listen to default https port (443) and forward the request to the backend proxmox cluster nodes on port 8006.

My first target was the web management console and secondly it was making noVNC and SPICE work too. The last seemed to be more tricky.

We will use Nginx to handle Proxmox web and noVNC console traffic (port 8006) and HAProxy to handle SPICE traffic (port 3128).

Note The configuration below has been tested with the following software versions:

  • Debian GNU/Linux 8.6 (jessie)
  • nginx version: nginx/1.6.2
  • HA-Proxy version 1.5.8
  • proxmox-ve: 4.3-66 (running kernel: 4.4.19-1-pve)

What you will need

1. A basic Linux vm. My preference for this tutorial was Debian Jessie.

2. Nginx + HAProxy for doing the magic.

3. OpenSSL packages to generate the self signed certificates.

4. Obviously a working proxmox cluster.

5. Since this will be a critical vm, It would be a good idea to configure it as a HA virtual machine into your proxmox cluster.

The steps

– Download Debian Jessie net-install.

– Assign a static IP address and create the appropriate DNS record on your DNS server (if available, otherwise use just hostnames).
In my case, I created an A record named ‘pve‘ which is pointing to 10.1.1.10 . That means that when you manage to complete this guide your will be able to access all proxmox nodes by using https://pve (or https://pve.domain.local) on your browser! You will not even need to type the default port which is 8006.

– Update package repositories by entering ‘apt-get update’

– Install Nginx and HAProxy:

apt-get install nginx && apt-get install haproxy

Nginx and OpenSSL setup

– Assuming that you are logged in as root, create backup copy of the default config file.

cp /etc/nginx/sites-enabled/default /root

– Remove /etc/nginx/sites-enabled/default:

rm /etc/nginx/sites-enabled/default

– Download OpenSSL packages:

apt-get install openssl

– Generate a private key (select a temp password when prompted):

openssl genrsa -des3 -out server.key 1024

– Generate a csr file (select the same temp password if prompted):

openssl req -new server.key -out server.csr

– Remove the password from the key:

openssl rsa -in server.key -out server_new.key

– Remove old private key and rename the new one:

rm server.key && mv server_new.key server.key

– Make sure only root has access to private key:

chown root server.key && chmod 600 server.key

– Generate a certificate:

openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

– Create a directory called ssl in /etc/nginx folder and copy server.key and server.crt files:

mkdir /etc/nginx/ssl && cp server.key /etc/nginx/ssl && cp server.crt /etc/nginx/ssl

– Create an empty file:

vi /etc/nginx/sites-enabled/proxmox-gui

– Paste the code below and save the file. Make sure that you change the ip addresses to match your proxmox nodes ip addresses:

Edit (11-11-2017)

upstream proxmox {
ip_hash;    #added ip hash algorithm for session persistency
server 10.1.1.2:8006;
server 10.1.1.3:8006;
server 10.1.1.4:8006;
}
server {
listen 80 default_server;
rewrite ^(.*) https://$host$1 permanent;
}
server {
listen 443;
server_name _;
ssl on;
ssl_certificate /etc/nginx/ssl/server.crt;
ssl_certificate_key /etc/nginx/ssl/server.key;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header Host $http_host;
location / {
proxy_pass https://proxmox;
}
}

– Create a symlink for /etc/nginx/sites-enabled/proxmox-gui in /etc/nginx/sites-available:

ln -s /etc/nginx/sites-enabled/proxmox-gui /etc/nginx/sites-available

– Verify that the symlink has been created and it’s working:

ls -ltr /etc/nginx/sites-available && cat /etc/nginx/sites-available/proxmox-gui (You should see the above contents after this)

– That’s it! You can now start Nginx service:

systemctl start nginx.service && systemctl status nginx.service (Verify that it is active (running).

HAProxy Setup

– Create a backup copy of the default config file.

cp /etc/haproxy/haproxy.cfg /root

– Create an empty /etc/haproxy/haproxy.cfg file (or remove it’s contents):

vi /etc/haproxy/haproxy.cfg

– Paste the following code and save the file. Again make sure that you change the ip addresses to match your proxmox hosts. Also note that the hostnames must also match your pve hostnames, e.g pve1, pve2, pve3

global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
maxconn 4096
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy/haproxy.sock mode 0644 uid 0 gid 107
defaults
log global
mode tcp
option tcplog
option dontlognull
retries 3
option redispatch
maxconn 2000
timeout connect 5000
timeout client 50000
timeout server 50000
listen proxmox_spice *:3128
mode tcp
option tcpka
balance roundrobin
server pve1 10.1.1.2:3128 weight 1
server pve2 10.1.1.3:3128 weight 1
server pve3 10.1.1.4:3128 weight 1

– Note that the above configuration has been tested on HA-Proxy version 1.5.8.
If the Nginx service fails to start please troubleshoot by running:

haproxy -f /etc/haproxy/haproxy.cfg ...and check for errors.

– Start HAProxy service:

systemctl start haproxy.service && systemctl status haproxy.service (Must show active and running)

Testing our setup…

Open a web browser and enter https://pve . You should be able to access PVE webgui. (remember in my case I have assigned ‘pve’ as hostname to the Debian VM and I have also created a similar entry on my DNS server. That means that your client machine must be able to resolve the above address properly otherwise it will fail to load proxmox webgui).

You can now also test noVNC console and SPICE. Please note that you may need to refresh noVNC window in order to see the vm screen.

UPDATE: You can seamesly add SSH to the proxied ports if you wish to ssh in any of pve host.

Just add the lines below to your /etc/haproxy/haproxy.cfg file. Note that I’m using port 222 instead of 22 in order to prevent conflicting ports with the actual Debian vm which already listens on port tcp 22.


listen proxmox_ssh *:222
mode tcp
option tcpka
balance roundrobin
server pve1 10.1.1.2:22 weight 1
server pve2 10.1.1.3:22 weight 1
server pve3 10.1.1.4:22 weight 1

Now if you try to connect from your machine as root@pve at port 222 (ssh root@pve -p 222), the first time you will be asked to save the ECDSA key of the host to your .ssh/known_hosts file and then you will login to the first proxmox node e.g pve1.
If you attempt to connect for a second time your request will be rejected since HAProxy will forward your request to the second proxmox node e.g pve2 which happens to have a different fingerprint from the first. This is good of course for security reasons but in this case we will need to disable the check for the proxied host, otherwise we will not be able to connect to it.

– On your client machine, modify /etc/ssh/ssh_config file (not sshd_config !).

– Remove the following entry:

Host *

– Add the following at the end of the file:

Host pve
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
ServerAliveInterval 5

This will prevent the security ECDSA key checks ONLY for host pve and enable them from ALL other hostnames. So in short it’s quite restrictive setting.ServerAliveInterval is used in order to keep the ssh session alive during periods of inactivity.I’ve noticed that without setting that parameter to ssh client, it will drop the session quite often.

Error when updating the Debian based system: The following signatures couldn’t be verified…

Issue

When updating the Debian based system, apt-get may display an error message like:

W: GPG error: ftp://ftp.debian.org/ testing Release:
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 010908312D230C5F

Solution

Simply type the following commands, taking care to replace the number below with that of the key that was displayed in the error message:

gpg --keyserver pgpkeys.mit.edu --recv-key 010908312D230C5F
  gpg -a --export 010908312D230C5F | sudo apt-key add -