KVM on CentOS: Hyperconverged nested oVirt Cluter with Gluster vSAN


KVM on Centos for Nested hyperconverged oVirt cluster.




WARNING: This is not step by step guide and omits much of the self-explanatory steps such as individual installation instructions and basic configuration instructions.

What is used?
One host with 64GB RAM, ~2TB Storage (128SSD & 2000SSHD) and i5 CPU,
Centos7: KVM, QEMU,
oVirt v4.1.2
oVirtHost v4.1.2
Gluster 3.10
A wicked mind


From a high level perspective the infrastructure looks like this:




The Physical host is installed onto the SSD disk
The Nested hosts are installed onto the SSHD Datastore

The final result is:

----Physical KVM
          |-----------------------Virtual oVirt
          |-----------------------Nested KVM/oVirtNode-1--------------Guest VMs
          |-----------------------Nested KVM/oVirtNode-2--------------Guest VMs
          |-----------------------Nested KVM/oVirtNode-3--------------Guest VMs


First install your Host Hypervisor; the tools and packages:

sudo yum -y install qemu-kvm libvirt virt-install bridge-utils virt-install

In case of CentOS minimal install X window manager is required to be installed on order to start virt-manager locally. 

yum install "@X Window System" xorg-x11-xauth xorg-x11-fonts-* xorg-x11-utils-y

It doesn't makes sense for me to use virt-manager locally as the server will always be administered remotely, but nonetheless should you wish to there you go.


Second, start and enable the virtualisation service: 
systemctl start libvirtd
systemctl enable libvirtd
systemctl list-unit-files | grep libvirtd

Make sure modules are loaded: 
lsmod | grep kvm



Virtualisation Internals


Location of the VM XML config (I/O) files: 
/etc/libvirt/qemu

The management layer (API) which interacts with the hypervisor is called libvirt. The command line client interface of libvirt is called virsh. This is equivalent to esxcli. The graphical client interface of libvirt is called virt-manager. This is equivalent vSphere Client.



Validation


To check server virtualisation capability 
virt-host-validate 

Check if management CLI virsh is installed and use it to check hypervisor/host info: 
virsh nodeinfo
virsh domcapabilities
virsh domcapabilities | grep -I max
virsh domcapabilities | grep diskDevice -A 5



Creating Machines

List the configured networks: 
virsh net-list –all 

List the details of a specific network: 
virsh net-info default 

List the config parameters of specific network: 
virsh net-dumpxml default 

Virtual network configuration files are stored in 
/etc/libvirt/qemu/networks/ 
as XML files.


For the default network it is 
/etc/libvirt/qemu/networks/default.xml.

Stopping and starting virtual network: 
virsh net-destroy default
virsh net-start default

Finding partition and available storage disks: 
fdisk -l
lsblk
lsblk -f
df -h


Creating new storage pool (partition based storage pool):

1. Partitioning:
 [root@localhost home]# parted -a optimal /dev/sdb 
GNU Parted 3.1 
Using /dev/sdb 
Welcome to GNU Parted! Type 'help' to view a list of commands. 
(parted) mkpart primary ext4 0% 100% 
(parted) print 
Model: ATA ST2000DX001-1NS1 (scsi) 
Disk /dev/sdb: 2000GB 
Sector size (logical/physical): 512B/4096B 
Partition Table: gpt 
Disk Flags: 
Number Start End Size File system Name Flags 
1. 1049kB 2000GB 2000GB primary

- Set the filesystem to the newly created partition 
mkfs.ext4 /dev/sdb1

- Label the partition 
e2label /dev/sdb1 storage_pool_01

- Check the label 
lsblk -f

- Define the pool (partition based storage pool in this case) 
pool-define-as VMs fs - - /dev/sdb1 - "/var/lib/libvirt/filesystems/local/"

- List the pool 
virsh pool-list --all

- Start the new storage pool 
virsh pool-start VMs

- List if pool started successfully 
virsh pool-list --all

- Turn autostart ON 
virsh pool-autostart VMs 

- Verify storage pool configuration 
virsh pool-info VMs

- Delete storage pool 
virsh pool-destroy VMs

- Undefine storage pool 
virsh pool-undefine VMs

- Delete pool directory 
Virsh pool-delete VMs


2. Create the virtual disk for the machine 
qemu-img create -f raw -o size=500GB /var/lib/libvirt/filesystems/local/esxi_01.img

3. Create the virtual machine
Use virt-manager


Note: In case of hosting another hypervisor in KVM, such as ESXi,KVM,XEN, Hardware Virtualisation needs to be enabled:
vim /etc/modprobe.d/nested.conf
options kvm ignore_msrs=1
options kvm-intel nested=y ept=y 
modprobe -r kvm_intel
modprobe kvm_intel

To check if enabled: 
cat /sys/module/kvm_intel/parameters/nested

Do below or use virt-manager to tick the “Copy host cpu” box. 
virsh edit node01
<cpu mode=’host-passthrough’>


Cloning Machines (Clean install is always preferred over cloning)

virsh list --all
virsh suspend node01
virsh list --all
virt-clone --connect qemu:///system --original kvm01 --name kvm02 --file /var/lib/libvirt/filesystems/local/kvm02_os.img


Note: it is important to remove all additional disks off original vm before cloning. If disks are to be cloned however than --file portion of the command has to be repeated for every disk that will be cloned.

Preparing/decontectualizing the cloned VM (creating a fresh copy by removing copied MAC, hostname, accounts, SSH keys, etc.)
yum install libguestfs-tools
virt-sysprep -d node02
virsh node02 start



Bridged Networking; Multiple Subnets


NAT’ed deafult networking will allow the machines to communicate out from KVM but cannot be accessed from externally. To allow both incoming and outgoing communications bridge networking is required.

NetworkManager service does not support bridging hence needs to be disabled and the plain network service that works with network scripts enabled. 
chkconfig NetworkManager off
chkconfig network on
service NetworkManager stop
service network start 

When running multiple bridges, each to/with its own subnet things get complex. Linux would only allow incoming/outgoing communication from the virtual to the physical infrastructure through one interface/bridge/default gateway only. To achieve independent routing per bridge/subnet we need to introduce independent routing tables for each bridge interface/subnet.

For example, lets assume that we have 3 physical/virtual networks:

---physical machines----->br-datacenter<---virtual machines---Mgmt Netowrk
---physical machines------>br-gluster<------virtual machines---Storage Network
---physical machines------>br-servers <-----virtual machines---Server Network
---PC ADMIN--------------------|
                      


To create the bridged interface use the settings below or best use virt-manager GUI to create them. When creating them using virt-manager always have the physical interface already preconfigured with all the IP addressing so you can copy the settings onto the bridge.

vim /etc/sysconfig/network-scripts/ifcfg-enp7s0
DEVICE="enp7s0"
ONBOOT="yes"
BRIDGE="br-datacenter"
  
touch /etc/sysconfig/network-scripts/ifcfg-br-datacenter

vim /etc/sysconfig/network-scripts/ifcfg-br-datacenter
DEVICE="br-datacenter"
ONBOOT="yes"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="172.18.0.10"
NETMASK="255.255.255.0"
GATEWAY="172.18.0.1"
DEFROUTE="yes"
IPV6INIT="yes"'
IPV6_AUTOCONF="yes"
DHCPV6C="no"
STP="on"
DELAY="0.0"
DNS1=172.18.0.1

vim /etc/sysconfig/network-scripts/ifcfg- enp4s0f0
DEVICE=enp4s0f0
ONBOOT=yes
BRIDGE="br-gluster"

vim /etc/sysconfig/network-scripts/ifcfg- br-gluster
DEVICE="br-gluster"
ONBOOT="yes"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="172.18.1.10"
NETMASK="255.255.255.0"
NETWORK="172.18.1.0"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
DHCPV6C="no"
STP="on"
DELAY="0.0"

vim /etc/sysconfig/network-scripts/ifcfg- enp4s0f1
DEVICE=enp4s0f1
ONBOOT=yes
BRIDGE="br-servers"

vim /etc/sysconfig/network-scripts/ifcfg- br-servers
DEVICE="br-servers"
ONBOOT="no"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="172.18.2.10"
NETMASK="255.255.255.0"
NETWORK=”172.18.2.0”
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
DHCPV6C="no"
STP="on"
DELAY="0.0"

iptables -I FORWARD -m physdev --physdev-is-bridged -j ACCEPT
  
yum install iptables-services
service iptables save

service libvirtd reload

brctl show
Ip addr show dev br-datacenter

At this point we have linux with 3 different subnets hooked onto separate bridge each.Linux will use a single default gateway and the main routing table. The first interface br-datacenter will be able to communicate out but the second br-gluster and br-servers will not be able to forward packets to the external netwok.

We now need routing table per interface: 

vi /etc/iproute2/rt_tables
#Routing tables for "br-datacenter" "br-gluster" "br-servers"
1 datacentertable
2 glustertable
3 servertable

Then we need to populate the routing tables (except for br-datacenter which can be left to use the main routing table) 
vi /etc/sysconfig/route-br-gluster
172.18.1.0/24 dev br-gluster src 172.18.1.10 table glustertable
default via 172.18.1.1 dev br-gluster table glustertable
vi /etc/sysconfig/rule-br-gluster
from 172.18.1.10 table glustertable
to 172.18.1.10 table glustertable

On the secondary and tertiary bridge interfaces we need to disable the default route and the gateway as well 
DEFROUTE="no"
#GATEWAY="not defined"
systemctl restart network.service

Checking the config: 
Ip route show cache
Ip route flush cache
Ping -I br-gluster
Traceroute -I br-gluster

Ip route show
Ip route show servertable
Ip rule show


oVirt

Once we have the routing in place we continue with deploying the Virtual Management Engine first and then the Hosts to form a cluster under this umbrella. 

yum install http://resources.ovirt.org/pub/yum-repo/ovirt-release41.rpm 
yum install ovirt-engine -y
engine-setup – accept default whilst making sure the hostname is the FQDN of the server.
systemctl status ovirt-engine.service

Browse to the FQDN: https://ovirt.yourdomain.com




Creating gluster shared storage for hyperconverged cluster

1. Configure IP addresses on the Host NICs participating in the Gluster Network. Create the network using oVirt manager, which will create the necessary bridges on the Hosts and join the NICs to this bridges:







2. Add additional vmdk to be used for building the Gluster distributed storage on each host:






3. Create DNS resolution for the Gluster NIC on the hosts. This is necessary to build and keep the gluster traffic separate from the other VM traffic.
In this scenario I have Sophos UTM acting as the DNS server, you may have Windows or Linux DNS server therefore configure accordingly:



4. Configure the Gluster Network manually:

PHYSICAL VOLUME COMMANDS: 
lsblk
pvcreate /dev/sdb

Note: if pvcreate /dev/sdb fails check if enabled and disable mutipathing:
* multipath -l (to list all multipath devices)
* black list all devices in /etc/multipath.conf by adding the lines below, to create file run the command 'vdsm-tool configure --force' which will generate the file if not already there:
blacklist {
devnode "*"
}
*mutipath -F to flush the multipath devices

pvdisplay /deb/sdb
vgcreate vg_gluster /dev/sdb
vgdisplay vg_gluster
vgremove vg_gluster

LOGICAL VOLUME CREATION: 
lvcreate -L 16GB --name gluster_meta vg_gluster -> creates POOL for metadata to be used later on
lvdisplay /dev/vg_gluster/gluster_meta
lvcreate -L 350GB --name gluster_pool vg_gluster -> creates POOL for storage to be used later on
lvdisplay /dev/vg_gluster/gluster_pool
lvconvert --thinpool vg_gluster/gluster_pool --poolmetadata vg_gluster/gluster_meta -> adds the metadata POOL to the storage POOL
lvchange --zero n vg_gluster/gluster_pool -> ZERO's the volume


FINALLY:
lvcreate -V 350G -T vg_gluster/gluster_pool -n gluster_vsan -> create the logical VOLUME from the POOL
lvdisplay /dev/vg_gluster/gluster_vsan


VOLUME FORMATING:
mkfs.xfs /dev/vg_gluster/gluster_vsan

--Mounting the volume 
mkdir -p /bricks/brick1

Add this entry in "/etc/fstab" 
/dev/vg_gluster/gluster_vsan /bricks/brick1 xfs defaults 0 0

Mount and test the fstab: 
mount /bricks/brick1
df -h

DISTRIBUTED GLUSTER VOLUME:

In this case we are just creating a shared volume with no replication. If replication is required consult corresponding gluster guides. 
iptables -I INPUT -p all -s 172.18.1.14 -j ACCEPT
iptables -I INPUT -p all -s 172.18.1.13 -j ACCEPT
iptables -I INPUT -p all -s 172.18.1.12 -j ACCEPT
iptables save

Start gluster: 
systemctl enable glusterfsd.service
systemctl enable glusterd
systemctl start glusterfs.service
systemctl start glusterd

From node01: 
gluster peer probe node02gluster.infotron.com.au
gluster peer probe node03gluster.infotron.com.au
gluster peer status

mkdir /bricks/brick1/brick

gluster>
volume create gluster_vsan node01gluster.infotron.com.au:/bricks/brick1/brick node02gluster.infotron.com.au:/bricks/brick1/brick node03gluster.infotron.com.au:/bricks/brick1/brick

gluster volume info all
gluster volume start gluster_vsan
gluster volume status

In order to able to add the volume in oVirt, add following permisions to the gluser volume:
gluster volume set gluster_vsan storage.owner-uid 36
gluster volume set gluster_vsan storage.owner-gid 36



5. Add the storage in oVirt cluster 
yum install ntp
systemctl enable ntpd
systemctl start ntpd






At this point the Data Center should also go Green/Active




Done.

Hyperconverged vSphere/VSAN alternative for free!!!!

Comments

Most Popular

Creating oVirt ISO domain: Glusterised

ESXi 6.5 on KVM

oVirt: Creating a VM

VyOS ultra basic quick start guide

Installing .NET 3.5 on Windows Server 2012 / 2012R2

MSTeams: Powershell for Linux

MSTeams User Direct routing number

Contact me by email

Name

Email *

Message *