KVM on CentOS: Hyperconverged nested oVirt Cluter with Gluster vSAN
KVM on Centos for Nested hyperconverged oVirt cluster.
WARNING: This is not step by step guide and omits much of the self-explanatory steps such as individual installation instructions and basic configuration instructions.
What is used?
One host with 64GB RAM, ~2TB Storage (128SSD & 2000SSHD) and i5 CPU,
Centos7: KVM, QEMU,
oVirt v4.1.2
oVirtHost v4.1.2
Gluster 3.10
A wicked mind
From a high level perspective the infrastructure looks like this:
The Physical host is installed onto the SSD disk
The Nested hosts are installed onto the SSHD Datastore
The final result is:
----Physical KVM
|-----------------------Virtual oVirt
|-----------------------Nested KVM/oVirtNode-1--------------Guest VMs
|-----------------------Nested KVM/oVirtNode-2--------------Guest VMs
|-----------------------Nested KVM/oVirtNode-3--------------Guest VMs
First install your Host Hypervisor; the tools and packages:
sudo yum -y install qemu-kvm libvirt virt-install bridge-utils virt-install
In case of CentOS minimal install X window manager is required to be installed on order to start virt-manager locally.
yum install "@X Window System" xorg-x11-xauth xorg-x11-fonts-* xorg-x11-utils-y
It doesn't makes sense for me to use virt-manager locally as the server will always be administered remotely, but nonetheless should you wish to there you go.
Second, start and enable the virtualisation service:
systemctl start libvirtd
systemctl enable libvirtd
systemctl list-unit-files | grep libvirtd
Make sure modules are loaded:
lsmod | grep kvm
Virtualisation Internals
Location of the VM XML config (I/O) files:
/etc/libvirt/qemu
The management layer (API) which interacts with the hypervisor is called libvirt. The command line client interface of libvirt is called virsh. This is equivalent to esxcli. The graphical client interface of libvirt is called virt-manager. This is equivalent vSphere Client.
Validation
To check server virtualisation capability
virt-host-validate
Check if management CLI virsh is installed and use it to check hypervisor/host info:
virsh nodeinfo
virsh domcapabilities
virsh domcapabilities | grep -I max
virsh domcapabilities | grep diskDevice -A 5
Creating Machines
List the configured networks:virsh net-list –all
List the details of a specific network:
virsh net-info default
List the config parameters of specific network:
virsh net-dumpxml default
Virtual network configuration files are stored in
/etc/libvirt/qemu/networks/
as XML files.
For the default network it is
/etc/libvirt/qemu/networks/default.xml.
Stopping and starting virtual network:
virsh net-destroy default
virsh net-start default
Finding partition and available storage disks:
fdisk -l
lsblk
lsblk -f
df -h
Creating new storage pool (partition based storage pool):
1. Partitioning:[root@localhost home]# parted -a optimal /dev/sdb
GNU Parted 3.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mkpart primary ext4 0% 100%
(parted) print
Model: ATA ST2000DX001-1NS1 (scsi)
Disk /dev/sdb: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1. 1049kB 2000GB 2000GB primary
- Set the filesystem to the newly created partition
mkfs.ext4 /dev/sdb1
- Label the partition
e2label /dev/sdb1 storage_pool_01
- Check the label
lsblk -f
- Define the pool (partition based storage pool in this case)
pool-define-as VMs fs - - /dev/sdb1 - "/var/lib/libvirt/filesystems/local/"
- List the pool
virsh pool-list --all
- Start the new storage pool
virsh pool-start VMs
- List if pool started successfully
virsh pool-list --all
- Turn autostart ON
virsh pool-autostart VMs
- Verify storage pool configuration
virsh pool-info VMs
- Delete storage pool
virsh pool-destroy VMs
- Undefine storage pool
virsh pool-undefine VMs
- Delete pool directory
Virsh pool-delete VMs
2. Create the virtual disk for the machine
qemu-img create -f raw -o size=500GB /var/lib/libvirt/filesystems/local/esxi_01.img
3. Create the virtual machine
Use virt-manager
Note: In case of hosting another hypervisor in KVM, such as ESXi,KVM,XEN, Hardware Virtualisation needs to be enabled:
vim /etc/modprobe.d/nested.conf
options kvm ignore_msrs=1
options kvm-intel nested=y ept=y
modprobe -r kvm_intel
modprobe kvm_intel
To check if enabled:
cat /sys/module/kvm_intel/parameters/nested
Do below or use virt-manager to tick the “Copy host cpu” box.
virsh edit node01
<cpu mode=’host-passthrough’>
Cloning Machines (Clean install is always preferred over cloning)
virsh list --allvirsh suspend node01
virsh list --all
virt-clone --connect qemu:///system --original kvm01 --name kvm02 --file /var/lib/libvirt/filesystems/local/kvm02_os.img
Note: it is important to remove all additional disks off original vm before cloning. If disks are to be cloned however than --file portion of the command has to be repeated for every disk that will be cloned.
Preparing/decontectualizing the cloned VM (creating a fresh copy by removing copied MAC, hostname, accounts, SSH keys, etc.)
yum install libguestfs-tools
virt-sysprep -d node02
virsh node02 start
Bridged Networking; Multiple Subnets
NAT’ed deafult networking will allow the machines to communicate out from KVM but cannot be accessed from externally. To allow both incoming and outgoing communications bridge networking is required.
NetworkManager service does not support bridging hence needs to be disabled and the plain network service that works with network scripts enabled.
chkconfig NetworkManager off
chkconfig network on
service NetworkManager stop
service network start
When running multiple bridges, each to/with its own subnet things get complex. Linux would only allow incoming/outgoing communication from the virtual to the physical infrastructure through one interface/bridge/default gateway only. To achieve independent routing per bridge/subnet we need to introduce independent routing tables for each bridge interface/subnet.
For example, lets assume that we have 3 physical/virtual networks:
---physical machines----->br-datacenter<---virtual machines---Mgmt Netowrk
---physical machines------>br-gluster<------virtual machines---Storage Network
---physical machines------>br-servers <-----virtual machines---Server Network
---PC ADMIN--------------------|
To create the bridged interface use the settings below or best use virt-manager GUI to create them. When creating them using virt-manager always have the physical interface already preconfigured with all the IP addressing so you can copy the settings onto the bridge.
vim /etc/sysconfig/network-scripts/ifcfg-enp7s0
DEVICE="enp7s0"
ONBOOT="yes"
BRIDGE="br-datacenter"
touch /etc/sysconfig/network-scripts/ifcfg-br-datacenter
vim /etc/sysconfig/network-scripts/ifcfg-br-datacenter
DEVICE="br-datacenter"
ONBOOT="yes"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="172.18.0.10"
NETMASK="255.255.255.0"
GATEWAY="172.18.0.1"
DEFROUTE="yes"
IPV6INIT="yes"'
IPV6_AUTOCONF="yes"
DHCPV6C="no"
STP="on"
DELAY="0.0"
DNS1=172.18.0.1
vim /etc/sysconfig/network-scripts/ifcfg- enp4s0f0
DEVICE=enp4s0f0
ONBOOT=yes
BRIDGE="br-gluster"
vim /etc/sysconfig/network-scripts/ifcfg- br-gluster
DEVICE="br-gluster"
ONBOOT="yes"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="172.18.1.10"
NETMASK="255.255.255.0"
NETWORK="172.18.1.0"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
DHCPV6C="no"
STP="on"
DELAY="0.0"
vim /etc/sysconfig/network-scripts/ifcfg- enp4s0f1
DEVICE=enp4s0f1
ONBOOT=yes
BRIDGE="br-servers"
vim /etc/sysconfig/network-scripts/ifcfg- br-servers
DEVICE="br-servers"
ONBOOT="no"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="172.18.2.10"
NETMASK="255.255.255.0"
NETWORK=”172.18.2.0”
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
DHCPV6C="no"
STP="on"
DELAY="0.0"
iptables -I FORWARD -m physdev --physdev-is-bridged -j ACCEPT
yum install iptables-services
service iptables save
service libvirtd reload
brctl show
Ip addr show dev br-datacenter
At this point we have linux with 3 different subnets hooked onto separate bridge each.Linux will use a single default gateway and the main routing table. The first interface br-datacenter will be able to communicate out but the second br-gluster and br-servers will not be able to forward packets to the external netwok.
We now need routing table per interface:
vi /etc/iproute2/rt_tables
#Routing tables for "br-datacenter" "br-gluster" "br-servers"
1 datacentertable
2 glustertable
3 servertable
Then we need to populate the routing tables (except for br-datacenter which can be left to use the main routing table)
vi /etc/sysconfig/route-br-gluster
172.18.1.0/24 dev br-gluster src 172.18.1.10 table glustertable
default via 172.18.1.1 dev br-gluster table glustertable
vi /etc/sysconfig/rule-br-gluster
from 172.18.1.10 table glustertable
to 172.18.1.10 table glustertable
On the secondary and tertiary bridge interfaces we need to disable the default route and the gateway as well
DEFROUTE="no"
#GATEWAY="not defined"
systemctl restart network.service
Checking the config:
Ip route show cache
Ip route flush cache
Ping -I br-gluster
Traceroute -I br-gluster
Ip route show
Ip route show servertable
Ip rule show
oVirt
Once we have the routing in place we continue with deploying the Virtual Management Engine first and then the Hosts to form a cluster under this umbrella.yum install http://resources.ovirt.org/pub/yum-repo/ovirt-release41.rpm
yum install ovirt-engine -y
engine-setup – accept default whilst making sure the hostname is the FQDN of the server.
systemctl status ovirt-engine.service
Browse to the FQDN: https://ovirt.yourdomain.com
Creating gluster shared storage for hyperconverged cluster
1. Configure IP addresses on the Host NICs participating in the Gluster Network. Create the network using oVirt manager, which will create the necessary bridges on the Hosts and join the NICs to this bridges:2. Add additional vmdk to be used for building the Gluster distributed storage on each host:
3. Create DNS resolution for the Gluster NIC on the hosts. This is necessary to build and keep the gluster traffic separate from the other VM traffic.
In this scenario I have Sophos UTM acting as the DNS server, you may have Windows or Linux DNS server therefore configure accordingly:
4. Configure the Gluster Network manually:
PHYSICAL VOLUME COMMANDS:
lsblk
pvcreate /dev/sdb
Note: if pvcreate /dev/sdb fails check if enabled and disable mutipathing:
* multipath -l (to list all multipath devices)
* black list all devices in /etc/multipath.conf by adding the lines below, to create file run the command 'vdsm-tool configure --force' which will generate the file if not already there:
blacklist {
devnode "*"
}
*mutipath -F to flush the multipath devices
pvdisplay /deb/sdb
vgcreate vg_gluster /dev/sdb
vgdisplay vg_gluster
vgremove vg_gluster
LOGICAL VOLUME CREATION:
lvcreate -L 16GB --name gluster_meta vg_gluster -> creates POOL for metadata to be used later on
lvdisplay /dev/vg_gluster/gluster_meta
lvcreate -L 350GB --name gluster_pool vg_gluster -> creates POOL for storage to be used later on
lvdisplay /dev/vg_gluster/gluster_pool
lvconvert --thinpool vg_gluster/gluster_pool --poolmetadata vg_gluster/gluster_meta -> adds the metadata POOL to the storage POOL
lvchange --zero n vg_gluster/gluster_pool -> ZERO's the volume
FINALLY:
lvcreate -V 350G -T vg_gluster/gluster_pool -n gluster_vsan -> create the logical VOLUME from the POOL
lvdisplay /dev/vg_gluster/gluster_vsan
VOLUME FORMATING:
mkfs.xfs /dev/vg_gluster/gluster_vsan
--Mounting the volume
mkdir -p /bricks/brick1
Add this entry in "/etc/fstab"
/dev/vg_gluster/gluster_vsan /bricks/brick1 xfs defaults 0 0
Mount and test the fstab:
mount /bricks/brick1
df -h
DISTRIBUTED GLUSTER VOLUME:
In this case we are just creating a shared volume with no replication. If replication is required consult corresponding gluster guides.
iptables -I INPUT -p all -s 172.18.1.14 -j ACCEPT
iptables -I INPUT -p all -s 172.18.1.13 -j ACCEPT
iptables -I INPUT -p all -s 172.18.1.12 -j ACCEPT
iptables save
Start gluster:
systemctl enable glusterfsd.service
systemctl enable glusterd
systemctl start glusterfs.service
systemctl start glusterd
From node01:
gluster peer probe node02gluster.infotron.com.au
gluster peer probe node03gluster.infotron.com.au
gluster peer status
mkdir /bricks/brick1/brick
gluster>
volume create gluster_vsan node01gluster.infotron.com.au:/bricks/brick1/brick node02gluster.infotron.com.au:/bricks/brick1/brick node03gluster.infotron.com.au:/bricks/brick1/brick
gluster volume info all
gluster volume start gluster_vsan
gluster volume status
In order to able to add the volume in oVirt, add following permisions to the gluser volume:
gluster volume set gluster_vsan storage.owner-uid 36
gluster volume set gluster_vsan storage.owner-gid 36
5. Add the storage in oVirt cluster
yum install ntp
systemctl enable ntpd
systemctl start ntpd
At this point the Data Center should also go Green/Active
Done.
Hyperconverged vSphere/VSAN alternative for free!!!!










Comments
Post a Comment
Feel free to engage in conversation.
New problems are solved by new thinking.