introduction
In IT world : administrators looking to achieve high availability with different ways, including but NOT limited to :
- HA applications,
- redundant NICs,
- server clusters,
- redundant power supplies, etc.
- redundant HDD with SAN storage
- VMware provide high availability at the virtualization layer with a feature called vSphere High Availability
vSphere HA protect against the following types of failures:
- ESXi host failure– if an ESXi host fails, VMs that were running on that host are automatically restarted on other ESXi hosts.
- Guest OS failure– if the VM Monitoring option is enabled and the VM stops sending heartbeats, the guest OS is reset. The VM stays on the same ESXi host.
- Application failure– the agent on an ESXi host can monitor heartbeats of applications running inside a VM. If an application fails, the VM is restarted, but it stays on the same host. This type of monitoring requires a third-party application monitoring agent and VMware Tools.
Where vSphere HA is configured ?
- vSphere HA is configured on a cluster.
- A cluster is a collection of ESXi hosts configured to share their resources. Up to 32 ESXi hosts and 4000 VMs per cluster are supported.
Which license vSphere support HA ?
- HA is available in ALL VSphere licenses[ std , EP , Plat]
- VCenter HA does not require any additional license. It just requires an vCenter Server Standard single license to work.
How vSphere HA works ?
- In HA there is master host [take decision which host will run VM if host01 down ]
- and salve host [which run replica VM copies ]
- VSphere HA uses TCP and UDP port 8182 for agent-to-agent communication. à The firewall ports open and close automatically to ensure they are open only when needed
HA requirements
- A minimum of three hosts >> DONE
- The hosts should be running at least ESXi 5.5 >> DONE : our vSphere is 6.7
- The management network should be configured with a static IP address and reachable FQDN >> DONE
- SSH should be enabled on the VCSA >> DONE
- A port group for the HA network is required on each ESXi host >> DONE : we have configure new port group for HA on each ESXI
- The HA network must be on a different subnet to the management network >> DONE : management subnet is 172.16.x.x/16 and HA network is 172.21.x.x./16
- Network latency between the nodes must be less than 10ms >> DONE
- vCenter HA is compatible with both embedded deployment model and external PSC >> DONE
- HA require that all hosts to be connected to shared storage [ISCSI or SAN or NFS] à so host run [ VM compute ] >> DONE : we have shared storage ISCSI DataSstore and also NFS DataSstore
- HA highly recommended to install [VM tools ] on al VMs since HA could monitor VMs by hear beating [VM tools ] >> DONE
- HA relay on heart beating [pinging message every seconds to detect if host is alive or NOT] >> DONE
Admission Control
The Admission Control is one of the most important features in vSphere HA
admission control ensures that sufficient resources are available in a HA cluster to provide failover protection.
To understand admission control let us to have have the following scenarios :
example 1 :
let us to suppose that We have in clutser16 three ESXI :
- ESXI151
- ESXI152
- ESXI153
If we enable admission control : HA will reserve 33% from each ESXI just in case ESXI down
And will NOT you enable to power-on any new VM over than 66% of ESXI resources even you have 33% resources free , because its reserved for HA [just in case any ESXI down ]
example 2
Supposed we have 4 ESXI server in cluster16 :
- ESXI151
- ESXI152
- ESXI153
- ESXI154
Then admission control will reserve 25% resources from each ESXI server for HA and will allow to use only 75% and keep 25 for HA [just in case any ESXI down ]
what if you disable admission control ?
This mean you can utilize 100% of ESXI resources to run VM
BUT
If any ESXI down > HA will NOT work effectively : simply because there is NO free resources on other ESXI servers to run VM of down ESXI server
I hope 2 example above explain admission control
Admission Control policy
You can choose between these four policies to define how Admission Control will ensure capacity for the cluster:
Define failover capacity by static number of hosts
- a number of hosts that may fail is specified.
- Spare capacity is calculated using a slot-based algorithm.
- A slot represents the amount of memory and CPU assigned to powered-on virtual machines.
- This option is recommended in vSphere environments that have VMs with similar CPU and memory reservations.
Define failover capacity by reserving a percentage of the cluster resources
- a percentage of the cluster’s aggregate CPU and memory resources that will be reserved for recovery from ESXi host failures is specified.
- The specified percentage indicates the total amount of resources that will remain unused for vSphere HA purposes.
- This option is recommended in vSphere environments that have VMs with highly variable CPU and memory reservations.
Use dedicated failover hosts
- one or more hosts are used exclusively for failover purposes.
- The failover hosts cannot have powered-on virtual machines, because they are used for failover purposes only.
Do not reserve failover capacity
- VMs can be powered on, even if the availability constraints are violated.
- This option basically disables Admission Control.
conclusion
in HA part I :
we have discussed the concept of HA and it’s requirements
also we the great feature admission control which will reserver free resource of ESXI just in case failover happened
in next article : we iwl see how to configure HA and set advanced option