Runecast Analyzer makes hardware checking against the VMware HCL easy

Runecast Analyzer is a tool that helps VMware administrators to proactive manage there vSphere environment. It discovers potential risks in the VMware environment before they can cause a major outage. It uses best practices, security hardening guides (VMware, DISA STIG, PCI-DSS v3.2.1 and HIPAA) and known issues found in the VMware Knowledge Base to protect the Software Defined Data Center (SDDC).  Runecast Analyzer supports the following VMware products:

  • VMware vSphere
  • VMware vSAN
  • VMware NSX-V
  • VMware Horizon

Runecast Analyzer introduced a new feature called “Automated VMware HCL” and “ESXi Compatibility Simulation“. The “Automated VMware HCL” feature checks the VMware ESXi host hardware, driver and firmware versions against the VMware Hardware Compatibility List (HCL). The VMware Hardware Compatibility List (HCL) lists all the physical hardware components, driver and firmware versions that are supported by VMware. Keeping the hardware aligned with the VMware HCL is essential for a healthy, stable and supported VMware environment but can be difficult to perform. For example see the blog post below how to identify a networkcard and the supported driver.

Identify NIC driver and supported driver version for ESXi server

 

Within the “Automated VMware HCL” feature you can enable “ESXi Comparability Simulation“. ESXi comparability simulation checks the existing hardware against a newer VMware ESXi version before upgrading to this new version so you can verify if the hardware, driver and firmware levels are supported.

Automated Hardware Compatibility

After deploying the Runecast Analyzer appliance and connecting to one or more vCenter Servers, the first scan can be performed by clicking on the purple “Analyze Now” button. When the scan is completed select “HW Compatibility” on the left menu bar. By default all ESXi hosts are listed. In the action pane you can specify a specific clusters or one or more host(s).

The screenshot shows the host, ESXi release, hardware summary and the compatible status of the BIOS and I/O devices. The BIOS and I/O Devices are red in this example which means they need attention. All the hardware, firmware and driver  results can be exported to a CSV file. Per ESXi host you can drill down to the server hardware.

The BIOS state needs an update, it’s reported as possible incompatibility “Not Found” in the HCL Data field. When clicking on the “HCL online” button we’ve got redirected to the VMware Compatibility List (HCL).

The VMware HCL tells that the BIOS level that matches is version 1.2. After the BIOS view we go to the I/O devices by clicking on the I/O Devices tab

The Intel I350 Gigabit and the Samsung NVMe SSD Controller needs attention. When looking at the Intel I350 in the HCL we see that the the firmware level is okay and that the installed driver version is 0.1.1.0 is old. The HCL reports that version 1.4.1 is needed.

ESXi Compatibility Simulation

With the Hardware Compatibility Overview there is another feature called “ESXi Compatibility Simulation“.  ESXi Compatibility Simulation checks the existing hardware against a newer VMware ESXi version before upgrading to this new version so you can verify if the hardware, driver and firmware levels are supported.

After turning on the ESXi Compatibility Simulation feature and selecting the ESXi version to upgrade to you can fire the simulation. In this environment I want to upgrade to ESXi 6.7 U2 and it shows that the BIOS not compatible.

Conclusion

It can be difficult and time consuming for VMware admins to check if the server hardware is aligned with the VMware Hardware Compatibility List (HCL) for maintaining a healthy, stable and supported environment. Runecast Analyzer makes this very easy and fast by performing a simple scan and see if the hardware of the VMware environment is complaint with the VMware HCL.

Another great feature is the ESXi Compatibility Simulation. Again with a simple scan you check if the hardware is compatible against a newer version of ESXi before actually upgrading to this version. The Automated Hardware Compatibility and ESXi Compatibility Simulation are great new features that saves a lot of value time investigating if the VMware environment is compliant.

You can download a 30 day full trial version of Runecast Analyzer and try it yourself.

 

Monitor vSAN with ControlUp

One of the new enhancements of ControlUp 7.3 is vSAN monitoring support. ControlUp will detect the vSAN cluster(s), objects and displays real-time vSAN specific metrics and metadata. In this blog post I highlight the features of the new vSAN integration in ControlUp 7.3.

Installation

The vSAN cluster is automatically recognized by ControlUp when the following requirements are met:

  • PowerShell minimum Version 5.0
  • VMware PowerCLI 10.1.1.x
  • .NET framework version 4.5
  • vSAN Performance service should be turned on on the cluster
  • The user account configured for the hypervisor connection requires the “storage.View” permission.

Running ControlUp is easy, no installation is needed, simple execute a single executable (ControlUpConsole.exe). After starting ControlUp, add the vCenter server and the vSAN cluster(s) are automatically recognized. When clicking on the vSAN cluster you see real-time metadata and performance metrics.

Views

There are several preset views available with vSAN metrics such as:

  • vSAN Performance. Includes vSAN performance metrics such as IOPS, latency, cache and buffers.
  • vSAN Health. Includes the vSAN health checks
  • vSAN Host Network. Includes vSAN network I/O and packet loss metrics.

You can easily switch between predefined views in the “Colum Preset”. Here is an overview of vSAN metrics used by ControlUp:

Datastores: Name, Type, Capacity, Read/Write IOPS, Read/Write Rate, Read/Write Latency, Compression, Capacity Deduplication, Congestion, Outstanding IO, Disk Configuration, Total Used Capacity, Total Used – Physically Written, Total Used – VM Overreserved, Total Used – System Overhead, vSAN Free Capacity, vSAN Health, vSAN Cluster Health, vSAN Network Health, vSAN Physical Disk Health, vSAN Data Health, vSAN Limits Health, vSAN Hardware Compatibility Health, vSAN Performance Service Health, vSAN Build Recommendation, vSAN Online Health.
Datastores on Hosts: Name, Type, Capacity, Read/Write IOPS, Read/Write Rate, Read/Write Latency, Compression, Capacity Deduplication, Congestion, Outstanding IO, Local Client Cache Hit IOPS, Local Client Cache Hit Rate, vSAN Max Read Cache Read Latency, vSAN Max Write Buffer Write Latency, vSAN Max Read Cache Write Latency, vSAN Max Write Buffer Read Latency, vSAN Min Read Cache Hit Rate, vSAN Write Buffer Min Free Percentage, vSAN Host Network Inbound/Outbound I/O Throughput, vSAN Host Network Inbound/Outbound Packets Per Second, vSAN Host Network Inbound/Outbound Packet Loss Rate

When navigating you see all those metrics available on the vSAN cluster, vSAN datastores on hosts, virtual disks and vSAN Host network utilization views. You can easily drill down by double clicking from the vSAN datastore to the diskgroup(s) on each ESXi host and then drill down to the the virtual disk(s). From the virtual disk(s) you can drill down to the Windows process.

Example: Find the root cause of high IOPS load on the vSAN cluster.

In the following example we will identify a Windows process that is causing high IOPS stress on the vSAN cluster. We drill down from the vSAN cluster to the vSAN diskgroup of the ESXi host to the virtual disk to the process level in the VM to find the root cause of the high IOPS.

  • In the vSAN Performance view we see the stress level has changed and a high IOPS load.

  • In the IOPS we see that the threshold of 2000 is crossed. This threshold is default and can be adjusted. The Virtual Expert suggest to navigate to the “Datastore on Hosts (IOPS detailed View).

  • When double clicking on the “Datastore on Host” we see that “esxin04.lab.local” is generating the IOPS load.

  • The vSAN diskgroup of the “esxin04.lab.local” host has a virtual disk that belongs to the “ControlUp-vSAN-Test” VM that is causing the high IOPS load.

  • When double clicking on the virtual disk we go the the “Processes” view and see that “diskspd.exe” process is causing the high IOPS load.

  • Optional: Right click on the process and select kill to end the “diskspd.exe” process. This stops the IOPS load on the vSAN cluster.

This example shows how easy it is to identify what process is causing stress on the vSAN cluster.

Alerting and reporting

For alerting you can add triggers in ControlUp to notify you when something happens on the vSAN cluster such as a change in the stress level for a period of time.

When using the triggers you’re able to start investigating it right away when something happening on the vSAN cluster. All the vSAN data is transferred to ControlUp Insight for historical reporting and analytics. This is great for analyzing data and trends over time and can be very useful when investigate issues and understanding what is going on you’re environment.

Conclusion

ControlUp is easy to set-up and great for fast troubleshooting. In version 7.3 is vSAN support added. As shown in the this blog post with a couple of double clicks you’re able to perform a root cause analysis and find what process is causing the high IOPS on the vSAN.

There is a free trail available. Give it a try here: link

New enhancements in Runecast Analyzer 2.0

Runecast Analyzer provides proactive management for VMware environments. It discovers potential risks in the VMware environment before they can cause a major outage. In 90% of the outages with VMware environments, the root cause is based on a known issue that is already available in the VMware knowledge base. Runecast Analyzer uses information from the VMware knowledge base, security hardening guides (VMware, DISA STG and PCI-DSS), and best practices to proactively identify problems or outages before they occur.

In my last review of Runecast Analyzer I tested version 1.7 (link) with vSphere and vSAN support. The next version (1.8) included NSX-V support and a couple of weeks ago version 2.0 is of Runecast Analyzer is released. This version includes the following new enhancements.

New User Interface (UI)

Runecast Analyzer 2.0 has a complete redesigned User Interface(UI) that includes new widgets such as:

  • Historical Trending
  • Host with Most Issues

History trending

It includes historical trending for at least 3 months of vSphere, vSAN and NSX-V scan results. By default every day (this can be changed) a scan is performed against one of more vCenter environment(s). The scans contains the description, IP address and why the issues was detected. The trending information is showed in widgets in the UI.

With this functionality you can keep track how compliant you are and what progress you made to solve issues. All the detected issues are summarized in the “Issue History” widget per day or weeks.

Hosts with Most Issues

Another new widget in the UI is the “Hosts with Most Issues”. It shows which ESXi host that has the most issues and deserves the most priority to investigate.

History Analysis

History Analysis is a new functionality that helps with isolating the root cause of the reported incident as quick as possible.

The first section shows a chart with a trend of detected and fixed issues over time. There are interactive dots in the chart trend that shows  issues and details of the scan. The second section shows a table with detailed descriptions of the issues.

Within the history analysis there can be filtered on:

  • Severity (Critical, Major, Medium or Low)
  • Source ( PCIDSS, SH, BP or KB)
  • Applies to (Network, Compute, vCenter, Management or VM)
  • Products (NSX-V or vSphere)

The issue results can be compared with previous scan results and the differences are showed.

This makes the new history analysis very powerful for finding issues in the vSphere environment for example after a maintenance window when performing configuration changes.

vSphere 6.7 with vSphere HTML5 client support

Runecast Analyzer supports vSphere 6.7 and has a HTML5 web-plugin for the vSphere Client and even integrates in the NSX dashboard.

PCI-DSS compliance 

Runecast Analyzer 2.0 includes a new profile with 226 different checks for the Payment Card Industry Data Securiy Standard (PCI-DSS). The profile can be enabled and automatically checks if you are compliant with the PCI-DSS profile (Runecast Analyzer supports PCI DSS 3.2.1).

This helps with becoming PCI-DSS compliant and very helpful for companies in the financial space.

The PCI-DSS results can be easily filtered and exported in different formats (PDF, CSV or clipboard copy). This can be useful when having for example an audit.

Latest VMware Knowledge Base updates

When there are new knowledge definitions available the definition database can be (automated) updated. For example with the Spectre, Meltdown and L1TF vulnerabilities, Runecast Analyzer can quickly identify those vulnerabilities when VMware releases the KB articles.

Appliance Update

In version 2.0 of Runecast Analyzer the internal components of the appliance are updated to the latest versions (such as Ubuntu, 14.04.05 LTS, PostgreSQL 10, Apache Tomcat  9.0.10 and TLS 1.2 is used). The appliance meets the latest security compliance. The appliance and knowledge definitions can be easily updated when a new version is available.

For new users deploying a new appliance (OVF) is a piece of cake. Runecast Analyzer is installed en operational within a couple of minutes. A free Runecast Analyzer trail or demo can be requested by using the following link.

Version 2.0 of Runecast Analyzer adds great new enhancements that helps better to proactively identify problems or outages before they occur and easily check the compliance of the VMware vSphere, vSAN en NSX-V environment.