Following on from my first two blogs investigating HPE Nimble dHCI (see here and here), this time I thought i’d take a look at the ability to perform full-stack upgrades on the platform using the Nimble dHCI 1-Click Upgrade feature, which became available to all customers within the NimbleOS 5.2.x payload in mid-2020.
When considering what HCI is (which can be constituted as somewhat of a religious battle between various vendors & analysts) – if you were to ask me, someone that’s been around virtualisation since 2003 – it’s really about:
- Empowering the VM-administrator though a single user interface and removing cumbersome infrastructure management roadblocks
- Simplifying the IT hardware stack as much as possible
- Delivering seamless lifecycle management & platform upgrading solutions
- Ultimately – remove the pain and making VM-infrastructure dependable and boring!
When looking at points 1-4, the thing that stands out is HCI is really about the management experience. And thus infrastructure upgrades and lifecycle management is a HUGE part of that requirement.
The engineers and product management for Nimble dHCI realised this need very early on into building the product, hence why the capability has been released only 9 months after the initial release of the first product to market. They also understood that in order to fully allow for cross-platform upgrades across the stack of compute nodes, switches, storage, ESXi, vCenter & NCM, the capability needed to be built and engineered from scratch – OUTSIDE and away from VMware’s recent vLCM capability – which only does a small portion of the dHCI functionality.
Furthermore, the dHCI engineering team wanted to build upon the intelligence and integrity of HPE Infosight and the global predictive analytics & wellness the platform brings to customers with allowlists and denylists to protect customers away from upgrading to potentially detrimental firmware that could expose a problem – based on machine learning.
So how does this work?
Providing lifecycle management and upgradability across the infrastructure is not something that’s easy – there’s many moving parts, and many things that can go wrong, especially when dealing with multiple vendors. The promise of HCI is to simplify this experience drastically.
For any infrastructure solution (HCI or not) there are multiple complex moving parts within the equation, all of which require consideration on how they should be patched, upgraded and maintained. within dHCI we’ve broken those moving parts down into the following categories:
- dHCI Compute Nodes – HPE Proliant rackmount servers*
- dHCI Storage Nodes – HPE Nimble Storage AF/HF arrays
- VMware ESXi on each compute node
- VMware vCenter for cluster administration
- Nimble dHCI Integration, comprising of:
- Stack Setup
- Stack Manager
- Stack Scalability
- Configuration Checker
- 1-click upgrades
- Nimble VMware Integration, comprising of:
- Nimble Connection Manager (MPIO & connection service)
- VAAI primitives for storage awareness
- VASA provider for SPBM
*Proliant Gen10 servers only.
As you can see this a large list of software, firmware and potential headaches around interoperability come upgrade time. The good news is that today within dHCI it’s capable of handling intelligent upgrades and lifecycle management of 5 out of the 6 items in the list! The only thing today it doesn’t perform is an upgrade of VMware vCenter (which however is now super easy in vSphere 7). Note: this will be enabled in a future NimbleOS release.
The major headaches encountered day in, day out when performing full-stack upgrades ultimately boils down to interoperability; is the component or software that i’m upgrading going to work, be compatible and not cause any potential issues (known or not) within my environment. This is a big question, and one that i’m sure you dear reader has asked many times previously.
Nimble dHCI handles this complexity by creating and issuing Catalogs. These are bundles of tested and ratified firmware releases by dHCI engineering for every component within the stack, made available for dHCI customers to upgrade to when needed.
Catalogs are issued and staged within HPE Infosight, and Infosight connected dHCI systems authenticate against these catalogs to return to the customer the best and recommended upgrade for the environment to take. Furthermore, if for some reason there’s a potential problem within a catalog that the customer might run into within their environment (diagnosed within Infosight based on AI and Predictive Analytics), the specific dHCI catalog is made not applicable for the customer to upgrade to until the potential problem has been mitigated.
When deciding to upgrade your dHCI environment, the dHCI upgrade functionality is intelligent enough to only upgrade the components needed in order to ensure you’re compliant. It will not enforce a complete upgrade of all components should it not be needed – saving you precious time within your maintenance window.
Here’s an example of a dHCI environment. It’s running an older release of NimbleOS, NCM and ESXi, but has the most up-to-date release of vCenter and the Proliant SPP. In this instance, dHCI will only upgrade NimbleOS, NCM and ESXi to ensure it conforms to the 2.0 catalog.
If you’re interested in taking a look at the available dHCI upgrade catalogs in more detail, they are listed within the Infosight Validated Configuration Matrix for HPE customers – I blogged about it here.
Let’s take a look at a dHCI system to see how this really works
Note: this is performed on an engineering system and thus some things like the catalog numbers & NimbleOS firmware releases might not be correct 🙂
Running on a dHCI platform running NimbleOS 5.2 and above, within the dHCI management plane in VMware vCenter we can see a new “Update” menu item.
Clicking this menu item now drops us into a new part of the dHCI management plane dedicated to aid your full-stack upgrade experience.
Firstly, we can see the current version of software that’s running within our dHCI environment, and if it conforms to a catalog or not (not a problem if it doesn’t). From there, we can see the released dHCI Upgrade Catalogs (if they’re greyed out then they’re not recommended for your dHCI platform based on Predictive Analytics). Finally, dHCI then highlights the recommended upgrade for the environment in blue.
When you click ‘update’ – the dHCI platform will proceed through the following steps. All of this is entirely non-disruptive using a combination of NimbleOS HA upgrades, VMware HA, DRS and maintenance mode.
My buddy Fred Gagne blogged a bit more about the practical steps of upgrading dHCI, which is available here.
I hope you enjoyed this post! If you have any questions or intruiged about the process please feel free to comment below!
Cheers for now…. and Stay Nimble!