Many years ago back in the days of NimbleOS 2.0, the engineering teams introduced the ability to non-disruptively cluster Nimble arrays together in a scale-out fabric using a concept of groups and pools. Since then, customers have successfully used scale-out as a usecase to allow for online data migration & within their Nimble estate with no application disruption. It’s a common question from customers and partners quite how this works, and thus in this blog, i’ll walk through exactly how to perform this action in your environment.
NimbleOS Scale-Out – Concepts
Every Nimble array post NimbleOS 2.0 contains a concept of a Group, which is created as part of the initial system deployment. A group is a construct in which the management domain resides; all administration tasks (such as IP/network configuration, Active Directory user config, as well as plugin integration such as vCenter full-stack analytics) all function on the group level.
Inside of the group, a Nimble array (and all of it’s capacity expansion shelves) form a logical entity called a Storage Pool. This is where all data services such as volumes, snapshots and clones operate, as well as data reduction features such as deduplication.
Here is an example of my environment. I have a group name called “vnmbl-group” which contains a single, older Nimble iSCSI array (it could be FC – as the functionality is the same). Inside of my array group, I have a storage pool called default (which is the enforced default name prior to operating NimbleOS 5.1 – check my blog here to change it).
Inside my pool, there could be numerous volumes, with their snapshot policies, as well as zero-copy clones.
Whilst this array has done a fantastic job, I now wish to replace my older Nimble array with a newer Nimble platform. Perhaps I want to migrate my environments to All Flash, or perhaps I want to repurpose my older Nimble array to Disaster Recovery, or even for another project elsewhere. Most importantly, I want to do this as easy as possible, with the most minimal amount of application disruption! Typically, storage array data migrations are the bugbear of the enterprise IT industry.
And this is exactly what we can use Nimble Scale-Out technology to do for us, with minimal app disruption or downtime.
Before you start, it’s important to ensure:
- Both old and new platforms are residing on the same data network – ie iSCSI or Fibre Channel.
- Both systems need to run the same NimbleOS software release.
- Nimble Connection Manager should also be installed on your Windows, VMware or Linux servers. This is a critical part, and should not be overlooked.
If you’ve done all of the above, then we’re ready to get going.
Once you’ve racked, cabled and powered on your new system, the array will start sending broadcast packets onto your network in order to be discovered. Therefore, you can jump into your current Nimble array UI, head to the “Hardware” tab, and click Actions->Add Array To Group.
This will discover any Nimble arrays currently on your layer 2 network domain, and return values to identify them. As you can see in my screenshot – it shows the Serial Number, the Technology Category (All Flash or Hybrid), Model and Software version. It also shows any predicted errors that it can see with this system which will cause problems if you were to scale out!
We wish to add this new array to the group. By clicking “Add“, it will start the integration process, and ask you for some new details for the new system to have as part of the group. Here, it asks me to assign an array name, some Interface IP addresses, and what I wish to call this new Storage Pool.
Once this has been done, click “OK” to finish the process. If all goes as expected, the Hardware screen now shows two arrays in my group, each residing within it’s own pool.
Heading to the “Software” screen, we can now see that it’s possible to upgrade both of the Nimble arrays within the group to the same NimbleOS release. This is a fully managed process which will roll the process across all arrays in the group – it will start with the “group leader” first (which is typically the first array in the group) and will then conclude with other members in the group. This is a 100% non-disruptive process – as always with NimbleOS upgrades.
Checking out the “Volumes” screen, you’ll see that we have two pools in the group, with aggregated capacity, performance, data reduction and volume counts.
It’s possible to run my environment within a scale-out environment if required (the maximum we support is up to 4 arrays in a group). However, this blog is about migrating my workloads from my old array to new with minimal downtime! So let’s do that 🙂
Non-Disruptive Data Migration Between Pools
In this example, there could be many applications (VMs, physical servers, containers etc) all connected and having it’s IO served by the array group. Nimble Connection Manager (mentioned above as a critical component) is used to abstract the IO away from the iSCSI or FC persona of each array – instead it virtualises the connections to the group itself and is able to re-map and direct I/O to each array on the fly without the need for hopping between systems. This ensures that the latency is kept consistently low whilst performing data migrations.
What’s also cool here is that deduplication on Nimble occurs at the Pool level rather than volume level. Thus, if you have an older array which isn’t capable of dedupe, and you’re replacing it with a new shiny one that CAN dedupe, the systems will deduplicate the data as it transfers from the old to the new – on the fly!
Let’s kick off a data migration from old to new! To do this, in the Volumes screen – select your Volume(s) or Folder and select “Move“.
Then, it asks you to confirm the pool location to move the volumes:
Finally, it asks you to confirm that you want this action to happen. You can see a few points of action to note here:
- It will move volumes, as well as the family of snapshots and clones to the target pool you selected in a single operation.
- It will move your data _very slowly_ in order to protect front-end IO performance for your applications. Therefore, it could take a while for this action to complete – but has no performance overhead – nor does it introduce any data integrity risk whilst performing the action.
You’ll also notice a new tab under Monitor – an overall “Data Migration” pane, which shows the progress & completion time of of each volume family.
The volume in this example has now been successfully migrated, with all associated snapshots & clones – with the catalog 100% intact.
Evacuating The Old Array From Group
Once the group has successfully completed the migration of data non-disruptively to the new system, one can easily delete the old pool and evacuate the array from the group as all IO is now being serviced from that system. Evacuating the array will also serve as a factory reset, as all group information is wiped.
To do this, first head to the Volumes panel to delete the pool. This will push the new default pool to become the AFA pool. This should take a second or two.
Finally, you can remove the array from the group from Hardware -> “Remove From Group”.
You may see the following warning of not being able to migrate the management services and a request to call Nimble Support to perform this action – do give them a call, and they can do this quickly for you. If you’re on NimbleOS 5.1 or above, we introduced the auto-migration of management services within the group without requiring a call to support, as it forms part of the Peer Persistence feature.
So there we have it – how to use Nimble Scale Out technology to perform a non-disruptive data migration. Of course, this isn’t the only way to skin this particular cat – and in some instances it may make more sense to perform this via host-based methods. The nice thing about this technique is that it’s inbuilt to Nimble, has no performance overhead for any application IO, and is a few-click operation to complete.
Let me know if you have any further questions on how scale-out or the data migration works. But for now, be safe and stay Nimble! 🙂