NetApp SnapCenter 1.0 – a new hope…

NetApp recently released version 1.0 of a new software offering going by the name of SnapCenter. It’s a long held tradition that 80% of NetApp’s releases contain the word ‘snap’, continuing to point out their ages old innovation in storage of snapshot technology providing efficient, speedy backups of your precious data.

 Screen Shot 2015-10-13 at 18.16.40

So what does SnapCenter bring to the table that we did not have before? Well first we need some context…

SnapDrive is Windows/UNIX software which taps into a NetApp storage system, allowing the provisioning, backup, restoration, and administration of storage resources without having to directly log onto the storage system. This enables application owners to take control of their own backup/restore operations and therefore feel more able to manage their data. For applications or server roles which are not subject to issues with inconsistency in backups the backup/restore features in SnapDrive are fine. Where applications are used which do have this concern, NetApp have provided another solution.

With me so far? Good. So SnapDrive is supplemented by the SnapManager suite of products. These have been built up over a long period of time by NetApp, and integrate directly with applications like:

  • SQL Server
  • Oracle
  • VMware
  • Hyper-V
  • Sharepoint
  • Exchange
  • SAP

These applications have vastly different purposes, but have equally unique requirements in terms of backing up their data in an application consistent way. Usually creating a backup/restore strategy which produces application consistent backups requires detailed understanding of the application, and is not integrated with the features presented by the underlying storage.

The SnapManager suite of products fills this gap, delivering a simplified, storage-integrated, application consistent method of easily backing up and restoring data, and providing the features that application owners desire. Further to this, it gives the application owners a simple GUI to take ownership of their own backup and recovery, whilst ensuring nothing in the underlying storage will break.

But this panacea to the challenge of backup and recovery, and its place within the application stack, is not without fault. Many criticisms have been levelled at the SnapManager suite over the years. The main two criticisms which I believe SnapCenter addresses are:

  1. Inconsistent user interfaces – the SnapManager suite was built up over time by NetApp, and many of the products were developed by different internal teams. This meant that the resultant software has very different looks and feels as you transition from one product to another. This complicates administration of the product for infrastructure administrators because they end up with multiple GUIs to learn, instead of a single GUI
  2. Scalability issues – to be fair to NetApp, this is not just an issue with their solution, a previous workplace of mine were heavy users of IBM’s Tivoli Storage Manager and that had a similar issue which is, as your environment grows, you may end up with tens of SQL servers, which means tens of instances of SnapManager for SQL to install, update, manage, and monitor, this could mean thousands upon thousands of reports and alerts to sift through each day, and without a solution to manage this, issues will go undiscovered for days, weeks or even months. Once you add in your Exchange environments, vCenter servers, Sharepoint farms, Oracle servers etc, you may be looking at tens of thousands of backups running a day, and potentially hundreds of pieces of installed software to manage and try to keep an eye on

So how does SnapCenter address this problem? Well, with the release of Clustered Data ONTAP (CDOT) 8.3 at the start of 2015, and the end of NetApp’s legacy 7-Mode operating system, there seems to have been a drive to revitalise their software and hardware lines, simplifying the available options, and pushing software interfaces to be web based, rather than thick GUIs.

So the value proposition with SnapCenter is a centrally managed point of reference to control your backups programatically, with a modern web based interface, and scalability to provide a workable solution regardless of the size of estate being backed up. So let’s look at these features, and how NetApp have delivered this:

1. Scalability

Scalability utilises the Windows NLB and ARR (Application Request Routing, basically a reverse web proxy) features to allow for the creation of a farm of SnapCenter servers up to the maximum size allowed by Windows NLB of 32 nodes.

SnapCenter utilises a SQL database as its back end, this can be either a local SQL Server Express instance (for small deployments), or a full SQL Server instance for scalable deployments.

2. Programability

NetApp have also been pretty decent at including programmability in their more recent software offerings, and SnapCenter is no exception, of course providing a PowerShell cmdlet pack, and of course the now ubiquitous REST API. SnapCenter is also policy-driven, which means once you have created your backup policy sets you can apply them to new datasets you want to backup going forward, this helps to keep manageability of backups under control as your infrastructure grows.

3. Interface

A web interface is a beautiful thing, accessing software from any browser on any OS makes life a lot easier for administrators, and not logging onto servers means less chance of breaking said servers. NetApp have chosen HTML5 for this interface which does away with the pain of having to deal with Java or Flash which plagues other web interfaces (UCS, VMware, I’m looking at you!). NetApp have raised the bar with the SnapCenter interface, producing a smart and stylish WUI not dissimilar to Microsoft’s Azure interface.

3506i6C464D71FA4BF802

Once you have installed the SnapCenter software on your Windows server, you will need to use the software to deploy the Windows and SQL Server plugins to your SQL servers. These plug-ins replace SnapDrive and SnapManager respectively, but this deployment process promises to be quick and painless, and a reboot should not be necessary. SnapCenter utilises the same licenses as SnapManager so if this is already licensed on your storage system then you are good to go. There is a migration feature present to help you move from SnapManager to SnapCenter, although this does not support migration of databases on VMDKs at this time.

The initial release of SnapCenter only interoperates with SQL Server, and VMware through the Virtual Storage Console (VSC), so it probably won’t replace many customer’s full SnapManager install bases just yet, but the delivery team are promising rollouts of more plugins over the coming months.

There are limitations even in the SQL backup/recovery capabilities, although these will likely not affect many customers, these are detailed in the product Release Notes, but the biggest of these from what I can see is that SnapCenter does not presently support SQL databases on SMB volumes.

Hopefully NetApp will provide regular and functionality enhancing updates to this product so that it delivers on its promises. It would also be good to see some functionality enhancements over what is currently delivered by the SnapManager products, top of the list from my perspective is allowing Exchange databases to reside on VMDK storage as the current restriction on this being purely LUN based makes things difficult, especially where customers are not deploying iSCSI, as this means the dreaded RDMs must be used in VMware, which as a VMware admin causes no end of headaches. It would also be nice to see this offered at some point as a virtual appliance, perhaps with an embedded PostgreSQL type database similar to what VMware offer for the vCenter Server Appliance, but that will be way down the line I would imagine as providing an appliance that scales well is a difficult thing.

NetApp have promised to continue to deliver SnapManager products for the time being, this is needed because of the lack of 7-Mode support in SnapCenter. Having worked extensively with both CDOT and 7-Mode though, I think there are many compelling reasons to move to CDOT if possible, and this seems like a fair compromise. SnapCenter can be installed quickly and tested out without committing to moving all your databases over to it, so give it a try, it’s the future after all!

NetApp Cluster Mode Data ONTAP (CDOT) 8.3 Reversion to 8.2 7-Mode

A project came in at work to build out a couple of new NetApp FAS2552 arrays; this was to replace old FAS2020s for a customer who was using FCP in their Production datacenter, and iSCSI in their DR datacenter, with a semi-synchronous Snapmirror relationship between the two.

The new arrays arrived on site, and we set them up separate from the production network, to configure them. We quickly identified that the 2552s were running OnTap 8.3RC1, which is how they were sent to us out of the factory. Nobody had any experience with Cluster Mode Data ONTAP, but this didn’t seem too much of a challenge, as it did not seem hugely different.

After looking what to do next, it appeared that transitioning SAN volumes from 7-mode to Cluster Mode Data ONTAP is not possible, so the decision was taken to downgrade the OS from 8.3RC1, to 8.2 7-mode to make the transition of the customer’s data, and the downtime during switchover from old arrays to new, be as easy and quick as possible.

We got there in the end, but due to the tomes of documentation we had to trawl through, and tie together, I decided to document the process, to assist any would be future CDOT luddites in carrying out this task.

NOTE: This has not been tested on anything other than a FAS2552 with two controllers, and if you are in any way uncertain I would suggest contacting NetApp support for assistance. As this was a brand new array, and there was no risk of data loss, we proceeded regardless. You will need a NetApp support account to access some of the documentation and downloads referenced below. This is the way we completed the downgrade, not saying it is the best way, and although I have many years experience of working with NetApp arrays, this is just a guide.

  • Downloading and updating the boot image:

We decided on 8.2.3 for our boot image, this was the last edition of Data ONTAP with 7-mode included. If you go to http://mysupport.netapp.com/NOW/cgi-bin/software/ and select your array type you will see the available versions for your array. There are pages of disclaimers to agree to, and documents of pre-requisites and release notes for each version, these are worth reading to ensure there are no known issues with your array type. Eventually you will get the download, it will be a .tgz file.

You will now need a system with IP connectivity to both controllers, and use something like FileZilla Server to host the file via FTP. This will allow you to get the file up to the controller. I am not going to include steps to setup your FTP server, but there are plenty of resources online to do this. You could also host this via HTTP using something like IIS if that is more convenient.

Now to pull the image onto the array, this will need doing on both controllers (nodes), this document was followed, specifically the following command (based on content on page 143):

 system node image get -node localhost -package <location> -replace-package true - background true

I changed the command to replace ‘-node *’ with ‘-node localhost’ so we could download the image to each node in turn, this was just to ensure we could tackle any issues with the download. I also removed the ‘-background true’ switch, which would run the download in the background, this was to give us maximum visibility.

Now our cluster had never been properly configured, there are a bunch of checks to do at this point to ensure your node is ready for the reversion, these are all detailed in the above document and should be followed to make sure nothing is amiss. We ran through these checks prior to installing the newly downloaded image. This includes things

Once happy, the image can be installed by running:

system node image update -node localhost -package file:///mroot/etc/software/<image_name>

The image name will be the name of the .tgz file you downloaded to the controller earlier (including the extension).

Once the image is installed, you can check the state of the installation with:

 system image show

This should show something like:

Screen Shot 2015-02-12 at 20.41.26

This shows the images for one controller only, but shows us the image we are reverting to is loaded into the system, and we can move on.

There are some more steps in the document to follow, ensuring the cluster is shutdown, and failover is disabled before we can revert, follow these from the same document as above.

Next we would normally run ‘revert_to 8.2’ to revert the firmware. However, we had issues at this point because of the ADP (Advanced Drive Partitioning), which seems to mark the disks as in a shared container. It goes into the background here, in Dan Barber’s excellent article. Long story short, we decided to reboot and format the array again to get round this.

  • Re-zeroing the disks and building new vol0:

We rebooted the first controller, and saw that when it came back up it was running in 8.2.3 (yay) Cluster Mode (boo). We tried zeroing the disks and building a new vol0, by interrupting the boot sequence with Ctrl+C to get to the special boot menu, and then running option 4, this was no good for us though, because once built, the controller booted into 8.2.3 Cluster Mode, a new tactic would be required.

We found this blog post on Krish Palamadathil’s blog, which detailed how to get around this. The downloaded image contains both Cluster Mode and 7-Mode images, but boots into Cluster Mode by default when doing this reversion. Cutting to the chase, the only thing we needed to do was to get to the Boot Loader (Ctrl+C during reboot to abort the boot process), and then run the following commands:

 LOADER> set-defaults 
 LOADER> boot_ontap

We then saw the controller come up in 8.2.3 7-Mode, interrupted the boot sequence, and ran an option 4 to zero the disks again and build a new vol0

Happy to say that the array is now at the correct version and in a state where it can now be configured. As usual, the NetApp documentation was great, even if we had to source steps from numerous different places. As this is still a very new version of Data ONTAP I would expect this documentation to get better over time, in the meantime hopefully this guide can be of use to people.