How to use Azure Data Box for Migrations

This post has been republished via RSS; it originally appeared at: ITOps Talk Blog articles.

Data is something we all have, at home I have about 2TB of data stored on various hard drives, some of it’s duplicated, some of its backups, some of its super important while the rest is just random stuff I don’t care about.  Organisations are the same, they’ll have data stored either in Gigabytes, Terabytes, Petabytes, or maybe even Exabtyes, and much like my data at home it some of it will be duplicates, some will be super important, etc.

 

Often when I talk to organisations who are looking to move to the cloud the amount of data they have to move is a concern.  They are unclear the way to move that data.  Moving data via an Internet upload is often the preferred method especially if you leverage something like AzCopy, Azure CLI, PowerShell or another language to script and schedule the upload to happen at “quiet” times in your network.  However, if you are concerned about using your Internet connection to do the transfer for bandwidth reasons or timing reasons the Azure Data Box family can assist you.

 

Audit

If you are doing a migration, then an inventory of what you currently have is super important.  It sounds like a simple piece of advice but do an audit of what you have storage wise. 

 

Let Data Box help you

In its simplest form, Data Box is storage that comes to you that helps move your data from your datacentre into Azure’s.  It comes in a variety of different sizes and flavours in order to suit your organisation’s needs.

 

Azure Data BoxAzure Data Box

 

 

The Data Box Disk has 7TB of usable space and comes in the format of small hard disks, in a similar vain to the types of hard drives we use at home to back up our family photo memories.   You can request up to five Data Box disks at any one time.

 

The Data Box can store up to 80TB in data and it is a big bigger than the disk, it’s more like your tower PC in size.

 

And the Data Box Heavy can store up 780TB of data.  And again, it comes in a much larger format, it comes with wheels and a trolley handle.  If you are looking at using this device, make sure it will fit in in your space.  

 

Utilising Data Box

To get a hold of a Data Box you need to place an order within the Azure Portal.   You’ll be asked which type of Data Box you want to utilise, where you want it shipped to, where the data will be stored when it’s uploaded to Azure. The ordering process is straight forward. 

 

When you are ordering your Data Box, it’s key to remember things like shipping times, weekends, public holidays, as these can all impact on the timescale of when you get the Data Box, and when the data will be uploaded.  So be sure to build those times into your project plan.

 

Also the storage account you select for the data to be transferred into doesn’t need to be publicly accessible, you can lock it down as per your organisation’s standards.  It’s key to point out that Microsoft staff do not have access to any of your data while it resides in the Data Box, the data is transferred straight from the Data Box into the Azure Storage Account as requested by you.

 

Once you have the Data Box at your location the copying of the data on can be done by something that supports SMB protocol such as Robocopy.  Using Robocopy will allow you to run parallel copies that will optimize your copy performance.  Good old fashioned drag and drop using File Explorer is an option but be wary as it will be slow.  However, you get the data on to the Data Box it's a good idea once your data has been uploaded to ensure you validate the data using the validation tool.Once the Data Box has been returned to the Azure Datacentre and your data is all safely uploaded to the right area the data is erased from the Data Box as per NIST SP 800-88 Revision 1 guidelines.

 

Workloads

The Data Box range is great for when you want to move lots of data, it can take the load off your network bandwidth, it’s great strengths come into play when you utilise it with the right types of workloads.

 

Moving data from your environment to Azure with Data Box is great for the following:

  • Archive Data – data you need to keep for long periods of time
  • Large Data Sets – large amounts of data you are aiming to analysis in the cloud with big data tools
  • Media files – videos, pictures, sound files, etc you need to store and potentially index with Cognitive Services
  • Backup files – your organisation’s backup files can be moved to the cloud for safe, secure storage

You can use it to pre-seed data as well, but be careful that you understand the process of applying any changes to this data to get it back in line while the Data Box was in transit.

 

Tips on using Data Box

There are some tips that can help you make the most of Data Box when using it:

  1. The temptation with large migration project is to load everything on to a Data Box Heavy, however it might be more beneficial to split up your data into multiple Data Boxes, this allows you to run multiple devices in parallel.
  2. When transferring your data to the Data Box it is advisable, where possible to use a secondary network path that is not competing with the primary network connection, as using the primary network path can lead to bottlenecks and delays,
  3. Again when you are transferring your data across to the Data Box make use of all available options at your disposal, and by that I mean if you are using Robocopy to make the transfer use the switches that Robocopy has to try and optimise your copy and lower CPU on the management system.

This is a tried and tested command our field engineers have helped customers use:

robocopy \\NAS-01\Photos  \\databox\<proper container>   /e /r:3 /w:60 /is /np /ndl /nfl /np /MT:32 /fft /Log+:c:\temp\logcopy.txt

 

Data Box alternatives

At present Data Box isn’t available for all Azure regions, the Products Available by Region site can be sure to check what is available or not within your desired region.

If you can’t utilise the Data Box family, you will have access to the Azure Import/Export service which still uses physical devices, however you provide them.  You must handle all the shipping details and device preparation in this scenario.

 

If you are looking to move databases or unstructured data into Azure then utilizing Azure Data Factory is a another option as it can help to facilitate moving data into a storage account or directly into an IaaS virtual machine in Azure.

 

Call to Action

Get in touch and let us know how you’ve utilised Data Box in your environment or if you have questions.  You can also explore more about moving data from on prem to Azure by working through the Microsoft Learn module.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.