Azure Storage Blob Count & Capacity usage Calculator

Posted by

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

Overview  

This PowerShell script allow you to count and calculate Azure Storage blob usage for Soft Deleted / non-Soft Deleted objects, by Container, by Tier, with Prefix, and considering Last Modified Date.
Azure Storage blob objects is defined as Base Blobs, Blob Snapshots or Blob Versions.

 

Parameters:

Specify each value in the script under "Parameters - User defined" section.

All values are mandatory and some can be empty, as described below and also in the script.

Options available are:

$storageAccountName - just run the script and the storage account name will be asked.

$containerName - specify some Container Name, or empty (default value) to list all containers
$prefix - specify some blob prefix (excluding container name) for scanning, or leave empty (default value) to list all objects
$deleted - specify 'True' to list only Soft Deleted objects, 'False' to list only non-Soft Deleted object (active objects - default value), or 'All' to list Active and Soft deleted Objects
$blobType - select 'Base' to list only base Blobs (default value), 'Snapshots' to list only Snapshots, 'Versions' to list only Versions, 'Versions+Snapshots' to list only Versions and Snapshots, or 'All Types' to list all objects (base Blobs, Versions and Snapshots)
$accessTier -  select 'Hot' to list only objects in Hot tier, 'Cool' to list only objects in Cool tier, 'Archive' to list only objects in Archive tier, or 'All' to list objects in all tiers (Hot, Cool and Archive)

$Year, $Month, $Day - Define a date to list objects only before or equal of Last Modified Date - if at least one value is empty, current date will be used.

 

Notes:

- Just running the script will ask for your AAD credentials and to select the storage account name to list.
- By default (without any parameter change, the script will list all Base Blobs, on all containers in the storage account, from all access tiers, with Last Modified Date before or equal current date time.

- All other options (above) should be defined in the script.
- This can take hours/days to complete, depending on the number of blobs, versions and snapshots in the container or Storage account.
- $logs container is not covered by this script (not supported)

 

 

 

 

 

# ==================================================================================== # Azure Storage Blob calculator: # Base Blobs, Blob Snapshots, Versions, Deleted / not Deleted, by Container, by tier, with prefix and considering Last Modified Date # ==================================================================================== # This PowerShell script will count and calculate blob usage on each container, or in some specific container in the provided Storage account # Filters can be used based on # All containers or some specific Container # Base Blobs, Blob Snapshots, Versions, All # Hot, Cool, Archive or All Access Tiers # Deleted, Not Deleted or All # Filtered by prefix # Filtered by Last Modified Date # This can take some hours to complete, depending of the amount of blobs, versions and snapshots in the container or Storage account. # $logs container is not covered by this script (not supported) # ==================================================================================== # DISCLAMER : please note that this script is to be considered as a sample and is provided as is with no warranties express or implied, even more considering this is about deleting data. # We would really recommend you to double check that the list of filtered elements looks fine to you before processing with the deletion with the last line of the script. # ==================================================================================== # PLEASE NOTE : # Just run the script and your AAD credentials and the storage account name to list will be asked. # All other values should be defined in the script, under 'Parameters - user defined' section. # Uncomment line 172 (# DEBUG) to get the full list of blobs # ==================================================================================== # For any question, please contact Luis Filipe (Msft) # ==================================================================================== Connect-AzAccount CLS #---------------------------------------------------------------------- # Parameters - user defined #---------------------------------------------------------------------- $selectedStorage = Get-AzStorageAccount | Out-GridView -Title 'Select your Storage Account' -PassThru -ErrorAction Stop $storageAccountName = $selectedStorage.StorageAccountName $containerName = '' # Container Name, or empty to all containers $prefix = '' # Set prefix for scanning (optional) $deleted = 'False' # valid values: 'True' / 'False' / 'All' $blobType = 'Base' # valid values: 'Base' / 'Snapshots' / 'Versions' / 'Versions+Snapshots' / 'All Types' $accessTier = 'All' # valid values: 'Hot', 'Cool', 'Archive', 'All' # Select blobs before Last Modified Date (optional) - if all three empty, current date will be used $Year = '' $Month = '' $Day = '' #---------------------------------------------------------------------- if($storageAccountName -eq $Null) { break } #---------------------------------------------------------------------- # Date format #---------------------------------------------------------------------- if ($Year -ne '' -and $Month -ne '' -and $Day -ne '') { $maxdate = Get-Date -Year $Year -Month $Month -Day $Day -ErrorAction Stop } else { $maxdate = Get-Date } #---------------------------------------------------------------------- #---------------------------------------------------------------------- # Format String Details in user friendy format #---------------------------------------------------------------------- switch($blobType) { 'Base' {$strBlobType = 'Base Blobs'} 'Snapshots' {$strBlobType = 'Snapshots'} 'Versions+Snapshots' {$strBlobType = 'Versions & Snapshots'} 'Versions' {$strBlobType = 'Blob Versions only'} 'All Types' {$strBlobType = 'All blobs (Base Blobs + Versions + Snapshots)'} } switch($deleted) { 'True' {$strDeleted = 'Only Deleted'} 'False' {$strDeleted = 'Active (not deleted)'} 'All' {$strDeleted = 'All (Active+Deleted)'} } if ($containerName -eq '') {$strContainerName = 'All Containers (except $logs)'} else {$strContainerName = $containerName} #---------------------------------------------------------------------- #---------------------------------------------------------------------- # Show summary of the selected options #---------------------------------------------------------------------- function ShowDetails ($storageAccountName, $strContainerName, $prefix, $strBlobType, $accessTier, $strDeleted, $maxdate) { # CLS write-host " " write-host "Listing Storage usage per Container" write-host "-----------------------------------" write-host "Storage account: $storageAccountName" write-host "Container: $strContainerName" write-host "Prefix: '$prefix'" write-host "Blob Type: $strDeleted $strBlobType" write-host "Blob Tier: $accessTier" write-host "Last Modified Date before: $maxdate" write-host "-----------------------------------" } #---------------------------------------------------------------------- #---------------------------------------------------------------------- # Filter and count blobs in some specific Container #---------------------------------------------------------------------- function ContainerList ($containerName, $ctx, $prefix, $blobType, $accessTier, $deleted, $maxdate) { $count = 0 $capacity = 0 $blob_Token = $Null $exception = $Null write-host -NoNewline "Processing $containerName... " do { # all Blobs, Snapshots $listOfBlobs = Get-AzStorageBlob -Container $containerName -IncludeDeleted -IncludeVersion -Context $ctx -ContinuationToken $blob_Token -Prefix $prefix -ErrorAction Stop #------------------------------------------ # Filtering blobs by type #------------------------------------------ switch($blobType) { 'Base' {$listOfBlobs = $listOfBlobs | Where-Object { ($_.SnapshotTime -eq $null)} } # Base Blobs 'Snapshots' {$listOfBlobs = $listOfBlobs | Where-Object { ($_.SnapshotTime -ne $null)} } # Snapshots 'Versions+Snapshots' {$listOfBlobs = $listOfBlobs | Where-Object { ($_.IsLatestVersion -ne $true)} } # Versions & Snapshots 'Versions' {$listOfBlobs = $listOfBlobs | Where-Object { ($_.IsLatestVersion -ne $true) -and $_.SnapshotTime -eq $null} } # Versions only 'All' {$listOfBlobs = $listOfBlobs} # All - Base Blobs + Versions + Snapshots } #------------------------------------------ # filter by Deleted / not Deleted / all #------------------------------------------ switch($deleted) { 'True' {$listOfBlobs = $listOfBlobs | Where-Object { ($_.IsDeleted -eq $true)} } # Deleted 'False' {$listOfBlobs = $listOfBlobs | Where-Object { ($_.IsDeleted -eq $false)} } # Not Deleted 'All' { } # All Deleted + Not Deleted } # filter by Last Modified Date $listOfBlobs = $listOfBlobs | Where-Object { ($_.LastModified -le $maxdate)} # <= Last Modified Date #Filter by Access Tier if($accessTier -ne 'All') {$listOfBlobs = $listOfBlobs | Where-Object { ($_.accesstier -eq $accessTier)} } #------------------------------------------ # Count and used Capacity # Count includes folder/subfolders on ADLS Gen2 Storage accounts #------------------------------------------ foreach($blob in $listOfBlobs) { # DEBUG write-host $blob.Name " Content-length:" $blob.Length " Access Tier:" $blob.accesstier " LastModified:" $blob.LastModified " SnapshotTime:" $blob.SnapshotTime " URI:" $blob.ICloudBlob.Uri.AbsolutePath " latestVeriosn:" $blob.IsLatestVersion " Lease State:" $blob.ICloudBlob.Properties.LeaseState $count++ $capacity = $capacity + $blob.Length } if ($blob.Count-1 -gt 0) {$blob_Token = $blob[$blob.Count-1].ContinuationToken}; }while ($blob_Token -ne $Null) write-host " Count: $count Capacity: $capacity" return $count, $capacity } #---------------------------------------------------------------------- $totalCount = 0 $totalCapacity = 0 ShowDetails $storageAccountName $strContainerName $prefix $strBlobType $accessTier $strDeleted $maxdate $ctx = New-AzStorageContext -StorageAccountName $storageAccountName -UseConnectedAccount -ErrorAction Stop $arr = "Container", "Count", "Used capacity" $arr = $arr + "-------------", "-------------", "-------------" $container_Token = $Null #---------------------------------------------------------------------- # Looping Containers #---------------------------------------------------------------------- do { $containers = Get-AzStorageContainer -Context $Ctx -Name $containerName -MaxCount 5000 -ContinuationToken $container_continuation_token -ErrorAction Stop if ($containers -ne $null) { $container_continuation_token = $containers[$containers.Count - 1].ContinuationToken for ([int] $c = 0; $c -lt $containers.Count; $c++) { $container = $containers[$c].Name $count, $capacity, $exception = ContainerList $container $ctx $prefix $blobType $accessTier $deleted $maxdate $arr = $arr + ($container, $count, $capacity) $totalCount = $totalCount +$count $totalCapacity = $totalCapacity + $capacity } } } while ($container_Token -ne $null) write-host "-----------------------------------" #---------------------------------------------------------------------- #---------------------------------------------------------------------- # Show details in user friendly format and Totals #---------------------------------------------------------------------- ShowDetails $storageAccountName $strContainerName $prefix $strBlobType $accessTier $strDeleted $maxdate $arr | Format-Wide -Property {$_} -Column 3 -Force write-host "-----------------------------------" write-host "Total Count: $totalCount" write-host "Total Capacity: $totalCapacity " write-host "-----------------------------------" #----------------------------------------------------------------------

 

 

 

 

This script was tested on PSVersion 5.1.19041.1682 and Az.Storage module 4.6.0, for Blob flat namespace and Hierarchical namespace (ADLS Gen2) Storage accounts. 

 

Other tools:

Azure Portal - Due to the historical flat name space nature of Blob service in Azure Storage, the sum of blobs or usage per container or per folder is not supported on Azure Portal. Also, listing blob snapshots or blob versions can only be verified to each specific blob and is not support list all at container level in the Portal.

Azure Storage Explorer (current version 1.24.3) tool can list, count and show usage for Active blobs, at container or subfolder level, using option "Folder Statistics".
We can list "Active and Soft Deleted blobs", "Active Blobs and blobs without current version", and "All blobs and blobs without current version". For count and capacity, we can only use it for Active Blobs.

For list snapshots and versions this option is only available at blob level.

In summary, we can list Soft Deleted Blobs, versions and snapshots but calculating its count and usage is not supported in Azure Storage Explorer.


Azure Storage data protection features:

Blob Snapshot  

A snapshot is a read-only version of a blob that's taken at a point in time. A snapshot of a blob is identical to its base blob, except that the blob URI has a DateTime value appended to the blob URI to indicate the time at which the snapshot was taken. A blob can have any number of snapshots. Snapshots persist until they are explicitly deleted, either independently or as part of a Delete Blob operation for the base blob.

 

Blob versioning  

Azure Blob storage versioning lets you automatically maintain previous versions of an object. When blob versioning is enabled, you can access earlier versions of a blob to recover your data if it is modified or deleted.

 

Soft delete for blobs  

Blob soft delete protects an individual blob, snapshot, or version from accidental deletes or overwrites by maintaining the deleted data in the system for a specified period of time. During the retention period, you can restore a soft-deleted object to its state at the time it was deleted. After the retention period has expired, the object is permanently deleted.

 

Conclusion:  

Calculate usage capacity on Azure storage is not always an easy task. Each client application can interact with blobs on its own way, creating snapshots and versions to protect the blobs when processing them. Also, Soft Delete can be used to protect accidental deletions.
All these features create additional blobs usually not listed in the base blob lists and even that lists are per container or per subfolder level.
For that reason, it's hard to have an idea of the global amount of blobs and storage usage per container or per subfolder, making the cost prediction very hard to calculate.

This PowerShell script should help you on calculate blob count and blob usage, based on more often filters used. 

 

Related documentation:  

Soft delete for blobs
Blob versioning
Blob Snapshots

Other techcommunity articles:  
Calculate the size/capacity of storage account and it services (Blob/Table)
Analyzing Storage Capacity

Other PowerShell scripts:  
Calculate the total billing size of a blob container
Calculate the size of a blob container with PowerShell

I hope this can be useful!!!

 

This articles are republished, there may be more discussion at the original link. But if you found this helpful, you're more than welcome to let us know!

This site uses Akismet to reduce spam. Learn how your comment data is processed.