In real world scenario, we always tend to perform large file
uploads to azure blob using worker role. Hence I will depict code with respect
to console application. Don’t worry it can be easily converted to worker role
specific code. J

Alright so, let’s start with it!!

Again – This
might also be long post due to heavy code blocks. So be prepared.

It is a great
solution to upload large files to azure blob storage. Just awesome!! I will change
a bit so that, we can perform parallel uploads of large files.

Let’s understand few important components.

First I create a simple console application and named it as AzureBlobUploadParallelSample
as shown below –

Then I added another class library in the same solution named
as AzureBlobOperationsManager as shown below –

This class library will perform the upload of large files to
azure blob storage and hence should have reference to Azure storage libraries.
Nuget is the best way to get latest of dll’s therefore I opened Tools->
Library Package Manager-> Package Manager Console and types following
command to install storage libraries –

Also add reference to Microsoft.WindowsAzure.Serviceruntime
latest version from Add Reference dialog box.

I am defining a class here named as FileBlobNameMapper. This
class defines two properties BlobName and FilePath. These properties will be
used by user to specify name of the blob and path of file to be uploaded. This
class will help users to provide multiple large files to be uploaded in Azure
blob storage.

///<summary>

/// Class to be used for holding the file-blobname
mapping.

///</summary>

publicclassFileBlobNameMapper

{

public FileBlobNameMapper(string blobName, string filePath)

{

BlobName = blobName;

FilePath = filePath;

}

publicstring BlobName { get; set; }

publicstring FilePath { get; set; }

}

After invoking async upload of multiple blobs in azure
storage, we need to know which uploads is successful and which are failed.
Therefore to get this status information I have defined another class named as BlobOperationStatus. It is as follows –

publicclassBlobOperationStatus

{

publicstring Name { get; set; }

publicUri BlobUri { get; set; }

publicOperationStatus OperationStatus { get; set; }

publicException ExceptionDetails { get; set; }

}

publicenumOperationStatus

{

Failed, Succeded

}

Now we need a class which will actually perform upload of
large file to blob in parallel. Therefore I added a class named as AsyncBlockBlobUpload.

In this class I copied method GetFileBlocks, and internal class FileBlock from the codeplex link which
is also specified above.

I defined MaxBlockSize class varibale to 2 MB as follows. This
means every file block will be of size 2MB.

privateconstint MaxBlockSize = 2097152; //
Approx. 2MB chunk size

Now I defined a method which will use Parallel.ForEach to
start the upload of all blobs in parallel, means every blob upload on different
thread and hence faster.

So if you see, this is where I use the earlier defined class
FileBlobMapper. The method has parameter containerName means all files and
blobs names present in FileBlobNameMapper class list will be uploaded in the
specified container. So if you wish to upload a single file or multiple files
to blob then also this method serves the purpose. The full method code is as
follows –

//
Retrieve reference to a blob and set the stream read and write size to minimum

CloudBlockBlob blockBlob = container.GetBlockBlobReference(blobName);

blockBlob.StreamWriteSizeInBytes = 1048576;

blockBlob.StreamMinimumReadSizeInBytes = 1048576;

//set
the blob upload timeout and retry strategy

BlobRequestOptions options = newBlobRequestOptions();

options.ServerTimeout = newTimeSpan(0, 180, 0);

options.RetryPolicy = newExponentialRetry(TimeSpan.Zero, 20);

//get
the file blocks of 2MB size each and perform upload of each block

HashSet<string> blocklist = newHashSet<string>();

List<FileBlock> bloksT = GetFileBlocks(fileContent).ToList();

foreach (FileBlock block in GetFileBlocks(fileContent))

{

blockBlob.PutBlock(

block.Id,

newMemoryStream(block.Content,
true), null,

null, options, null

);

blocklist.Add(block.Id);

}

//commit
the blocks that are uploaded in above loop

blockBlob.PutBlockList(blocklist, null, options, null);

//set
the status of operation of blob upload as succeeded as there is not exception

blobStatus.BlobUri =
blockBlob.Uri;

blobStatus.Name =
blockBlob.Name;

blobStatus.OperationStatus = OperationStatus.Succeded;

return blobStatus;

}

catch (Exception ex)

{

//set
the status of blob upload as failed along with exception message

blobStatus.Name = blobName;

blobStatus.OperationStatus = OperationStatus.Failed;

blobStatus.ExceptionDetails =
ex;

return blobStatus;

}

}

The comments in above method are self-explanatory and simple
to understand. So here we complete our library classes for large file upload to
blob storage. Build the class library project and add reference to console
application project.

No we client project (means my console app or worker role app)
I need to invoke these methods of azure blob upload asynchronously. After
completion of upload operation retrieve the result in callback method and take
necessary action.

Alright we need to now look into console application code from
which we will call upload operation in Async way. I highly recommend you to go
through this link - http://msdn.microsoft.com/en-us/library/2e08f6yc(v=vs.110).aspx to
understand how can we call any method in Async way from C#. So based on this
approach I have defined delegate AsyncBlockBlobUploadCaller having the same signature
as that of actual blob upload method. So I will use object of this delegate to
use BeginInvoke and EndInvoke method.

I declared delegate in Program class of console application as
class variable –

//to
keep main thread alive I am using While(true). Because Async operations here
will be based on ThreadPool and if main thread is ended then async operation
child threads will also end.

//Note:
If you are using worker role here then it usually run's the operation in Run
method in While(true) method keeping your main thread alive always.

while (true)

{

Console.WriteLine("continue the
main thread work...");

Thread.Sleep(90000);

}

}

If you see I have added While(true) loop. It is of no use
here. It just to simulate that my main thread of console operation is doing
some work and in the background my async upload of azure blob storage is also
happening at the same time. If you are using worker role then you will not need
it. In above code change yellow
marked file paths to your file paths and they can be of different sizes.
Also you may change the container name, blob names as per your choice.

Not it was time for me defining the callback method which will
get automatically called when blob upload async operation fails or succeeds.

//Note:This
is where you can write the failed blob operation entry in table/ queue and
again make worker role traverse th' to perform upload again.

}

}

That’s it. If you run the application the output will be as
follows –

If you observe the main thread work had started and continuing
then when entire blob uploads operation was successful then the message of
those blob upload appeared and after that again main thread continue work
message. JJ…

Hence my entire large file uploads to azure blob storage was
async and in parallel.

Let’s check if the sample is working correct and getting
correct results if my async azure blob upload fails. To fail the blob upload,
best way is to specify name of blob to length greater than 1024 characters.
Therefore I wrote some random sentence in word file and made sure that its
length is greater than 1024(I am having 2019 length of those random words) and
then in debug mode I changed the name of my blob to this random name of heavy
length.

As expected it got failed and I got the correct result of
failure as shown below –

Enhancements
–

Right now the code uploads
multiple files in parallel but all blocks of file are uploaded synchronously. The
enhancements can be, to upload blocks of one file ALSO IN PARALLEL.

Followers

About Me

I am Kunal Chandratre. Working as Cloud Solution Architect @Microsoft. My speciality is Microsoft Azure Cloud platform.
Awarded as Most Valuable Professional (MVP) in Microsoft Azure for consecutive 3 years. Passionate speaker, trainer...In free time (which I don't get usually)I write blogs and answers the forum questions. I was doing it just for timespass but now I have got addicted to blogging...Apart from work, I do variety of things which I can't tell here:).. I am trekker, singer, actor, painter, f1 racer, super hero in my dreams.. ...and now trying my luck with technologies...Keep posting...

Visitors

Disclaimer:

The information shared in this blog is the result of my personal experience with various technology platforms. In no way it represents the company I work for.
The information provided here is "AS IS" with no warranties, and confers no rights. This blog does not represent the thoughts, intentions, plans or strategies of my current employer or past empolyers or any other forums or community I belong to. It is fully my own opinion. Inappropriate comments will be deleted at the authors discretion. All code samples are provided "AS IS" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.I have full rights to edit/modify/delete any content of this blog without any prior notice to public/followers/RSS readers of this blog.