"Stream is too long" when uploading big files on Azure Blob Storage

5.9k views Asked by At

I'm trying to upload big files (4Gb) to Azure Blob Storage, but it fails. According to this article (https://learn.microsoft.com/en-us/azure/storage/storage-dotnet-how-to-use-blobs), this is my code :

CloudBlobContainer blobContainer = blobClient.GetContainerReference("my-container-name");
blobContainer.CreateIfNotExistsAsync().Wait();
CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference("blob-name");
await blockBlob.UploadFromFileAsync("C:\test.avi");

But I got this error

Message: Stream was too long.
Source: System.Private.CoreLib
StackTrace: at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count) at Microsoft.WindowsAzure.Storage.Blob.BlobWriteStream.d__5.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\BlobWriteStream.cs:line 144 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.Core.Util.StreamExtensions.d__1`1.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Core\Util\StreamExtensions.cs:line 308 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.<>c__DisplayClass20_0.<b__0>d.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\CloudBlockBlob.cs:line 301 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob.<>c__DisplayClass23_0.<b__0>d.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\CloudBlockBlob.cs:line 397 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at MyCompany.AzureServices.Blob.BlobService.d__7.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\BlobService.cs:line 79 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at MyCompany.AzureServices.Blob.MyProject.RecordBlobService.<>c__DisplayClass1_0.<b__0>d.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\MyProject\RecordBlobService.cs:line 25

According to this article (https://www.simple-talk.com/cloud/platform-as-a-service/azure-blob-storage-part-4-uploading-large-blobs/) I try to add more option for large files. This is my new code :

TimeSpan backOffPeriod = TimeSpan.FromSeconds(2);
int retryCount = 1;
BlobRequestOptions bro = new BlobRequestOptions()
{
    //If the file to upload is more than 67Mo, we send it in multiple parts
    SingleBlobUploadThresholdInBytes = 67108864, //67Mo (maximum)
    //Number of threads used to send data
    ParallelOperationThreadCount = 1,
    //If the block fail, we retry once (retryCount) after 2 seconds (backOffPeriod)
    RetryPolicy = new ExponentialRetry(backOffPeriod, retryCount),
};
blobClient.DefaultRequestOptions = bro;
CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference("blob-name");
//If the file is sended in multiple parts, theses parts size are 4Mo
blockBlob.StreamWriteSizeInBytes = 4194304; //4Mo (maximum)
await blockBlob.UploadFromFileAsync("C:\test.avi");

But I got the same error again (Stream was too long).

I found in Microsoft.WindowsAzure.Storage library that the the function "UploadFromFileAsync" use "UploadFromStreamAsync" which use a MemoryStream. I think my error come from that MemoryStream but it's write in blob storage article that the max size of a blob is 195Gb. So how I'm suppose to use it ?

I use Microsoft.WindowsAzure.Storage version 7.2.1

Thanks !

.

UPDATE 1 : Thanks to Tom Sun and Zhaoxing Lu, I tried to use Microsoft.Azure.Storage.DataMovement.
Sadly, I get an error on the "TransferManager.UploadAsync" function. I tried to google it but nothing...
Any ideas ?

This is my code :

 string storageConnectionString = "myStorageConnectionString";
   string filePath = @"C:\LargeFile.avi";
   string blobName = "large_file.avi";

    CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
    CloudBlobClient blobClient = account.CreateCloudBlobClient();
    CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
    blobContainer.CreateIfNotExists();

   CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference(blobName);
    // Setup the number of the concurrent operations
    TransferManager.Configurations.ParallelOperations = 64;

    // Setup the transfer context and track the upload progress
    var context = new SingleTransferContext();
    UploadOptions uploadOptions = new UploadOptions
    {
        DestinationAccessCondition = AccessCondition.GenerateIfExistsCondition()
    };
    context.ProgressHandler = new Progress<TransferStatus>(progress =>
    {
        Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred);
    });

    // Upload a local blob
    TransferManager.UploadAsync(filePath, destBlob, uploadOptions, context, CancellationToken.None).Wait();

This is the error :
Message : One or more errors occurred. (The transfer failed: The format of value '*' is invalid..)
Source : System.Private.CoreLib
StackTrace :

at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait() at MyCompany.AzureServices.Blob.BlobService.d__7.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\BlobService.cs:line 96 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at MyCompany.AzureServices.Blob.MyProject.RecordBlobService.<>c__DisplayClass1_0.<b__0>d.MoveNext() in C:\MyProjectSource\MyCompany.AzureServices\Blob\MyProject\RecordBlobService.cs:line 25

And the inner exception :
Message : The transfer failed: The format of value '*' is invalid..
Source : Microsoft.WindowsAzure.Storage.DataMovement
StackTrace :

at Microsoft.WindowsAzure.Storage.DataMovement.TransferScheduler.d__22.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferScheduler.cs:line 214 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.SingleObjectTransfer.d__7.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferJobs\SingleObjectTransfer.cs:line 226 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferManager.d__72.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferManager.cs:line 1263

And the next inner exception :
Message : The format of value '*' is invalid.
Source : Microsoft.WindowsAzure.Storage.DataMovement
StackTrace :

at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.BlockBlobWriter.HandleFetchAttributesResult(Exception e) in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferWriters\BlockBlobWriter.cs:line 196 at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.BlockBlobWriter.d__18.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferWriters\BlockBlobWriter.cs:line 157 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.BlockBlobWriter.d__16.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferWriters\BlockBlobWriter.cs:line 83 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.SyncTransferController.d__13.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\SyncTransferController.cs:line 81 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferControllers.TransferControllerBase.d__33.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferControllers\TransferControllerBase.cs:line 178 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Microsoft.WindowsAzure.Storage.DataMovement.TransferScheduler.d__22.MoveNext() in C:\Local\Jenkins\jobs\DM_Hotfix\workspace\lib\TransferScheduler.cs:line 208

And the last inner exception :
Message : The format of value '*' is invalid.
Source : System.Net.Http
StackTrace :

at System.Net.Http.Headers.HttpHeaderParser.ParseValue(String value, Object storeValue, Int32& index) at System.Net.Http.Headers.EntityTagHeaderValue.Parse(String input) at Microsoft.WindowsAzure.Storage.Shared.Protocol.RequestMessageExtensions.ApplyAccessCondition(StorageRequestMessage request, AccessCondition accessCondition) in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Shared\Protocol\RequestMessageExtensions.cs:line 125 at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.<>c__DisplayClass116_0.b__0(RESTCommand1 cmd, Uri uri, UriQueryBuilder builder, HttpContent cnt, Nullable1 serverTimeout, OperationContext ctx) in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Blob\CloudBlob.cs:line 1206 at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.d__4`1.MoveNext() in C:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\WindowsRuntime\Core\Executor\Executor.cs:line 91

And finally my project.json :

  {
  "version": "1.0.0-*",

  "dependencies": {
    "Microsoft.Azure.DocumentDB.Core": "0.1.0-preview",
    "Microsoft.Azure.Storage.DataMovement": "0.4.1",
    "Microsoft.IdentityModel.Protocols": "2.0.0",
    "NETStandard.Library": "1.6.1",
    "MyProject.Data.Entities": "1.0.0-*",
    "MyProject.Settings": "1.0.0-*",
    "WindowsAzure.Storage": "7.2.1"
  },

  "frameworks": {   
    "netcoreapp1.0": {
      "imports": [
        "dnxcore50",
        "portable-net451+win8"
      ],
      "dependencies": {
        "Microsoft.NETCore.App": {
          "type": "platform",
          "version": "1.0.0-*"
        }
      }
    }
  }
}

Thanks for your help !

UPDATE 2 (working)

Thanks to Tom Sun, this is the working code

string storageConnectionString = "myStorageConnectionString";
     CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
      CloudBlobClient blobClient = account.CreateCloudBlobClient();
      CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
      blobContainer.CreateIfNotExistsAsync().Wait();
      string sourcePath = @"C:\Tom\TestLargeFile.zip";
      CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference("LargeFile.zip");
      // Setup the number of the concurrent operations
      TransferManager.Configurations.ParallelOperations = 64;
      // Setup the transfer context and track the upoload progress
      var context = new SingleTransferContext
      {
          ProgressHandler =
          new Progress<TransferStatus>(
               progress => { Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred); })
       };
      // Upload a local blob
      TransferManager.UploadAsync(sourcePath, destBlob, null, context, CancellationToken.None).Wait();
      Console.WriteLine("Upload finished !");
      Console.ReadKey();

I also add

ShouldOverwriteCallback = (source, destination) =>
               {
                   return true;
               },

in the SingleTransferContext to overwrite a blob if it already exists.

3

There are 3 answers

3
Tom Sun On BEST ANSWER

We can use  Azure Storage Data Movement Library easily to upload large files to Azure blob Storage. It works correctly for me, please have a try with the following code. More info about the Azure Storage Data Movement Library please refer to the document :

    string storageConnectionString = "storage connection string";
    CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
    CloudBlobClient blobClient = account.CreateCloudBlobClient();
    CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
    blobContainer.CreateIfNotExists();
    string sourcePath = @"C:\Tom\TestLargeFile.zip";
    CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference("LargeFile.zip");
    // Setup the number of the concurrent operations
    TransferManager.Configurations.ParallelOperations = 64;
    // Setup the transfer context and track the upoload progress

    var context = new SingleTransferContext();

    UploadOptions uploadOptions = new UploadOptions
    {
        DestinationAccessCondition = AccessCondition.GenerateIfExistsCondition()
    };
    context.ProgressHandler = new Progress<TransferStatus>(progress =>
    {
        Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred);
    });
    // Upload a local blob
    TransferManager.UploadAsync(sourcePath, destBlob, uploadOptions,context, CancellationToken.None).Wait();

SDK info please refer to package.config file

<?xml version="1.0" encoding="utf-8"?>
<packages>
  <package id="Microsoft.Azure.KeyVault.Core" version="1.0.0" targetFramework="net452" />
  <package id="Microsoft.Azure.Storage.DataMovement" version="0.4.1" targetFramework="net452" />
  <package id="Microsoft.Data.Edm" version="5.6.4" targetFramework="net452" />
  <package id="Microsoft.Data.OData" version="5.6.4" targetFramework="net452" />
  <package id="Microsoft.Data.Services.Client" version="5.6.4" targetFramework="net452" />
  <package id="Microsoft.WindowsAzure.ConfigurationManager" version="1.8.0.0" targetFramework="net452" />
  <package id="Newtonsoft.Json" version="6.0.8" targetFramework="net452" />
  <package id="System.Spatial" version="5.6.4" targetFramework="net452" />
  <package id="WindowsAzure.Storage" version="7.2.1" targetFramework="net452" />
</packages>

Check the uploaded file from azure portal

enter image description here

Update:

For .net core project code:

     string storageConnectionString = "myStorageConnectionString";
     CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
      CloudBlobClient blobClient = account.CreateCloudBlobClient();
      CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
      blobContainer.CreateIfNotExistsAsync().Wait();
      string sourcePath = @"C:\Tom\TestLargeFile.zip";
      CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference("LargeFile.zip");
      // Setup the number of the concurrent operations
      TransferManager.Configurations.ParallelOperations = 64;
      // Setup the transfer context and track the upoload progress
      var context = new SingleTransferContext
      {
          ProgressHandler =
          new Progress<TransferStatus>(
               progress => { Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred); })
       };
      // Upload a local blob
      TransferManager.UploadAsync(sourcePath, destBlob, null, context, CancellationToken.None).Wait();
      Console.WriteLine("Upload finished !");
      Console.ReadKey();

enter image description here

1
Zhaoxing Lu On

We're actively looking into the issue in Azure Storage Client Library.

Please note that since UploadFromFileAsync() is not a reliable and efficient operation for a huge blob, I'd suggest you to consider following alternatives:

If you can accept command line tool, you can try AzCopy, which is able to transfer Azure Storage data in high performance and its transferring can be paused & resumed.

If you want to control the transferring jobs programmatically, please use Azure Storage Data Movement Library, which is the core of AzCopy.

0
Michael Roberson - MSFT On

The original issue is fixed in version 8.0 of WindowsAzure.Storage.