In this article

Transfer data with AzCopy on Linux

In this article

AzCopy is a command-line utility designed for copying data to/from Microsoft Azure Blob and File storage, using simple commands designed for optimal performance. You can copy data between a file system and a storage account, or between storage accounts.

There are two versions of AzCopy that you can download. AzCopy on Linux targets Linux platforms offering POSIX style command-line options. AzCopy on Windows offers Windows style command-line options. This article covers AzCopy on Linux.

Note

Starting in AzCopy 7.2 version, the .NET Core dependencies are packaged with the AzCopy package. If you use 7.2 version or later, you no longer need to install .NET Core as a pre-requisite.

You can remove the extracted files once AzCopy on Linux is installed. Alternatively, if you do not have superuser privileges you can also run azcopy using the shell script azcopy in the extracted folder.

Writing your first AzCopy command

The basic syntax for AzCopy commands is:

azcopy --source <source> --destination <destination> [Options]

The following examples demonstrate various scenarios for copying data to and from Microsoft Azure Blobs and Files. Refer to the azcopy --help menu for a detailed explanation of the parameters used in each sample.

Download blobs with specified prefix

Assume the following blobs reside in the specified container. All blobs beginning with the prefix a are downloaded.

abc.txt
abc1.txt
abc2.txt
xyz.txt
vd1\a.txt
vd1\abcd.txt

After the download operation, the folder /mnt/myfiles includes the following files:

/mnt/myfiles/abc.txt
/mnt/myfiles/abc1.txt
/mnt/myfiles/abc2.txt

The prefix applies to the virtual directory, which forms the first part of the blob name. In the example shown above, the virtual directory does not match the specified prefix, so no blob is downloaded. In addition, if the option --recursive is not specified, AzCopy does not download any blobs.

Set the last-modified time of exported files to be same as the source blobs

You can also exclude blobs from the download operation based on their last-modified time. For example, if you want to exclude blobs whose last modified time is the same or newer than the destination file, add the --exclude-newer option:

Upload all files

Specifying option --recursive uploads the contents of the specified directory to Blob storage recursively, meaning that all subfolders and their files are uploaded as well. For instance, assume the following files reside in folder /mnt/myfiles:

After the upload operation, the container includes the following files:

abc.txt
abc1.txt
abc2.txt
subfolder/a.txt
subfolder/abcd.txt

When the option --recursive is not specified, AzCopy skips files that are in sub-directories:

abc.txt
abc1.txt
abc2.txt

Specify the MIME content type of a destination blob

By default, AzCopy sets the content type of a destination blob to application/octet-stream. However, you can explicitly specify the content type via the option --set-content-type [content-type]. This syntax sets the content type for all blobs in an upload operation.

Customizing the MIME content type mapping

AzCopy uses a configuration file that contains a mapping of file extension to content type. You can customize this mapping and add new pairs as needed. The mapping is located at /usr/lib/azcopy/AzCopyConfig.json

After the copy operation, the target container includes the blob and its snapshots. The container includes the following blob and its snapshots:

abc.txt
abc (2013-02-25 080757).txt
abc (2014-02-21 150331).txt

Synchronously copy blobs across Storage accounts

AzCopy by default copies data between two storage endpoints asynchronously. Therefore, the copy operation runs in the background using spare bandwidth capacity that has no SLA in terms of how fast a blob is copied.

The --sync-copy option ensures that the copy operation gets consistent speed. AzCopy performs the synchronous copy by downloading the blobs to copy from the specified source to local memory, and then uploading them to the Blob storage destination.

--sync-copy might generate additional egress cost compared to asynchronous copy. The recommended approach is to use this option in an Azure VM, that is in the same region as your source storage account to avoid egress cost.

File: Download

Download single file

If the specified source is an Azure file share, then you must either specify the exact file name, (e.g.abc.txt) to download a single file, or specify option --recursive to download all files in the share recursively. Attempting to specify both a file pattern and option --recursive together results in an error.

Copy from blob to file share

When you copy a file from blob to file share, a server-side copy operation is performed.

Synchronously copy files

You can specify the --sync-copy option to copy data from File Storage to File Storage, from File Storage to Blob Storage and from Blob Storage to File Storage synchronously. AzCopy runs this operation by downloading the source data to local memory, and then uploading it to destination. In this case, standard egress cost applies.

Note that --sync-copy might generate additional egress cost comparing to asynchronous copy. The recommended approach is to use this option in an Azure VM, that is in the same region as your source storage account to avoid egress cost.

Other AzCopy features

Only copy data that doesn't exist in the destination

The --exclude-older and --exclude-newer parameters allow you to exclude older or newer source resources from being copied, respectively. If you only want to copy source resources that don't exist in the destination, you can specify both parameters in the AzCopy command:

Use a configuration file to specify command-line parameters

azcopy --config-file "azcopy-config.ini"

You can include any AzCopy command-line parameters in a configuration file. AzCopy processes the parameters in the file as if they had been specified on the command line, performing a direct substitution with the contents of the file.

Assume a configuration file named copyoperation, that contains the following lines. Each AzCopy parameter can be specified on a single line.

Journal file folder

Each time you issue a command to AzCopy, it checks whether a journal file exists in the default folder, or whether it exists in a folder that you specified via this option. If the journal file does not exist in either place, AzCopy treats the operation as new and generates a new journal file.

If the journal file does exist, AzCopy checks whether the command line that you input matches the command line in the journal file. If the two command lines match, AzCopy resumes the incomplete operation. If they do not match, AzCopy prompts user to either overwrite the journal file to start a new operation, or to cancel the current operation.

If you omit option --resume, or specify option --resume without the folder path, as shown above, AzCopy creates the journal file in the default location, which is ~\Microsoft\Azure\AzCopy. If the journal file already exists, then AzCopy resumes the operation based on the journal file.

This example creates the journal file if it does not already exist. If it does exist, then AzCopy resumes the operation based on the journal file.

If you want to resume an AzCopy operation, repeat the same command. AzCopy on Linux then will prompt for confirmation:

Incomplete operation with same command line detected at the journal directory "/home/myaccount/Microsoft/Azure/AzCopy", do you want to resume the operation? Choose Yes to resume, choose No to overwrite the journal to start a new operation. (Yes/No)

Output verbose logs

Specify the number of concurrent operations to start

Option --parallel-level specifies the number of concurrent copy operations. By default, AzCopy starts a certain number of concurrent operations to increase the data transfer throughput. The number of concurrent operations is equal eight times the number of processors you have. If you are running AzCopy across a low-bandwidth network, you can specify a lower number for --parallel-level to avoid failure caused by resource competition.

Tip

To view the complete list of AzCopy parameters, check out 'azcopy --help' menu.

Installation Steps for AzCopy 7.1 and earlier versions

AzCopy on Linux (v7.1 and earlier only) requires the .NET Core framework. Installation instructions are available on the .NET Core installation page.

You can remove the extracted files once AzCopy on Linux is installed. Alternatively if you do not have superuser privileges, you can also run azcopy using the shell script azcopy in the extracted folder.

Known issues and best practices

Error Installing AzCopy

If you encounter issues with AzCopy installation, you may try to run AzCopy using the bash script in the extracted azcopy folder.

cd azcopy
./azcopy

Limit concurrent writes while copying data

When you copy blobs or files with AzCopy, keep in mind that another application may be modifying the data while you are copying it. If possible, ensure that the data you are copying is not being modified during the copy operation. For example, when copying a VHD associated with an Azure virtual machine, make sure that no other applications are currently writing to the VHD. A good way to do this is by leasing the resource to be copied. Alternately, you can create a snapshot of the VHD first and then copy the snapshot.

If you cannot prevent other applications from writing to blobs or files while they are being copied, then keep in mind that by the time the job finishes, the copied resources may no longer have full parity with the source resources.

Running multiple AzCopy processes

You may run multiple AzCopy processes on a single client providing that you use different journal folders. Using a single journal folder for multiple AzCopy processes is not supported.