Windows Azure Storage Logging: Using Logs to Track Storage Requests

Windows Azure Storage Logging provides a trace of the executed requests against your storage account (Blobs, Tables and Queues). It allows you to monitor requests to your storage accounts, understand performance of individual requests, analyze usage of specific containers and blobs, and debug storage APIs at a request level.

What is logged?

You control what types of requests are logged for your account. We categorize requests into 3 kinds: READ, WRITE and DELETE operations. You can set the logging properties for each service indicating the types of operations they are interested in. For example, opting to have delete requests logged for blob service will result in blob or container deletes to be recorded. Similarly, logging reads and writes will capture reads/writes on properties or the actual objects in the service of interest. Each of these options must be explicitly set to true (data is captured) or false (no data is captured).

What requests are logged?

The following authenticated and anonymous requests are logged.

Authenticated Requests:

Signed requests to user data. This includes failed requests such as throttled, network exceptions, etc.

Failed anonymous requests (other than those listed above) are not logged so that the logging system is not spammed with invalid anonymous requests.

During normal operation all requests are logged; but it is important to note that logging is provided on a best effort basis. This means we do not guarantee that every message will be logged due to the fact that the log data is buffered in memory at the storage front-ends before being written out, and if a role is restarted then its buffer of logs would be lost.

What data do my logs contain and in what format?

Each request is represented by one or more log records. A log record is a single line in the blob log and the fields in the log record are ‘;’ separated. We opted for a ‘;’ separated log file rather than a custom format or XML so that existing tools like LogParser, Excel etc, can be easily extended or used to analyze the logs. Any field that may contain ‘;’ is enclosed in quotes and html encoded (using HtmlEncode method).

Each record will contain the following fields:

Log Version (string): The log version. We currently use “1.0”.

Transaction Start Time (timestamp): The UTC time at which the request was received by our service.

REST Operation Type (string) – will be one of the REST APIs. See “Operation Type: What APIs are logged?” section below for more details.

HTTP Status Code (string): E.g. “200”. This can also be “Unknown” in cases where communication with the client is interrupted before we can set the status code.

E2E Latency (duration): The total time in milliseconds taken to complete a request in the Windows Azure Storage Service. This value includes the required processing time within Windows Azure Storage to read the request, send the response, and receive acknowledgement of the response.

Server Latency (duration): The total processing time in milliseconds taken by the Windows Azure Storage Service to process a request. This value does not include the network latency specified in E2E Latency.

Authentication type (string) – authenticated, SAS, or anonymous.

Requestor Account Name (string): The account making the request. For anonymous and SAS requests this will be left empty.

Owner Account Name (string): The owner of the object(s) being accessed.

Service Type (string): The service the request was for (blob, queue, table).

Object Key (string): This is the key of the object the request is operating on. E.g. for Blob: “/myaccount/mycontainer/myblob”. E.g. for Queue: “/myaccount/myqueue”. E.g. For Table: “/myaccount/mytable/partitionKey/rowKey”. Note: If custom domain names are used, we still have the actual account name in this key, not the domain name. This field is always quoted.

Request ID (guid): The x-ms-request-id of the request. This is the request id assigned by the service.

Operation Number (int): There is a unique number for each operation executed for the request that is logged. Though most operations will just have “0” for this (See examples below), there are operations which will contain multiple entries for a single request.

Copy Blob will have 3 entries in total and operation number can be used to distinguish them. The log entry for Copy will have operation number “0” and the source read and destination write will have 1 and 2 respectively.

Table Batch command. An example is a batch command with two Insert’s: the Batch would have “0” for operation number, the first insert would have “1”, and the second Insert would have “2”.

Client IP (string): The client IP from which the request came.

Request Version (string): The x-ms-version used to execute the request. This is the same x-ms-version response header that we return.

Request Header Size (long): The size of the request header.

Request Packet Size (long) : The size of the request payload read by the service.

Response Header Size (long): The size of the response header.

Response Packet Size (long) : The size of the response payload written by the service. NOTE: The above request and response sizes may not be filled if a request fails.

Request Content Length (long): The value of Content-Length header. This should be the same size of Request Packet Size (except for error scenarios) and helps confirm the content length sent by clients.

Request MD5 (string): The value of Content-MD5 (or x-ms-content-md5) header passed in the request. This is the MD5 that represents the content transmitted over the wire. For PutBlockList operation, this means that the value stored is for the content of the PutBlockList and not the blob itself.

Server MD5 (string): The md5 value evaluated on the server. For PutBlob, PutPage, PutBlock, we store the server side evaluated content-md5 in this field even if MD5 was not sent in the request.

ETag(string): For objects for which an ETag is returned, the ETag of the object is logged. Please note that we will not log this in operations that can return multiple objects. This field is always quoted.

Last Modified Time (DateTime): For objects where a Last Modified Time (LMT) is returned, it will be logged. If the LMT is not returned, it will be ‘‘(empty string). Please note that we will not log this in operations that can return multiple objects.

ConditionsUsed(string): Semicolon separated list of ConditionName=value. This is always quoted. ConditionName can be one of the following:

If-Modified-Since

If-Unmodified-Since

If-Match

If-None-Match

User Agent (string): The User-Agent header.

Referrer (string): The “Referrer” header. We log up to the first 1 KB chars.

Client Request ID (string): This is custom logging information which can be passed by the user via x-ms-client-request-id header (see below for more details). This is opaque to the service and has a limit of 1KB characters. This field is always quoted.

Copy Blob Copy blob will have 3 log lines logged. The request id will be the same but operation id (highlighted in the examples below) will be incremented. The line with operation ID 0 represents the entire copy blob operation. The operation ID 1 represents the source blob retrieval for the copy. This operation is called CopyBlobSource. Operation ID 2 represents destination blob information and the operation is called CopyBlobDestination. The CopyBlobSource will not have information like request size or response size set and is meant only to provide information like name, conditions used etc. about the source blob.

Table Batch Request or entity group requests with 2 inserts. The first one stands for the batch request, and the latter two represent the actual inserts. NOTE: the key for the batch is the key of the first command in the batch. The operation number is used to identify different individual commands in the batch

Operation Type: What APIs are logged?

The APIs recorded in the logs are listed by service below, which match the REST APIs for Windows Azure Storage. Note that for operations that have multiple operations executed (e.g. Batch) as part of them, the field OperationType will contain multiple records with a main record (that has Operation Number ‘0’) and an individual record for each sub operation.

Transaction Status

This table shows the different statuses that can be written to your log.

Status (this is a string field)

Description

Success

Indicates the request was successful

AnonymousSuccess

Indicates that the anonymous request was successful

SASSuccess

Indicates that the SAS request was successful

ThrottlingError

Indicates that the request was throttled with http status code ServerBusy i.e. 503. The user is expected to back-off on throttling errors.

AnonymousThrottlingError

Indicates that the anonymous request was throttled with http status code ServerBusy i.e. 503. The user is expected to back-off on throttling errors.

SASThrottlingError

Indicates that the SAS request was throttled with http status code ServerBusy i.e. 503. The user is expected to back-off on throttling errors.

ClientTimeoutError

Indicates that the authenticated request timed out.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

This is marked as client side timeout error when the user’s network IO is slow and the timeout provided was less than expected value for a slow network. In other words, if a request spends most time in reading/writing from/to client and does not allow sufficient time for the server to process, we will term these timeouts as ClientTimeoutErrror. To determine what is the minimum time required by a server to process a request, please refer to the times posted here.

Any other timeout will be deemed as ServerTimeout.

AnonymousClientTimeoutError

Indicates that the anonymous request timed out.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

This is marked as client side timeout error when the user’s network IO is slow and the timeout provided was less than expected value for a slow network. In other words, if a request spends most time in reading/writing from/to client and does not allow sufficient time for the server to process, we will term these timeouts as AnonymousClientTimeoutErrror. To determine what is the minimum time required by a server to process a request, please refer to the times posted here.

Any other timeout will be deemed as AnonymousServerTimeout.

SASClientTimeoutError

Indicates that the SAS request timed out.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

This is marked as client side timeout error when the user’s network IO is slow and the timeout provided was less than expected value for a slow network. In other words, if a request spends most time in reading/writing from/to client and does not allow sufficient time for the server to process, we will term these timeouts as SASClientTimeoutErrror. To determine what is the minimum time required by a server to process a request, please refer to the times posted here.

Any other timeout will be deemed as SASServerTimeout.

ServerTimeoutError

Indicates that the request timed out because of a problem on the server end. Users are expected to back-off on these errors.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

AnonymousServerTimeoutError

Indicates that the anonymous request timed out because of a problem on the server end. Users are expected to back-off on these errors.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

SASServerTimeoutError

Indicates that the SAS request timed out because of a problem on the server end. Users are expected to back-off on these errors.

These are seen as timeout errors or Http Status code 500 with Storage error code as “Timeout”.

Indicates that authenticated request failed due to unknown server error.

These are typically Http Status code 500 with Storage error code other than Timeout.

AnonymousServerOtherError

Indicates that anonymous request failed due to unknown server error.

These are typically Http Status code 500 with Storage error code other than Timeout.

SASServerOtherError

Indicates that SAS request failed due to unknown server error.

These are typically Http Status code 500 with Storage error code other than Timeout.

AuthorizationError

Indicates that the authenticated requests failed authorization check.

Example: write requests from users to logs under $logs will be treated as Authorization error

SASAuthorizationError

Indicates that the SAS requests failed authorization check.

Example: write requests using SAS when only read access was provided Authorization error

NetworkError

Indicates that the authenticated request failed because of network error.

Example: Network errors occur when a user prematurely closes the connection before the timeout expires or if there are problems in any of the intermediate switches.

AnonymousNetworkError (long)

Indicates that the anonymous request failed because of a network error.

Example: Network errors occur when a user prematurely closes the connection before the timeout expires or if there are problems in any of the intermediate switches.

SASNetworkError

Indicates that the SAS request failed because of network error.

Example: Network errors occur when a user prematurely closes the connection before the timeout expires or if there are problems in any of the intermediate switches.

Client Request Id

One of the logged fields is called the Client Request Id. A client can choose to pass this client perspective id up to 1KB in size as a HTTP header “x-ms-client-request-id” in with every request and it will be logged for the request. Note, if you use the optional client request id header, it is used in constructing the canonicalized header, since all headers starting with “x-ms” are part of the resource canonicalization used for signing a request.

Since this is treated as an id that a client may associate with a request, it is very helpful to investigate requests that failed due to network or timeout errors. For example, you can search for requests in the log with the given client request id to see if the request timed out, and see if the E2E latency indicates that there is a slow network connection. As noted above, some requests may not get logged due to node failures.

Where can I find the logs and how does the system store them?

The analytics logging data is stored as block blobs in a container called $logs in your blob namespace for the storage account being logged. The $logs container is created automatically when you turn logging on and once created this container cannot be deleted, though the blobs stored in the container can be deleted. The $logs container can be accessed using “http://<accountname>.blob.core.windows.net/$logs” URL. Note that normal container listing operations that you issue to list your containers will not list the $logs container. But you can list the blobs inside the $logs container itself. Logs will be organized in the $logs namespace per hour by service type (Blob, Table and Queue). Log entries are written only if there are eligible requests to the service and the operation type matches the type opted for logging. For example, if there was no table activity on an account for an hour but we had blob activity, no logs would be created for the table service but we would have some for blob service. If in the next hour there is table activity, table logs would be created for that hour.

We use block blobs to store the log entries as block blobs represents files better than page blobs. In addition, the 2 phase write semantics of block blobs allows our system to write a set of log entries as a block in the block blob. Log entries are accumulated and written to a block when the size reaches 4 MB or if it has been up to 5 minutes since the entries have been flushed to a block. The system will commit the block blob if the size of the uncommitted blob reaches 150MB or if it has been up to 5 minutes since the first block was uploaded – whichever is reached first. Once a blob is committed, it is not updated with any more blocks of log entries. Since a block blob is available to read only after commit operation, you can be assured that once a log blob is committed it will never be updated again.

NOTE: Applications should not take any dependency on the above mentioned size and time trigger for flushing a log entry or committing the blob as it can change without notice.

What is the naming convention used for logs?

Logging in a distributed system is a challenging problem. What makes it challenging is the fact that there are many servers that can process requests for a single account and hence be a source for these log entries. Logs from various sources need to be combined into a single log. Moreover, clock skews cannot be ruled out and the number of log entries produced by a single account in a distributed system can easily run into thousands of log entries per second. To ensure that we provide a pattern to process these logs efficiently despite these challenges, we have a naming scheme and we store additional metadata for the logs that allow easy log processing.

The log name under the $logs container will have the following format:

<service name>/YYYY/MM/DD/hhmm/<Counter>.log

Service Name: “blob”, “table”, “queue”

YYYY – The four digit year for the log

MM – The two digit month for the log

DD – The two digit day for the log

hh – The two digit hour (24 hour format) representing the starting hour for all the logs. All timestamps are in UTC

mm – The two digit number representing the starting minute for all the logs. This is always 00 for this version and kept for future use

Counter – A zero based counter as there can be multiple logs generated for a single hour. The counter is padded to be 6 digits. The counter progress is based on the last log’s counter value within a given hour. Hence if you delete the last log blob, you may have the same blob name repeat again. It is recommended to not delete the blob logs right away.

The following are properties of the logs:

A request is logged based on when it ends. For example, if a request starts 13:57 and lasts for 5 minutes, it will make it into the log at the time it ended. It will therefore appear in a log with hh=1400.

Log entries can be recorded out of order which implies that just inspecting the first and last log entry is not sufficient to figure if the log contains the time range you may be interested in.

To aid log analysis we store the following metadata for each of the log blobs:

LogType = The type of log entries that a log contains. It is described as combination of read, write and delete. The types will be comma separated. This allows you to download blobs which have the operation type that you are interested in.

StartTime = The minimum time of a log entry in the log. It is of form YYYY-MM-DDThh:mm:ssZ. This represents the start time for a request that is logged in the blob.

EndTime = The maximum time of a log entry in the log of form YYYY-MM-DDThh:mm:ssZ. This represents the maximum start time of a request logged in the blob.

LogVersion = The version of the log format. Currently 1.0. This can be used to process a given blob as all entries in the blob will conform to this version.

With the above you can list of all of the blobs with the “include=metadata” option to quickly see which blob logs have the given time range of logs for processing.

For example, assume we have a blob log that contains write events generated at 05:10:00 UTC on 2011/03/04 and contains requests that started at 05:02:30.0000001Z until 05:08:05.1000000X, the log name will be: $logs/blob/2011/03/04/0500/000000.log and the metadata will contain the following properties:

LogType=write

StartTime=2011-03-04T05:02:30.0000001Z

EndTime=2011-03-04T05:08:05.1000000Z

LogVersion=1.0

Note, duplicate log records may exist in logs generated for the same hour and can be detected by checking for duplicate RequestId and Operation number.

Operations on $logs container and manipulating log data

As we mentioned above, once you enable logging, your log data is stored as block blobs in a container called $logs in the blob namespace of your account. You can access your $logs container using http://<accountname>.blob.core.windows.net/$logs. To list your logs you can use the list blobs API. The logs are stored organized by service type (blob, table and queue) and are sorted by generation date/time within each service type. The log name under the $logs container will have the following format: <service name>/YYYY/MM/DD/hhmm/<Counter>.log

The following operations are the operations allowed on the $logs container:

List blobs in $logs container. (Note that $logs will not be displayed in result of listing all containers in the account namespace).

Read committed blobs under $logs.

Delete specific logs. Note: logs can be deleted but the container itself cannot be deleted.

To improve enumerating the logs you can pass a prefix when using the list blobs API. For example, to filter blobs by date/time the logs are generated on, you can pass a date/time as the prefix (blob/2011/04/24/) when using the list blobs API.

It is important to note that log entries are written only if there are requests to the service. For example, if there was no table activity on an account for an hour but you had blob activity, no logs would be created for the table service but you would have some for blob service. If in the next hour there is table activity, table logs would be created for that hour.

What is the versioning story?

The following describes the versioning for logs:

The version is stored in the blob metadata and each log entry as the first field.

All records within a single blob will have the same version.

When new fields are added, they may be added to the end and will not incur a version change if this is the case. Therefore, applications responsible for processing the logs should be designed to interpret only the first set of columns they are expecting and ignore any extra columns in a log entry.

Examples of when a version change could occur:

The representation needs to change for a particular field (example – data type changes).

A field needs to be removed.

What is the scalability targets and capacity limits for logs and how does this related to my storage account?

Capacity and scale for your analytics account is separate from your ‘regular’ account. There is separate 20TB allocated for analytics data (this includes both metrics and logs data). This is not included as part of the 100TB limit for an individual account. In addition, $logs are kept in a separate part of the namespace for the storage account, so it is throttled separately from the storage account, and requests issued by Windows Azure Storage to generate or delete these logs do not affect the per partition or per account scale targets described in the Storage Scalability Targets blog post.

How do I cleanup my logs?

To ease the management of your logs, we have provided the functionality of retention policy which will automatically cleanup ‘old’ logs without you being charged for the cleanup. It is recommended that you set a retention policy for logs such that your analytics data will be within the 20TB limit allowed for analytics data (logs and metrics combined) as described above.

A maximum of 365 days is allowed for retention policy. Once a retention policy is set, the system will delete the logs when logs age beyond the number of days set in the policy. This deletion will be done lazily in the background. Retention policy can be turned off at any time but if set, the retention policy is enforced even if logging is turned off. For example: If you set the retention policy for logging to be 10 days for blob service, then all the logs for blob service will be deleted if the content is > 10 days. If you do not set a retention policy you can manage your data by manually deleting entities (like you delete entities in regular tables) whenever you wish to do so.

The capacity used by $logs is billable and the following actions performed by Windows Azure are billable:

Requests to create blobs for logging

The following actions performed by a client are billable:

Read and delete requests to $logs

If you have configured a data retention policy, you are not charged when Windows Azure Storage deletes old logging data. However, if you delete $logs data, your account is charged for the delete operations.

Turning Logging On

A REST API call, as shown below, is used to turn on Logging. In this example, logging is turned on for deletes and writes, but not for reads. The retention policy is enabled and set to ten days - so the analytics service will take care of deleting data older than ten days for you at no additional cost. Note that you need to turn on logging separately for blobs, tables, and queues. The example below demonstrates how to turn logging on for blobs.

The above logging and metrics section shows you how to configure what you want to track in your analytics logs and metrics data. The logging configuration values are described below:

Version - The version of Analytics Logging used to record the entry.

Delete (Boolean) – set to true if you want to track delete requests and false if you do not

Read (Boolean) – set to true if you want to track read requests and false if you do not

Write (Boolean) – set to true if you want to track write requests and false if you do not

Retention policy – this is where you set the retention policy to help you manage the size of your analytics data

Enabled (Boolean) – set to true if you want to enable a retention policy. We recommend that you do this.

Days (int) – the number of days you want to keep your analytics logging data. This can be a max of 365 days and a min of 1 day.

For more information, please see the MSDN Documentation. (this link will be live later today)

To turn on analytics, here are extensions to StorageClient library’s CloudBlobClient, CloudTableClient and CloudQueueClient. The extension methods and utility methods that accomplish this is just sample code in which error handling has been removed for brevity.

Let us start by listing a sample code that uses the extension samples:

We have added here a new self-explanatory settings class called AnalyticsSettings to contain the settings that can be set / retrieved. Each property listed in settings above has a property representing it.

Now that we have covered the basic class, let us go over the extension class that provides the ability to set/get settings. This class provides extension methods SetServicesettings and GetServiceSettings on each one of the client objects. The rest is self-explanatory as the extension method takes the settings and then calls a single method to dispatch the settings by serializing/deserializing the settings class.

NOTE: Because CloudQueueClient does not expose the BaseUri property, the extension takes the base Uri explicitly.

Now to the crux of the code which handles serialization/deserialization. This code provides a SerializeAnalyticsSettings method that serializes AnalyticsSettings class into the format expected by the service and provides DeserializeAnalyticsSettings to reconstruct the AnalyticsSettings class from the response for GET REST method.

Downloading your log data

Since listing normal containers does not list out $logs container, existing tools will not be able to display these logs. In absence of existing tools, we wanted to provide a quick reference application with source code to make this data accessible.

The following application takes the service to download the logs for, the start time and end time for log entries and a file to export to. It then exports all log entries to the file in a csv format.

For example the following command will select all logs that have log entries in the provided time range and download all the log entries in those logs into a file called mytablelogs.txt:

Case Study

To put the power of log analytics into perspective, we will end this post with a sample application. Below, we cover a very simple console application that allows search (e.g., grep) functionality over logs. One can imagine extending this to download the logs and analyze it and store data in structured storage such as Windows Azure Tables or SQL Azure for additional querying over the data.

Description: A console program that takes as input: service name to search for, log type to search, start time for the search, end time for the search and the keyword to search for in the logs. The output is log entries that match the search criteria and contains the keyword.

We will start with the method “ListLogFiles”. The method will use the input arguments to create a prefix for the blob listing operation. This method makes use of the utility methods GetSearchPrefix to get the prefix for listing operation. The prefix uses service name, start and end time to format the prefix. The start and end times are compared and only the parts that match are used in the prefix search. For example: If start time is "2011-06-27T02:50Z" and end time is "2011-06-27T03:08Z", then the prefix for blob service will be: “$logs/blob/2011/06/27/”. If the hour had matched then the prefix would be: “$logs/blob/2011/06/27/0200”. The listing result contains metadata for each blob in the result. The start time, end time and logging level is then matched to see if the log contains any entries that may match the time criteria. If it does, then we add the log to the list of logs we will be interested in downloading. Otherwise, we skip the log file.

NOTE: Exception handling and parameter validation is omitted for brevity.

/// <summary>
/// Given a service, start time, end time, and operation types (i.e. READ/WRITE/DELETE) to search for, this method
/// iterates through blob logs and selects the ones that match the service and time range.
/// </summary>
/// <param name="blobClient"></param>
/// <param name="serviceName">The name of the service interested in</param>
/// <param name="startTimeForSearch">Start time for the search</param>
/// <param name="endTimeForSearch">End time for the search</param>
/// <param name="operationTypes">A ',' separated operation types used as logging level</param>
/// <returns></returns>
static List<CloudBlob> ListLogFiles(
CloudBlobClient blobClient,
string serviceName,
DateTime startTimeForSearch,
DateTime endTimeForSearch,
string operationTypes)
{
List<CloudBlob> selectedLogs = new List<CloudBlob>();
// convert a ',' separated log type to a "flag" enum
LoggingLevel loggingLevelsToFind = GetLoggoingLevel(operationTypes);
// form the prefix to search. Based on the common parts in start and end time, this prefix is formed
string prefix = GetSearchPrefix(serviceName, startTimeForSearch, endTimeForSearch);
Console.WriteLine("Prefix used = {0}", prefix);
// List the blobs using the prefix
IEnumerable<IListBlobItem> blobs = blobClient.ListBlobsWithPrefix(
prefix,
new BlobRequestOptions()
{
UseFlatBlobListing = true,
BlobListingDetails = BlobListingDetails.Metadata
});
// iterate through each blob and figure the start and end times in the metadata
foreach (IListBlobItem item in blobs)
{
CloudBlob log = item as CloudBlob;
if (log != null)
{
DateTime startTime = DateTime.Parse(log.Metadata[LogStartTime]).ToUniversalTime();
DateTime endTime = DateTime.Parse(log.Metadata[LogEndTime]).ToUniversalTime();
string logTypes = log.Metadata[LogEntryTypes].ToUpper();
LoggingLevel levelsInLog = GetLoggoingLevel(logTypes);
// we will exclude the file if the time range does not match or it does not contain the log type
// we are searching for
bool exclude = (startTime > endTimeForSearch
|| endTime < startTimeForSearch
|| (loggingLevelsToFind & levelsInLog) == LoggingLevel.None);
Console.WriteLine("{0} Log {1} Start={2:U} End={3:U} Types={4}.",
exclude ? "Ignoring" : "Selected",
log.Uri.AbsoluteUri,
startTime,
endTime,
logTypes);
if (!exclude)
{
selectedLogs.Add(log);
}
}
}
return selectedLogs;
}
/// <summary>
/// Given service name, start time for search and end time for search, creates a prefix that can be used
/// to efficiently get a list of logs that may match the search criteria
/// </summary>
/// <param name="service"></param>
/// <param name="startTime"></param>
/// <param name="endTime"></param>
/// <returns></returns>
static string GetSearchPrefix(string service, DateTime startTime, DateTime endTime)
{
StringBuilder prefix = new StringBuilder("$logs/");
prefix.AppendFormat("{0}/", service);
// if year is same then add the year
if (startTime.Year == endTime.Year)
{
prefix.AppendFormat("{0}/", startTime.Year);
}
else
{
return prefix.ToString();
}
// if month is same then add the month
if (startTime.Month == endTime.Month)
{
prefix.AppendFormat("{0:D2}/", startTime.Month);
}
else
{
return prefix.ToString();
}
// if day is same then add the day
if (startTime.Day == endTime.Day)
{
prefix.AppendFormat("{0:D2}/", startTime.Day);
}
else
{
return prefix.ToString();
}
// if hour is same then add the hour
if (startTime.Hour == endTime.Hour)
{
prefix.AppendFormat("{0:D2}00", startTime.Hour);
}
return prefix.ToString();
}

Once we have a list of logs, then we just need to iterate and see if the log entry in the log contains the keyword.

Given the above methods, to search for log entries between 02:50 and 3:05 for DeleteContainer, the following can be executed. The spew from the console application is self-describing. It lists all the logs that are created between the two dates and then selects only the ones that have “Delete” in the LogType and then once it gets the eligible logs, it downloads it and then outputs any lines in the log that contains the search keyword:

Great feature, and great writeup and sample code to go with it. This blog entry will definitely save me some effort, and I really appreciate it.

One question (that I'm guessing I already know the answer to): what happens to logging/metrics when I'm serving a blob through Windows Azure CDN, where Blob Storage only gets hit once-per-TTL?

And, if the answer is "yeah, we're not logging CDN hits" then you can guess what my next formal feature request is going to be. 🙂 (Not that I'm not incredibly grateful for what you've done here already.) In effect, the logging/metrics feature seems so good that it almost acts as a disincentive to use the CDN, when you're looking to track hits.

When the Azure CDN access your storage account (e.g., when the TTL expires), that request will be logged inside of the storage account. The hits at the Azure CDN edges are not logged as part of the storage account (only the requests that directly hit the storage account). So you won’t see logging/metrics for Azure CDN hits as part of the storage account’s logging/metrics.

Yes, the Azure CDN team is looking at providing logging for requests coming into Azure CDN in the future.