So you’ve decided that it’s time to optimize the memory footprint of your Silverlight application.Before you’ll be able to do this analysis and optimization, you’ll need to understand a little bit about the way that Silverlight is structured.

Though the Silverlight programming model that you’ve come to know and love is accessed via managed code, much of Silverlight’s internals are written in native code (C++).The actions that you take in managed code can cause Silverlight to allocate large amounts of native memory on your behalf, be it for layout, rendering, image allocations, you name it.

Especially important to remember is that your managed Silverlight elements will have a native memory counterpart.Reducing the element count in your visual tree will have a direct and favorable impact in native working set. Because of these native allocations, this article will go over analyzing native memory in addition to managed memory.

Where to start?

One of the most intimidating parts of performing memory analysis can be deciding where to start.Often developers don’t focus on memory usage until it becomes a problem, at which point the project has already grown to a rather complex scale.The good news is that there is already a great free tool that can help you divide the problem into more manageable pieces, this tool is VMMap.

VMMap enables you to look inside your process to see both visual and statistical breakdowns of your working set.Once you’ve attached to your Silverlight application you should see three color-coded horizontal bar charts across the top of the window, where each color corresponds to the similarly colored category row in the grid below.You can click each of the column headers in the grid to sort the data in that dimension.Clicking the row for a particular category will filter the per-allocation view at the bottom of the window.

Take a moment to familiarize yourself with the terminology displayed in the window.Even if you think you already understand what each of the categories represents, I recommend you crack open the VMMap help file (via the help menu) and read through the ‘Memory Types’ and ‘The VMMap Window’ nodes.Throughout the rest of this article I’ll be using these definitions as a reference.

Of these terms, ‘Working Set’ is probably the most inconsistently used term in the entire domain of performance analysis.Working set is most accurately defined as “the amount of committed virtual memory that is in physical memory and owned by the process”.Working set can be broken down into three components; these are the private, shareable, and shared components.

Shareable memory is memory that can be shared with other processes, but isn’t necessarily.The most interesting types of shareable memory are generally images (the executable file kind, not the pictographic kind) and mapped files (usually .mui files, or fonts).If your Silverlight application is the first one to start up on the system, then many of your images and mapped files will be loaded under the ‘Shareable WS’ column.If another Silverlight application subsequently starts, these images and mapped files don’t need to be reloaded, so much of this memory will move to the ‘Shared WS’ column as it becomes shared with the other application.

If memory isn’t shareable or shared, it is private.Private memory is memory that can’t be shared with other processes, and is thus private to your application.Since this is the category that application developers have the most control over and impact on, most of this article will be devoted to investigating and understanding process private memory.

Now that you have some background on the terminology of memory analysis, let’s get back to analyzing your application.Click on the ‘Private WS’ column until your private working set is sorted in descending order, now take a look at the largest category.Chances are you are staring at either ‘Heap’ or ‘Managed Heap’.Over the next couple sections, I’ll go over the tools that are available for analyzing that particular type of memory and how you can use them to your advantage.

Tracking Managed Memory

Analyzing managed memory allocations in Silverlight takes a little bit of work as the tools in this space haven’t had a chance to mature yet.At the time of the writing of this article, there are no managed memory profilers available that work with Silverlight.However, Silverlight 4.0 RTM ships with a copy of the Son of Strike (SOS) debugging extension for WinDbg.This will allow you to perform all the SOS actions that you’re already familiar with on the desktop in Silverlight.If you haven’t used the SOS debugging extension before, head over to the MSDN “Investigating Memory Issues” article for a great primer.Though this article is a bit dated, most of the content is still directly applicable to performing managed memory analysis in Silverlight.

There are a few tricks you’ll need to know in order to attach to your Silverlight application using SOS.The MSDN article previously mentioned is written in reference to the desktop CLR, so while reading through the document you’ll need to make a few changes to get things working properly.

To load SOS, find the SOS.dll in your Silverlight installation directory.Here is an example command for loading SOS on my system, you may need to adjust the path accordingly (notice that back slashes are escaped):.load "C:\\Program Files (x86)\\Microsoft Silverlight\\4.0.50401.0\\sos.dll"

In the section on ‘Measure Managed Heap Size’, the article mentions setting a breakpoint in mscorwks.dll, the v2.0 desktop CLR.For Silverlight 4.0, the assembly you are looking for is ‘coreclr.dll’.Change the breakpoint from the MSDN article to:bp coreclr!WKS::GCHeap::RestartEE "j (dwo(coreclr!WKS::GCHeap::GcCondemnedGeneration)==2) 'kb';'g'"

At minimum, there are two SOS commands that you should make sure you are familiar with, these are !DumpHeap and !GCRoot.Both commands are quite aptly named; DumpHeap shows you a dump of the managed heap, and GCRoot finds roots to a particular object.

By default, DumpHeap sorts by total allocation size in ascending order.In this sample you can see that the type responsible for allocating the most memory is System.String over 4196 instances.Looking in the left column you can see the MT value for the System.String type, 79580758.This can be used to retrieve a listing of all allocations on the GC heap of that type, their address, and their size.

Now that you have a list of all instances on the heap, you want to find out why they are still alive; for this you can use GCRoot.Simply copy the address (the first column) corresponding to the instance that you’re interested in, then run GCroot.

Sometimes just having GC root stack information won’t be enough information for you to understand what’s going on.In these cases it can help to inspect the managed object’s actual fields, this can be achieved using the DO command.

In this case you can see that the value of the string was text/xml; charset=utf-8.

Given managed heap dumps, GC root stacks, and the ability to inspect individual objects you should have everything that you need to start trimming your managed memory usage.There are plenty of good articles on the web on using the SOS extension in conjunction with WinDbg to optimize your managed memory usage, so I won’t go into more detail here.Instead, in subsequent posts, I’ll focus on Silverlight specific techniques, or scenarios commonly run into while building Silverlight applications.

Future Managed Memory Tools

In Silverlight 4.0 we added the ability to use the CLR v4.0 profiling API in Silverlight. Because of this, third-parties now have the opportunity to write managed memory profilers (or adapt existing ones) to work with Silverlight.As mentioned at the beginning of this section, at the time of writing of this article no profilers are available, but as time goes on I expect we’ll see more and more managed memory profiling tools for Silverlight.

Tracking Native Memory

As I mentioned earlier, much of Silverlight’s internals are written in native code.The actions that you take at the application layer in managed code will cause Silverlight to make native allocations on your behalf for a variety of reasons.Because of this, you should know how to inspect these allocations should an issue arise.

Before we get started, take a look at a VMMap snapshot of your application.Focus on the ‘Heap’ and ‘Private Data’ sections of the snapshot as these will determine where you are seeing a larger amount of allocations.In either case we will use XPerf from the Windows Performance Toolkit for tracking allocations, though the method will be slightly different should you choose to investigate heap or private data.

Native Heap Analysis

So VMMap is reporting that you are holding large amounts of memory on the native heap.The first step in correcting this problem is to collect a heap trace and use the XPerf heap plugin to dissect it.Collecting a heap trace is a little tricky but I’ve written a handy script to ease the process, you can get it here [Download HeapMonitor.zip].If you’d prefer to control XPerf directly, or if you’d just like more background information, see the “Exploring Process Heaps Using WPA” article on MSDN.

Attaching to Silverlight

Attaching to an in-browser Silverlight application:

Launch Internet Explorer with your homepage set to “about:blank”.

Open an administrative command prompt with XPerf in your path and run:HeapMonitor.cmd –p <process-id-of-internet-explorer>

Wait for the trace sessions to start, when its ready you should see the text, “Hit enter to stop profiling...”

Attaching to an out-of-browser Silverlight application.

Install your Silverlight out-of-browser application on the system and run it.

Run ‘Process Explorer’ and find the copy of sllauncher.exe associated with your app (you can use ‘right-click => Window => Bring to front’ to help discover which sllauncher.exe process is for your app).

Right click on that copy of sllauncher.exe and click ‘Properties’, then remember the OOB token that shows up under “Command line”,this token is shown in bold below:"C:\Program Files (x86)\Microsoft Silverlight\sllauncher.exe" 3800347859.testapp

Open an administrative command prompt with XPerf in your path and run:HeapMonitor.cmd –p <process-id-of-sllauncher>

Wait for the trace sessions to start, when its ready you should see the text, “Hit enter to stop profiling…”

In the WinDbg command window run the “g” command to allow the process to run.

Run Your Scenario

Now that heap tracing is on, use your application as you normally would.Perform any actions that are known to cause the native heap to grow to large sizes in your application so that we can look over them in the trace.Once you are finished reproducing your scenario press enter in the command prompt where you are running the heap trace.

It will take a moment for the trace sessions to be shut down and merged; this can take a long time so you’ll need to be patient.When your trace is ready you’ll see the word “Finished” printed in the command prompt.The line above should tell you the name of your output trace file.

Analyzing Your Heap Trace

Open the output ETL file using XPerf and you’ll see a series of horizontal charts.You can refer back to the XPerf documentation for the details on what each graph displays, this article will only talk about the ‘Heap Outstanding Allocation Size’ graph as shown here.

The graph shows the number outstanding bytes allocated over time.You can use the mouse to select a region of the graph, and right click to zoom in or perform other operations.You can use this to select an interesting area of the graph such as a spike in allocation, then right click and select ‘Summary Table’ to get a textual breakdown of heap allocations.

Resolving Symbols

In order to make any of this information useful you’ll need to make sure you’re properly set up to resolve symbols.In XPerf click “Trace => Configure Symbol Paths” and set the following values, adjusting your symbol cache accordingly.

Next, in order to make the most use of XPerf, you’ll want to configure your view in the following way.Arrange your columns from left to right in the same order as shown below, taking care to put the yellow divider in the right location.Columns to the left of the yellow divider are intelligently grouped; columns to the right are not.This configuration is just to get you started, feel free to customize your view once you get more familiar with the tool.

Once you’ve set up your columns, from left to right you have the following:

Allocation Type.This will be one of the four values below.These categories are relative to your selection on the allocation graph.

AIFI – (A)llocated (I)nside (F)reed (I)nside

AIFO – (A)llocated (I)nside (F)reed (O)utside

Likely the most interesting category, these allocations were created but never freed inside your selection.These have essentially “leaked” from your selection, even though they may be freed later down the line.

AOFI – (A)llocated (O)utside (F)reed (I)nside

AOFO – (A)llocated (O)utside (F)reed (O)utside

HeapHandle – The address of a particular heap.

Stack – The call stack at which the allocation happened.This hierarchical and can be expanded to drill down to a more specific set of allocations.

Count – The number of inclusive heap allocations for a particular type, heap handle, and call stack.

Size – The inclusive size of all allocations in bytes for a particular type, heap handle, and call stack.

Now let’s take a look at an example of a potentially problematic call stack.Here is a seemingly innocuous looking piece of XAML that loads an image from Bing and displays it as a thumbnail at 150x80.

Doing the math, we’d expect that this image would take 150x80 x 4 bytes per pixel, or 48kb of memory, no big deal right?Wrong unfortunately.As of Silverlight 4.0 there is no way to set the size at which an image is decoded, they’re always decoded at full size.The actual resolution of the image at that URL was a much larger 956x512, weighing in at 1,961,984 bytes not counting headers.This can be seen in the call stack below:

Not all native allocations are this obvious to track back to their design-time source, but oversized image allocations are a common problem we see in customer applications so this is a particularly useful example.To fix this issue, ideally you would explicitly define the DecodePixelWidth/Height for the image; as Silverlight doesn’t support this yet you’ll need to employ a workaround such as this.

Conclusion

Hopefully this article has supplied you with some new and useful information when it comes to gathering data about your Silverlight application's memory usage. These tools and techniques, coupled with your domain knowledge about your application, should prove to be quite a powerful combination for locating suspicious allocations in your application. In subsequent posts on analyzing Silverlight memory usage we will cover some more specific scenarios that build on the foundation techniques in this article.

Startup is important because it is the first interaction that your user has with your application.You get one chance to impress, and failure to do so could mean the user closing and/or uninstalling your application permanently.What follows is a set of tips and tricks you can use to supercharge the startup path of your Silverlight application.

The Cardinal Rule

There is, essentially, a single cardinal rule.

Do the absolute minimum required to display your main screen.

Write this on a sticky note and hang it on your monitor.This may seem like common sense (I like to think so), but I’ve analyzed many an application that violates this rule in more ways than one.The less deterministic the code is that you’re executing, the more you’re asking for trouble.To subdivide this rule into some specific examples:

Minimize your download size.

Never wait on network I/O before displaying your main screen.

Minimize disk I/O, delay-load any data or business logic that you can.

Minimize your download size

Since your application has to be downloaded before it can start up, your download size directly affects startup time.Consider dividing your application into multiple XAP files.The first XAP should include only what is necessary to display your main screen and provide core functionality.Tim Heuer has put together a great video on silverlight.net titled “Loading Dynamic XAPs and Assemblies” that explains this concept in detail.Use this method to componentize and delay load parts of your application that pull in large dependencies and/or don’t absolutely need to be available when your application starts up.

Another great tip on Silverlight XAP compression comes from David Anson.A Silverlight XAP is just a renamed Zip file, but as of Silverlight 4 our XAP compression algorithm isn’t as efficient as it could be.By simply re-zipping your XAP files using an optimal compression algorithm you can shave about 20% off your download size (as high as 70% in extreme cases).Check out David’s post, “Smaller is better!” for the details and a useful script to help you automate the process.

Never wait on network I/O

No-doubt, this is one of the riskiest things you can do during startup.Network I/O latency is non-deterministic; when you make a call to a web service or access data from a network share your request could return in 2 milliseconds, 2 seconds, 2 minutes, or it may never return at all!If your application waits for this data to be retrieved before showing your UI, it may never show up.At best, you are gambling on the speed of your network connection to determine your startup time.

Minimize disk I/O

When you increase the amount of data that you load from disk (whether raw data files or loading unnecessary assemblies) you increase the amount of time that you’re waiting for physical media.It takes a substantial amount of time for your hard disk to seek to each new read location, and it takes time to read the data once you seek there.

Load Fewer Assemblies at StartupOne way to minimize disk I/O is to reduce the number of assemblies that your application loads.In Silverlight, a new assembly is loaded the first time that the CLR just-in-time (JIT) compiles a method that accesses a type from it.For example, if you took a simple “Hello World” Silverlight application and added a DataGrid to it, this would cause System.Windows.Controls.Data.dll (where DataGrid lives) to be loaded.You can use the VMMap tool from Windows Sysinternals to discover which assemblies you are loading, and how much data from each one.Assemblies can be found under the ‘Images’ category as seen in the screenshot below:

Since an assembly isn’t loaded until you pull in one of its types, there are two scenarios to keep in mind:

Static members can cause dependencies to be loaded eagerly.For example, if you have a static field of type DataGrid, System.Windows.Controls.Data.dll will be loaded at application startup during static initialization.This can be avoided by leveraging System.Lazy<T>, see the MSDN documentation for details.More information on lazy initialization can be found here.

A method that conditionally loads a new dependency can be refactored so that the assembly is loaded when it’s actually needed.private void OnLoaded(object sender, RoutedEventArgs e){bool b = true;if (b == true) {// Instead of this, refactor into a method, this will // prevent System.Windows.Controls.Data.dll where // DataGrid lives (and which is not one of the core // SL assemblies) from getting loaded. // DataGrid myDataGrid = new DataGrid(); // LayoutRoot.Children.Add(myDataGrid); AddDataGrid(); }}

Load Less DataAvoid loading content/configuration data that isn’t needed to display your main screen.An example could be an email client where a user’s messages are all serialized into a flat file.You should wait to load your message data until after your main window has loaded, and you’ve had a chance to display a loading animation or otherwise show that your application is responsive.

Another example could be an application that has a rich extensibility and plugin model.Make sure that you are loading and displaying the main shell of your application before you load each of your plugins.A single misbehaving plugin could add lots of extra time to your startup sequence!

Optimizing Templates and Styles

Having some Xaml parsed at startup to define the initial appearance of your application is unavoidable.Because of this, be sure to optimize your initial Xaml design for startup time.Here are a few things you should keep in mind when optimizing your Xaml.

Minimize your element count.Every element that you add to the visual tree adds to the amount of time that it takes to parse.When refactoring your Xaml, you may find yourself with elements left over in the visual tree that no longer contribute to function or appearance.A good example of this could be a Grid with a single child and no meaningful properties set.These types of elements should be removed!

Remove dead XAML.If a style or a part of your tree is no longer used, remove it!Why pay for something that you’re never going to see or use?

Prefer Templates over UserControls.UserControls need to be re-parsed per instantiation, templates are parsed only once.This is especially important in say, a DataGrid cell template, where hundreds of cells are styled the same way.If you style your cell using a UserControl, you are going to parse the same Xaml hundreds of times over; by using a template you ensure that this Xaml is only parsed once!

Don’t set properties to their default values, this includes things like setting Opacity=”1”, or setting RenderTransforms to non-parameterized values.Some of our design tools have a nasty habit of doing this so you may need to police their output.We’re working to make this better in the future.

Consider Using a Splash Screen

So you’ve implemented the rest of the startup best practices, but your application is still not quite snappy enough for you?At this point you should consider using a splash screen.By default, Silverlight applications use a default splash screen (the spinning orbs that I’m sure you’re familiar with).By replacing the default splash screen you can greatly improve perceived performance and get your own custom branding in front of the user as soon as possible.There is an MSDN article available here that explains the process of setting a custom splash screen.

A custom splash screen in conjunction with the Xap chaining example in Tim’s “Loading Dynamic XAPs and Assemblies” video mentioned under the “Minimize your download size” section can go a long way toward providing a responsive user experience.

Conclusion

Application startup can be a tricky thing to master, but once you’ve successfully driven your first real application to a fast startup the principles learned will serve you for years to come.We’ll be updating and correcting this page going forward, so add a bookmark and send the link out to other developers on your team.