Standard Music Font Layout (abbreviated SMuFL) is a font standard initiated by Steinberg and currently developed by the W3C Music Notation Community Group. Music fonts that are compliant with SMuFL specification are distributed with a JSON file containing metadata which describe default engraving settings (like barline thickness, beam thickness, etc.) and relations between glyphs such as glyph sizes, glyph cutouts (for positioning glyphs closely to one another), etc. The full specification of SMuFL can be found here: http://w3c.github.io/smufl/gitbook/specification/.

All these settings are stored in a JSON file which contains thousands of nodes. The single class Manufaktura.Controls.Model.SMuFL.GlyphBBoxes, which maps only a fragment of JSON file, contains 2964 properties. From my measurements, it appears that deserialization of the whole metadata file with a popular framework Newtonsoft.Json takes 4.8 seconds on my machine (I have a 7th generation i7 CPU). Other users reported that it can take up to 8 seconds on some machines.

Most of the data contained in JSON metadata file is unnecessary for most applications. If you are familiar with music, you probably know that most of the scores do not use microtonal alterations, and the SMuFL specification offers characters from several microtonal systems. You probably also don’t need such glyphs like PictJingleBells, PictMusicalSawPeinkofer or WindRimOnly, whatever that means…

The obvious solution is to read only parts of JSON file but it is tempting to map JSON tree structure to strongly typed object that will allow you to access all the needed properties without the need to manually traverse the JSON tree. Notice how simple it looks in this method (where ISMuFLFontMetadata is a deserialized JSON file):

The way that will allow you to read JSON partially while maintaining the simplicity of strongly typed objects is to implement a dynamic proxy.

How Does It Work

There are many implementations of dynamic proxies but I want my code to target .NET Standard, so I decided to use Microsoft’s official System.Reflection.DispatchProxy. Dynamic proxy basically creates a type dynamically at runtime. This generated type decorates another type and adds some additional logic. It can, for example, override virtual methods and wrap them in blocks for measuring performance, handling exceptions, etc. This technique is a part of aspect programming methodology.

DispatchProxy is a simple proxy that implements a given interface and enables the programmer to provide implementation in Invoke method which is called every time any method of the interface is invoked:

The code attached to this article contains performance measurements but I removed them from this example for the sake of clarity.

Every time the property is accessed, a JSON serializer deserializes only the part of JSON file associated with this property. Some properties contain large objects, for example GlyphBBoxes contain more than 2000 nodes. Deserializing such property is time consuming so I decided to implement proxy nesting. If a property type is interface, another proxy will be created instead of deserializing the whole subtree at once.

In these two examples, the metadata is of type ISMuFLFontMetadata but the implementations are different. In the first example, we have a normal object SMuFLFontMetadata (implemented with autoproperties) and in the second one, we have our proxy object.

The first example measures performance during deserialization of the whole JSON at once. The second creates a proxy object and touches some random properties. No more than a few dozens properties are really used by Manufaktura.Controls so I decided to pick 20 random ones.

In real situations, it is very unlikely to access 20 properties at one time. I created a test to check the performance in real rendering:

Also the startup time of test WPF application Manufaktura.Controls.WPF.Test (with attached debugger) has been reduced on my machine from 12 seconds to 5.

Conclusion

The proposed method offers a significant improvement in performance and allows a programmer to still use strongly typed models. Deserialization of specific nodes is postponed until they are requested which may lead to some unpredictability. To ensure stable performance, the programmer must know the structure of JSON file and decide which subtrees will be deserialized at once and which will be lazy deserialized by proxies.

You also have to have in mind that if you are debugging application and you try to inspect the contents of the proxy, the whole JSON will be deserialized at once because all properties will be resolved. It may cause some performance issues while debugging.

License

This article, along with any associated source code and files, is licensed under The MIT License

Share

About the Author

I graduated from Adam Mickiewicz University in Poznań where I completed a MA degree in computer science (MA thesis: Analysis of Sound of Viola da Gamba and Human Voice and an Attempt of Comparison of Their Timbres Using Various Techniques of Digital Signal Analysis) and a bachelor degree in musicology (BA thesis: Continuity and Transitions in European Music Theory Illustrated by the Example of 3rd part of Zarlino's Institutioni Harmoniche and Bernhard's Tractatus Compositionis Augmentatus). I also graduated from a solo singing class in Fryderyk Chopin Musical School in Poznań. I'm a self-taught composer and a member of informal international group Vox Saeculorum, gathering composers, which common goal is to revive the old (mainly baroque) styles and composing traditions in contemporary written music. I'm the annual participant of International Summer School of Early Music in Lidzbark Warmiński.

Comments and Discussions

12 -> 5 seconds is a big improvement. I am not using your library or anything, just read the article out of interest, but 5 seconds start up to me still seems excessive as a user. I'd make a better suggestion and ask how often does the json file change? If never or rarely, I'd pre-process it on your end and distribute the pre-processed file with your app instead. Pre-processing would be to load the file on your end and write out just what you need to either a more compact json, or even better yet, a binary file that you can just load "instantly".

Json file never changes. It can be replaced by another json file if user changes the font.
Binary format was my first thought. I tried to implement BSON but to my surprise it was slower than JSON. It looks that not only I have this problem: node.js - Why is JSON faster than BSON? - Stack Overflow[^]
An ideal solution would be to load a byte array into the memory and "assume" it's a parsed object. Is it even possible in C# if you want to target .NET Standard?

For now creating a smaller json file with only used features seems like a good solution. Thanks.
I also suppose that most users will not notice a lag because the deserialization time is now blended in rendering time so it only occurs when the score is rendered for the first time.

I wasn't suggesting BSON, but rather a binary (de)serialization into your destination structures. .Net Core supports binary serialization ( Binary serialization | Microsoft Docs). That would be way faster since no parsing is required. Of course, it would add the pre-processing step to serialize out to a binary file … but definitely no point in re-parsing huge json files every time. If you want to support the user changing the file, you can do a hybrid solution. Check the MD5 of the JSON to see if it has changed, if not, use the current binary file, otherwise re-read the json and create a new binary file. Of course, if the lag is not noticeable, then you are probably ok with what you have now . Depends if your users all have i7's or i3's .

If it was that simple... Some time ago I tried to use BinaryFormatter but I quickly gave up because it wanted me to add Serializable attribute to serialized class. The problem is that this class is in a .NET standard 1.1 library. BinaryFormatter and Serializable attribute was introduced in .NET standard 2.0. Recently I moved all SMuFL classes to new project in .NET standard 1.3 because this version is the first version of .NET standard that supports DispatchProxy so it forced me to update all .NET Framework projects that reference it from .NET Framework 4.5 to 4.6.
Now I had to update it further to 2.0 to allow binary serialization so I had to update all .NET Framework projects from 4.6 to 4.6.1 and give up Windows Vista support (earlier I had to give up Windows XP and Silverlight support but it's a different story).
The one thing that is left to do is to add SerializableAttribute to a few dozen classes but first I have to move them to project that supports Serializable and do some refactors to allow that.
One stupid attribute totally changes the architecture of my solution and forces me to drop support for one operating system. Sometimes I think that dependencies in .NET are the new kind of DLL hell.

Binary serialization of SMuFL metadata takes about 1.5 seconds and deserialization takes 0.02 seconds. I don't need Serializable attribute anymore and I was able to revert back to .NET Standard 1.3.

UPDATE2: Ok, it doesn't work as expected. I wrote that it takes 0.02 seconds to deserialize but I have been deserializing just after the serialization. I suppose it's cached. When I deserialize a binary resource it takes 2 seconds so SharpSerializer does it longer than my dynamic proxy (1.5 seconds). Of course SharpSerializer would be much better choice if I wanted to deserialize the whole file at once.

I'm surprised that binary deserialize is slower then the dynamic proxy. Isn't SharpSerializer one of the older generation ones? If you're bored , I'd experiment with some of the newer generation ones (i.e. see this benchmark GitHub - neuecc/Utf8Json: Definitely Fastest and Zero Allocation JSON Serializer for C#(NET, .NET Core, Unity, Xamarin).). A google search also shows some of the other newer generation ones that claim to be the fastest . You'd be surprised at some of the differences. For example, older generation DI engines like AutoFaq and Ninject that are based on reflection are THOUSANDS of times slower then the newer generation ones that are based on expression trees. Same with older gen mappers like AutoMapper. HORRIBLE performance compared to newer IL based ones.

You really have to find the deserializer with the fastest one time performance since most of them are going to assume you are going to deserialize the same objects over and over and might have more expensive start up costs, but will pay you back on the repeated deserializations if used correctly. Your case is a one time deserialize at start up.

One other option might be to read directly into a struct as a blob (not type safe) rather then do serialization with all the overhead of type safety.