Several years back, my father started a project to digitize our home videos. He purchased an old computer from the IMVU automated builds cluster, bought an AVerMedia C027 capture card, digitized a few tapes… and then his digitization workstation sat there for years, untouched.

Sadly, he passed away last year, so I picked up the project. There were four types of analog media to digitize: VHS tapes, 8mm tapes, audio cassettes, and old Super 8 film reels.

Super 8

The Super 8 film went to a service in Redwood City. I don’t have any relevant equipment to play it and they do a good job — they clean the tape, take high-resolution photos of each frame, and then apply color adjustments to correct for any age-related fading, overexposure, or underexposure. The output format is up to you. I selected an MP4 movie file and a 1080p JPEG for every captured frame. (30 GB of 1080p JPEGs for 20 minutes of video!)

The service worked out pretty well. My only complaint was that I gave them seven individually labeled 3″ film reels but, presumably to make it easier to process, they taped six of the reels into one larger 6″ reel, so I had to split the files back up. Avidemux made lossless splitting on the I-frame boundaries trivial.

Audio Cassettes

The audio was similarly easy. ION makes an inexpensive tape deck that advertizes itself as a stereo USB microphone. You can capture the audio straight into Audacity and clip, process, and encode as needed.

VHS and 8mm

The bulk of the project was VHS and 8mm: we had two medium-sized moving boxes plus a shoebox of VHS tapes and a medium-sized box of 8mm. Probably close to 100 tapes in all.

Home videos are not worth much if nobody can watch them, so my primary goal was to make the video conveniently accessible to family. I also wanted to minimize unnecessary re-encodes and quality loss. The film and VHS had already degraded over time. Some quality loss, unfortunately, is inevitable without spending $$$ on dedicated equipment that captures frames from the tape.

My parents happened to own a very high-quality VCR that’s still in great shape. The capture sequence ended up something like this:

Video Cassette -> VCR -> Composite Cables -> Capture Card -> MPEG-2

Since each tape contained a hodgepodge of home videos (sometimes interleaved with TV recordings!), they had to be split up. The excellent, open source dvbcut software is perfect for this: it has a quadratic slider for frame-accurate scrubbing and it only recompresses frames when your splits don’t line up precisely with I-frames. I recommend doing your dvbcut work on an SSD. Scrubbing is painful on a spinny disk.

Converting the 8mm tapes was similar except replace VCR with the (again, still in great shape) Sony camcorder in playback mode. Also, since the 8mm tapes are mono but the capture card always records in stereo, you have an option. You can either run a post-split ffmpeg -map_channel step to convert the stereo MPEG-2 files into mono. (This has to happen after splitting because dvbcut can’t read videos after ffmpeg processes them for some reason.) Or you can tell HandBrake to mixdown the audio to mono from the right channel only. The latter avoids an audio re-encode, but it’s easier to forget when setting up long HandBrake jobs.

Finally, because the captured MPEG-2 files are large (4 GB per hour of video), I recompressed in HandBrake to H.264. I don’t notice a material quality difference (besides some “free” noise reduction), and the H.264 MP4 files are smaller and have more responsive seeking.

In the end, the steps that involve quality loss are:

Real-time playback. Tracking glitches, for example, result in a few missed frames. But, like I mentioned, it would take $$$ to do a precise, frame-accurate digitization of each VHS frame.

Composite cables instead of S-Video. I couldn’t find a VCR on Craigslist that supported S-Video output.

Capturing in MPEG-2. I’m not convinced the real-time AVerMedia MPEG-2 encoder is very good – I’d occasionally notice strips of artifacty blocks in high-frequency regions like tree lines.

A few frames of dvbcut’s re-encoding at the beginning and end of every split.

YouTube / HandBrake. Might be slightly better to upload the split MPEG-2 into YouTube and let it recompress, but uploading 2 TB of video to YouTube didn’t seem very fun.

The bulk of the time in this project went towards capturing the video. It has to play in real time. Each 8mm cassette was 2 hours, and VHS tapes range between 2 and 8 hours.

The bulk of the effort, on the other hand, went into splitting, labeling, and organizing. I had to rely on clues to figure out when and where some videos were set. There were many duplicate recordings, too, so I had to determine which was higher quality.

Now that all that’s done, I plan to upload everything to YouTube and make a Google Doc to share with family members, in case anyone wants to write stories about the videos or tag people in them.

But by the time a project gets to this stage, it’s often a nontrivial amount of work to switch to Protocol Buffers or Cap’n Proto or Thrift or whatever. There might be thousands of lines of code for mapping model objects to and from JSON (that is arrays, objects, numbers, strings, and booleans). And if you’re talking about some hand-rolled binary format, it’s even worse: you need implementations for all of your languages and to make sure they’re fuzzed and secure.

The fact is, the activation energy required to switch from JSON to something else is high enough that it rarely happens. The JSON data model tends to stick. However, if we could insert a single function into the network layer to represent JSON more efficiently, then that could be an easy, meaningful win.

True, but, at first glance, none of those formats actually seem all that great. JSON documents tend to repeat the same strings over and over again, but those formats don’t support reusing string values or object shapes.

This led me to perform some experiments. What might a tighter binary representation of JSON look like?

What’s in a format?

First, what are some interesting properties of a file format?

Flexibility and Evolvability Well, we’re talking about representing JSON here, so they’re all going to be similar. However, some of the binary JSON replacements also have support for dates, binary blobs, and 64-bit integers.

Size How efficiently is the information encoded in memory? Uncompressed size matters because it affects peak memory usage and it’s how much data the post-decompression parser has to touch.

Compressibility Since you often get some sort of LZ compression “for free” in the network or storage interface, there’s value in the representation being amenable to those compression algorithms.

Code Size How simple are the encoders and decoders? Beyond the code size benefits, simple encoders and decoders are easier to audit for correctness, resource consumption bounds, and security vulnerabilities.

Decoder Speed How quickly can the entire file be scanned or processed? For comparison, JSON can be parsed at a rate of hundreds of MB/s.

Queryability Often we only want a subset of the data given to us. Does the format allow O(1) or O(lg N) path queries? Can we read the format without first parsing it into memory?

Size and parse-less queryability were my primary goals with JND. My hypothesis was that, since many JSON documents have repeating common structures (including string keys), storing strings and object shapes in a table would result in significant size wins.

Quickly glancing at the mainstream binary JSON encodings…

MsgPack

Each value starts with a tag byte followed by its payload. e.g. “0xdc indicates array, followed by a 16-bit length, followed by N values”. Big-endian integers.

Must be parsed before querying.

BSON

Per spec, does not actually seem to be a superset of JSON? Disallows nuls in object keys, and does not support arrays as root elements, as far as I can tell.

Otherwise, similar encoding as MsgPack. Each value has a tag byte followed by a payload. At least it uses little-endian integers.

UBJSON

Same idea. One-byte tags for each value, followed by a payload. Notably, lists and objects have terminators, and may not have an explicit length. Kind of an odd decision IMO.

Big-endian again. Weird.

CBOR

IETF standard. Has a very intentional specification with documented rationales. Supports arrays of known sizes and arrays that terminate with a special “break” element. Smart 3-bit major tag followed by 5-bit length with special values for “length follows in next 4 bytes”, etc.

Big-endian again… Endianness doesn’t matter all that much, but it’s kind of weird to see formats using the endianness that’s less common these days.

CBOR does not support string or object shape tables, but at first glance it does not seem like CBOR sucks. I can imagine legitimate technical reasons to use it, though it is a quite complicated specification.

JND!

Okay! All of those formats have roughly the same shape. One byte prefixes on every value, value payloads in line (and thus values are variable-width).

Now it’s time to look at the format I sketched up.

The file consists of a simple header marking the locations and sizes of three tables: values, strings, and object shapes.

The string table consists of raw UTF-8 string data.

In the value table, every value starts with a tag byte (sound familiar?). The high nibble encodes the type. The low nibble contains two 2-bit values, k and n.

0011 indicates this is a string value. k and n are “size tags” which indicate how many bytes encode the integers. The string offset is 1 << k bytes (little-endian) and the string length is 1 << n bytes. Once the size tags are decoded, the actual offset and length values are read following the tag byte, and the resulting indices are used to retrieve the UTF-8 text at the given offset and length from the string table.

The following 1 << k bytes encode the index into the object shape table, which holds the number and sorted list of object keys. Afterwards is a simple list of indices into the value table, each of size 1 << n bytes. The values are matched up with the keys in the object shape table.

Arrays are similar, except that instead of using an object index, they simply store their length.

This encoding has the property that lookups into an array are O(1) and lookups into an object are O(lg N), giving efficient support for path-based queries.

But there’s a pretty big downside relative to MsgPack, CBOR, and the like. The cost of efficient random access is that the elements of arrays and objects must have a known size. Thus, the (variable-width) values themselves cannot be stored directly into the array’s element list. Instead, arrays and objects have a list of fixed-width numeric offsets into the value table. This adds a level of indirection, and thus overhead, to JND. The payoff is that once a particular value is written (like a string or double-precision float), its index can be reused and referenced multiple times.

getinitialstate I can’t share the contents of this document as it’s from a project at work, but it’s 1.7 MB of various entity definitions, where each entity type is an object with maybe a dozen or two fields.

instruments A huge number of repeated structures and data — very compressible.

mesh 3D geometry. Basically a giant list of floating point and integer numbers.

svg_menu Only 600 bytes – used to test startup and base overhead costs.

twitter List of fairly large objects, many long strings.

update-center From Jenkins. Mostly consists of an object representing a mapping from plugin name to plugin description.

Conclusions

We can draw a few conclusions from these results.

As a wire replacement for the JSON data model, there’s no apparent reason to use BSON or UBSON. I’d probably default to MsgPack because it’s simple, but CBOR might be okay too.

The current crop of binary formats don’t compress any better than JSON itself, so if size over the wire is your primary concern, you might as well stick with JSON.

Except for small documents, where JND’s increased representation overhead dominates, JND is pretty much always smaller uncompressed. As predicted, reusing strings and object shapes is a big win in practice.

LZ-style compression algorithms don’t like JND much. Not a big surprise, since they don’t do a good job with sequences of numeric values. I expect delta coding value offsets in arrays and objects would help a lot, at the cost of needing to do a linear pass from delta to absolute at encode and decode time.

JND’s disadvantage is clear in the above graphs: while it’s smaller uncompressed, it does not compress as well as JSON or MsgPack. (Except in cases where its uncompressed size is dramatically smaller because of huge amounts of string or object shape reuse.)

Where would something like JND be useful? JND’s big advantage is that it can be read directly without allocating or parsing into another data structure. In addition, the uncompressed data is relatively compact. So I could imagine it being used if IO is relatively cheap but tight memory bounds are required.

Another advantage of JND is a bit less obvious. In my experience using sajson from other languages, the parse itself is cheaper than converting the parsed AST into values in the host language. For example, constructing Swift String objects from binary UTF-8 data was an order of magnitude slower than the parse itself. In JND, every unique string would naturally be translated into host language String objects once.

Future Work?

It was more work than I could justify for this experiment, but I’d love to see how Thrift or Protocol Buffers compare to JND. JND is a self-describing format, so it’s going to lose on small messages, but when data sizes get big I wouldn’t be surprised if it at least ties protobufs.

Update: Forgot to mention decode times. I didn’t have time to set up a proper benchmark with high-performance implementations of each of these formats (rather than whatever pip gave me), but I think we can safely assume CBOR, MsgPack, BSON, and UBSON all have similar parse times, since the main structure of the decoder loop would be the same. The biggest question would be: how does that compare to JSON? And how does JND compare to both?

Several holidays ago, I got a bee in my bonnet and wrote a fast JSON parser whose parsed AST fits in a single contiguous block of memory. The code was small and simple and the performance was generally on-par with RapidJSON, so I stopped and moved on with my life.

Well, at the end of 2016, Rich Geldreich shared that he’d also written a high-performance JSON parser.

A while back I wrote a *really* fast JSON parser in C++ for fun, much faster than RapidJSON (in 2012 anyway). Is this useful tech to anyone?

I dropped his pjson into my benchmarking harness and discovered it was twice as fast as both RapidJSON and sajson! Coupled with some major JSON parsing performance problems in a project at work, I was nerd-sniped again.

I started reading his code but nothing looked particularly extraordinary. In fact, several things looked like they ought to be slower than sajson… Oh wait, what’s this?

How do we think about the performance of this loop? Remember that mainstream CPUs are out-of-order and can execute four (or more!) instructions in parallel. So an approximate mental model for reasoning about CPU performance is that they understand multiple instructions per cycle, then stick them all into an execution engine that can execute multiple instructions per cycle. Instructions will execute simultaneously if they’re independent of each other. If an instruction depends on the result of another, it must wait N cycles, where N is the first instruction’s latency.

So, assuming all branches are correctly predicted (branch predictors operate in the frontend and don’t count towards dependency chains), let’s do a rough estimate of the cost of the loop above:

increment p is the only instruction on the critical path – it carries a dependency across iterations of the loop. Is there enough work to satisfy the execution resources during the increment? Well, the comparisons are independent so they can all be done in parallel, and there are four of them, so we can probably keep the execution units busy. But it does mean that, at most, we can only check one byte per cycle. In reality, we need to issue the load and increment too, so we’re looking at a loop overhead of about 2-3 cycles per byte.

Now let’s look more closely at Rich’s code.

Replacing the comparisons with a lookup table increases comparison latency (3-4 cycles latency from L1) and increased comparison throughput (multiple loads can be issued per cycle).

So let’s spell out the instructions for Rich’s code (reordered for clarity):

Again, the critical path is only add p, 4, but we still need to issue the other instructions. The difference is that now all of the loads happen in parallel and the comparisons for 4 bytes happen in parallel too, rather than doing four comparisons per byte.

It’s still hard to say if this is a win on paper — Haswell can only issue 2 loads per cycle. But the loads can overlap with the comparisons from previous bytes). However, we still have to issue all of these instructions. So maybe we’re looking at something closer to 2 cycles per byte?

Empirically, at least on my Broadwell, replacing four comparisons with a LUT was definitely a win. 55cc213

But is unrolling the loop necessary? If I take out the unrolling but leave the LUT, clang gets slower but gcc stays the same. I checked – neither gcc nor clang do any automatic unrolling here. What’s going on? Branch predictor effects? Also, Intel’s Intel Architecture Code Analyzer tool says that the unrolled and non-unrolled loops are both frontend-bound and have the same loop throughput.

I’m not yet convinced that a tight dependency chain is why unrolling is a win. More investigation is required here. But the important thing to keep in mind is to pay attention to the critical path latency through a loop body as well as the number of independent operations that can soak up spare execution bandwidth.

Lead Bullets

After playing around with LUTs and unrolling, I started implementing a bunch of small optimizations that I’d long known were available but didn’t consider to be that important.

Well, as it often turns out in performance projects, a sequence of 2% gains adds up to significant wins over time! If you’ve read Ben Horowitz’s lead bullets story or how SQLite achieved a 50% speed up, this will sound familiar.

Here are the optimizations that mattered:

Moved the input and output pointers into locals instead of members, which helps VC++ and Clang understand that they can be placed in registers. (gcc was already performing that optimization.) 71078d34a07c77

Store the tag bits at the bottom of the element index instead of the top, which avoids a shift on 64-bit. e7f2351

Static Branch Prediction

I also spent a bit of time on static branch prediction. It’s a questionable optimization; in theory, you should just use profile-guided optimization (PGO) and/or rely on the CPU’s branch predictors, but in practice few people actually bother to set up PGO. Plus, even though the CPU will quickly learn which branches are taken, the compiler doesn’t know. Thus, by using static branch prediction hints, the compiler can line up the instructions so the hot path flows in a straight line and all the cold paths are off somewhere else, sometimes saving register motion in the hot path.

I can’t recommend spending a lot of time on annotating your branches, but it does show up as a small but measurable win in benchmarks, especially on smaller and simpler CPUs.

Things I Didn’t Do

Unlike Rich’s parser and RapidJSON, I chose not to optimize whitespace skipping. Why? Not worth the code size increase – the first thing someone who cares about JSON parsing performance does is minify the JSON. b05082b

I haven’t yet optimized number parsing. Both RapidJSON and Rich’s parser are measurably faster there, and it would be straightforward to apply the techniques. But the documents I regularly deal with are strings and objects and rarely contain numbers.

Benchmark Results

Desktop

Dell XPS 13

iPhone SE

Atom D2700

The charts aren’t very attractive, but if you look closely, you’ll notice a few things:

Parsing JSON on modern CPUs can be done at a rate of hundreds of megabytes per second.

gcc does a much better job with RapidJSON than either clang or MSVC.

JSON parsing benefits from x64 – it’s not a memory-bound or cache-bound problem, and the extra registers help a lot.

The iPhone SE is not much slower than my laptop’s Broadwell. :)

The Remainder of the Delta

As you can see in the charts above, sajson is often faster than RapidJSON, but still not as fast as Rich’s pjson. Here are the reasons why:

sajson does not require the input buffer to be null-terminated, meaning that every pointer increment requires a comparison with the end of the buffer (to detect eof) in addition to the byte comparison itself. I’ve thought about changing this policy (or even adding a compile-time option) but I really like the idea that I can take a buffer straight from a disk mmap or database and pass it straight to sajson without copying. On the other hand, I measured about a 10% performance boost from avoiding length checks.

sajson sorts object keys so that object lookup takes logarithmic rather than linear time. The other high-performance parsers have linear-time object lookup by key. This is an optimization that, while not necessary for most use cases, avoids any accidental worst-case quadratic-time usage.

sajson’s contiguous AST design requires, for every array, shifting N words in memory where N is the number of elements in the array. The alternative would be to use growable arrays in the AST (requiring that they get shifted as the array is realloc’d). Hard to say how much this matters.

Aside: The “You Can’t Beat the Compiler” Myth

There’s this persistent idea that compilers are smart and will magically turn your code into something that maps efficiently to the machine. That’s only approximately true. It’s really hard for compilers to prove the safety (or profitability) of certain transformations and, glancing through the produced code for sajson, I frequently noticed just plain dumb code generation. Like, instead of writing a constant into a known memory address, it would load a constant into a register, then jump to another location to OR it with another constant, and then jump somewhere else to write it to memory.

Also, just look at the charts – there are sometimes significant differences between the compilers on the same code. Compilers aren’t perfect and they appreciate all the help you can give!

Benchmarking

Measuring the effect of microoptimizations on modern computers can be tricky. With dynamic clock frequencies and all of today’s background tasks, the easiest way to get stable performance numbers is to take the fastest time from all the runs. Run your function a couple thousand times and record the minimum. Even tiny optimizations will show up this way.

I also had my JSON benchmarks report MB/s, which normalizes the differences between test files of different sizes. It also helps understand parser startup cost (when testing small files) and the differences in parse speed between files with lots of numbers, strings, huge objects, etc.

Swift Bindings

Dropbox (primarily @aeidelson) contributed high-quality Swift bindings for sajson. The challenge in writing these bindings was finding a way to efficiently expose the sajson parse tree to Swift. It turns out that constructing Swift arrays and objects is REALLY expensive; we once benchmarked 10 ms in sajson’s parser and 400 ms of Swift data structure construction.

Fortunately, Swift has decent APIs for unsafely manipulating pointers and memory, and with those we implemented the ability to access sajson AST nodes through a close-to-natural ValueReader type.

An iOS team at Dropbox replaced JSONSerialization with sajson and cut their initial load times by two thirds!

Summary

I used to think JSON parsing was not something you ever wanted in your application’s critical path. It’s certainly not the kind of algorithm that modern computers love (read byte, branch, read byte, branch). That said, this problem has been beaten to death. We now have multiple parsers that can parse data at hundreds of megabytes per second — around the same rate as SHA-256! If we relaxed some of the constraints on sajson, it could even go faster.

So how fast was Rich’s parser, after all? When measured in Clang and MSVC, quite a lot, actually. But when measured in GCC, RapidJSON, sajson, and pjson were (and remain) very similar. Many of the differences come down to naive, manually-inlined code, which we know the compiler can reliably convert to a fast sequence of instructions. It’s annoying to eschew abstractions and even duplicate logic across several locations, but, in the real world, people use real compilers. For critical infrastructure, it’s worth taking on some manual optimization so your users don’t have to care.

Update

Arseny Kapoulkine pointed out an awesome trick to me: replace the last byte of the buffer with 0, which lets you avoid comparing against the end of the buffer. When you come across a 0 byte, if it’s the last one, treat it as if it’s the byte that got replaced. This would be a material performance win in sajson _and_ avoid the need for specifying null-terminated buffers.

Email and text messaging are cold. It’s easy to assume the person on the other side of the screen is aggressive or upset or thinks poorly of you, and respond in kind.

Many years ago, I worked on the SCons build system. Much of the development occurred on mailing lists. Every so often someone would come in, guns blazing, asking why X was so bad or why Y didn’t work. The project lead, Steven Knight, always managed to either get useful information from the person or, if they were truly trolling, deflect and ignore the onslaught.

I once asked him how he did such a great job keeping the community calm. He said “I assume everyone has a good heart.” If a person comes in with a problem or does something you don’t like, gently figure what the real issue is. It’s probably something you can solve.

Imagine you have a bunch of JSON documents that you want to keep around without reencoding in some way, and you want to occasionally query the document for a selection of fields. Parsing JSON is not terribly efficient, and you have to parse most of the file just to read, say, a single key.

The key observation is that you can store a bitstream of information alongside the original JSON document that describes enough of the parse tree to be able to find a key without parsing the entire document. JSON is LL(1) so the type of each node is determined by its first byte in the file. Thus, the locations of each JSON node are a monotonically-increasing set of integer indices into the source file. Elias-Fano codes are a very efficient (quasi-succinct) encoding for such a set. In addition, the nesting hierarchy is encoding with a balanced parentheses bit pattern code.

The result is that it’s possible to preparse the document into a data structure about 25% of the size of the original JSON document that allows direct field lookups.

The Quasi-Succinct Indices paper gives some background information on the encoding and its efficient implementation. An implementation of a set of succinct data structures is available on GitHub.

All of that said, I do slightly question this approach’s utility: it’s probably a bigger win to come up with an isomorphic encoding of JSON that allows direct access and still allows reconstruction of the original document if necessary.

Companies are built of many projects and repositories. Each tends to develop its own culture and tools for interacting with it.

The client is compiled with MSVC, and tests are run with ./runner.py --test foo/bar/baz_test.py.

The website builds with SCons and tests are run with bin/run-tests.sh.

The utility library has ./build and ./test.

Over time, some projects put scripts in the root, some in bin, maybe some in a directory named scripts. Some projects might use all three.

This makes it hard for people coming to the project to know 1) how to do common project tasks and 2) what the set of available commands even is.

When code is collectively owned and engineers contribute across the entire stack, it’s important that anyone can easily check out a repository and start developing in it, no matter the language or tooling.

At IMVU we solved this problem by introducing a directory to every project named s/.

How do you build the project? s/build

How do you run the tests? s/test

How do you deploy? s/deploy

How do you lint? s/lint

How do you launch the program? s/run

What if there’s a server component? s/server

How do you see what commands are available? ls s

At first there was some resistance, primarily to the name. “s is too short. Nobody will know what it means.” But after living with it, it was perfect:

It’s short.

It’s on home row for both qwerty and dvorak.

Unlike bin and scripts, s doesn’t already have a semantic meaning. bin, for example, is often used for programs generated by the build script. Conflating that with project tooling is confusing.

s is a home only for project commands – the interface that developers use to interact with the project. These commands should use the same nomenclature as your company (e.g. if people say build, call it s/build; if people say compile, call it s/compile).

The scripts shouldn’t have extensions, because, importantly, the programming language is an implementation detail. That is, s/build.sh or s/build.py are wrong. s/build lets you be consistent across projects and have the option to migrate from Python to Bash or whatever.

s/ is a simple trick, but it goes a long way towards helping people migrate between projects!

I’ve intentionally tried not to break much new ground with Crux’s type system. I mostly want the language to be a pleasant, well-executed, modern language and type system for the web.

That said, when porting some web APIs to Crux, I ran into an bothersome issue with an interesting solution. I’ll motivate the problem a bit and then share the solution below.

Records and Row Polymorphism

Crux has row-polymorphic record types. Think of records as anonymous little structs that happen to be represented by JavaScript objects. Row polymorphism means that functions can be written to accept many different record types, as long as they have some subset of properties.

I lived in a TypeScript codebase for nine months or so and the convenience of being able to quickly define anonymous record types is wonderful. Sometimes you simply want to return a value that contains three named fields, and defining a new type and giving it a name is busywork that contributes to the common belief that statically typed languages feel “heavy”.

Records and Traits

json.encode‘s type is fun encode<V: ToJSON>(value: V) — that is, it accepts any value which implements the ToJSON trait. Why don’t records implement traits? Well, record types are anonymous, as described before. They don’t necessarily have unique definitions or names, so how would we define a ToJSON instance for this record?

Haskell, and other statically-typed languages, simply have the programmer manually construct a HashMap and JSON-encode that. In Crux, that approach might look something like:

let map = hashmap.new()
map["x"] = 10
map["y"] = 20
json.encode(map)

Not too bad, but not as nice as using record syntax here.

To solve this problem, which is “only” a human factors problem (but remember that Crux is intended to be delightful), I came up with something that I believe to be novel. I haven’t seen this technique before in the row polymorphism literature or any languages I’ve studied.

There are two parts to this problem. First we want to use record literals to construct key-value maps, and then we want to define trait instances that can use said maps.

Dictionaries

As I mentioned, Crux records are implemented as JavaScript objects. This is pretty convenient when interfacing with JavaScript APIs.

Sometimes, however, JavaScript objects are used as key-value maps instead of records. Crux provides a Dict type for that case. Dict is like Map, except keys are restricted to strings to allow implementation on a plain JavaScript object.

Read ... as “arbitrary set of fields” and : V as “has type V”. Thus, {...: V} is the new type of record constraint, and it’s read as “record with arbitrary set of fields where each field’s value has type V”.

So now we can write:

json.encode(dict.from({
happy: True,
sad: False,
}))

Better, but we’re still not done. For one, dict.from requires the values to have the same type — not necessary for JSON encoding. Two, it’s annoying to have to call two functions to quickly create a JSON object.

Defining Record Instances

Here’s where record trait instances come in. First we need to convert all the record’s field values into a common type T. Then, once the record has been converted into a record of type {...: T}, it can be passed to dict.from and then json.encode. The resulting syntax is: