Category: Swords

I post about Power BI dataflows a lot, but that’s mainly because I love them. My background in data preparation and ETL, combined with dataflows’ general awesomeness makes them a natural fit for my blog. This means that people often think of me as “the dataflows guy” even though dataflows are actually a small part of my role on the Power BI CAT team. Most of what I do at work is help large enterprise customers successfully adopt Power BI, and to help make Power BI a better tool for their scenarios[1].

As part of my ongoing conversations with senior stakeholders from these large global companies, I’ve noticed an interesting trend emerging: customers describing self-service BI as a two-edged sword. This trend is interesting for two main reasons:

It’s a work conversation involving swords

Someone other than me is bringing swords into the work conversation[2]

As someone who has extensive experience with both self-service BI and with two-edged swords, I found myself thinking about these comments more and more – and the more I reflected, the more I believed this simile holds up, but not necessarily in the way you might suspect.

The two sharp edges of a sword each serve distinct and complementary purposes.

A competent swordsperson knows how and when to use each, and how to use them effectively in combination.

Having two sharp edges is only dangerous to the wielder if they are ignorant of their tool.

A BI tool like Power BI, which can be used for both “pro” IT-driven BI and self-service business-driven BI has the same characteristics, and to use it successfully at scale an organization needs to understand its capabilities and know how to use both “edges” effectively in combination.

As you can imagine, there’s more to it than this, so you should probably watch the session recording.

For those who are coming to the Microsoft Business Applications Summit next week, please consider joining the CAT team’s “Enterprise business intelligence with Power BI” full-day pre-conference session on Sunday. Much of the day will be deep technical content, but we’ll be wrapping up with a revised and refined version of this content, with a focus on building a center of excellence and a culture of data in your organization.

There is also a video of the final demo where Adam Saxton joined me to illustrate how business and IT can work together to effectively respond to unexpected challenges. If you ever wondered what trust looks like in a professional[5] environment, you definitely want to watch this video.

[1] This may be even more exciting for me than Power BI dataflows are, but it’s not as obvious how to share this in blog-sized pieces.

[2] Without this second point, it probably wouldn’t be noteworthy. I have a tendency to bring up swords more often in work conversations than you might expect[3].

[3] And if you’ve been paying attention for very long, you’ll probably expect this to come up pretty often.

I think about metadata a lot.[1] I probably think about metadata more than I think about swords, and that’s saying something.

I believe my love affair with metadata may have its roots in my college years when I took several anthropology courses from Dr. Ivan Brady. Dr. Brady changed the way I looked at the world, and I will never forget his most frequently used saying:

“Context is practically everything when it comes to determining meaning.”

— Dr. Ivan Brady

Dr. Brady wasn’t talking about metadata, but the statement still applies. Metadata provides context that is lacking from data. Metadata allows a user to understand the meaning of the data – is source, its purpose, its scope, its intended uses – without needing to explore the data itself in exhaustive detail.

In the context of enterprise data, metadata is absolutely vital. But not all metadata is created equal. Some metadata is swords, and some metadata is WiFi.

Please bear with me for a moment – I promise I’m going somewhere with this.

Oakeshott’s typology of medieval and early renaissance swords is among his most influential and most lasting works. Though his work was not entirely original, it was certainly groundbreaking. Dr. Jan Peterson had previously developed a typology for Viking swords consisting of twenty-six categories. Peterson’s typology was simplified by Dr. R. E. M. Wheeler in short order to only seven categories (Types I–VII). This simplified typology was then slightly expanded by Oakeshott by the addition of two transitional types into its current nine categories (Types I–IX). From this basis, Oakeshott began work on his own thirteen-category typology of the medieval sword ranging from Type X to Type XXII.

What made Oakeshott’s typology unique was that he was one of the first people either within or outside of academia to seriously and systematically consider the shape and function of the blades of European Medieval swords as well as the hilt, which had been the primary criteria of previous scholars. His typology traced the functional evolution of European swords over a period of five centuries, starting with the late Iron Age Type X, and took into consideration many factors: the shape of blades in cross section, profile taper, fullering, whether blades were stiff and pointed for thrusting or broad and flexible for cutting, etc. This was a breakthrough. Oakeshott’s books also dispelled many popular cliches about Western swords being heavy and clumsy. He listed the weights and measurements of many swords in his collection which have become the basis for further academic work as well as templates for the creation of high quality modern replicas.

And although the quote above doesn’t mention it, in addition to the primary types X through XXII, there are multiple subtypes as well, denoted by a lower-case letter following the roman numeral of the primary type.[4]

To summarize:

Oakeshott was working from a sample of data that wasn’t necessarily representative, and for which no meaningful metadata existed. He needed to reverse engineer the metadata from the available data, and to manually assign structure and consistency to it.

Earlier efforts to provide metadata for this data domain had focused on structural characteristics of the data, rather than the functional characteristics in which Oakeshott was interested.

Oakeshott was building on the efforts of earlier data stewards and expanded the work that they had done in one data domain, while also defining more comprehensive metadata for a new, larger, data domain.

Oakeshott’s work revealed significant discrepancies between the actual data and users’ perceptions of the data, and in doing so it enabled significant new opportunities to work with that data at scale.

Each metadata category is defined using an arcane and obtuse combination of letters and numbers to describe its members, such as Xa, XIIIb, and XVIIIb.

Even if you’ve never held a sword[3], this probably sounds familiar.

A lot of the data used in enterprise analytics wasn’t created with any metadata in mind. Other than table names, object names, and data types[5], there often isn’t much to go on. In order to understand the data, you need to look at and work with the data, at length. Efforts to develop structured metadata for these existing sources is more data archaeology than it is data science, and it is often difficult to know if you have all of the data, if you have taken into consideration every possible permutation of values… You get the idea. It’s hard, and it’s often very difficult to have strong confidence in the results you reach. Reverse-engineered metadata is better than no metadata, but…

But it’s better to take metadata into account right from the beginning, and to build it at the same time you’re building the data. Like WiFi.

The standard and amendments provide the basis for wireless network products using the Wi-Fi brand. While each amendment is officially revoked when it is incorporated in the latest version of the standard, the corporate world tends to market to the revisions because they concisely denote capabilities of their products. As a result, in the marketplace, each revision tends to become its own standard.

Let’s summarize this as well:

The metadata was defined before the data was created, rather than being inferred from existing data.

The metadata includes functional and structural characteristics, based on agreed-up requirements.

All data is validated against the metadata in a consistent and standard manner as it is created.

Each metadata category is defined using an arcane and obtuse combination of letters and numbers to describe its members, such as 802.11ax, 802.11b, and 802.11n.

Each approach to metadata adds value, but it should be obvious that prioritizing metadata in your data architecture is key to data consistency, interoperability, and reuse.

When I buy a sword[6], I can use the Oakeshott type as a concrete way to describe and discuss the sword with its maker, or with my sword-loving friends. This is inherently valuable. But there are many swords that don’t fall neatly into this classification, which reduces that value.

When I buy wireless networking equipment, all I need to do is to look at the standards it implements. From this metadata I can immediately and authoritatively know what other networking equipment it will work with, and what functional characteristics it will implement.

Is your metadata swords, or is it WiFi? Would you rather have swords, or WiFi?

I really think about metadata a lot…

[1] I never metadata I didn’t like.

[2] If you’ve been watching Forged in Fire: Knife or Death, you’ve heard this name before. And if you know anything about swords and their classification, you cringed and cried out in pain when you heard this term misused by the hosts of the show.

[5] And if you’re using a data lake, you’ll be lucky to have this much.

[6] It will be an Angus Trim type XVII longsword, the younger twin of this one. It will be ready in January. I know this because I ordered it already. No, I haven’t told my wife yet, but she will understand.

I participated last night in The Sword Experience. This is a delightfully fun event organized as part of Microsoft’s annual “Giving” campaign, and I was very happy to donate to support a great cause and spend 3+ hours pretending to sword fight with actor Adrian Paul[1] and a bunch of like-minded Microsoft employees.

I can’t wait to do it again, but there were a few things that bothered me, especially when Mr. Paul “corrected” my footwork during some of the warmup exercises. I’ve spent much of the last four years studying various historical martial arts[2] and practicing them as a full-contact combat sport, and footwork, balance, and structure are the foundation of all of that. Damn it, man, I know how to do this the right way, and you’re trying to make me do it wrong!!

Sigh.

Of course I didn’t say this, and of course it would have completely missed the point if I had. The event was about stage combat and fight choreography, not about actual sword fighting. Even though the two things may look the same from a distance, they have fundamentally different goals.

And this got me thinking about software demos, and how they relate to building production software. These things look similar from a distance as well, but they also have fundamentally different goals.

In an actual sword fight, you want to make small, fast movements that can’t be predicted, and which make contact before their threat is recognized. You want the fight to be over immediately and decisively. In a stage fight, you want to make large, easily visible movements that are clearly expressive of threat, but which are not actually presenting one. You want the fight to last for a long time, and to be interesting to observe.

When building production software, you want to make a solution that is secure, that performs well, and that is easy to maintain and extend. The structure of your solution, and the processes used to deploy and support it, reflect these goals. Typically you do not optimize production software around its ease of understanding, and instead invest in training new team members over time.

When building a software demo, you want to make a solution that is easy to understand, and that communicates and reinforces the concepts and information that are the foundation of the demo. The structure of the demo is simplified to eliminate any details that do not directly support the demo’s goals, even at the expense of fundamental characteristics that would be required in any production system that uses the concepts and technologies in the demo.

A demo tells a story that reflects the reality of a production system, but deliberately glosses over the complicated and messy bits – just like a choreographed fight reflects the reality of an actual fight, minus all the violence and consequences.

You can learn from stage combat, and you can learn from demos. Just don’t mistake them for the real thing.[3] [4]

And now it’s time for me to go cut something with a real sword, just to make sure the things I practiced last night don’t stick around…

Isn’t that a sight just overflowing with promise? Like an empty Visual Studio solution, where anything is possible…

[4] In both situations, don’t be “that guy.” Don’t be the guy who complains about how a demo isn’t “real world” because it isn’t production ready. And don’t be the guy who complains that the stage combat moves you’re being shown aren’t martially sound. Seriously.