Tag: trust

The components in which you have constant trust relationships are “islands in the stream”.

I wrote a fairly complex post a few months ago called “Zero-trust”: my love/hate relationship, in which I discussed in some details what “zero-trust” networks are, and why I’m not convinced. The key point turned out to be that I’m not happy about the way the word “zero” is being thrown around here, as I think what’s really going on is explicit trust.

Since then, there hasn’t been a widespread movement away from using the term, to my vast lack of surprise. In fact, I’ve noticed the term “zero-trust” being used in a different context: in p2p (peer-to-peer) and Web 3.0 discussions. The idea is that there are some components of the ecosystem that we don’t need to trust: they’re “just there”, doing what they were designed to do, and are basically neutral in terms of the rest of the actors in the network. Now, I like the idea that there are neutral components in the ecosystem: I think it’s a really important distinction to make to other parts of the system. What I’m not happy about is the suggestion that we have zero trust in those components. For me, these are the components that we must trust the most of all of the entities in the system. If they don’t do what we expect them to do, then everything falls apart pretty quickly. I think the same argument probably applies to “zero-trust” networking, too.

I started thinking quite hard about this, and I think I understand where the confusion arises. I’ve spent a lot of time over nearly twenty years thinking about trust, and what it means. I described my definition of trust in another post, “What is trust?” (which goes into quite a lot of detail, and may be worth reading for a deeper understanding of what I’m going on about here):

“Trust is the assurance that one entity holds that another will perform particular actions according to a specific expectation.”

For the purposes of this discussion, it’s the words “will perform particular actions according to a specific expectation” that are key here. This sounds to me as exactly what is being described in the requirement above that components are “doing what they’re designed to do”. It is this trust in their correct functioning which is a key foundation in the systems being described. As someone with a background in security, I always (try to) have these sorts of properties in mind when I consider a system: they are, as above, typically foundational.

What I think most people are interested in, however – because it’s a visible and core property of many p2p systems – is the building, maintaining and decay of trust between components. In this equation, the components have zero change in trust unless there’s a failure in the system (which, being a non-standard state, is not a property that is top-of-mind). If you’re interested in a p2p world where you need constantly to be evaluating and re-evaluating the level of trust you have in other actors, then the components in which you have (hopefully) constant trust relationships are “islands in the stream”. If they can truly be considered neutral in terms of their trust – they are neither able to be considered “friendly” nor “malevolent” as they are neither allied to nor can be suborned by any of the actors – then their static nature is uninteresting in terms of the standard operation of the system which you are building.

This does not mean that they are uninteresting or unimportant, however. Their correct creation and maintenance are vital to the system itself. It’s for this reason that I’m unhappy about the phrase “zero-trust”, as it seems to suggest that these components are not worthy of our attention. As a security bod, I reckon they’re among the most fascinating parts of any system (partly because, if I were a Bad Person[tm], they would be my first point of attack). So you can be sure that I’m going to be keeping an eye out for these sorts of components, and trying to raise people’s awareness of their importance. You can trust me.

Last week, Bloomberg published a story detailing how Chinese state actors had allegedly forced employees of Supermicro (or companies subcontracting to them) to insert a small chip – the silicon in the title – into motherboards destined for Apple and Amazon. The article talked about how an investigation into these boards had uncovered this chip and the steps that Apple, Amazon and others had taken. The story was vigorously denied by Supermicro, Apple and Amazon, but that didn’t stop Supermicro’s stock price from tumbling by over 50%.

I have heard strong views expressed by people with expertise in the topic on both sides of the argument: that it probably didn’t happen, and that it probably did. One side argues that the denials by Apple and Amazon, for instance, might have been impacted by legal “gagging orders” from the US government. An opposing argument suggests that the Bloomberg reporters might have confused this story with a similar one that occurred a few months ago. Whether this particular story is correct in every detail, or a fabrication – intentional or unintentional – is not what I’m interested in at this point. What I’m interested in is not whether it did happen in this instance: the clear message is that it could have happened, and it could be happening now.

I’ve written before about State Actors, and whether you should worry about them. There’s another question which this story brings up, which is possibly even more germane: what can you do about it if you are worried about them? This breaks down further into two questions:

how can I tell if my systems have been compromised?

what can I do if I discover that they have?

The first of these is easily enough to keep us occupied for now [1], so let’s spend some time on that. First, let’s first define six types of compromise, think about how they might be carried out, and then consider the questions above for each:

supply-chain hardware compromise;

supply-chain firmware compromise;

supply-chain software compromise;

post-provisioning hardware compromise;

post-provisioning firmware compromise;

post-provisioning software compromise.

This article doesn’t provide sufficient space to go into detail of these types of attack, and provides an overview of each, instead[2].

Terms

Supply-chain – all of the steps up to when you start actually running a system. From manufacture through installation, including vendors of all hardware components and all software, OEMs, integrators and even shipping firms that have physical access to any pieces of the system. For all supply-chain compromises, the key question is the extent to which you, the owner of a system, can trust every single member of the supply chain[3].

Post-provisioning – any point after which you have installed the hardware, put all of the software you want on it, and started running it: the time during which you might consider the system “under your control”.

Hardware – the physical components of a system.

Software – software that you have installed on the system and over which you have some control: typically the Operating System and application software. The amount of control depends on factors such as whether you use proprietary or open source software, and how much of it is produced, compiled or checked by you.

Firmware – special software that controls how the hardware interacts with the standard software on the machine, the hardware that comprises the system, and external systems. It is typically provided by hardware vendors and its operation opaque to owners and operators of the system.

Compromise types

See the table at the bottom of this article for a short summary of the points below.

Supply-chain hardware – there are multiple opportunities in the supply chain to compromise hardware, but the more hard they are made to detect, the more difficult they are to perform. The attack described in the Bloomberg story would be extremely difficult to detect, but the addition of a keyboard logger to a keyboard just before delivery (for instance) would be correspondingly more simple.

Supply-chain firmware – of all the options, this has the best return on investment for an attacker. Assuming good access to an appropriate part of the supply chain, inserting firmware that (for instance) impacts network performance or leaks data over a wifi connection is relatively simple. The difficulty in detection comes from the fact that although it is possible for the owner of the system to check that the firmware is what they think it is, what that measurement confirms is only that the vendor has told them what they have supplied. So the “medium” rating relates only to firmware that was implanted by members in the supply chain who did not source the original firmware: otherwise, it’s “high”.

Supply-chain software – by this, I mean software that comes installed on a system when it is delivered. Some organisations will insist in “clean” systems being delivered to them[4], and will install everything from the Operating System upwards themselves. This means that they basically now have to trust their Operating System vendor[5], which is maybe better than trusting other members of the supply chain to have installed the software correctly. I’d say that it’s not too simple to mess with this in the supply chain, if only because checking isn’t too hard for the legitimate members of the chain.

Post-provisioning hardware – this is where somebody with physical access to your hardware – after it’s been set up and is running – inserts or attaches hardware to it. I nearly gave this a “high” rating for difficulty below, assuming that we’re talking about servers, rather than laptops or desktop systems, as one would hope that your servers are well-protected, but the ease with which attackers have shown that they can typically get physical access to systems using techniques like social engineering, means that I’ve downgraded this to “medium”. Detection, on the other hand, should be fairly simple given sufficient resources (hence the “medium” rating), and although I don’t believe anybody who says that a system is “tamper-proof”, tamper-evidence is a much simpler property to achieve.

Post-provisioning firmware – when you patch your Operating System, it will often also patch firmware on the rest of your system. This is generally a good thing to do, as patches may provide security, resilience or performance improvements, but you’re stuck with the same problem as with supply-chain firmware that you need to trust the vendor: in fact, you need to trust both your Operating System vendor and their relationship with the firmware vendor.

Post-provisioning software – is it easy to compromise systems via their Operating System and/or application software? Yes: this we know. Luckily – though depending on the sophistication of the attack – there are generally good tools and mechanisms for detecting such compromises, including behavioural monitoring.

Table

Compromise type

Attacker difficulty

Detection difficulty

Supply-chain hardware

High

High

Supply-chain firmware

Low

Medium

Supply-chain software

Medium

Medium

Post-provisioning hardware

Medium

Medium

Post-provisioning firmware

Medium

Medium

Post-provisioning software

Low

Low

Conclusion

What are your chances of spotting a compromise on your system? I would argue that they are generally pretty much in line with the difficulty of performing the attack in the first place: with the glaring exception of supply-chain firmware. We’ve seen attacks of this type, and they’re very difficult to detect. The good news is that there is some good work going on to help detection of these types of attacks, particularly in the world of Linux[6] and open source. In the meantime, I would argue our best forms of defence are currently:

for supply-chain: build close relationships, use known and trusted suppliers. You may want to restrict as much as possible of your supply chain to “friendly” regimes if you’re worried about State Actor attacks, but this is very hard in the global economy.

for post-provisioning: lock down your systems as much as possible – both physically and logically – and use behavioural monitoring to try to detect anomalies in what you expect them to be doing.

1 – I’ll try to write something on this other topic in a different article.

2 – depending on interest, I’ll also consider a series of articles to go into more detail on each.

3 – how certain are you, for instance, that your delivery company won’t give your own government’s security services access to the boxes containing your equipment before they deliver them to you?

4 – though see above: what about the firmware?

5 – though you can always compile your own Operating System if you use open source software[6].

The problem is not the autonomy. The problem isn’t even particularly with the intelligence…

Autonomous, intelligent agents offer some great opportunities for our digital lives*. There, look, I said it. They will book meetings for us, negotiate cheap holidays, order our children’s complete school outfit for the beginning of term, and let us know when it’s time to go to the nurse for our check-up. Our business lives, our personal lives, our family relationships – they’ll all be revolutionised by autonomous agents. Autonomous agents will learn our preferences, have access to our diaries, pay for items, be able to send messages to our friends.

This is all fantastic, and I’m very excited about it. The problem is that I’ve been excited about it for nearly 20 years, when I was involved in a project around autonomous agents in Java. It was very neat then, and it’s still very neat now***.

Of course, technology has moved on. Some of the underlying capabilities are much more advanced now than then. General availability of APIs, consistency of data formats, better Machine Learning (or Artificial Intelligence, if you must), less computationally expensive cryptography, and the rise of blockchains and distributed ledgers: they all bring the ability for us to build autonomous agents closer than ever before. We talked about disintermediation back in the day, and that looked plausible. We really can build scalable marketplaces now in ways which just weren’t as feasible two decades ago.

The problem, though, isn’t the technology. It was never the technology. We could have made the technology work 20 years ago, even if it wasn’t as fast, secure or wide-ranging as it could be today. It isn’t even vested interests from the large platform players, who arguably own much of this space at the moment – though these interests are much more consolidated than they were when I was first looking at this issue.

The problem is not the autonomy. The problem isn’t even particularly with the intelligence: you can program as much or as little in as you want, or as the technology allows. The problem is with the agency.

How much of my life do I want to hand over to what’s basically a ‘bot? Ignore***** the fact that these things will get hacked******, and assume we’re talking about normal, intended usage. What does “agency” mean? It means acting for someone: being their agent – think of what actors’ agents do, for example. When I engage a lawyer or a builder or an accountant to do something for me, or when an actor employs an agent for that matter, we’re very clear about what they’ll be doing. This is to protect both me and them from unintended consequences. There’s a huge legal corpus around defining, in different fields, exactly the scope of work to be carried out by a person or a company who is acting as an agent. There are contracts, and agreed restitutions – basically punishments – for when things go wrong. Say that an accountant buys 500 shares in a bank, and then I turn round and say that she never had the authority to do so: if we’ve set up the relationship correctly, it should be entirely clear whether or not she did, and whose responsibility it is to deal with any fall-out from that purchase.

Now think about that in terms of autonomous, intelligent agents. Write me that contract, and make it equivalent in software and the legal system. Tell me what happens when things go wrong with the software. Show me how to prove that I didn’t tell the agent to buy those shares. Explain to me where the restitution lies.

And these are arguably the simple problems. How to I rebuild the business reputation that I’ve built up over the past 15 years when my agent posts on Twitter a tweet about how I use a competitor’s products, when I’m just trialling them for interest? How does an agent know not to let my wife see the diary entry for my meeting with that divorce lawyer*******? What aspects of my browsing profile are appropriate for suggesting – or even buying – online products or services with my personal or business credit card*********? And there’s the classic “buying flowers for the mistress and having them sent to the wife” problem**********.

I don’t think we have an answer to these questions: not even close. You know that virtual admin assistant we’ve been promised in sci-fi movies for decades now: the one with the futuristic haircut who appears as a hologram outside our office? Holograms – nearly. Technology behind it – pretty much. Trust, reputation and agency? Nowhere near.

*I hate this word: “digital”. Well, not really, but it’s used far too much as a shorthand for “newest technology”**.

***this is one of those words that my kids hate me using. There are two types of word that come into this category: old words and new words. Either I’m showing how old I am, or I’m trying to be hip****, which is arguably worse. I can’t win.

****yeah, they don’t say hip. That’s one of the “old person words”.

*****for now, at least. Let’s not forget it.

******_everything_ gets hacked*******.

*******I could say “cracked”, but some of it won’t be malicious, and hacking might be positive.

********I’m not. This is an example.

*********this isn’t even about “dodgy” things I might have been browsing on home time. I may have been browsing for analyst services, with the intent to buy a subscription: how sure am I that the agent won’t decide to charge these to my personal credit card when it knows that I perform other “business-like” actions like pay for business-related books myself sometimes?

… what’s the fun in having an Internet if you can’t, well, “net” on it?

Sometimes – and I hope this doesn’t come as too much of a surprise to my readers – sometimes, there are bad people, and they do bad things with computers. These bad things are often about stopping the good things that computers are supposed to be doing* from happening properly. This is generally considered not to be what you want to happen**.

For this reason, when we architect and design systems, we often try to enforce isolation between components. I’ve had a couple of very interesting discussions over the past week about how to isolate various processes from each other, using different types of isolation, so I thought it might be interesting to go through some of the different types of isolation that we see out there. For the record, I’m not an expert on all different types of system, so I’m going to talk some history****, and then I’m going to concentrate on Linux*****, because that’s what I know best.

In the beginning

In the beginning, computers didn’t talk to one another. It was relatively difficult, therefore, for the bad people to do their bad things unless they physically had access to the computers themselves, and even if they did the bad things, the repercussions weren’t very widespread because there was no easy way for them to spread to other computers. This was good.

Much of the conversation below will focus on how individual computers act as hosts for a variety of different processes, so I’m going to refer to individual computers as “hosts” for the purposes of this post. Isolation at this level – host isolation – is still arguably the strongest type available to us. We typically talk about “air-gapping”, where there is literally an air gap – no physical network connection – between one host and another, but we also mean no wireless connection either. You might think that this is irrelevant in the modern networking world, but there are classes of usage where it is still very useful, the most obvious being for Certificate Authorities, where the root certificate is so rarely accessed – and so sensitive – that there is good reason not to connect the host on which it is stored to be connected to any other computer, and to use other means, such as smart-cards, a printer, or good old pen and paper to transfer information from it.

And then…

And then came networks. These allow hosts to talk to each other. In fact, by dint of the Internet, pretty much any host can talk to any other host, given a gateway or two. So along came network isolation to try to stop tha. Network isolation is basically trying to re-apply host isolation, after people messed it up by allowing hosts to talk to each other******.

Later, some smart alec came up with the idea of allowing multiple processes to be on the same host at the same time. The OS and kernel were trusted to keep these separate, but sometimes that wasn’t enough, so then virtualisation came along, to try to convince these different processes that they weren’t actually executing alongside anything else, but had their own environment to do their old thing. Sadly, the bad processes realised this wasn’t always true and found ways to get around this, so hardware virtualisation came along, where the actual chips running the hosts were recruited to try to convince the bad processes that they were all alone in the world. This should work, only a) people don’t always program the chips – or the software running on them – properly, and b) people decided that despite wanting to let these processes run as if they were on separate hosts, they also wanted them to be able to talk to processes which really were on other hosts. This meant that networking isolation needed to be applied not just at the host level, but at the virtual host level, as well******.

A step backwards?

Now, in a move which may seem retrograde, it occurred to some people that although hardware virtualisation seemed like a great plan, it was also somewhat of a pain to administer, and introduced inefficiencies that they didn’t like: e.g. using up lots of RAM and lots of compute cycles. These were often the same people who were of the opinion that processes ought to be able to talk to each other – what’s the fun in having an Internet if you can’t, well, “net” on it? Now we, as security folks, realise how foolish this sounds – allowing processes to talk to each other just encourages the bad people, right? – but they won the day, and containers came along. Containers allow lots of processes to be run on a host in a lightweight way, and rely on kernel controls – mainly namespaces – to ensure isolation********. In fact, there’s more you can do: you can use techniques like system call trapping to intercept the things that processes are attempting and stop them if they look like the sort of things they shouldn’t be attempting*********.

And, of course, you can write frameworks at the application layer to try to control what the different components of an application system can do – that’s basically the highest layer, and you’re just layering applications on applications at this point.

Systems thinking

So here’s where I get to the chance to mention one of my favourite topics: systems. As I’ve said before, by “system” here I don’t mean an individual computer (hence my definition of host, above), but a set of components that work together. The thing about isolation is that it works best when applied to a system.

Let me explain. A system, at least as I’d define it for the purposes of this post, is a set of components that work together but don’t have knowledge of external pieces. Most important, they don’t have knowledge of different layers below them. Systems may impose isolation on applications at higher layers, because they provide abstractions which allow higher systems to be able to ignore them, but by virtue of that, systems aren’t – or shouldn’t be – aware of the layers below them.

A simple description of the layers – and it doesn’t always hold, partly because networks are tricky things, and partly because there are various ways to assemble the stack – may look like this.

As I intimated above, this is a (gross) simplification, but the point holds that the basic rule is that you can enforce isolation upwards in the layers of the stack, but you can’t enforce it downwards. Lower layer isolation is therefore generally stronger than higher layer isolation. This shouldn’t come as a huge surprise to anyone who’s used to considering network stacks – the principle is the same – but it’s helpful to lay out and explain the principles from time to time, and the implications for when you’re designing and architecting.

Because if you are considering trust models and are defining trust domains, you need to be very, very careful about defining whether – and how – these domains spread across the layer boundaries. If you miss a boundary out when considering trust domains, you’ve almost certainly messed up, and need to start again. Trust domains are important in this sort of conversation because the boundaries between trust domains are typically where you want to be able to enforce and police isolation.

The conversations I’ve had recently basically ran into problems because what people really wanted to do was apply lower layer isolation from layers above which had no knowledge of the bottom layers, and no way to reach into the control plane for those layers. We had to remodel, and I think that we came up with some sensible approaches. It was as I was discussing these approaches that it occurred to me that it would have been a whole lot easier to discuss them if we’d started out with a discussion of layers: hence this blog post. I hope it’s useful.

*although they may well not be, because, as I’m pretty sure I’ve mentioned before on this blog, the people trying to make the computers do the good things quite often get it wrong.

**unless you’re one of the bad people. But I’m pretty sure they don’t read this blog, so we’re OK***.

***if you are a bad person, and you read this blog, would you please mind pretending, just for now, that you’re a good person? Thank you. It’ll help us all sleep much better in our beds.

****which I’m absolutely going to present in an order that suits me, and generally neglect to check properly. Tough.

*****s/Linux/GNU Linux/g; Natch.

******for some reason, this seemed like a good idea at the time.

*******for those of you who are paying attention, we’ve got to techniques like VXLAN and SR-IOV.

********kernel purists will try to convince you that there’s no mention of containers in the Linux kernel, and that they “don’t really exist” as a concept. Try downloading the kernel source and doing a search for “container” if you want some ammunition to counter such arguments.

Don’t increase the technical complexity of a process just because you’ve got a cool technology that you could throw at it.

I’m attending Open Source Summit 2017* this week in L.A., and went to an interesting “fireside chat” on blockchain moderated by Brian Behlendforf of Hyperledger, with Jairo*** of Wipro and Siva Kannan of Gem. It was a good discussion – fairly basic in terms of the technical side, and some discussion of identity in blockchain – but there was one particular part of the session that I found interesting and which I thought was worth some further thought. As in my previous post on this topic, I’m going to conflate blockchain with Distributed Ledger Technologies (DLTs) for simplicity.

Siva presented three questions to ask when considering whether a process is a good candidate for moving to the blockchain. There’s far too much bandwagon-jumping around blockchain: people assume that all processes should be blockchained. I was therefore very pleased to see this come up as a topic. I think it’s important to spend some time looking at when it makes sense to use blockchains, and when it’s not. To paraphrase Siva’s points:

is the process time-consuming?

is the process multi-partite, made up of multiple steps?

is there a trust problem within the process?

I liked these as a starting point, and I was glad that there was a good conversation around what a trust problem might mean. I’m not quite sure it went far enough, but there was time pressure, and it wasn’t the main thrust of the conversation. Let’s spend a time looking at why I think the points above are helpful as tests, and then I’m going to add another.

Is the process time-consuming?

The examples that were presented were two of the classic ones used when we’re talking about blockchain: inter-bank transfer reconciliation and healthcare payments. In both cases, there are multiple parties involved, and the time it takes for completion seems completely insane for those of us used to automated processes: in the order of days. This is largely because the processes are run by central authorities when, from the point of view of the process itself, the transactions are actually between specific parties, and don’t need to be executed by those authorities, as long as everybody trusts that the transactions have been performed fairly. More about the trust part below.

Is the process multi-partite?

If the process is simple, and requires a single step or transaction, there’s very little point in applying blockchain technologies to it. The general expectation for multi-partite processes is that they involve multiple parties, as well as multiple parts. If there are only a few steps in a transaction, or very few parties involved, then there are probably easier technological solutions for it. Don’t increase the technical complexity of a process just because you’ve got a cool technology that you can throw at it******.

Is there a trust problem within the process?

Above, I used the phrase “as long as everybody trusts that the transactions have been performed fairly”******. There are three interesting words in this phrase*******: “everybody”, “trusts” and “fairly”. I’m going to go through them one by one:

everybody: this might imply full transparency of all transactions to all actors in the system, but we don’t need to assume that – that’s part of the point of permissioned blockchains. It may be that only the actors involved in the particular process can see the details, whereas all other actors are happy that they have been completed correctly. In fact, we don’t even need to assume that the actors involved can see all the details: secure multi-party computation means that only restricted amounts of information need to be exposed********.

trusts: I’ve posted on the topic of trust before, and this usage is a little less tight than I’d usually like. However, the main point is to ensure sufficient belief that the process meets expectations to be able to accept it.

fair: as anyone with children knows, this is a loaded word. In this context, I mean “according to the rules agreed by the parties involved – which may include parties not included in the transaction, such as a regulatory body – and encoded into the process”.

This point about encoding rules into a process is a really, really major one, to which I intend to return at a later date, but for now let’s assume (somewhat naively, admittedly) that this is doable and risk-free.

One more rule: is there benefit to all the stakeholders?

This was a rule that I suggested, and which caused some discussion. It seems to me that there are some processes where a move to a blockchain may benefit certain parties, but not others. For example, the amount of additional work required by a small supplier of parts to a large automotive manufacturer might be such that there’s no obvious benefit to the supplier, however much benefit is manifestly applicable to the manufacturer. At least one of the panellists was strongly of the view that there will always be benefit to all parties, but I’m not convinced that the impact of implementation will always outweight such benefit.

Conclusion: blockchain is cool, but…

… it’s not a perfect fit for every process. Organisations – and collections of organisations – should always carefully consider how good a fit blockchain or DLT may be before jumping to a decision which may be more costly and less effective than they might expect from the hype.

*was “LinuxCon and ContainerCon”**.

**was “LinuxCon”.

***he has a surname, but I didn’t capture it in my notes****.

****yes, I write notes!

*****this is sometime referred to as the “hammer problem” (if all you’ve got is a hammer, then everything looks like a nail)

******actually, I didn’t: I missed out the second “as”, and so had to correct it in the first appearance of the phrase.

*******in the context of this post. They’re all interesting in the own way, I know.

********this may sound like magic. Given my grasp of the mathematics involved, I have to admit that it might as well be*********.

… “explicit-trust networks” really is a much better way of describing what’s going on here.

A few weeks ago, I wrote a post called “What is trust?”, about how we need to be more precise about what we mean when we talk about trust in IT security. I’m sure it’s case of confirmation bias*, but since then I’ve been noticing more and more references to “zero-trust networks”. This both gladdens and annoys me, a set of conflicting emotions almost guaranteed to produce a new blog post.

Let’s start with the good things about the term. “Zero-trust networks” are an attempt to describe an architectural approach which address the disappearance of macro-perimeters within the network. In other words, people have realised that putting up a firewall or two between one network and another doesn’t have a huge amount of effect when traffic flows across an organisation – or between different organisations – are very complex and don’t just follow one or two easily defined – and easily defended – routes. This problem is exacerbated when the routes are not only multiple – but also virtual. I’m aware that all network traffic is virtual, of course, but in the old days**, even if you had multiple routing rules, ingress and egress of traffic all took place through a single physical box, which meant that this was a good place to put controls***.

These days (mythical as they were) have gone. Not only do we have SDN (Software-Defined Networking) moving packets around via different routes willy-nilly, but networks are overwhelmingly porous. Think about your “internal network”, and tell me that you don’t have desktops, laptops and mobile phones connected to it which have multiple links to other networks which don’t go through your corporate firewall. Even if they don’t******, when they leave your network and go home for the night, those laptops and mobile phones – and those USB drives that were connected to the desktop machines – are free to roam the hinterlands of the Internet******* and connect to pretty much any system they want.

And it’s not just end-point devices, but components of the infrastructure which are much more likely to have – and need – multiple connections to different other components, some of which may be on your network, and some of which may not. To confuse matters yet further, consider the “Rise of the Cloud”, which means that some of these components may start on “your” network, but may migrate – possibly in real time – to a completely different network. The rise of micro-services (see my recent post describing the basics of containers) further exacerbates the problem, as placement of components seems to become irrelevant, so you have an ever-growing (and, if you’re not careful, exponentially-growing) number of flows around the various components which comprise your application infrastructure.

What the idea of “zero-trust networks” says about this – and rightly – is that a classical, perimeter-based firewall approach becomes pretty much irrelevant in this context. There are so many flows, in so many directions, between so many components, which are so fluid, that there’s no way that you can place firewalls between all of them. Instead, it says, each component should be responsible for controlling the data that flows in and out of itself, and should that it has no trust for any other component with which it may be communicating.

I have no problem with the starting point for this – which is as far as some vendors and architects take it: all users should always be authenticated to any system, and auhorised before they access any service provided by that system. In fact, I’m even more in favour of extending this principle to components on the network: it absolutely makes sense that a component should control access its services with API controls. This way, we can build distributed systems made of micro-services or similar components which can be managed in ways which protect the data and services that they provide.

And there’s where the problem arises. Two words: “be managed”.

In order to make this work, there needs to be one or more policy-dictating components (let’s call them policy engines) from which other components can derive their policy for enforcing controls. The client components must have a level of trust in these policy engines so that they can decide what level of trust they should have in the other components with which they communicate.

This exposes a concomitant issue: these components are not, in fact, in charge of making the decisions about who they trust – which is how “zero-trust networks” are often defined. They may be in charge of enforcing these decisions, but not the policy with regards to the enforcement. It’s like a series of military camps: sentries may control who enters and exits (enforcement), but those sentries apply orders that they’ve been given (policies) in order to make those decisions.

Here, then, is what I don’t like about “zero-trust networks” in a few nutshells:

although components may start from a position of little trust in other components, that moves to a position of known trust rather than maintaining a level of “zero-trust”

components do not decide what other components to trust – they enforce policies that they have been given

components absolutely do have to trust some other components – the policy engines – or there’s no way to bootstrap the system, nor to enforce policies.

I know it’s not so snappy, but “explicit-trust networks” really is a much better way of describing what’s going on here. What I do prefer about this description is it’s a great starting point to think about trust domains. I love trust domains, because they allow you to talk about how to describe shared policy between various components, and that’s what you really want to do in the sort of architecture that’s I’ve talked about above. Trust domains allow you to talk about issues such as how placement of components is often not irrelevant, about how you bootstrap your distributed systems, about how components are not, in the end, responsible for making decisions about how much they trust other components, or what they trust those other components to do.

So, it looks like I’m going to have to sit down soon and actually write about trust domains. I’ll keep you posted.

*one of my favourite cognitive failures

**the mythical days that my children believe in, where people have bouffant hairdos, the Internet could fit on a single Winchester disk, and Linux Torvalds still lived in Finland.

***of course, there was no such perfect time – all I should need to say to convince you is one word: “Joshua”****

Blockchains are big news at the moment. There are conferences, start-ups, exhibitions, open source projects (in fact, pretty much all of the blockchain stuff going on out there is open source – look at Ethereum, zcash and bitcoin as examples) – all we need now are hipster-run blockchain-themed cafés*. If you’re looking for an initial overview, you could do worse than the Wikipedia entry – that’s not the aim of this post.

Before we go much further, one useful thing to know about many blockchains projects is that they aren’t. Blockchains, that is. They are, more accurately, distributed ledgers****. For now, however, let’s roll in blockchain and distributed ledger technologies and assume we’re talking about the same thing: it’ll make it easier for now, and in most cases, the difference is immaterial for our discussions.

I’m not planning to go into the basics here, but we should briefly talk about the main link with crypto and blockchains, and that’s the blocks themselves. In order to build a block, a set of transactions to put into a blockchain, and then to link it into the blockchain, cryptographic hashes are used. This is the most obvious relationship that the various blockchains have with cryptography.

There’s another, equally important one, however, which is about identity*****. Now, for many blockchain-based crypto-currencies, a major part of the point of using them at all is that identity isn’t, at one level, important. There are many actors in a crypto-currency who may be passing each other vanishingly small or eye-wateringly big amounts of money, and they don’t need to know who each other is in order to make transactions. To be more clear, the uniqueness of each actor absolutely is important – I want to be sure that I’m sending money to the entity who has just rendered me a service – but being able to tie that unique identity to a particular person IRL****** is not required. To use the technical term, such a system is pseudonymous. Now, if pseudonymity is a key part of the system, then protecting that property is likely to be important to its users. Crypto-currencies do this with various degrees of success. The lesson here is that you should do some serious reading and research if you’re planning to use a crypto-currency, and this property matters to you.

On the other hand, there are many blockchain/distributed ledger technologies where pseudonymity is not a required property, and may actually be unwanted. These are the types of system in which I am most generally interested from a professional point of view.

In particular, I’m interested in permissioned blockchains. Permissionless (or non-permissioned) blockchains are those where you don’t need permission from anyone in order to participate. You can see why pseudonimity and permissionless blockchains can fit well today: most (all?) crypto-currencies are permissionless. Permissioned blockchains are a different kettle of fish, however, and they’re the ones at which many businesses are looking at the moment. In these cases, you know the people or entities who are going to be participating – or, if you don’t know now, you’ll want to check on them and their identity before they join your blockchain (or distributed ledger). And here’s why blockchains are interesting in business********. It’s not just that identity is interesting, though it is, because how you marry a particular entity to an identity and make sure that this binding is not spoofable over the lifetime of the system is difficult, difficult, lemon difficult******** – but there’s more to it than that.

What’s really interesting is that if you’re thinking about moving to a permissioned blockchain or distributed ledger with permissioned actors, then you’re going to have to spend some time thinking about trust. You’re unlikely to be using a proof-of-work system for making blocks – there’s little point in a permissioned system – so who decides what comprises as “valid” block, that the rest of the system should agree on? Well, you can rotate around some (or all) of the entities, or you can have a random choice, or you can elect a small number of über-trusted entities. Combinations of these schemes may also work. If these entities all exist within one trust domain, which you control, then fine, but what if they’re distributors, or customers, or partners, or other banks, or manufacturers, or semi-autonomous drones, or vehicles in a commercial fleet? You really need to ensure that the trust relationships that you’re encoding into your implementation/deployment truly reflect the legal and IRL trust relationships that you have with the entities which are being represented in your system.

And the problem is that once you’ve deployed that system, it’s likely to be very difficult to backtrack, adjust or reset the trust relationships that you’ve designed in. And if you don’t think about the questions I noted above about long-term bindings of identity, you’re going to be in some serious problems when, for instance:

an entity is spoofed;

an entity goes bankrupt;

an entity is acquired by another entity (buy-outs, acquisitions, mergers, etc.);

an entity moves into a different jurisdiction;

legislation or regulation changes.

These are all issues that are well catered for within existing legal frameworks (with the possible exception of the first), but which are more difficult to manage within the sorts of systems with which we are generally concerned in this blog.

Please don’t confuse the issues noted above with the questions around how to map legal agreements to the so-called “smart contracts” in blockchain/distributed ledger systems. That’s another thorny (and, to be honest, not unconnected issue), but this one goes right to the heart of what a system is, and it’s the reason that people need to think very hard about what they’re really trying to achieve when they adopt our latest buzz-word technology. Yet again, we need to understand how systems and the business work together, and be honest about the fit.

*if you come across one of these, please let me know. Put a picture in a comment or something.**

**even better – start one yourself. Make sure I get an invitation to the opening***.

***and free everything.

****there have been onlines spats about this. I’m not joining in.

*****there are others, but I’ll save those for another day.

******IRL == “In Real Life”. I’m so old-skool.

*******for me. If you’ve got this far into the article, I’m hoping there’s an evens chance that the same will go for you, too.

********I’ll leave this as an exercise for the reader. Watch it, though, and the TV series on which it’s based. Unless you don’t like swearing, in which case don’t watch either.