Topics

Featured in Development

Understandability is the concept that a system should be presented so that an engineer can easily comprehend it. The more understandable a system is, the easier it will be for engineers to change it in a predictable and safe manner. A system is understandable if it meets the following criteria: complete, concise, clear, and organized.

Featured in Architecture & Design

Sonali Sharma and Shriya Arora describe how Netflix solved a complex join of two high-volume event streams using Flink. They also talk about managing out of order events and processing late arriving data, exploring keyed state for maintaining large state, fault tolerance of a stateful application, strategies for failure recovery, data validation batch vs streaming, and more.

Featured in Culture & Methods

Tim Cochran presents research gathered from ThoughtWorks' varied clients and projects, and shows some of the metrics their teams have identified as guides to creating the platform and the culture for high performing teams.

Peter Bourgon on CRDTs and State at the Edge

Today on The InfoQ Podcast, Wes Reisz talks with Peter Bourgon. Peter is a distributed system engineer working at Fastly. His area of interest is around coordination free replicated systems. The two engineers talk about the space of Conflict-Free Replicated Data Types (CRDTs) specifically in the context of edge compute. Topics covered on the podcast include Edge compute, CRDTs, CAP Theorem, and challenges around building distributed systems.

Key Takeaways

An easy way to think of a CRDT is as an associative, commutative, and idempotent data structure plus the operations needed to use it.

The edge is an overloaded term that people tend to define based on where they sit on a spectrum between the customer and the data center. Fastly’s edge is away from the data center and but not to the telephone pole or handset.

RAFT and Gossip are two alternative approaches to using a coordination free replication system like CRDTs.

To get the properties of a CRDT and have useful data types, you have to pay a cost in size and often bytes on the wire. These are challenges that continue to need solutions.

Modern Distributed systems and data structures like CRDTs require you to start thinking about state in the system itself. It’s not unusual for a system today to give you multiple results back that the system will have to handle or merge.

09:15 The important property is that you have deterministic mergeability.

09:20 An example I like to use is: if you have a set of operations to a key, the important thing is that you put the operations in the bag and they come out in an order, that you get to the same end state.

14:10 If you want to have the systems with these nice properties, low latency and not deal with conflicts, we have to make the applications more complicated.

You've got all these POPs distributed, and you're using these CRDTs to build a distributed key value store?

14:50 When people design systems, they are typically designing for a single data centre.

15:00 You have pretty good technology, connections are fast and reliable - and everything is physically close, so latency is pretty low.

15:20 The main constraint in edge system design is that you can't ignore the speed of light any more.

15:25 The logical system that you're trying to build occupies a physical space that cannot cheat physical laws.

15:55 The speed of light is the core constraint in this space, and the faults that come with it can't be ignored.

16:05 CRDTs are the things that allow us to work productively at this scale.

16:10 We can't fall back to circuit breaker patterns, retry loops, leader elections, when we're not sure the messages are not going to get there in a day.

16:20 We're talking about Earth bound systems at the moment, but when we go interplanetary all this becomes more important.

With Raft, you can run into problems when there is latency caused by distance between POPs. Is that correct?

17:10 You could say that; you can tweak the timeouts in a Raft installation, and in theory you could make it work at a large scale, but you have to bear in mind every operation has to go through a quorum.

17:30 It has to go to the leader, go through a round, append to the log, and you have to get the response back to the client.

17:35 All of those things are subject to timeouts; if you have timeouts sufficient for operation at a global scale, then your client will be waiting seconds for each response.

Can you recap the CAP theorem?

18:00 Broadly speaking, it states that if you are designing a distributed system, you can choose either consistency or availability in the face of partitions.

18:05 A partition is a network fault, and they happen all the time - they are inevitable.

18:15 Raft's choice in this situation is consistency, and there's a whole set of protocols like Raft and Paxos that you can use to do this.

18:25 Because of the latency requirements, they don't usually work outside of a single data centre, if you want to have reasonable latency.

19:10 It depends on the type of system you're using, but in a key value store, every key you write will have some kind of latency - but even reads, you need to have consistency.

19:40 Now we're into the formal language, but there's whole classes of serialisability and linearisability that have different consistency models.

19:50 Raft is one of the more formal ones, and AP systems choose availability in the face of partitions.

20:00 There are also eventually consistent systems, which is what CRDTs give you, so you can have things that are out-of-sync; you can do local operations quickly and sync later.

20:20 CRDTs make sure that overall process is still formally consistent and correct.

20:35 Cassandra is an AP system, though I'm not sure in all the ways it communicates.

20:40 I guess that are ways you can configure it and have Cassandra clusters that are very large - but I've not used Cassandra in anger.

Why not Gossip for a problem at the edge?

21:00 It gets into a slightly different aspect of system design, which is data ownership or location.

21:15 In any sort of edge compute system, the whole point is that you have physical devices that are spread out across a large physical space.

21:30 What you want to do is establish a single logical state space, like a distributed key value store.

21:45 The other constraint is that these devices are often not huge, so you can't keep the entire data set on every device - you have to have a heuristic.

22:00 For example, you might store it all in a central space, or distribute it out to based on use - it's a physical constraint on these kinds of systems.

22:10 If you set a key to a value in some place, it has to live on the device in that place to start, but where does it go eventually?

22:25 What's the protocol for getting that data if another device requests it, and what is the information hierarchy?

35:55 You can learn a lot from Kyle Kingsbury who writes the "Call Me Maybe" Jepsen testing series.

36:30 I wish there was a book, but the area is too nascent and moving too quickly to moving at any specific book.

About QCon

QCon is a practitioner-driven conference designed for technical team leads, architects, and project managers who influence software innovation in their teams. QCon takes place 8 times per year in London, New York, Munich, San Francisco, Sao Paolo, Beijing, Guangzhou & Shanghai. QCon New York is at its 9th Edition and will take place Jun 15-19, 2020. 140+ expert practitioner speakers, 1000+ attendees and 18 tracks will cover topics driving the evolution of software development today. Visit qconnewyork.com to get more details.

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via
SoundCloud,
Apple Podcasts,
Spotify,
Overcast
and the Google Podcast.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.