Thursday, December 4, 2014

Supervisor trees

In my past few posts, I have focused on fault tolerant distributed systems as implemented through cluster managers. Apache Mesos, Kubernetes, and many others all attempt to support fault tolerance by auto-restarting and other self-healing techniques at the cluster manager level. As such, they rightly claim that they are the new operating systems of the cloud. It turns out, however, cluster managers certainly do not have a monopoly on fault tolerance features. Long before Mesos, Kubernetes, and possibly even University of Wisconsin's Condor, a distributed processing system with considerable more pedigree, Erlang had supervisor trees and supervisor behaviors (a kind of language interface) in the runtime thus supporting large, highly fault tolerant distributed systems decades ago.

A supervisor tree consists of supervisor and worker processes where supervisors themselves may have supervisors (i.e., a supervisor can be over both subordinate supervisors or workers).
Erlang supervisors three process restart strategies:

one-for-one: when a process fails or quits, it is restarted

one-for-all: when a child process terminates, all its sibling processes are terminated and restarted

rest-for-one: when a child process terminates, all younger siblings (i.e., sibling processes that started after) are terminated and the original child process and its younger siblings are restarted

Supervised processes (children) may also be specified as one of three kinds:

Search

chitika

Disclaimer: I may trade in or out of any of the above names. None of the above should be construed as investment advice. It is only for informational and entertainment purposes. Please consult a qualified financial professional before acting on any financial information.