I’ve hinted, in last week’s post, about the development of a feature that could help Minds.com evolve into a decentralised social network and bring P2P into the mainstream, thereby solving the growing privacy and censorship concerns that are associated with a centralised social network. This feature is based on an application layer protocol known as ‘DAT’. There are reasons to believe it’s likely to succeed where previous ideas failed: Since DAT works entirely at the application layer, and is implemented using Node.js, there’s very little effort or learning curve involved for developers and users of DAT applications. Web applications can be extended to support it, if the demand is there, using already published librariesthat are extensively documented.
For those who aren’t developers, there is a working browser that anyone, without technical skills or knowledge, can use to browse and publish sites on the DAT Web.

What is DAT?

DAT started life with the scientific community, which had a need for a more effective method of distributing, tracking and versioning data. In the conventional Web, data objects are moved, Web pages are deleted and domains expire – this is referred to as ‘content drift’. We’ve all come across an example of this in the form of ‘dead links’. When using a hyperlink to reference a data object or Web page, there is no guarantee that link would be valid at some point in the future. DAT was proposed as a solution to this.
But what does this have to do with censorship and privacy, you’re probably asking? The answer to this question is in how data is distrubuted, discovered and encrypted.

Merkle Trees, Hashing Algorithms and Public Key Encryption

The DAT protocol is essentially a real-world implementation of the Merkle Tree data structure, with the BLAKE2b and Ed25519 algorithms for identification, encryption and verification (other docs state that SHA256 is used as the hashing algorithm). It’s not necessary to understand this concept in order to develop DAT applications, since there are already libraries for implementing this, but for the curious, I reccommend reading Tara Vancil’s explanation first before moving on to the whitepaper.

An important point Vancil made was DAT is about the addressing and discovery of data objects, not the addressing of servers hosting those objects. Data objects are not bound to IP addresses or domains either. Each data object has its own address, and that address is determined by its cryptographic hash value – a file’s hash digest will be static, regardless of where it’s hosted. This is important, because we’re accustomed to thinking of the Internet/Web in terms of the client/server model, and proposed solutions for privacy and anti-censorship typically try to deal with the problem of decentralised host discovery in a peer-to-peer (P2P) network.

Some form of data structure is required to make the data objects addressable and to enable their integrity to be verified. A DAT peer-to-peer network uses Merkle Trees for this, where all ‘leaves’ and nodes contain the hash values of the data objects they represent, and the root node contains the hash digest of all its child nodes. In other words, as the whitepaper puts it, ‘each non-leaf node is the hash of all child nodes‘.
Not only does this provide a way of verifying the integrity of the data objects – the root node’s digest will change if there’s any modification to a data object represented in the tree – it provides the means to an efficient lookup system, as the root hash digest becomes the identifier for a dataset.

Obviously, this means clients would need to fetch the root node’s value for a given dataset from a trusted source, which might be one of many designated lookup peers on the network. If the client wanted a given data object, it wouldn’t need to fetch everything referenced under the root node, but just the root node value, the parent node of the requested objects, and the hash values of the other parent nodes.

Addressing, References and Security

Now, let’s get into the more specific aspects of how Merkle Trees are implemented in the context of DAT. All the ‘leaf’ nodes in the DAT Merkle Tree contain a BLAKE2b or SHA256 (depending on the docs being read) hash digest of the referenced object. All parent nodes contain the hash digest and a cryptographic signature. The signature is generated by creating Ed25519 keys for each parent node and using them to sign the hash digest.

When sharing a locally-created site in the Beaker browser, or viewing one already shared on the network, you might notice the URI following ‘DAT://’ is a long hexadecimal string. This is actually the Ed25519 public key of the archive containing the referenced object being shared, and it’s used to encrypt and decrypt the content. The corresponding private key is required to write changes to the DAT archive.
The public key is, in turn, hashed to generate a discovery key, which is used to find the data objects. This ensures no third-party can determine the public key of a private data object that hasn’t been publicly shared.

Beaker

The Beaker browser looks very much like the standard Firefox browser on the surface, and it can be used to browse both DAT:// and HTTP:// addresses. As we can see, DAT sites are rendered just as well as those on the conventional Web. The only problem is that, as with Tor and I2P, sites are hosted on machines that aren’t online 24/7, so many of them are unreachable at a given time.

From the Welcome dialogue, we can get straight to setting up a personal Web site dor publishing on the DAT Web. A default index page, script.js and styles.css are included ready for us to customise. In addition, Beaker allows us to share the contents of an arbitrary directory on the machine it’s running on.

Previously-created sites are available under the ‘Library‘ tab in the main menu. Sites that aleady exist will be listed under the ‘Your archives‘ section, and can be modified and/or published.

What happens to a published site when the local machine is offline? There is a method to keep a site accessible, by somehow getting another person or machine to ‘seed’ the data. This is a short-hand way of saying another person could fetch a copy of the site and re-share it over the network. Seeding happens automatically as a user is actively browsing a DAT site.

The Node.js Modules

Several Node.js modules provide libraries that developers can use to implement DAT features in their applications.

hypercore: A component for creating and appending feeds, and verifying the integrity of data objects. The API exposes a number of methods under the ‘feed’ namespace for reading, writing and querying feeds.

hyperdrive: This is a distributed filesystem for P2P. One of the design principles is to reproduce, as closely as possible, the APIs as the core Node.js filesystem component, thereby making it transparent to application developers. This module enables a local file system to be replicated on other machines.

dat-node: A high-level component that developers could use to bring together other DAT modules and build DAT-capable applications.

hyperdiscovery: Module for network discovery and joining. Running two instances of a hyperdiscovery module will result in a given archive key being replicated.

dat-storage: The DAT storage provider. Used for storing secret keys, among other things, using the hyperdrive filesystem.

In conjunction with Electron.js and Node.js, the above modules can be used to develop a DAT-enabled desktop application, of which Beaker is just one example.

Node Discovery in Practice

Two components are used for this: discovery-channel and discovery-swarm. The discovery-channel component searches BitTorrent, DNS and Multicast DNS servers for peers, and advertises the address/port of the local node. Therefore, it is based on the bittorrent-dht and dns-discovery modules. Using discovery-channel, the client can join channels, terminate sessions, call handlers on session initiation and fetch a list of relevant channels. The network-swarm module uses discovery-channel to connect with DAT peers and control the session.

Like this:

Using Node.js, JSON and jQuery, I’ve managed to develop something much like an MVC application that’s considerably more lightweight than a .NET project, and anyone can use this as a template or basis for their own Web application project. Node.js enables the creation of Web servers, and enables communication between client-side JavaScript and the server. A Node.js application has the following:

Creating a Simple Node.js Server
The server-side code for this is fairly simple:

Note the server created by this method is just a process listening on port 8090 (or whichever port is specified), and doesn’t host Web pages at this stage. Instead it returns an HTTP response using response.writeHead() to determine the header and response.end() to determine the body. When this code executes console.log() will print the ‘Server running’ message in the command line. A browser sending a request to localhost:8090 will display ‘Hello World’ as the response. I saved this file as ‘nodeserver.js’.

To start the server using the Node.js interpreter, simply navigate the command line to the directory where the .js file’s stored, and enter the following:node nodeserver.js

File Operations
Perhaps the main reason we want server-side code, rather than something entirely client-based is data persistence. An application isn’t much use if it can’t store and retrieve data. Here I have two files: file-op.js server-side script, and the serverdata.txt data file. The latter simply contains two lines of text.

This time we import both the http and filesystem (fs) modules:

var http = require("http");
var fs = require("fs");

And specify the file to read:
var data = fs.readFileSync('serverdata.txt');

And this time, the HTTP response is defined as the contents of serverdata.txt:

Streaming Data and Writing to File
The filesystem module enables the JavaScript to perform I/O with files using createReadStream() and createWriteStream(). As before, we import http and the filesystem modules, but leave the data as a null value. Another variable is needed to declare the read stream. The data returned by the fs.createReadStream() function populates readerStream.

And to write to file using createWriteStream:

Although this isn’t much at this point, it demonsrates that we can use persistent storage with a bit of JavaScript.

Node.js ExpressExpress can be used to achieve the same thing as ASP.NET MVC, as it handles routing, REST requests and other server-side operations. First we need to use npm to install Express.js.npm install express --save

We’ll use the following simple express server to understand routing:

As with ASP.NET MVC, the controllers here determine actions to be performed when the server receives a given request in the form of a URI. To initiate an action, we only need to send its name as part of the URI in the browser. For example, ‘http://localhost:8090/listusers&#8217; will cause express to return the response for that app.get() method. It responds by calling the sendFile() function that returns users.html. This is the equivalent of MVC’s ‘return view()‘.

Reading and Writing JSON Files
Of course, most Web applications function as an interface to some data source. Here I’ll try and use a JSON-based source to store and retrieve the data, with data being sent between the HTML and the JavaScript controllers. The Express.js site lists the database integrations it supports.

For the following the body-parser is required through npm:npm install body-parser --save

In the HTML file we have a simple form with four fields. To the JavaScript file we add another method for handling the data submitted from the HTML.

If the fields are populated and submitted, the following JSON output is generated:

Now we need a .json file for the application to append, for example ‘users.json’. Here’s the solution I hacked together, by trial and error:

This can also be extended to MongoDB, which is also JSON-based, if a data access layer needs to be added to the application.

To do the reverse – rendering JSON data in an HTML page – we’ll need jQuery and a script that fetches the data returned by the Node.js controller. In the HTML I have two elements, ‘get-data‘ and ‘show-data‘. The first is a link that triggers the JSON reader JavaScript.

The handler function will read the JSON file and return the output to the ‘show-data’ element, placing the read values in an HTML list.

Noticeably fewer people at last night’s meetup, on account of the crap weather.

Introduction to Node.js
Created using C++ and JavaScript, Node.js has gained the support of some major companies. It’s a platform for developing server-side web applications. If I recall correctly, Cloud9 even managed to create an Integrated Developer Envirnment using this.
There are a number of features that make the platform popular. Firstly, it’s incredibly fast in terms of development and runtime, according to a couple of developers I spoke to. Applications can spawn new processes and run them concurrently. Node.js also has APIs that allow code to interact with the hardware, ports and sockets.

Deploying and Changing Code
Slightly different from the subject I posted on my site yesterday, as I thought this was about automated deployment. This was actually Matthew MacDonald Wallace’s little rant on the importance of testing code properly before deploying it on producion systems, and thereby not causing problems for the sysadmins. Several useful tips were given here:

* Every change to the system should be documented. This provides an audit trail and enables any problems to be traced more efficiently later.
* Testing code fully before deployment is possible by simulating the production environment using virtual machines.
* Make sure the system is scalable, or at least capable of handling unexpected loads quickly.
* There are also open source solutions for change and configuration management: Puppet and Chef. I’ll look more into those when I get the time.

Finally, there was a quick explanation of the ‘DevOps’ concept: having developers and system administrators working together in the same team, which apparently makes life so much easier.

AAA Game EnginesIan Thomas gave an overview of AAA game engines, their architecture, and how they’re used. Basically the game engine is a platform, in the sense developers can keep re-using it.
According to Ian, almost every games development firm has its own engine, and almost all engines are coded in C++, the reason being memory management is critical to ensuring the game runs efficiently, and to avoid memory fragmentation.

The game engine’s architecture includes a platform abstraction layer, which interprets between the engine’s system calls and the APIs for whatever games console. Porting an entire game to another console should be a simple matter of modifying the one layer, and not the rest of the code.

Interestingly, almost all games with graphics since the 1980s are created with the following:
* Game Objects: The characters in the game, and perhaps a few other elements in the environment. From the programming perspective these are simply aggregators of smaller data objects provided by the game engine.
* Static Geometry: The environment in the game, which is loaded and unloaded during runtime as required.
* Heads Up Display (HUD): A static layer on the interface, which displays variables, or representations of the variables during runtime.

Categories

Profile

My name is Michael, and I’m a software developer specialising in clinical systems integration and messaging (API creation, SQL Server, Windows Server, secure comms, HL7/DICOM messaging, Service Broker, etc.), using a toolkit based primarily around .NET and SQL Server, though my natural habitat is the Linux/UNIX command line interface.
Before that, I studied computer security (a lot of networking, operating system internals and reverse engineering) at the University of South Wales, and somehow managed to earn a Masters’ degree. My rackmount kit includes an old Dell Proliant, an HP ProCurve Layer 3 switch, two Cisco 2600s and a couple of UNIX systems.
Apart from all that, I’m a martial artist (Aikido and Aiki-jutsu), a practising Catholic, a prolific author of half-completed software, and a volunteer social worker.