Log Files

Apache2 provides an access and an error log in the /var/log/apache2/ directory. access.log contains all details of requests processed by the server including the IP address the request originated from, timestamp and user agent. All messages and errors from the server are stored in error.log. For example

AH00558: apache2: Could not reliably determine the server's
fully qualified domain name, using 127.0.1.1. Set the
'ServerName' directive globally to suppress this message

Virtual Hosts

As soon as you want to host multiple sites from the server, then it is necessary to work with the configuration files. Apache2 loads configuration data from individual files as an alternative to editing a long configuration file which may introduce errors.

Virtual Network Computing (VNC) is a widely used remote technology, particularly in heterogenous networks. For this reason, several open source VNC projects exist, including RealVNC and TightVNC. VNC enables one to control a number of different computers from one keyboard. It is different to a remote terminal session such as SSH as you do not log in to a server. Anything you do on the VNC session is as if done by the user currently logged into the remote desktop.

VNC requires a client and server to create a session. The server runs on the remote desktop and open a vncviewer on the client. VNC estalishes remote access over either a local arean network or over the internet using TCP/IP, implementing a Remote Frame Buffer (RFB) which grabs the screen image and sends it to the client. The client displays the remote screen in a window on the client desktop. The client transmits mouse and keyboard data back to the server.

VNC creates stateless sessions, enabling the user to disconnect and reconnect from different machines irrespective of which operating system is installed on the client or server.

Security

However RFB does not travel over a normal connection in an encrypted mode. Consider using OpenSSH and use VNC through an encrypted tunnel.

Raspberry Pi Specifics

The first time you run a VNC server it will prompt for a system password. The server will appear as hostname:1, hostname:2 etc. The server chooses the first available display number and tells you what it is. Setting the DISPLAY environment variable can cause applications to use a specified display.

RealVNC

As of September 2016, RealVNC software is available for free for academic and non-commercial use on Raspberry Pi’s. Note that only one VNC server can be installed at one time. So if you have installed tightVNC previously, you will need to remove it.

I discovered a blog entry which described a Raspberry Pi cluster built by Nnvidia High Performance Computing Engineer, Adam DeConinck Some technical details about Adam DeConinck’s pi cluster. Due to public demand, they made some details available, including a readme file on github with more information.

There are many different ways to do this, one is to split services (webserver,mysql, ftp) over different Pi’s. Another is to split the files across Pi’s (images on one, html on another etc). But in my opinion, the best (and most complicated) way to achieve load balancing is to setup one Pi as a nginx proxy to distribute the load to the two Pi’s hosting the same files. You can even set how much load is distributed to each Pi!

“The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.”

Hadoop Distributed File System

Distributed scalable and portable file system written in Java for the Hadoop Framework. Includes secondary NameNode which connects to primary NameNode and builds snap shots of directory structure.

MapReduce Engine

Different ways to submit and track jobs. Overhaulled in Hadoop 2.3 MRV2 referred to as YARN. Split up the major functionality of job tracker, resource manager and scheduler. YARN is compatible with MapReduce.

Started as Google Bigtable and MapReduce / Google File System (GFS). Wanted to be able to access the data with a SQL style language. Facebook, LinkedIn and Yahoo stack all had some similar products in their stacks.

Rounded up all the Raspberry Pi’s and Linux books from around the house.

Most of the time with Raspberry Pi’s, I look at them and think it would be nice to do something really cool…. but then real life takes over and the moment is gone. I’m not alone in this, but now with refreshed determination…. and the fact that I have an Operating Systems and Network project to complete on anything to do with Linux… my plan is to build a cluster of Raspberry Pi’s and experiment with Hadoop and see where that takes me, initially operating the environment within a dedicated local wireless network.