Recently I bought my first house, and that means no more silly tenancy agreement rules, e.g. no picture hooks on the walls or screwing with the wiring, aka ultimate smart home freedom :-D. To complement my CurrentCost SDR logging and Temperature & Humidity sensors I decided it was time to buy a cheap hackable doorbell, the general idea is on each doorbell press I’d capture the date, time and a quick photo of who was at the door.

To get started I checked out what sort of doorbells rtl_433 could intercept and hunted out a compatible door bell and cheap outdoor wireless IP camera from Amazon. In the end I settled for an Elro DB286A Doorbell and Knewmart IP Camera. Once they arrived I got to work configuring the IP camera and placing it in a suitable location on my porch so I could see who was pressing the door bell and stuck the doorbell’s push button to the door frame, the doorbell communicates wirelessly over 433mhz and is ran on batteries so no cabling there, I had to drill a few small holes for the power cable for the IP camera, but the installation was pretty tidy and the camera is somewhat obscured by the guttering making it less obvious.

Next the fun could begin. I opened rtl_433 to checkout the packets sent from the doorbell…

Now I had the doorbell’s unique ID I could extend my existing 433mhz ingestion script to raise a HTTP request to my home automation system every time the doorbell is pressed as well as continuing to log CurrentCost data to my Graphite server. Using the code from the rtl_433 payload is particularly useful for identifying your doorbell from your neighbour’s devices. In the end I settled for something like the following…

The updated script still sends my CurrentCost power usage data into Graphite, and any door bell pushes raise a GET request to my home automation system. As the home automation system is written in Flask I continued to add route to my API blueprint to handle the doorbell pushes. Every time this route receives an API key authenticated request it will take a snapshot of the IP camera, base64 encode the image and dump it into the DB as a blob which can then be retrieved in the automation systems UI. Later on my intentions is to add a callback to some notifications queue to send a notification to our telephones to let us know somebody is at the door. Lets have a look at the route, yep it’s bad as the camera credentials and URL is hard coded, but it’ll do the job for now.

Lets test it by pressing the doorbell and check out what is in the sensor_readings db table, as you can see it dumps out a large base64 payload in the value column with our complete image, probably not the best way to store the image but it’ll do for now. Long term it would be good to upload the image to something like Amazon S3 or Rackspace Cloud Files and just store the image’s URL in the DB.

This is great, but reading doorbell activations from the DB using the MySQL client is not very friendly so I added a HTML template to the UI of my home automation system to display doorbell events. It simply reads the DB for the last 10 activations and dumps them out to a page, later I should probably add pagination and a date / time filter so I can search historic doorbell pushes.

Notice there is nobody at the door? Well I had already locked up for the night when I was writing the code for this so I opted to simulate a doorbell press each time I wanted to check out if it was working by manually sending a request to the home automation API, hence the human-less photos. I appreciate I have been particularly vague about my home automation system here, but its just a Flask app with some SQLAlchemy models for storing events in a MySQL DB and a Graphite server for storing time series data, it’s pretty rough around the edges so I am not keen to release it on Github yet, but maybe later after some more development. I hope you enjoyed the post and maybe I have inspired you to hack your own doorbell :-).

A few posts ago I wrote a simple guide on how to read power usage stats from CurrentCost CT Clamps using Software Defined Radio, you can check it out here. This is great but how can I view historic power consumption or integrate this data with other stuff?

First we need somewhere to store the data produced by the sensor. In my last post I covered configuring a Graphite server, Graphite is a time series database a bit like RRDTool but with a modern API and more flexibility. I built this in mind of my growing collection of Internet of Things sensors and the time series data they are churning out on a daily basis. Lets write a script to take the power stats output by rtl_433 and insert them into Graphite.

The script is very simplistic and wraps the rtl_433 command, it filters any invalid lines and any device which does not have a dev_id that matches the device id defined in the configuration segment at the top of the script. This is useful if your neighbour also has a CurrentCost and you don’t want to record their readings, it also ensures we do not record erroneous values from other sensors which operate in the 433mhz range like door bells and tyre pressure monitors. Obviously swap out the graphite_server IP address with the IP of your own Graphite installation. Now run the script and go and make a coffee or something and give it chance to dump some power usage data into Graphite.

After a few minutes you can check Graphite to see how the data looks, open up your Graphite web console and expand the tree to reveal “sensors.<device_id>”. Select the power series by clicking on it and change the date range for the drawn graph to the last hour, you should see that time series data has started to get populated with your current power usage in Watts.

As you can see my Graphite instance now has some data for the power series for my CurrentCost CT Clamp. Ignore the unusual straight line, thats a few minutes where I stopped the script to make some tweaks. If you want your power consumption to be recorded at all times then feel free to add this to your SysVInit or SystemD scripts to start at boot time and run continually as a service.

As you can see you now have a decent way of storing your historic power usage and Graphite offers up a render API so you can integrate the graphs or raw data with other services or scripts. In my case I thought it would be cool to be able to ask my Amazon Echo “Alexa, ask Home App for my current power usage.”. Home App is a small API and mobile app I have been developing to support the custom home automation stuff in my house, the API is written in Python using the Flask framework, lets get Amazon Alexa to query the API and fetch the current power usage.

First you’ll need an Amazon account and to login to the Amazon developer portal at https://developer.amazon.com, navigate to the Alexa tab and click Get Started under the Alexa Skills Kit. Obviously I have already registered and started working on my skill, but you will probably have an empty list presented if you haven’t worked with Alexa before. Click Add a New Skill to start building your own skill. You should create a skill with the current characteristics…

Custom Interaction Model

A reasonable invocation name, in my case “Home App”

Set the language to your preferred language

Say no to the skill being an audio player or video app.

This will give us the basics to create a plain speech skill with no fancy stuff like sound effects.

Click next and move onto the Interaction Model, luckily for us the intent schema is super basic for this skill and we do not need to configure any slots or anything, just one intent and the utterances for calling the intent. Intents are defined using a simple JSON data structure, I added an intent called GetCurrentPowerUsageIntent, remember this, it’ll be used later in our API.

{
"intents": [
{
"intent": "GetCurrentPowerUsageIntent"
}
]
}

Then in the sample utterances field define what you’d like to say to call this intent, in my case I wanted to say “Alexa, ask Home App for my current power usage.”. The trigger word is already handled for you and so is the skill name, so we only care about the end of the phrase “for my current power usage”. Utterances are defined as follows, you can have multiple utterances per intent.

GetCurrentPowerUsageIntent for my current power usage.

Once you have entered all of the utterances click save and move onto the next tab.

Here you need to configure where Alexa will call to execute your intent, this can be either an AWS Lambda function or a custom HTTPS URL, if you use an SSL enabled URL you must have a valid certificate and hence your own domain name. I tend to use LetsEncrypt to get a free certificate, you can also use self signed certificates if you wish although you must upload your cert to the Alexa Developer console so Alexa can verify your endpoint. In my case as my Graphite server and my Home App API are hosted on my HP Microserver in my house and it’s not linked to an AWS VPC in anyway I have pointed Alexa to my API endpoint.

Ok, once this page is completed the Alexa skill is now configured and you can run it in test mode, of course it has no API to reach out to so the skill will always fail, lets move onto writing a small Flask App to serve up the response to Alexa with the current power usage.

Either create a new VirtualEnv or install to your systems site-packages the following prerequisites from PyPi. I personally prefer to have each project in a VirtualEnv but I’ll leave this up to you.

pip install flask flask-ask requests

Flask-Ask is a Python package which allows you to easily create Alexa skills within a Flask App, it offers a bunch of decorators and functions which handle all of the Alexa requests and responses without having to intricately know the Alexa API. Once the requirements are installed you can make a super simple one module Flask App to begin working with Alexa…

Once again you’ll need to configure your Graphite server IP address and it’s Web UI port number and the CurrentCost Device ID in the configuration segment of the Flask App.

The App is very simple, it hosts a small web server which when requested by Alexa on the /api/alexa endpoint responds with the current power usage providing the GetCurrentPowerUsageIntent intent is called, this is fetched on the fly using requests from your Graphite server and some basic manipulation performed to make the read back more human friendly. You can also call the launch intent by saying “Alexa, start Home App” or whatever your skill invocation phrase was, in this case Alexa will ask you “Welcome to Home App, what would you like to do?” and wait for a response to which you can say “get current power usage.”.

To use this with Alexa you’ll need to host the Flask App publicly on the internet with a valid SSL certificate. I will leave it to you to setup your preferred web server and configure your preferred certificate, although if there is reasonable demand in the comments following this post I may provide a follow up tutorial on configuring the app using UWSGI, NGINX and LetsEncrypt.

Once the API is online and you can hit it by visiting https://yourhostname.yourdomain.com/api/alexa in your browser you can continue to use the tools in the Alexa developer console to test your skill. Navigate to the test tab in the developer console and in the enter utterance box type something like “Alexa, ask <skill invocation phrase> for my current power usage.” If everything went to plan you should see something like the following returned…

You can also click the listen button to hear what it sounds like with Alexa performing the read back. As you can see in my case Alexa replied with “Current power usage is 240 watts.”. If it failed to retrieve your power usage troubleshoot as necessary.

Now your skill is enabled in test mode you should be able to ask your Amazon Echo the same phrase as entered into the test tools, and it should provide the required response, in test mode it will only work in your Amazon account which is great for my use case. Of course if you choose to you could develop out the skill to allow people to send their own CurrentCost readings to your Graphite server and allow them to pair their Amazon account to their CurrentCost device ID and offer a public skill. If you do not have a real Amazon Echo device but wanted to test out your skill with speech recognition and response then checkout https://www.echosim.io for a virtual in browser Echo, however be warned its feature set is limited.

I hope this provided a sample of how to integrate Amazon Alexa with one of your IoT sensors, of course you can expand this to offer a lot more functionality and use slots to allow inputs via voice if you have any IoT devices you’d like to control via your Echo.

Following on from my LoRa temperature and humidity sensor and CurrentCost SDR man in the middle it became apparent a time series database was needed to store all the data that was being collected. RRDTool seemed the obvious choice and although it has a good track recorded at dealing with time series data previous interactions with the software left scars. Particularly the fact that data cannot be inserted retrospectively, data must be inserted in chronological order, and also the lack of a modern API.

To overcome the concerns with RRDTool two other tools where considered, Graphite and InfluxDB. InfluxDB has a clear advantage in terms of implementation although it seems pretty heavy and involved, on the other hand Graphite offers familiarity as I have used it before and hence administration and maintenance would be a lot simpler and less of a learning curve due to knowledge of the API.

Graphite Components

Lets get on with the Graphite server installation, Graphite comes in several components…

Whisper – the time series data store, file based like RRDTool but without the limitations.

Carbon – an API layer exposed by raw socket which indexes, caches, inserts and retrieves data from the whisper files.

Graphite-Web – a web layer that provides a HTTP accessible API including rendering of graphs, and a web front end for exploring the data. The API itself is not RESTful but pretty easy to use and learn.

There are also a few other carbon components which can be used to cluster Graphite deployments, although in this case installation will be to a single server with backups on cron to a NAS.

Installation

Installation on Debian is mega easy as packages exist for Graphite in the distribution’s repo and are actively maintained.

apt-get install graphite-carbon graphite-web

This will install and start the Carbon service, the web service is not configured to run and hence needs to be configured with SystemD to run properly. By default Carbon is configured to store data with an interval of 60 seconds for a maximum time period of 24 hours, after this the space in the whisper file will be overwritten with new data just like a round robin database file in RRDTool. In my case I wanted to store the data with high resolution for essentially ever, so I updated the default Carbon storage schema to have a 10 second interval and store data for 30 years, I expect this server will be replaced by then and we can worry about the data migration at a later date. Of course this comes at a large increase in storage cost, and each metric with this resolution and storage period eats 1.1gb of disk, although if you do not have so many metrics (in my case probably a maximum of 10 or so in the end) then it isn’t a concern with today’s hard drive prices. You should edit the storage schema to fit your use case…

After the storage schema has been configured you should restart the Carbon service, following this you can start writing data to your Graphite server.

service carbon-cache restart

Verifying Functionality

Writing data to your Graphite server is easy using a raw socket, but be warned this is unprotected and is not encrypted so you should only run it within your local network and not out on the internet. If you want to expose the ability to write to your Graphite server to external devices it is recommended to write a wrapper API which implements authentication and SSL. However due to the input to Carbon being a raw socket we can simply write data to the time series database using netcat… Lets add some random stuff to the database to ensure its working properly.

The sleep in here should be sufficient to write the values into separate intervals defines in your storage schema, if you do not sleep for a sufficient period of time the values will be averaged, of course instead of sleeping you could pass in some pre decided time stamps instead of using the date command. After the inserts have ran check out /var/lib/graphite/whisper/demo you should see a file called increments.

Here we can see that a whisper file has been created for our metric “increments”, metrics can be nested and the folders within /var/lib/graphite/whisper are arranged accordingly. Whisper files can be interrogated using the Whisper CLI tools…

Using the whisper-dump command we can see all the time series data stored in the file.

whisper-dump increments.wsp
--very long output mostly made up of 0 values--

Ok, this proves Carbon is writing to the Whisper files as required, now lets look at the Web UI and graphing the data, for now lets run graphite-web by hand, we can create a SystemD service for this once we are happy everything is functional. When you first run the web server you will need to run graphite-manage syncdb to run database migrations for the Web UI, the database by default is an SQLite file in /var/lib/graphite, although for larger installations you can swap this out for MySQL or PostgreSQL by editing /etc/graphite/local_settings.py.

graphite-manage syncdb
graphite-manage runserver 0.0.0.0:8000

You should now be able to open the Graphite Web UI in your browser on the port specified in the runserver command, the left hand side of the UI is a tree containing all of your data, and the right hand side pane is used for drawing graphs. Expand the tree and click on increments to load in data from your whisper file, this will ask Carbon to fetch the data either from it’s cache or the disk.

You’ll notice the data isn’t really displayed that well, by default the Graphite Composer draws graphs with a 24 hour time window, obviously we only entered the data over 100 seconds so the data is a very small vertical line on the graph, lets change the graph’s time window to an hour so the data is actually readable. Click the select recent data button (5th from the left in the composer window) and change the time range to the past 30 minutes. The graph will now update, we should see the incrementing data we previously submitted to Carbon, of course as our 10 second intervals do not match exactly the intervals of the whisper file some averaging has occurred, we can see this as the data ends around 9, rather than 10 as submitted.

The basic functionality seems to be working great. Exit the graphite-web server by pressing ctrl-c, lets configuring backup and systemd to start the graphite-web service, of course if you prefer you can use Apache with modwsgi or UWSGI and NGINX, after all the graphite-web server is only a Django application, however as its only to be used internally by myself I’ll just run it standalone.

Configuring Graphite-Web

To configure Graphite-Web to run under SystemD create a new service file in /lib/systemd/system such as…

Save the file and then run systemctl daemon-reload to configure the service, following this you can test the service by running the service command.

systemctl daemon-reload
service graphite-web start

Now check it’s up and running in your browser!

If the UI loads as per the above then we know SystemD has the service configured ok, lets make it run at boot time.

systemctl enable graphite-web

Configuring Backup to NAS

If you do not have a clustered Graphite server obviously you have a huge single point of failure, not only that but as the data is only stored on one node so in case of failure data loss is somewhat likely. To combat this it is advised to backup your /var/lib/graphite folder to another host, and preferably offsite. In my case I have installed Graphite on my existing web server which already mounts an NFS share from my ZFS powered NAS. This NAS snapshots changes periodically on the shares and replicates them offsite to another ZFS based NAS. Therefore I will just abuse this for backing up my Graphite data.

The share from the NAS is mounted on /mnt/web_backups via AutoFS and the backups are started by cron at 12 every day. The existing backup script will be altered to tar up the Graphite data and dump it into the share, the configuration is also copied for easy restoration at a later date, most of the file has been redacted.

Now the backup script is ran, and the resulting tar.gz file compared for completeness. If you’d like more information on mounting the NFS mount from the NAS or AutoFS check out the Debian Wiki.

Fetch data via the render API

Now everything is up and running it is possible to fetch data via the Graphite render API, by default Carbon automatically adds some metrics to the time series database about the local machine and statistics about the Carbon cache. Lets query the machine load average via the render API. First lets fetch the last 1 hour of cpu usage in JSON output.

Hopefully you should have a CPU usage graph similar to the below outputted into test.png.

Conclusion

Installing Graphite on my existing Debian webserver was pretty quick and painless and works well. Hopefully the above should give you a leg up compared to storing your IoT sensor submissions in RRDTool or in a MySQL DB. Unfortunately it’s probably out of scope todo clustering in my configuration as I don’t really want several servers running in my house due to the heat / noise / power usage etc… The Graphite web UI is useful for quickly viewing the data and allows saving favourites etc… so you can probably get away without having Grafana running. Next I’ll work on integrating my sensors and making some dashboards using the render API.

As you have probably guessed from my previous blog posts I am slowly pimping (or in the words of the misses, making a mess) my house with a variety of Internet of Things devices such as Phillips Hue, various sensors and so on. After discussing IoT devices with my work colleague the other day he showed me screenshots his fancy power monitoring solution and gave me a link to the guys who make the device he was using a “Current Cost”, so I promptly ordered a refurbished unit from the Current Cost shop on eBay. I went for the ENVI, a slightly older model but at the bargain price of £20 including delivery.

Today the ENVI arrived and I was eager to get it up and running. Inside the box is a CT clamp with an attached battery operated box, the display base unit and a power adapter for powering the base unit. The base unit has USB serial support for getting an XML feed from the device, however I was curious to see if I could get the power readings without the USB cable as it uses a custom connector and the only cables I could find are selling for £12 on eBay. According to the compliance documentation for the product the CT clamp unit transmits over 433mhz, sounds like an SDR job to me. Installation was easy I attached the CT clamp on the positive inlet cable in the meter cupboard after the meter and main supply fuse, paired it up to the display unit using the included instructions and within a minute I had a frequently updating display of the current power draw in my house.

Next lets see if we can intercept some messages from the CT Clamp to the display unit using the SDR. I used my trusty NooElec NESDR Mini 2+ with the supplied out of the box antenna for the job. I knew from the Current Cost documentation that the display updated every 6 seconds, so lets have a look around the 433mhz frequency for something with a 6 second interval. Here on the waterfall at 433.92mhz we see these lines which match up with every 6 seconds ish with a small red segment which we can assume is the packet from the CT clamp specially as the display updates each time one of these packets appears on the waterfall.

Now we have found the assumed frequency lets try passing it to rtl_433 on the off chance that someone has already reverse engineered / made a decoder in rtl_433 for the Current Cost devices. First I had to boot up my Linux VM and pass the RTL USB device through to the VM as I have never managed to get any librtlsdr stuff to compile on my Mac. I then installed rtl_433 as follows, you’ll need cmake, gcc and librtlsdr-dev etc installed prior to compiling rtl_433.

Seems to have the necessary decoding stuff implement, does the job great without no further configuration. Interestingly even when unplugging the paired base station the CT clamp unit appears to keep sending out readings. I guess the pairing is very simplistic where the base station just remembers the clamp’s Device Id and listens out only for packets from that clamp, great I can unplug the base station and put it back in the box… Next steps to write a simple Python wrapper around rtl_433 to submit the power consumption metrics back to my sensors API.

A few weeks ago I decided I’d like to compare the results of Radio Mobile Online and my actual LoRa coverage using TTNMapper, and although I have not walked all roads within the coverage prediction I have walked the roads around my house and the results are quite underwhelming so far. Here is a very primitive overlay of my TTNMapper results vs the shading from Radio Mobile Online.

You can quite clearly see the difference between the Radio Mobile Online and TTNMapper map types and the area where the coverage stops in real life is far inferior compared to the Radio Mobile Online prediction, there are a number of reasons for this…

a) Radio Mobile Online only takes into account terrain not buildings, and unfortunately my house is walled in by other housing and apartment buildings.

b) Radio Mobile Online probably assumes that you are using a proper antenna on a proper mount outside of the building, unfortunately my LoRa gateway is a bad home made antenna inside of a window on the ground floor.

c) I am unsure of the resolution of the terrain data in Radio Mobile Online, it is safe to say I live in a small dip in the landscape that is very slight, but significant enough to affect coverage in my opinion, if the resolution of Radio Mobile Online terrain data is not fairly high it’ll probably miss the fact my house is lower than the surrounding terrain.

d) My tests were only using SF7 as I only have an el cheapo single channel, single spread factor receive only gateway.

All in all quite disappointing compared to the results from others who have had antennas on top of hills with clear line of sight over many miles, but not really a bad thing as all of my LoRa devices are in my house anyway. Hopefully after I move soon we should see some much nicer results (my new house is on top of a hill). However for sure unless you have perfect conditions I would not take the Radio Mobile Online predictions too seriously when working out your LoRa coverage.

Out of curiosity I have been trying to establish the range in which my LoRa gateway can receive packets, first off I started using the tools by Radio Mobile Online which allows you to determine coverage estimates based on the antenna positioning details and the height profile of the area around your antenna, you can give it a go yourself over at http://www.cplus.org/rmw/rmonline.html.

I was quite suprised to see the range of a kilometre or so from my house considering I live in a small dip in the landscape and thought the estimate was unlikely. Therefore I wanted to try and validate the coverage for myself, however I have no GPS module for Arduino so I either had to buy a GPS sensor and get this to periodically send the location over LoRa and then go for a walk with this device powered on or have a complex arrangement of tracking my position on my smart phone and then trying to match my position to the time stamps of the packets received by the gateway, this obviously would take some time to get setup, so I searched to see if anyone else had tried mapping LoRa networks when I discovered ttnmapper.org, a community effort to map The Things Network coverage.

To use TTN Mapper you have to have a LoRa node which periodically sends a packet to The Things Network via a gateway, you then walk with this device powered up and your smart phone running the TTN Mapper app, the app listens for packets from your node via MQTT and when it receives a packet submits the receive signal strength, receiving gateway and GPS location to the TTN Mapper API which publishes the coverage data onto the TTN Mapper website for everyone to view. Looking at the Netherlands and Switzerland on these maps shows how popular LoRa is in these countries and some cities have very good coverage indeed.

Here is the coverage map for Zurich for example…

All in all pretty good mapping for a community driven LoRa “war driving” effort so to speak.

Anyway, what about the coverage for my own gateway? Well I realised I need to somehow power up the LoRa node on the move and due to the laws surrounding duty cycle I will have to rate limit my mapping so it may take some time and walking slowly around the local area to get decent resolution. I am off to charge up the largest USB power bank I own and tommorow I will hit the road in a breif stroll around the neighbourhood to test out how TTN Mapper performs and to get an idea of what my coverage looks like. I will then compare the results to the mapping from Radio Mobile Online to see if there are any similarities. Keep an eye out on my blog over the next few days for the results.

Recently I have been working on a project to provide a File Intelligence API within the OpenStack ecosystem (OpenStack Nemesis) which can be used to find out information about a given file based on it’s hashes, the plan is to support multiple pluggable backends for processing files which are unknown to the system and provide a verdict of if the file is malicious and other data about the file like its type, or if it’s an image the contents of the image, or if it’s a document provide a summary of the document and so on.

As part of the process it has been imperative to choose the first pluggable backend for providing file analysis, whilst long term it is hoped there will be multiple backends developed for a range of software including commercial sandboxes and scanners it struck me as important for the first backend available to be opensource and freely available so people can experiment with the API without having to commit financially by more than a few virtual machines. Initial experimentation with ClamAV, Cuckoo Sandbox and a selection of Python libraries for discovering MIME type, header details etc… yielded that a combination of these approaches can be used to construct meaningful meta data about a given file. Cuckoo in particular is very impressive, with some minor modification it was relatively easy to get Cuckoo to run it’s executions inside of Nova and communicate using a Neutron tenant network, post execution the software analyses the artifacts collected against a bunch of community contributed signatures and suggests a score of how malicious a file is and provides enough information to categorise it somewhat based on it’s characteristics.

Lets take a look at Cuckoo, first Cuckoo was installed to a Nova instance as per the documentation at http://docs.cuckoosandbox.org/en/latest/. The Cuckoo server was configured with 2 interfaces, one on the public network and another to a tenant network where the executions would happen. The Cuckoo server acted as NAT for a tenant network configured on the second interface so the execution slaves could access the internet (via Cuckoo rooter) and none of the execution slaves had direct public network access. The execution slave was built as per the Cuckoo documentation and then imaged so further instances could be built and at the end of an execution rebuilt to a known state. Out of the box Cuckoo does not support Nova to provide virtual machines for execution for initial experientation a custom machine type was monkey patched in as a proof of concept, however long term this needs a more major rework to enable Cuckoo to autoscale executions, in it’s current form factor only one image can be used on a fixed number of Nova instances.

You can see my monkey patched code in my fork of Cuckoo here – https://github.com/robputt796/cuckoo/tree/nova_machinery. I would highly suggest if you’d like to give this a go you compare the changes to the upstream master branch of Cuckoo and rebase the machinery addition onto the current master.

Now that the Nova machinery is patched onto Cuckoo and the Cuckoo server and execution slave instances are up and running lets try running some file analysis and see what comes up. It should be mentioned the results may be highly variable and this is in no means a sure way to identify malware with high accuracy, malware commonly has various anti-sandbox techniques such as not running in a virtual machine, not running if the host has unusual usage patterns (e.g. being very new, not having commonly installed applications, or not having any signs of a physical user doing stuff on the machine), or simply just sleeping for longer than a typical analysis window. There are some steps that can be taken to alleviate these such as ensuring the host looks used, installing common software such as Microsoft Office and Adobe Reader (you’ll probably want to execute these file types anyway to see if there are any nasty payloads hidden in them so your execution slaves will probably need these installing anyway) or simply running the execution slave on a physical machine (Ironic can come in handy here). Long term it will be up to the deployer of Nemesis which pluggable backends they would like to use and how these backends are configured.

For the test samples were submitted via the Cuckoo CLI tool and then the meta data and artifacts considered in the web UI, long term the plan would be to upload the artifacts to Swift for storage and the meta data, along with meta data from other scanners and file analysis tools, to be passed to Nemesis API so it can be queried. Let’s submit some samples, first up lets send it some benign executable like PuTTY.

Ok, so what results came back for PuTTY? Well it scored 1.2/10 on the maliciousness signature score in Cuckoo suggesting it is “potentially malicious”, lets have a look at the signatures which were picked up from the execution artifacts…

Here we can see there are a few signatures for PuTTY and some screenshots of the software running in the execution environment. The signatures are as followed:
(information): The executable is signed.
(warning): Potentially malicious URLs were found in the process memory dump.
(critical): PuTTY Files, registry keys or mutexes detected.

These signatures may suprise you, of course PuTTY contains stuff related to PuTTY but why is this dangerous? It can only be assumed a fair bit of malware out there is using PuTTY for communication of some kind, and who can blame them? SSH is a good protocol, it’s encrypted and allows tunneling of traffic without setting up and complicated tunnels, however overall the signatures appear to be fairly accurate and the score seems justified. Now lets try executing something far more dangerous via Cuckoo, malware flavour of the week WannaCrypt.

Now to checkout what Cuckoo thought of WannaCrypt. Coming in with a maliciousness score of 9/10 and the following signatures…

Ok, this has a whole load of warnings and critical signatures including installing Tor, deleting shadow copy, changing over 500 files on the system (as it encrypts them), listening on multiple ports, delaying execution, poking WMI and network interfaces and having a high level of entropy. It screams of malicious behaviour, and now for the screenshots of the execution environment running WannaCrypt.

As you can see you can learn alot from the execution of a file and this sort of analysis would be very advantageous to a file intelligence API for catching emergent threats which are not yet identified by typical AV scanners, of course with this approach there are chances of high false positive and high false negative rates so one should be wary and keep this in mind.

Continuing with my mission to convert my ESP8266 based sensors into LoRaWAN enabled sensors I pushed on with getting a DHT22 on breadboard and getting the code written to submit temperature and humidity over LoRa.

If you’d like to replicate this mini project you’ll need the following:

First I hooked up the SX1276 module to my Arduino Uno as per my previous post, and connected the DHT22 to 5v VCC, ground and pin 7 on the relevant DHT22 pins. The pins on the DHT22 can be identified by looking at the meshed plastic side of the device, the pin on the left is VCC followed by data, not connected and ground.

Next I moved onto the code, I started by merging the DHT22 and LMIC samples together however upon uploading the sketch it became apparent the sketch was too big for the 32kb of the ATMega328-PU, so I continued by shrinking the code by removing some of the serial output and removing unused remnants of the LMIC example, this allowed me to fit the sketch on the Arduino (just!) but also made debugging super hard so I turned to TheThingsNetwork Slack channel to discuss an alternative strategy.

As a result Matthijs Kooijman came forward with his fork of LMIC which included the ability to use a more lightweight AES encryption framework which although slower was 8kb smaller. To replace the original LMIC from IBM installed by the Arduino IDE download Matthijs’s fork from https://github.com/matthijskooijman/arduino-lmic and copy and overwrite the files in your Arduino library’s LMIC folder. Now when compiling the sketch it came to a much more acceptable 23kb leaving plenty of room to add back in the debug serial prints.

Here is the results of my efforts… A fully working DHT22 and LMIC combo which periodically transmits temperature and humidity readings over LoRa, remember before using this code to set your network key, application key and device id prior to uploading to your board.

Now to see it in action! With everything connected up I placed the sensor in range of a TTN LoRa Gateway and powered it up. Now to check TheThingsNetwork and see what kind of data is coming from the device…

Ok, so you are probably thinking wtf is that payload. As you know with LoRa you need to keep the payload as small as possible to reduce airtime and hence reduce duty cycle (a legal restriction in the EU), I decided upon having only one decimal place within my measurements so what you see here is the hex of the temperature and humidty float appended to each other, 31 34 2E 30 is the temperature and 35 34 2E 37 is the humidity.

OK looks like the temperature is 14°C and relative humidity is 54.7%, sounds about right considering the device in next to an open window in the shade and given the current weather today.

Next steps will involve moving the electronics on to a more permanent board (probably veroboard / stripboard), mounting the device in a small enclosure and subscribing to the TTN’s MQTT and publishing the results to a fancy dashboard.

Another LoRa transceiver module finally arrived from China today, now I have two of the things I decided to crack on with building a LoRa push button using an Arduino Uno and the LoRa module. Eventually this will be developed into a small circuit on strip board consisting of the LoRa module, an ATMega328-PU and a DHT22 to make my first LoRa powered temperature and humidity sensor to replace the ESP8266 based sensors I have today.

First steps as per all the other LoRa modules was to solder on some cables with appropriate headers at the other end, for this purpose I used a ribbon of male to male dupont style cables and chopped the connectors off one end, tinned the wires and soldered them to the VCC, GND, MISO, MOSI, NSS, SCK, RESET, DIO0 and DIO1 pads of the module. This module is a unbranded SX1276 module I found on eBay which is more cost effective than the RFM95W which was used for the single channel gateway, the downside is it has smaller pads than the RFM95W so soldering is a little more tricky, but with practice and a fine tip soldering iron you can acheive an acceptable result. Here is a chance for you to laugh at my soldering again…

Next I hooked the module up to an Arduino Uno as per the following pins… You shouldn’t really do this because the logic pins of the Uno are 5v and could damage the module, but I gathered for a quick test it would be ok, and if I fried it they are cheap enough. When mounting this permanently on a DHT22 sensor board I will use some voltage dividers to bring the 5v ATMega328-PU logic down to 3.3v for the module.

SX1276 Pins

Arduino Uno Pins

VCC

3.3v

GND

GND

SCK

13

MISO

12

MOSI

11

NSS

10

RESET

9

DIO0

2

DIO1

3

Next I plugged the Uno in to my laptop and launched the Arduino IDE. The intention is to send the packets to The Things Network via my single channel gateway, hence we need to find a LoRaWAN library. The most common appears to be the IBM LMIC Framework, go to Sketch -> Include Library -> Manage Libraries to download LMIC to your Arduino IDE.

Once the library is installed open the IBM LMIC Framework TTN example from File -> Examples, don’t upload it to your board yet, there are a few mods we need to make before we upload the sketch to the board. Open up your browser and visit https://console.thethingsnetwork.org, if you do not have an account already sign up. Once in the console go to the Applications tab and create a new application. Once you have made an application go to the devices tab of the application and register a new device, as over the air activation is not available for these sorts of modules via a single channel gateway just fill out the form with dummy data and click register.

Once you get to the device information page shown above click on Settings to begin the device personalisation. On the settings page change device activation method from OAA to ABP and click save, the page will refresh and the device’s address, network and application keys will be generated for you. Copy these into your sketch in MSB format, you can get the keys in MSB format by clicking the ‘<>’ button next to their text box. You should end up with something that looks like this…

Scroll down the code a bit further until you get to the pin mapping comment. Modify the NSS pin to 10, RST pin to 9 and DIO pins to 2 as per below.

Next keep scrolling until you find the channel setup statements, comment out all the channels other than the channel served by your single channel gateway, in my case all channels except for 868.1mhz. If you are in range of a full 8 channel gateway you can skip this step.

Now you are ready to upload the sketch to the board, once it boots up it should transmit a packet saying “Hello, world!” once every 60 seconds, however the LMIC library does observe duty cycle limitations, so it may be delayed as appropriate. Before you hit the upload sketch button open TheThingsNetwork console, go to your application and select the device you created and open the data tab. Now upload your sketch, providing you are in range of a TheThingsNetwork LoRaWAN gateway you should see packets arriving in the console, if you do not see packets arriving as expected you can use the Arduino IDE serial monitor to look for debug messages from the Arduino, you’ll need to connect with a baudrate of 115200. Once you see the packets in your TheThingsNetwork console you know everything is working as expected, you can now expand the functionality to initiate the do_send function upon some other trigger rather than the timed callback set on line 105, e.g. a button press.

You can expand each packet to show meta data about the packet, for example which gateway forwarded it and the signal strength. The payload is presented as hex, but if you convert it back to ascii then the original contents of mydata in the Arduino sketch can be read.

echo "48656C6C6F2C20776F726C6421" | xxd -r -p
Hello, world!

Great all seems to be working just fine, next I will move on to combining this with periodic polling of data from a DHT22 and moving the circuit to veroboard with a ATMega328-PU standalone rather than having the bulk of the Uno. I may also do some range test to see how far away I can position a sensor, however as my area has a lot of buildings and not many hills I think the range will be quite small compared to some of the many kilometer range tests we see on YouTube for LoRa. Stay tuned for updates.

Proof it works atleast, of course we do not know what the data contains as it is encrypted with another The Things Network user’s application key but atleast somebody nearby is getting heard by my gateway and hopefully their packets are getting forwarded on to their application via The Things Network. Oh well, hopefully only another couple of weeks before my LoRa transceiver shipment from China arrives so we can start converting my current ESP8266 based sensors into LoRa equivalents, then we can see the end to end process.