Friday, December 30, 2016

Automatically recognising digital modes with machine learning

My favourite digital modes are PSK31 and WSPR, both on 20m, but there are a large number of other modes. Recently tuning around I saw the mode pictured on the right but despite reviewing some excellentsites that show all the modes and provide both images of the waterfall and audio recordings I was unable to decode the signal using fldigi.

Machine learning has improved enormously in the past few years and the ability of trained models to recognise new images as being things like a cat or sunset are amazing.

It might be possible to train a neural network with a collection of screen shots from the waterfall of each digital mode so that a new screen shot could be automatically identified.

An internet search for some existing software that does this turned up something that looked hopeful - a windows application called Artemis: Free Signal Identification Software, but (after navigating through a truely evil free hosting Windows malware attempt) the downloaded utility is just a GUI for searching the collection of waterfall images so that the user must decide.

Google has open sourced TensorFlow which is a system which can be trained with sample images and then when given a new image will classify it for you. They ship a pre-trained model called Inception v3 that has been trained with 1,000 different classes of images from ImageNet.

The tutorial shows how to re-train this model with additional flowers that it doesn't know including daisy, dandelion, roses, sunflowers and tulips. Here's some of the sample daisy images.

Thanks to docker, it's very easy to get TensorFlow running. The sample images are in a directory that is mounted as a volume in the docker container.

After getting all this to work - and very reliably recognise flowers, I captured and hunted down sample images of two digital modes, BPSK and RTTY. I chose these two because they are common and also rather similar to the eye. Here's some of my psk sample images.

One trap to note is that you do need a decent number of sample images, 30 - 40 or more or you'll get this mysterious error during training.

CRITICAL:tensorflow:Label rtty has no images in the category validation.

Traceback (most recent call last):

File "tensorflow/examples/image_retraining/retrain.py", line 1012, in

tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run

sys.exit(main(sys.argv[:1] + flags_passthrough))

File "tensorflow/examples/image_retraining/retrain.py", line 839, in main

bottleneck_tensor))

File "tensorflow/examples/image_retraining/retrain.py", line 480, in get_random_cached_bottlenecks

bottleneck_tensor)

File "tensorflow/examples/image_retraining/retrain.py", line 388, in get_or_create_bottleneck

bottleneck_dir, category)

File "tensorflow/examples/image_retraining/retrain.py", line 245, in get_bottleneck_path

category) + '.txt'

File "tensorflow/examples/image_retraining/retrain.py", line 221, in get_image_path

mod_index = index % len(category_list)

ZeroDivisionError: integer division or modulo by zero

In the end I got past this by simply duplicating my sample images which of course doesn't help improve recognition but gets past the fatal error. It is quite hard to find 40 sample images of a digital mode.

Identifying images of modes

As a first test I fed the system two images which were part of the training images.

I think there is a good prospect of using machine learning image recognition for guessing digital modes. Ideally this would be built in to clients but it might make a good app (using the phone camera to capture the unidentified signal) or a web site where you upload a screen shot.

The main thing I need to expand this is lots of sample waterfall images.

There's some interesting discussion in a thread in the Reddit amateur radio subreddit.

4 comments:

Annoyingly there is a nice 'standard' for identifying what mode you're about to transmit - RSID. Fldigi, along with a few other data mode programs support it. Unfortunately, it's not widely used, leading to the situation you find yourself in.

I suspect a machine learning algorithm like you're talking about may help to distinguish between the 'base' modulation schemes (PSK, QPSK, FSK, MFSK), but won't help with the sub-modes of them.

For example, how would you distinguish something like DominoEX from THOR? They look exactly the same on a waterfall, but one has FEC and interleaving...

Let's default RSID to be "ON" for most modes as a default for HRD, fldigi and Multipsk etc...

Also a brute force approach to computer decode is perhaps easier to code and easier for a ASCII system to "recognize" clear text and Q Codes from the decoded stream. Once all modes are matched or a reduced set defined as a trial from unique features such as bandwidth, modulation, baud etc. Then clear text decodes from many many modes can be easily defined and scripted...