Author: alok760

Handling Planned Actions for the SUSI Smart Speaker

Planned action is the latest feature added to the SUSI Smart Speaker, The user now has the option to set timed actions such as- settings alarms, countdown timer etc. So the user now can say “SUSI, Set an alarm in one minute” and after one minute the smart speaker will notify you.

The following flowchart represents the workflow for the working of a planned action:

Planned Action Response

The SUSI Server accepts the planned action query and sends a multi-action response which looks like this:

Here we can see that we have two actions in the server response. The first action is of the type “answer” and is executed by the SUSI Linux client immediately, the other response has the `plan_date` and `plan_delay` keys which tells the BUSY State of the SUSI Linux client that this is a planned action and is then sent to the scheduler.

Parsing Planned Action Response From The Server

The SUSI python wrapper is responsible for parsing the response from the server and making it understandable to the SUSI Linux client. In SUSI Python we have classes which represent the different types of actions possible. The SUSI Python takes all the actions sent by the server and parses them into objects of different action types. To enable handling planned actions we add two more attributes to the base action class – `planned_date` and `planned_delay`.

Here if the action object has a non none value for the planned attributes, the action object’s values are added to a planned actions list.

Listening to Planned Actions in the SUSI Linux Client

In the busy state, we see if the payload coming from the IDLE state is a query or a planned response coming from the scheduler. If the payload is a query, the query is sent to the server, otherwise the payload is executed directly

If the payload was a query and the server replies with a planned action response, then

The server response is sent to the scheduler.

if ‘planned_actions’ in reply.keys(): for plan in reply[‘planned_actions’]: self.components.action_schduler.add_event(int(plan[‘plan_delay’])/1000,plan)

The scheduler then schedules the event and send the payload to the IDLE state with the required delay. To trigger planned actions we implemented an event based observer using RxPy. The listener resides in the idle state of the SUSI State Machine.

if self.components.action_schduler is not None: self.components.action_schduler.subject.subscribe( on_next=lambda x: self.transition_busy(x))

The observer in the IDLE state on receiving an event sends the payload to the busy state where it is processed. This is done by the transition_busy method which uses the allowedStateTransitions method.

In the previous versions of the firmware for the smart speaker, we had a separate flask server for serving the configuration page and another flask server for serving the control page. This puts forward inconsistencies in the frontend. To make the frontend and the user experience the same across all platforms, the SUSI.AI Web Client is now integrated into the smart speaker.

Now whenever the device is in Access Point mode (Setup mode), and the user accesses the web client running on the smart speaker, the device configuration page is shown.

If the device is not in access point mode, the normal Web Client is shown with a green bar on top saying “You are currently accessing the local version of your SUSI.AI running on your smart device. Configure now”Clicking on the Configure now link redirects the user to the control page, where the user can change the settings for their smart speaker and control media playback.

To integrate both the control web page and the configure web page in the web client we needed to combine the flask server for the control and the configure page. This is done by adding all the endpoints of the configure server to the sound server and then renaming the sound server to control server – Merge Servers

Serving the WebClient through the control server

The next step is to add the Web Client static code to the control server and make it available on the root URL of the server.To do this, first, we have to clone the web client during the installation and add the required symbolic links for the server.

When the SUSI Smart Speaker is set up for the first time it needs to be configured. After successful configuration, the smart speaker is registered with the associated account so that the user can see their smart speaker device information from the settings of their susi.ai account. There are two ways to configure the smart speaker:

After the configuration setup is done, the Smart Speaker reboots and connects to your WiFi and registers the device with the given account using the login information provided during the setup.

Figure: Device Details are shown in the susi.ai account settings after successful configuration.

Working

The Auth Endpoint

Whenever the speaker is configured via the android app or manually via the web interface it uses various endpoints (access-point-server). For storing login information /auth endpoint is used. The /auth endpoint writes the login details to config.json file in /home/pi/SUSI.AI/config.json

The ss-susi-register service is then enabled i.e. the service will run in the next startup which will register the device online after the device is connected to the WiFi.

The SYSTEMD Registration Service

This is the service which registers the device on bootup after the configuration phase. The service waits for the network services to run such that the registration script is run only after when it is connected to a network. This service uses register.py to register the device online.

The SUSI smart speaker supports playing local music from any USB device connected to the smart speaker. To play your favourite music directly from files, just put them in a thumb drive and plug it into any one of the four USB ports on the smart speaker. SUSI can either play all songs from the USB device or songs from a specific artist, genre or album.

Working

The first thing that needs to be done is to automount the thumb drive in the smart speaker, for this the usbmount package is used. Further, after the mount is done local skills are created which are then used by the SUSI server to interpret voice commands related to offline music playback.

Enable the server to work with offline skills stored on the device. The new skills related to offline music playback are stored in /susi_server_data/generic_skills/media_discovery in a file named custom_skill.txt.

echo “Preparing USB automount”# systemd-udevd creates its own filesystem namespace, so mount is done, but it is not visible in the principal namespace.sudo mkdir /etc/systemd/system/systemd-udevd.service.d/echo -e “[Service]\nPrivateMounts=no” | sudo tee /etc/systemd/system/systemd-udevd.service.d/udev-service-override.conf

First an override rule is added, which changes `PrivateMounts` rule’s value to `no` in /lib/systemd/system/systemd-udevd.service. PrivateMounts if set to yes, the processes of this unit will be run in their own private file system (mount) namespace with all mount propagation from the processes towards the host’s main file system namespace turned off. This means any file system mount points established or removed by the unit’s processes will be private to them and not be visible to the host. To learn more about mount namespaces read – http://man7.org/linux/man-pages/man7/mount_namespaces.7.html

Next, whenever a device is mounted the 01_create_skill file is executed which contains the following instruction:

The SUSI Smart Speaker is an AI assistant device which runs SUS.AI. To learn to set up your own smart speaker, head up to SUSI Installer. One of the new features of the smart speaker is the ability to control it via a webpage, the smarts speaker now allows the user to control various playback features such play/pause music directly via their mobile phones or laptops which are in the same network. The web page is served via the sound server running locally on the Raspberry Pi. The soundserver provides various methods of the vlcplayer as endpoints. The webpage uses these endpoints to control the smart speaker. Also, an external application such as an android/ios app can use these endpoints(or the webpage) to control the music playback on the device.

Making the Front-end

The front end is served via the flask server on ‘ / ’ endpoint and on the port 7070. Currently, the Front End contains the volume control slider and various buttons to control the audio playback of the device. The responses are sent to the server via javascript. Bootstrap is used for the CSS framework and Fontawesome is used for various icon support. Since the smart speaker should be able to run offline, CDN links for Bootstrap and Fontawesome are not used and the required files are served via the flask server on /static.

Sending Response to Server

Since this is a control webpage, on sending of a response, the webpage should not reload. To accomplish this all the buttons point to a javascript function which then sends out an HTTP POST request to the server. For this purpose XMLHttpRequest Object is used. The XMLHttpRequest object is used to exchange data with a web server behind the scenes.

Here the SetVolume function is used to send a request to the /volume endpoint which is used to control the volume of the device. The control function is used to send a post request to audio control endpoints such as /pause /stop /shuffle etc.

Features

The endpoints on the server provide the different audio control features via the vlc player. The endpoints used are listed below –

Play

The play functionality currently is only used directly via the busy state. It currently supports playback via youtube URL or MRLs.

To play using an MRL(Media Resource Locator) the request URL should have an argument called MRL with the needed MRL value. This also supports multiple semicolon ‘ ; ‘ separated MRLs in a single request.

Pause and Resume are also implemented in the same way but both of these use the same method pause of the MediaListPlayer class as that method acts as a toggle.

Shuffle

The shuffle endpoint shuffles the currently playing song list. It uses the shuffle method of the random library to shuffle the list containing MRLs of all the songs and then initiates a new MediaListPlayer object for playback. The Request URL doesn’t need any arguments.

Volume

The Volume endpoint is used to set the volume of the device, The volume control slider uses this endpoint. A single argument val is needed in the URL of the POST request. Val can have a value ranging from 0 to 100, where 0 means mute and 100 means full volume.

Previously whenever a sound had to be played via the smart speaker, the subprocess python module was used to call the CVLC process and play audio via it. This puts forward a number of challenges while implementing various music features such as queuing songs, shuffling songs or handle the volume of the music. Thus, the audio structure was remade in the SUSI Smart Speaker project. The audio playing structure resides mainly in the susi_installer and the susi_linux repository. The above flow chart describes how audio is handled now in the project.

Sound Server

The soundserver provides various methods of the VLC Player as endpoints. These endpoints are then used by the busy state directly or via the remote access webpage or an external application. An external application such as an android/ios app can use these endpoints to control the music playback on the device.

VLC Player

The VLC Player class is actually responsible for playing and handling the music. This uses python VLC module for handling audio playback and various other functionalities. We use the Media List Player class found in the VLC player module to play music, using Media list player over media player gives us the advantage of queuing the files and essentially making a playlist.

The play method receives a single MRL or multiple MRLs. If multiple MRLs are sent, they are separated via a semicolon ‘;’. The list player method of VLC Player class takes a list of MRLs as an input so if the received string has more than one MRLs it is broken down into a list of MRLs via the python’s inbuilt split method, which is then added to the list player.

At last play method of the mediaListPlayer class is used to play the music/audio.

If you’re worried about privacy concerns when using smart assistants or just want to build your own one with complete freedom then this guide will help you. SUSI.AI provides Artificial Intelligence for Smart Speakers, Personal Assistants, Robots, Help Desks and Chatbots. SUSI.AI is a completely free and open source software.

In this guide, we will be building our own smart speaker assistant which will talk to the user just like Alexa or Google home. The keyword will be “SUSI”.