History

I’ve been taking a break to do a number of things, one being to learn about modern Frontend. I haven’t properly done Frontend since doing LAMP gigs in college in 2005. This has been quite a journey. Backend experience helps, but the amount of solutions in the frontend space appears several times greater than in the backend (on any given language). And then multiply by some factor: whatever is a “good solution” at any given time, still requires some knowledge of the previous stack since the web hasn’t fully moved on and lots of content or solutions need to be adapted to be used in the new stack.

A long uninterrupted break was what I needed to go through breadth-first-search style across the long list of technologies used these days. To understand what did what, before investing further on a key list of minimal yet powerful enough technologies needed for web application development. And of course, to do the animation above as a cathartic experience.

Laundry list to learn Frontend in 2018

So this is what I recommend learning, fresh from the experience of (re)learning most myself.

Concepts:

I like to start at a high level. Some of these are overlapping (Flux and MVC) or historical, but worth knowing:

This leaves us with only a bazillion other Frontend technologies. But the above are the key ones to learn. Kamran open-sourced a beautiful roadmap that sheds some light on the taxonomy and relevance of all the JavaScript buzzwords you’ve read about Hacker News, I quote the Frontend bit:

First project

Ultimately you have to start somewhere and you want to be productive asap, without having to learn it all. The core technologies above suffice. In addition, in my case as I needed to make some mostly static sites I decided to adopt Gatsby.

Gatsby helps a developer to work with React and GraphQL to render the content on the server side pre-deployment, to ship static content to a CDN, resulting in super fast and cheap websites and luckily no servers if external services like Typeform or Disqus suffice for stateful logic. Rather than trying to adapt a specific static site generator (e.g. Jekyll, in Ruby) for these sites, I wanted to invest in JavaScript, React and GraphQL. As a one man shop I prefer to use powerful technologies that can be leveraged in many use cases.

Gatsby can also be a good choice for some types of dynamic applications, used together with a server-side API or cloud functions. Inside it, Gatsby manages a number of key technologies like Webpack for out-of-the box results, that I can defer to learn later. Additionally, it comes with a large number of plugins and starters to quickly adopt or learn by example.

Intro

This post is about automating the provisioning of your local machines via code. That could be your {work, personal} {laptop, desktop, VM}.

It requires an upfront investment but ultimately adds value by reducing time on subsequent configurations of new environments.

In this post I will show you how I achieve this via Ansible and the benefits of this approach.

Demo

For motivation, this is how I setup some core applications and preferences automatically when on a new environment:

Installing and personalising git, i3, direnv and pipenv via Ansible.

With a few lines I configured a window manager, git, direnv and pipenv. My actual setup does almost all my entire local provisioning from scratch. But before going further in this example, let’s revisit the problem and the solution in more detail.

Problem

As a developer I end up using a number of machines (at home and at work), with various OS/distributions. Examples:

personal laptop (Fedora)

personal desktop/server (Fedora and Windows)

work laptop (Ubuntu)

work servers (any)

temporary setups for each of the above whenever one breaks

temporary VMs (e.g. using Linux in Virtualbox in a Windows laptop)

Additionally:

Each of the environments above gets (re)provisioned periodically.

Each of these require a number of things set up so that I’m in my most productive environment.

OS package managers today are highly reliable for the installation part, but work isn’t finished just when a package gets installed. And the periodic manual provisioning of each of the machines/VMs above results in a substantial amount of time. If you have a similar situation, the initial investment in automating the provisioning (as well as the maintenance of it) requires less time than the manual provisioning over a few years time.

Furthermore, sometimes you want to start from a clean image or test big changes on your setup without being concerned with breaking the workstation you rely so much upon.

If having the best setup in your environments isn’t automated, you will compromise. In my tight work schedule I failed for some time to setup things that my personal laptop never failed to have, which accelerate or improve the way I work. You will fix something quickly in one environment but it won’t impact other environments, you might make notes but you’ll forget to update, and you’ll defer problems until they get just enough in your way. And the setup of your environments can get messy.

Backups

Backups solve part of this need for a single environment but aren’t ideal to reuse elsewhere (not to mention the difficulties in arranging for a decent full-system backup in Linux laptops).

Some people - like myself months ago - rely on tools like Dropbox for sharing a subset of data in the user’s home such as config folders and files via soft links to the shared Dropbox folders/files, while manually installing system packages. Some problems with this approach:

Critical config is not version controlled

Did you make a mistake in a config? Hopefully it was within the retention period of Dropbox deleted files and you won’t have to deal with customer support for files/directories not handled properly by their web app.

Generates a number of file conflicts between environments.

Requires setups and application versions to be very similar across different machines/VMs (generally not the case between work and home environments, or even in machines within each of these).

The setup of backups isn’t automated itself within the scope of this solution.

You still can’t test changes.

Either you will move almost everything to Dropbox (generating constant syncs for files you wouldn’t normally care about, and did I mention file conflicts?) or it will be a partial move still requiring a manual configuration.

It would be great to abstract what’s common and what’s specific to the OS and share the common bits, facilitating reusability across OS’s (either for switching or parallel use, like work and personal laptops). And then also share reusable common patterns within the community.

Enter Infrastructure as Code

Years ago a new movement started that brought development and infrastructure together to make Infrastructure as Code (or IaaC). Tools were developed to manage fleets of machines via a master machine with inventories and tasks defined in code to do things like install and configure databases, reverse proxies, LDAP, web servers, or any sysadmin task really - making infrastructure finally reproducible and automated.

In the current world of managed services and with a competitive cloud environment, there is rarely now the need for small to medium sized companies and individuals to manage so much infrastructure directly themselves, with tools like Terraform sufficing to just manage cloud services instead.

The improvement of online IDEs and platforms planning to manage the entire SDLC and a wider adoption of machines like Chromebooks might change this, but at least one machine remains on our hands to coordinate everything else - and that is the laptop or desktop in front of you.

Ansible

One of these IaaC tools is Ansible. Simple yet powerful, written in Python and whose inventories and tasks are written in readable YAML with an extensive range of native modules to change the state of your applications, services, files, networks, etc.

A module for instance is apt, so you can request packages to be present in Ubuntu/Debian - or not, with ansible’s job being to make sure your desired list of packages is in the state you instruct via your code.

A task uses modules, in the example above a task would be:

# if the git package is already installed this action does nothing# otherwise this installs gitapt:name:git# name of packagestate:present# instructs said package to be installedbecome:true# ansible needs to run sudo

Ansible modules are generally idempotent. Ansible is faster and easier than doing the equivalent in say a bash script to handle all the setup (although this isn’t apparent from the example above alone).

Roles and Ansible Galaxy

One big promise of Ansible (albeit not matching expectations) was the advent of reusable sets of tasks - aka roles - that could be developed and shared within the community on Ansible Galaxy, not too differently from PyPI in the Python world. Needing only a few user defined variables, a role would take over all aspects related a certain outcome. You’d be able to say download and install a role for setting up say nginx, grafana or Apache in a few commands, without having to code much yourself.

I have made available I few roles I use for myself for git, i3 and others. They are no longer on Ansible Galaxy but you can still find them further below.

Ansible walkthrough

Let’s setup a few of my roles as an example, from which you can build your entire Ansible setup. Note that this will make changes to your system, review the roles actions before starting.

Because pipenv might not yet be installed, we just assume Python 3 is:

Now we define a playbook that uses these roles, defining a few user variables for git. Note that any local setup you have might be changed, be sure to adapt the below to your preferences before running:

My personal Playbook includes the above in n-batalha/ansible-roles (reusable roles I’m sharing with the world), a number of external ones (see references below), as well as some less reusable roles.

Testing

An advantage of this setup which is easy to overlook is the ability to test system configurations before actually running them on your machine. Either via Docker or Vagrant. I won’t cover this here but you’ll be able to find how this works on my roles project.

Conclusion

If you have more than a single machine and you’re an advanced Linux user, it might be worth to automate its provisioning. Use the above as a starter point, and explore Ansible Galaxy for more reusable roles.

Perhaps I will share later my entire setup and the playbooks to invoke on a ready to use
repo for newcomers. Meanwhile I hope this helps.

Mine, used above

Recommended roles

Intro

Before I was even working in the fields of data and engineering, I have been intrigued as to how bots and automated personal assistants (the likes of Siri, Cortana and Google Now) work.

I now know enough to make a machine learning powered bot from scratch. But actually the best answer I have came later: does it matter?

In this post, I’ll dwelve into this answer and leave the interesting details of bots (the Machine Learning) for a future one.

Bots are not meeting expectations

First things first, let’s just acknowledge that the whole bots revolution has failed to live up to the high expectations many had.

The appeal was understandable for non-experts, on a superficial level. Unlike other machine learning applications, Siri and text bots are as close to everyone’s concept of AI (aka HAL 9000) as possible. Before using them, it was hard not to get excited, all skepticism aside. Even if covering only a fraction of what general AI can cover, it would still be quite useful.

WeChat seemed to be having enormous success in Asia and it relied on a conversational interface that was yet to be applied elsewhere, so it was easy to assume that it would be a matter of time before it did. So the bot gold rush began and by 2015 the hype was still at its highest. Dan Grover, then a Product Manager at WeChat, recalls in his brilliant article:

In practice, any interaction with most bots would quickly reveal just how far away we are from this reality. Alex Hern writes:

The problems with existing chatbots begin with how they actually work. Almost uniformly, the initial examples of Messenger bots are disastrous: unable to parse any instruction that doesn’t fit their (entirely undocumented) expectations, slow to respond when they are given the correct command, and ultimately useful only for tasks which are trivial to perform through the old apps or websites.

The Information reported last week that Facebook is rethinking its development efforts on all automation fronts after research showed bots failed to adequately address 70 percent of customer questions and requests. (…)

A recent Forrester survey found that only four percent of digital business professionals use any sort of chatbot. The research firm concluded in its most recent report on the state of the practice that its hype as a “world-changing technology” has led to a “peak of inflated expectations” that have gone largely unmet.

Whilst one can attribute exagerated expectations alone, these likely resulted from not realizing that:

a conversational interface is only one of many interfaces, and it has a much greater friction than a GUI in many situations (regardless of the bots intelligence)

limitations in Natural Language Processing further constrain the number of successful conversational interfaces we can develop. For now, they need to be simple and limited in scope to be accurate (more on this on the next post).

To these fundamental issues, some temporary ones might have worsened the situation:

lack of experience in the field (design practices are/were not as established)

a low barrier of entry that encourages numerous poor submissions

Whilst improvements on NLP are hard to predict, an answer to the first fundamental issue is just reaching the market. Or better yet, we are back to the starting point.

AI UI

While WeChat is popular and supports bots, it also supports web views mixed with a conversational model to provide a much better user experience. And that is claimed to be one of the key reasons of its success, not the conversational aspect.

Dan Grover compares the experience of ordering a pizza in Microsoft’s demo bot versus Pizza Hut’s account on WeChat. In WeChat the user starts in Pizza Hut’s account (a bot), but the interface really becomes a web like interface.

Perhaps more importantly, the most understated advantages of the WeChat model have less to do with chat than with the fact that users can immediately access a number of services in a central platform, not requiring installation of separate apps, accounts to be made and logged in, payment information to be entered, and more friends lists to manage.

Apple/Android Pay in mobile browsers, the evergrowing Facebook walled garden, if not improvements at OS level, might all evolve to provide the frictionless experience of WeChat, bots not required.

As Dan puts it:

This maybe a bit disheartening to hear since creating bots powered by AI sounds super cool and cutting edge, while making mobile optimized websites definitely does not.

Commoditization

Between starting this review a few years ago, and finally doing a post, there were enormous changes in the field. The most significant might just be the development ecosystem that sprung up and makes writing simple bots almost trivial.

From not requiring development skills, to doing only the machine learning bits (open-source or not), some service appears to cover it. See for instance Rasa (open-source), Wit.AI and Microsoft’s Bot Framework.

I won’t list all the good services I found just yet, but other people have covered most of them:

Conclusion

TL;DR: text interfaces are not enough, these serve a niche purpose but they can be combined with other interfaces providing more possibilities for developers to apply where appropriate. WeChat’s success was not likely due to the conversational aspect alone, but perhaps in great part due to their frictionless user experience to reach out to services (rather than needing separate apps with separate accounts and onboardings). Also, you don’t really need Machine Learning skills anymore to make bots.

Still, aren’t you curious as to how Siri finds the weather? Now that I’ve made clear it’s not that important, I’ll soon make a post about it, guilt free.

Flask

If introductions are needed, Flask is a great Python microframework (which can also be seen as a “framework” of frameworks). With it, one can create a simple API or web application in no time.

I most recently started using it for a personal project and at work, having been using Django exclusively for over 2 years. “Great” needs to be defined better of course, where it excels or would be better replaced by Django or others.

Testing

One popular measure of the quality of a codebase is the test coverage of such a codebase. Some go as far as assuming that “no unit tests mean that the code does not work and cannot be expanded upon”. Not everyone likes to write tests, but not doing comprehensive tests is not an option. As such, tests should be easy to write. They should also be fast. In general, they should not introduce friction or get in your way.

As with so many things in software development, if it’s easy to use, it will be used. If it’s tough to use, it won’t be used as much as it should be.

So how do we get this in Flask, if we want to use a relational database?

I won’t cover all of this, but only what concerns Flask and SQL-Alchemy, but the others are easier. Several articles and gists have been written regarding this section I’m covering, but of those that I found, many are outdated, have suggestions I disagree with (you shouldn’t use sqlite for tests if you run a different database in production) or cover specific subtopics separately, so I wrote yet another post.

Flask and SQLAlchemy

If you haven’t heard about SQL-Alchemy and you’re starting in Flask, read this. Django has a ORM, but in Flask it’s a separate framework.

In Django one might take for granted some of the nice features that run in the background. In testing, one is:

No such thing exists in the Flask side. There is of course a compromise between having very lightweight and specialized frameworks or “batteries included” frameworks, opiniated or not. But perhaps extensions here can be improved to save some of the boilerplate. In hindsight such boilerplate is small and I’ll easily port it to new projects, but it was the result of time consuming API reviewing, testing and tinkering.

App config

In Flask, we take care of the details of starting the app. A recommended path is to use a factory for apps. This factory takes as argument a config, for instance:

Then somewhere in your __init__.py or wherever you start your app object, you detect the environment and load the respective config object. The tests would create a separate app, with the config that is best suited for tests only (although the less you deviate from production, the better).

One of the extensions you’ll want to register above is Flask-SQLAlchemy as it takes care of the boilerplate integration bit between Flask and SQLAlchemy.

With the Flask-SQLAlchemy extension registered, in your config, you’ll need to have specific variables such as SQLALCHEMY_DATABASE_URI, to point to the database each environment should be using.

PyTest

I made the jump to Flask and PyTest at the same time. I’m not yet sure whether it’s better than unittest.

Instead of classes, in PyTest we make extensive use of the advanced fixtures mechanism. Fixtures here take a more general meaning than usual, it’s anything consumed by a test. The test app will be a fixture, along with the database.

One thing we don’t want though is to generate a database per test, as that would render tests very slow. For this, PyTest allows us to define a scope per fixture. Scopes can be function and session for instance - meaning that the fixture will be used and run per function or session respectively.

However if we scope the database fixture per session, how to do we make sure that the database is clean and at the same state when the different tests are run? At the end of each test, we rollback any changes made to the database during the test.

Now I went a little further and set the session fixture at autouse, but you might not want that. If True it’s automatically used per each test, otherwise you need to invoke it each test function by adding session as an argument. conftest needs to be set as a parent to all the test files that would use it, and PyTest takes care of discoverability.

Somewhere in your app you save in this table each POST request that is received. You can now test this with:

deftest_save_request(testapp):request_data=# some example request paramsresp=testapp.post('/',content_type='application/json',data=request_data)# the test should also filter by the request_data content, this is a simplified exampleassertRequest.query.filter_by(method='POST').count()==1

And what is SurrogatePK you ask? This is taken from sloria/cookiecutter-flask where you can take on more good ideas on how to structure your Flask project.

If I find more time, I’ll try and write an extension of this cookiecutter to include the setup above, more helpers and the goals Patrick covered above, meanwhile I hope this helps!

Amazon Echo

If you haven’t heard about Echo, it’s a new device that can be described as Siri in your living room, but actually getting used. And it does just that, listening to your voice and handling simple queries.

It’s supposed to have a seven-piece microphone array (that your phone doesn’t have) making it more accurate in the speech recognition, it’s always on and listening (making the interaction more frictionless) and has a great speaker. And perhaps being in the comfort of your own home - where talking to machines is somewhat less awkward - is another big reason behind it’s popularity.

Despite not having a great experience with it, I couldn’t resist the fun of developing for it. For now, I’m hoping that my unusual accent is the reason why we don’t get along that great, and next generations will improve.

Alexa skills

Also an advantage over Siri (although that’s changing), it that it has its own app store. Apps here are called “skills”. But you don’t need machine learning expertise to develop most skills, Amazon does the heavylifting for you.

The way it works is:

the user will speak to an Alexa device (like Amazon Echo)

Amazon does the speech recognition, intent classification and entity extraction, and calls our service while providing the processed speech (eg. the user says “what’s the weather in London”, and the service gets {'intent':'KnowWeather','entities':{'city':'London'}

the skill will receive the nicely parsed data, and do its logic (eg. query a weather API for London) and send a response via an HTTP POST request to Amazon