Elements of Machine Learning I: Regression as function estimation

Sharp Sight Labs “one concept machine learning” post provides an excellent illustration why instead of “just diving in and building something“, it pays to spend time first understanding what is really going on under the hood. In the case of machine learning, the author suggests it comes down to one core concept:

The essential problem of machine learning is function estimation. When you’re doing machine learning (specifically, supervised learning), you’re essentially using computational techniques to reverse engineer the underlying function from the data points alone. You’re estimating an underlying function based only on training observations in a dataset.

The approach the author uses to illustrate his point involves examining a 2D data set with a progression of regression models from linear through polynomial. A regression model in this context meaning an equation that is able to provide a continuous prediction for the corresponding output value y for any given input value x. In other words produce a curve for y=f(x). The article provides some reference code in R. I thought it would be a useful exercise to try and re-implement the exercise in Python using numpy, scikit-learn, pandas and matplotlib. The diagrams below show my results – the top graph is the raw data, the second shows a linear regression and the third shows various degrees (eg. quadratic, cubic and other) of polynomial regression. The models are generated using scikit-learn (sklearn) and plotted using matplotlib. Here’s a snippet showing how the polynomial regression curves are created using a Ridge (least squares) classifier:

The key point here is that a number of progressively more sophisticated models can be used to generate graphs to see how close we can get to the “true function” that best represents the data distribution. The code used to generate the corresponding graphs below is freely available for inspection and modification on Bitbucket:

Useful post explaining the concept of autoencoders (neural networks designed for data compression) and how they relate to word embeddings (vector representations of words in a text that ‘compress’ their relationships and meanings). The latter are a core element of many text processing machine learning models:

There are currently more than 1.6 million Americans working as truck drivers, making it the most common job in 29 states. … The loss of jobs representing 1 percent of the U.S. workforce will be a devastating blow to the economy. And the adverse consequences won’t end there. Gas stations, highway diners, rest stops, motels and other businesses catering to drivers will struggle to survive without them.

Douglas Rushkoff suggests “replacing labor with algorithms” is dangerous and ultimately bad for your business though it’s hard to see companies resisting the tide on this:

If you’re using algorithms and big data to figure out your next product line rather than designers, what’s your competitive advantage? The other company is using that same data and probably hiring the same big data analytics company to figure out the future trend. So now you’ve been turned into a commodity.

Apps and Services

we will tell chatbots our secrets. We will share information with them that we would never share with our friends. We will use them as repositories for important data that we know we need to remember. .. So data needs to be stored. And stored data can be hacked. It can be snooped on. It can be surveilled. It can be used for nefarious purposes. And it will. It’s only a matter of when.

WhatsApp is planning later this year to offer banks, airlines and other businesses the ability to send one-way messages to customers, people familiar with the plans said. It will be WhatsApp’s first official move to lure businesses onto the messaging app.

Will.i.am has a new smartwatch called the Dial available for pre-order. It is designed to be entirely standalone and comes complete with a 3 SIM and its own digital assistant. The UI looks strange and unpromising – it is, after all, the successor to the widely panned Puls released a year ago.

The Dial is being touted as a “voice-first” device with its own virtual assistant, called AneedA. Like Siri, Cortana and Alexa, the idea is that you can ask for information and execute tasks while your hands are full. How AneedA compares to those alternatives is, for now, a mystery.

The first rule of pricing is that you don’t talk about pricing, you feel it. This and other great insights into the GoodBetterBest model and how to use it to fix SaaS subscription pricing in this excellent Medium post.

One place I worked as an architect had a project I estimated at 2 person-weeks of coding. Six months and dozens of meetings later to write a 120 page requirement documents resulted in the same estimate. We could have built the app 10 times over. But what would all the people at those meetings have done?

Security

the Hacking Team is among the world’s few dozen private contractors feeding a clandestine, multibillion-dollar industry that arms the world’s law enforcement and intelligence agencies with spyware. Comprised of around 40 engineers and salespeople who peddle its goods to more than 40 nations, the Hacking Team epitomizes what Reporters Without Borders, the international anti-censorship group, dubs the “era of digital mercenaries.”

Two bytes to $951billion. The fascinating inside story of one of the biggest bank heists in history executed by compromising a SWIFT server in Bangladesh:

Dave Eggers’ dystopian novel details a utopian-sounding tech corporation whose ambitions extend to every aspect of people’s lives, anticipating, fulfilling and creating their every desire, to the extent that people never need to step outside the closed loop of control. Then find they can’t even if they want to. Apple has done its best to dispel such comparisons by building a massive new headquarters – in the shape of a circle.

A more inchoate sense of bewilderment and disillusion with the trajectory of travel of computing technology is apparent in a devastating short post called Drifting by Tariq Kim. In it he calls time on the lack of ownership, obsession with algorithmic choice and the inability to slow anything down in the tech work suggesting it is collectively leading us to calamity as a species. He’d like to see an alternative emerge with our collective help – an “organic” technology movement if you like:

The uncomfortable truth is that I fell out of love with the technology world and that I am not excited by the future anymore. At least the future that is being built today. … In the world of technology, we are taught to build things fast. Sometimes too fast. And we spend so little time studying the consequences of what we build. … We need to give people access to other choices, other life narratives, other tools, and other ideologies. A sort of “organic sustainable slow technology” that fights this commoditization of everything online and offline. I feel it’s time to build this and for that I want to stop drifting and get back to building products that make me love the future again.

“About three decades of research in neuroscience have identified a robust link between aerobic exercise and subsequent cognitive clarity, and to many in this field the most exciting recent finding in this area is that of neurogenesis.”

“If you take the man at his word,” said Michael Breen, the president of the Truman National Security Project and a decorated former Army officer, “we have a presidential candidate who seems to have committed himself to triggering what would probably be the greatest crisis in civil-military relations since the American Civil War.”

Society and Culture

As Lenny Henry reminisced of the time he “sang with Prince and Kate Bush“ Quartz asked the question why people grieve celebrities they’ve never met. The answer lies in a confusing mix part memory of a lost younger self, part keeping up appearance and part the sharp reminder of mortality:

You can die in an elevator alone no matter how rich you are and no matter how talented you are

27 years on from Hillsborough and the momentous revelation of the truth of the terrible events that day, another reminder of the abject failure of media objectivity. Billy Bragg performing Never Buy the Sun in 2011:

Hipsters

Under the hipsters’ watch, dance music has become tedious and diluted. A monstrous cabal of overpaid circuit DJs titillating a precious and unimaginative bunch of wimpy pseudo-hedonists at a carefully designed ‘safe space’. In broad daylight. If that’s your idea of raving, you can keep it. I’m out.

Presumably the author would apply the same withering tone to Further Future a “Burning Man for the 1%“. Featuring Alphabet Chairman Eric Schmidt in his party hat. Form an orderly queue now please:

“This is top-league networking and business folks are all here in the guise of having fun. It’s designed around the music, but it’s about the business. A ton of business will get done here. Entrepreneurs will get funded, investors will find their trajectories, service companies will meet and mix it up.”

Mr Clark’s coffin was decorated with coloured ribbons for the “journey that has no beginning and no end”, which saw more than 150 colourfully dressed mourners cross Finchley Road for a ritual burial and “g-RAVE-side ceremony” in Hampstead Cemetery. … Prayers were made aloud for Mr Fraser’s spirit, with one woman repeatedly asking for sexual energy.

Buddhism

Leicester City’s triumph in the Premier League deserves its own section next time around. Following this weekend’s dramatic finale, one wonders how many other teams will resort to Buddhism in a desperate bid to emulate the Fearless Foxes: