Monday, June 30, 2014

This post is a complementary write up about my WebPerfDays talk given at Google HQ last Friday.

Slides And More

During the talk I've live demoed an Arduino Yun board and its performance and I've also showed through a documents camera live performance tests on a wide range of devices including:

Android 2.3

Bada OS

Blackberry 10

first ZTE FirefoxOS phone

Windows Phone 8

... probably others too ...

While it's easy for me to point at those slides, you'll miss most of the live demoing content so here a better walk through.

Embedded Systems Performance

While Intel's Edison can be considered just a promise, there is already a huge variety of boards based on Atheros AR9331 board which is a single core MIPS architecture based System on a module @ 400MHz almost as small as the promised Edison.
Arduino Yun is only one of those boards featuring such beauty, and here something interesting about performance.

nodejs and npm work but npm is freaking slow

I've also opened a ticket about this non critical issue but the long story short is that npm --version or just npm --info take more than 7 seconds to show anything at all plus if you don't disable node flags npm will crash/fail to install anything with the global flag.
The lesson here in a nutshell: if your program is capable of smaller tasks, isolate these and make these executable/available a part instead of putting any software behavior after loading, parsing, and analyzing the entire logic. --version, as example, is not even something to even think about. Make basic operations available ASAP and lazy load complex operations and related logic only when needed.

nodejs FileSystem can go faster

It does not matter if node fs is asynchronous and non blocking, it still ensures somehow atomic reads and writes and its performance are usually based on assumptions that today Hard Drives have a lot of cache, are very fast ... etc etc ... then you face embedded:
SD cards are actually pretty fast but if you have concurrent reads you won't have any cache to take advantage.
Little tricks like those used in some fspeedexperimental module might help to reach better concurrent performance without needing much more RAM or CPU power from the tiny board.

nody little server

The entire talk has been live demoed over my Arduino Yun and nodejs running nody, a tiny little server focused on doing one thing: serving (little) files if presents to as many clients as possible.
nody handled up to 20 devices at the same time, asking for same or different content, without a glitch.
The project with same name already exists in npm so I might think about a new name and make it a proper package ideal for embedded systems. It's KISS and YAGNI at its best so far :-)

Mobile Web Has Embedded Performance

It would be so simple if everyone had the latest iPhone or latest Android Hardware available, unfortunately the reality is way different and specially in emerging markets where cheap phones are the target to reach.

2010 Best Practice still valid

Everything I could do to make this map experiment move smooth even on Bada or Android 2 phones is still valid these days:

you might want to degrade to 30 FPS instead of 60 and go smoother in older HW

you don't want to cache all the things because you have a limited amount of RAM

you might even prefer canvas to draw tiles instead of CSS because canvas is a single surface to upload and draw, CSS squares can be many impacting FPS smoothness/linearity

Benchmark For Real

The Tesla Experiment is a good benchmark to see how good is the GPU, if used at all, the CPU that will calculate all lightning, and how many touch input we can have on a single screen. A cheap old Bada OS perform actually very well in there, why aren't we targeting these devices too?

Do Not Polyfill Old Hardware!

When you realize that simply touch or mouse move events are very heavy and triggered like 15 times per second in old Androids phones, you must be think again why on earth you would use touch events to simulate Microsoft Pointer Events, a kind of event that also won't bring us much on touch devices ...
Windows 8 and WP 8 Phones have a very good hardware and the cheapest WP8 phone you can try will trigger at 60FPS any simulated touch event ... deal with it Microsoft, we need to support old Android and other browsers with its old Hardware underneath, you are the only out there without Touch Events support ... please fix this!

SimpleKinetic to calculate asynchronously directions and movements through deltas

Performance That Matter For Real

As mentioned during my talk, if you are still using jsperf to benchmark ++i VS i++ you really have no idea what is the problem in your app ... focus on real bottlenecks and try to avoid holding on RAM all that data, all that DOM, all those images, etc etc ... use storages, cache callbacks instead of entire network or i/o results, find the problem and solve it reasonably.
Last, but not least, spend few bucks for some cheap second hand phone and use it for tests: if it will go reasonably well in that hardware, it will FLY in any other modern browser! Don't trust your emulator when it comes to real raw performance.

Get ready for the future: it will not necessarily be more performant, rather smaller, using less power, and everywhere, as the Internet Of Things is already these days!