The economic value of a Youtube video

The Youtube platform has been designed for the purpose of making all of its
videos are available to anyone at anytime. Given its large amount of content,
its suggestion algorithm and the ability to automatically play the next
suggested video, a user of Youtube can consider that the stream of videos he
wants to watch is limitless. This makes the stream a nonscarce resource i.e.,
a free good which, from an economics point of view has no value.

Asking users to pay for watching videos would indeed be a bad business move
for Youtube, as many user would refuse to pay for watching content they
currently get at no cost. Even the Youtube Red subscription, according to an
Ars Technica article and related user comments, seems to be valuable not for
it exclusive content, but for the extra features provided by the mobile player
(download for offline viewing and background playing). Interestingly, these
features are already provided by non-standard software, which is freely
available. The exclusive content being apparently of little interest according
to the article, the subscription seems to rather be &ldquo;tax&rdquo; on people who
don't know they could legitimately get the same service for free.

Youtube being a business, it needs to get revenue, at least to support the
cost of its large IT infrastructure. Since users are not charged any money,
Youtube's revenue comes (exclusively?) from advertising (I guess that the
collection of user behavioural data and their profiling is also sold to
advertisers). Youtube's content therefore has value only insofar as it
attracts potential consumers and exposes them to advertisement. The content
itself does not matter to the platform, as long as users are drawn to it.

This is thus the foundation of the Web platforms and the Attention Economy,
where the users' attention is the actual product being sold by the platforms.
User profiling and Big Data is only a tool used for maximising the amount of
attention being captured.

Advertisers are gamblers, you don't need to let them win

In 2004, Patrick Le Lay, then CEO of the French TV channel TF1 said &ldquo;what we
sell to Coca-Cola is available human brain-time.&rdquo; One aspect of this
is that TV channels sell advertisers the opportunity to reach the channel's
viewers and attempt to influence them into buying the advertised products.

From the point of view of the advertisers, it is a gamble: they bet money (the
cost of producing the TV commercial and the price paid to the TV channel to show
the commercial) and hope to gain from it (when viewers buy their products
because they have seen the commercial and have been influenced by it). This
gamble is two-fold: the viewers may or may not see the commercial, and they may
or may not be receptive to the influence techniques used in the commercial.

In social media, advertisement can be considered from the same angle: the website
sells advertisers display space on its pages, and the advertisers gamble that
users of the website will see the commercial and be influenced by it.

While everybody agrees that you don't have a moral obligation to buy a product
after you have seen a commercial, it seems less obvious that you don't either
have any moral obligation to view the commercial. For example, you could close
your eyes and plug your ears to ignore the commercial; you could even use tools
that automatically hide the commercial from you.

Putting the advertisers' money to uses I approve of

As I want to protect myself from the influence of advertisers, I normally use
automated tools that prevent me from seeing advertisement. I record television
programs and skip the commercials (automatically when the tool works as
intended, otherwise manually), and I use ad blockers in my Web browser to
prevent commercials from being displayed on my screen.

The difference between TV and the Web is that the TV channel gets paid for
broadcasting the commercials and cannot control whether or not I skip them
while advertisers on the Web pay the websites only if the commercial is being
fetched i.e., only if I allow the Web browser under my control to display the
commercial. In that case, my preferred solution would be to fetch the
commercial as if it would be displayed, without actually displaying it. And
since advertisers not only display commercials but also track the users across
websites, it is necessary to isolate each commercial so that the tracking is
not possible.

I would not normally care about websites not being paid by their advertisers,
but in the particular case of Youtube, I use tools that allow me to watch
content without having to view any commercial, meaning that the content's
creators cannot hope to get payment from Youtube. I therefore dream of a tool
that would allow me to channel advertisers' money to the content creators
without having to view any commercial, thus letting the advertisers gamble,
but strongly shifting the odds of this gamble in my favor. I consider this to
be retribution for the advertisers' attempt at influencing me for their
profit.

The Solarized color scheme redefines
some of the standard basic ANSI colors, making some color combinations
unsuitable for display. In particular, bright green, bright yellow, bright
blue and bright cyan are tones of grey instead of the expected colors.

Also, some terminals interpret bold text as bright colors, turning e.g, bold
green into a shade of grey instead of the expected green. At least in
URxvt, setting intensityStyles: False will prevent bold text from being
displayed in bright colors (but will still be displayed in a bold font).

When redefining color schemes for terminal applications using ANSI colors,
these are possible combinations, using the usual ANSI color names. Note that
bright colors are usually not available as background colors.

The distance l from the TV depends on the desired horizontal viewing angle
a, the screen's diagonal d and the number N of pixels on a row.
Additionally, we will assume the screen's aspect ratio r to be 16:9 and the
human eye's smallest angle that can be
seene to be 31.5 arcseconds.

Let R be the ratio between the diagonal and the width of the screen:

R = &radic;(1 + 1/r2)

We can then write a relationship between a, N and e:

tan(a / 2) = NR tan(e / 2)&nbsp;&nbsp;&nbsp;&nbsp;(1)

From (1) we can deduce that for any given e, there is a maximum horizontal
viewing angle amax above which pixels can theoretically be
distinguished.

For N = 1920 (FullHD), amax = 19.1°. With a 4K screen,
amax = 37.2°.

We can also write a relationship between horizontal viewing angle, screen
diagonal and distance:

d / l = 2 R tan(a / 2)&nbsp;&nbsp;&nbsp;&nbsp;(2)

The ideal value or a is a matter of debate,
but THX defines a horizontal viewing angle of at least 36° (the screen viewed
from the rear seat of a THX theatre), while SMPTE suggests 30°. A value of 20°
is also mentioned.

With a 4K screen, amax = 37.2° and (2), we draw that the ideal
distance is 1.30 times the screen's diagonal. For example:

132 cm for a 40" screen

165 cm for a 50" screen

197 cm for a 60" screen

With a = 30°, the ideal screen distance is 1.63 times the
screen's diagonal. For example:

132 cm for a 32" screen

166 cm for a 40" screen

207 cm for a 50" screen

248 cm for a 60" screen

With a FullHD screen and a compromise angle a = 20°, the ideal distance is
2.47 times the screen diagonal. For example:

201 cm for a 32" screen

251 cm for a 40" screen

314 cm for a 50" screen

EDIT: The value of e is valid for a high contrast between two pixels. Most
images do not have such a high contrast, and therefore a value of e = 1
arcminute is a reasonnable assumption in practice.

From this follows that for N = 1920 (FullHD), amax = 35.5°
(1.36 times the screen diagonal). With a 4K screen, amax = 65.3°
(0.68 times the screen diagonal). This also gives a reasonnable value for
standard definition PAL TV with N = 1024, amax = 19.4° (2.55
times the screen diagonal).

That would allow for larger horizonat viewing angles, such as 45° (1.05 times
the screen diagonal) or 60° (0.75 times the screen diagonal) when viewing a 4K
screen. At such short distances one must however take into account the
possible lack of comfort due to the physical closeness of smaller screens.

Third part of my DNS setup notes: changing the
DNSSEC
config from NSEC to NSEC3. This has be on my TODO list for over a year now,
and despite the tutorial at the ISC Knowledge
Base,
the ride was a bit bumpy.

Generating new keys

The previous keys were using the default RSASHA1 algorithm (number 5), and we
need new keys using RSASHA256 (number 8).

Generating those keys was easy. On a machine with enough available entropy in
/dev/random (such as a Raspberry Pi with its hardware random number generator)
run:

Transfer the keys to the server where Bind is running, into the directory
where Bind is looking for them.

Loading the keys

The documentation says to load the keys with

rndc loadkeys example.net

but that ended with a cryptic message in the logs:

NSEC only DNSKEYs and NSEC3 chains not allowed

Apparently, the algorithm of the old keys does not allow to use NSEC3 (which I
knew) so Bind refuses to load these keys (which I didn't anticipate). I
eventually resorted to stopping Bind completely, moving away the old keys,
deleting the *.signed and *.signed.jnl files in /var/cache/bind/ and
restarting Bind. The new keys got then automatically loaded, and the zone was
re-signed using NSEC.

NSEC3 at last

I could then resume with the tutorial.

First, generate a random salt:

openssl rand -hex 4

(let's assume the result of that operation was &ldquo;d8add234&rdquo;).
Then tell Bind the parameters it needs to create NSEC3 records:

rndc signing -nsec3param 1 0 10 d8add234 example.com.

Then check that the zone is signed with

rndc signing -list example.com

Linking the zones

Since the keys have changed, you need to update your domain's DS record in
your parent domains DNS, using the tool provided to you by your registrar.
This step is the same as in the &ldquo;Linking the zones&rdquo; of the previous
part of
this tutorial.

My old NAS that I use for backups is now over 10 years old, and while it
still works and faithfully backs-up my files every night, it has an always
increasing probability to fail.

I decided to replace it with a Buffalo Linkstation 210, that offers 2&nbsp;TB of
space for 140&nbsp;EUR, making it cheaper than building my own device, at the risk
of not being able to use it the way I want it, being a commercial device that
wasn't designed with my needs in mind.

The way I want to use the NAS is that it boots automatically at a given
time, after which the backup script on the desktop starts, transfers the
needed files, and puts the NAS to sleep mode again. That last feature was
available on my previous device, but not anymore on the LS210. Hence the need
to make it do my bidding.

Moreover, the Web UI for administrating the LS210 is horribly slow on my
desktop due to bad Javascript code, so the less I have to use it, the better.

The device

The way to gain SSH access seems to vary depending on the exact version of the
device and the firmware. Mine is precisely a LS210D0201-EU device with
firmware version 1.63-0.04, bought in January 2017.

Initial setup

I found instructions on
the nas-central.com forum. It relies on a Java tool called
ACP_COMMANDER
that apparently uses a backdoor of the device that is used for firmware
updates and whatnots, but can apparently be used for running any kind of shell
command on the device, as root, using the device's admin user's password.

Let's assume $IP is the IP address of the device and "password" is the
password of the admin user on the device (it's the default password).

You can test that ACP_COMMANDER works with the following command that runs
uname -a on the device:

java -jar acp_commander.jar -t $IP -ip $IP -pw password -c "uname -a"

It will output some amount of information (including a weird message about
changing the IP and a wrong password ), but if you find the following in the
middle of it, it means that it worked:

One nasty feature of the device is that the /etc/nas_feature file gets
rewritten on each boot through the initrd. One last step is then to edit
/etc/init.d/sshd.sh and to comment out near the beginning of the file the
few lines that check for the SSH/SFTP support and exit in case SSH is not
supported:

Second part of my DNS
setup
notes, this time about DNSSEC. The following notes assumes there is already a
running instance of Bind 9 on a Debian Jessie system for an imaginary domain
example.com, served by a name server named ns.example.com.

The version of Bind 9 (9.9.5) on Debian Jessie supports "inline signing" of
the zones, meaning that the setup is much easier than in the tutorials
mentioning dnssec-tools or opendnssec.

Note that the db file must point to a file in /var/cache/bind, not in
/etc/bind. This is because bind will create a db.example.com.signed file
(among other related journal files), constructed from the path of the "file"
entry in the zone declaration, and it will fail doing so if the file is in
/etc/bind, because Bind would attempt to create the .signed file in this
read-only directory.

and place these lines in db.example.com (i.e., the db file for the
parent zone). Change the serial number of the zone in the same file and run

rndc reload

You should then be able to query the DS record with

dig @localhost -t ds home.example.org

You can use Verisign's DNS debugging
tool to check that the signatures
are valid and DNSViz to view the chain of signatures
from the TLD DNS down to your DNS. This also helped me figure out that my zone
delegation was incorrect and caused discrepancies between my primary DNS
server and the secondary server.

Now that I have my own server, I can finally have my own DNS server and my own
domain name for my home computer that has a (single) dynamic IP address.

The following notes assumes there is already a running instance of Bind 9 on a
Debian Jessie system for an imaginary domain example.com, served by a name
server named ns.example.com and you want to dynamically update the DNS
records for home.example.com. This is largely based on the Debian
tutorial on the subject, solving the problem
that bind cannot modify files in /etc/bind.

On the server

Create a shared key that will allow to remotely update the dynamic zone:

dnssec-keygen -a HMAC-MD5 -b 128 -r /dev/urandom -n USER DDNS_UPDATE

This creates a pair of files (.key and .private) with names starting with
Kddns_update.+157+. Look for the value of Key: entry in the .private
file and put that value in a file named /etc/bind/ddns.key with the
following content (surrounding it with double quotes):

In /var/cache/bind create the file db.home.example.com by copying
/etc/bind/db.empty and adapting it to your needs. For convinience, create a
db.home.example.com symbolic link in /etc/bind pointing to
/var/cache/bind/db.home.example.com.

In db.example.com (that is, the parent zone), add a NS entry to delegate
the name home.example.com to the DNS server of the parent zone:

home.example.com NS ns.example.com

You can now reload the bind service to apply the configuration changes.

On the home computer

I decided to use ddclient 3.8.3 because it supports dynamic dns updates
using the nsupdate tool. I backported that version of ddclient manually
from a Debian Testing package; it's written in Perl and the backporting is
trivial.

Copy /etc/bind/ddns.key from the server to /etc/ddns.key on the home
computer (the one running ddclient), ensuring only root can read it. Then add
the following to /etc/ddclient.conf (be careful with the commas, there is no
comma at the end of the second last line):

In recent (post-2007) Debian (and probably other) Linux distributions, the
passwords are stored in /etc/shadow using the sha512crypt algorithm.
According to Per Thorsheim,
with 2012 hardware, a single Nvidia GTX 580 could make 11,400 attempts at
brute-force cracking such a password. This means that a
log2&nbsp;11,400&nbsp;=&nbsp;13.5 bit password could be cracked in 1 second.

To have a password that would resist a year to such a brute-force attack, one
must multiply the password complexity by 86,400&times;365 (seconds per year)
i.e., add 24.5&nbsp;bits to the password for a total of 38&nbsp;bits.

But this password is guaranteed to be cracked in a year. To make the
probability of cracking such a password much lower, let's say less than 0.01,
one must increase the password's complexity by a hundred times i.e., add
6.7&nbsp;bits. We now have a minimum of 44.7 bits.

If one does not want to change the password for the next 10 years (because one
is lazy), one must again increase the complexity tenfold (that's another
3.3&nbsp;bits for a total of 48 bits) and account for the increase in processing
power in the coming years. Between 2002 and 2011, CPU and GPU computing
power
has been multiplied by 10 and 100 respectively i.e., +0.37 and +0.74
bits/year. That means that the password's complexity must be increased by
0.74&nbsp;&times;10 = 7.4 bits. We have now reached 55.4&nbsp;bits.

Now we need to guess who are the password crackers. How many such GPU will
they put together? Titan has 18,688 GPUs
(add another 14.2&nbsp;bits to stay ahead of it), and the (more affordable) machine
that cracked LinkedIn leaked passwords
had 25 GPUs (requiring to add only extra 4.6&nbsp;bits).

Assuming the crackers have a 25-GPU setup and not a gigantic cluster, 60&nbsp;bits
should be perfectly safe. If they are a government agency with huge resources
and your data is worth spending the entirety of that cluster's energy for 10
years, 70&nbsp;bits is still enough.

The same article also mentions an Intel i7, 6-core CPU would make 1,800
attempts per second i.e., 10.8&nbsp;bits. For a password that must resist for 10
years, that would mean 49&nbsp;bits. Titan has 300,000 CPU cores (50,000 times
more than the i7), so that makes an extra 15.6&nbsp;bits for a total of 64.6&nbsp;bits.
The Tianhe-2 has 3,120,000 cores,
adding 19&nbsp;bits to the original 49&nbsp;bits, leading to 68&nbsp;bits total.

In summary, 70&nbsp;bits is enough. If you are lazy and not paranoid, 60&nbsp;bits are
still enough. If you think the crackers will not use more than 32 i7 CPUs
for a month to try and break your password (adding 2.4&nbsp;+&nbsp;21.3&nbsp;bits to the
original 10.2&nbsp;bits), 48.5&nbsp;bits are still enough.

I just switched from using Xterm to using evilvte
but then I noticed that evilvte cannot be resize smaller. It can become
bigger, but there is no way back. Then I learned that URxvt does everything I
want (it even uses the same font as Xterm by default) with a bit of
configuration. And it's much more lightweight than evilvte (it doesn't use
GTK, that helps).

This is my .Xresources (everything you need to know is in the man page).

Then, choose a place where to put your repository (I chose my $HOME).
Origin, Label and Description are free-form fields. Codename is the
same as my current Debian version, and Architectures matches the
architectures I'm using. Then run:

Since I started with Linux, back in 1997, my xterm have been using always the
same font: a bitmap, fixed font which produces 6x13 pixels glyphs. I'm
convinced that a bitmap font is the best possible choice for not-so-high
resolution LCD monitors (I have a 17" 1280x1024 monitor which results in a
96&nbsp;dpi resolution) where any vector font would inevitably produce aliased or
fuzzy glyphs. My bitmap font is crisp and has no rainbow edges (who in his
right mind could imagine that subpixel antialiasig is a good idea?).

With the xterm, I could simply specify the font as 6x13 and it would use
it. That was simple, because it was meant for it.

Today I switched from pure X11 xterm to GTK-based evilvte
and while evilvte is apparently a great tool, it didn't want to use my beloved
6x13 bitmap font. It would use 6x12 or 7x13, but not the one in the middle.
The font is however available on the system through fontconfig, since I could
find it with fc-match:

But evilvte, while showing "SemiCondensed" as an option in its font dialog,
just seemed to ignore it. The fontconfig documentation mentions that one can
trigger debug output by setting an environment variable FC_DEBUG=1. With it,
I could see how Pango (GTK's font managemnt system) was interacting with
fontconfig:

Notice the important difference: fc-match asks for a weight of 100 (and style
SemiCondensed) while Pango asks for weight 80 and width 87 (which is
apparently equivalent to semi-condensed). Since my font had a weight of 100,
it was never selected. However, when requesting a bold version (fc-match
Fixed-10:semicondensed:bold or python mygtk.py "Fixed SemiCondensed Bold
10") the same font is found (6x13B-ISO8859-1.pcf.gz, which is the bold
counterpart of my font). That took me several hours to find out.

Since the root of the problem seemd to be the weight, I needed to find out how
to make Pango tell fontconfig to use a different weight, since there is
apparently
nothing
between &ldquo;Regular&rdquo; (Pango 400, fontconfig 80) and &ldquo;Bold&rdquo; (Pango 700,
fontconfig 200). And then, completely by accident, I found
out
there is actually a middle value: &ldquo;Medium&rdquo; (Pango 500, fontconfig 100),
which is exactly what I neeed. But the outdated PyGTK documentation and the
well-hidden man page (and very little help from Google and DuckDuckGo in
finding a decent documentation for Pango, I must say) didn't make this any
easy.

So finally, the magic font description I put in evilvte's config is &ldquo;Fixed
Medium SemiCondensed 10&rdquo;. With it, Pango selects the font I want:

Today I switched from using xterm (which I had been using for the past 15
years at least) to using evilvte. The reason is that evilvte allows to click
on URLs and opens a new tab in Firefox, while xterm does not. Since Firefox
removed the &#x2D;&#x2D;remote option, wmnetselect did not anymore allow me
to open a copied URL. Since wmnetselect has no been updated since forever
and has even been removed from Debian, I thought it was time for a radical
change (yes, I changed my terminal emulator because of the Web browser. I
know).

Evilvte is one of those simplistic tools that you configure by editing the
source code (the config.h, really), so I thought that after having done
that, I may as well make my own custom Debian package. It wasn't too hard, but
since I don't plan to do this regularly, here's the process.

Get the Debianized sources:

apt-get source evilvte

Enter the directory

cd evilvte-0.5.1

Edit the config file (or whatever you want to do for your own package), save
it in the right place. In my case, the package contained a debian/config.h
customized by the package's maintainer, so I needed to modify this one rather
than the src/config.h one. During the building of the package,
src/config.h is overwritten by debian/config.h.

Then edit debian/changelog and add a new entry. By doing that, you need to
choose a new version number. I wanted to keep the original version number of
the package (0.5.1-1) but make it known that it was slightly newer than
0.5.1-1: I decided to go for 0.5.1-1+custom (after discovering that my first
choice, 0.5.1-1&#x7E;custom, means that the package is slightly older than 0.5.1-1
and would therefore have been replaced during the next apt-get dist-upgrade)
by 0.5.1-1 . The description of the change is simply &ldquo;Custom configuration&rdquo;.
For the rest, follow the example of the existing entries in the changelog. Be
careful, there are two spaces between the author and the date.

If you have changed the upstream source code instead of only Debia-specific
files, the package building helpers will record a patch for your and let you
write some comments in the patch file, based on the new entry in the
changelog.

Then you just need to build the package:

dpkg-buildpackage

It will probably ask you for your GPG passphrase (when signing the package),
and after that, you're done. The newly created package is in the parent
directory, and ready to be installed.

On the radio the other day, I heard a mathematician warning about too hastily
interpreting probability results. Here's the example he gave.

Imagine a population of 100,000 people, 100 of which having a rare disease
(but not knowing about it) and therefore 99,900 of which not being sick. Imagine
moreover there is a test that can detect this disease with a 99% accuracy:
this means that the test will on average give a positive result on
100&nbsp;&times;&nbsp;0.99&nbsp;=&nbsp;99 of the 100 sick persons and a negative result on
100&nbsp;&times;&nbsp;(1&nbsp;-&nbsp;0.99)&nbsp;=&nbsp;1 of them. It will also give a negative
result on 99,900&nbsp;&times;&nbsp;0.99&nbsp;=&nbsp;98901 non-sick persons and a (false) positive
result on 99,900&nbsp;&times;&nbsp;(1&nbsp;-&nbsp;0.99)&nbsp;=&nbsp;999 non-sick persons.

This means that out of the 99&nbsp;+&nbsp;999&nbsp;=&nbsp;1098 persons (out of the whole
population) who got a positive result, only 99 actually have the disease. This
means that the test indicates, for a random person taken from the whole
population, only a 99&nbsp;/&nbsp;1098&nbsp;&asymp;&nbsp;9% probability of being sick. In other words,
even if the test is positive, there is still a 91% chance of not being sick!
This new result needs to be put into perspective with the probability of being
sick before doing the test (0.1%) and after getting a positive test result
(9%, i.e., 90 times higher chance of being sick). But it also means that
because of the imbalance between sick and non-sick populations, the 1% failure
of the test will yield a lot more false positive results among the non-sick
population than correct positive results among the sick population.

I finally built the timer for the new Leffakone. It is based on an Arduino
Uno, which controls two reed relays and one LED. The reed relays can be
activated with a very low current (10&nbsp;mA), meaning that the Arduino can drive
them directly from any I/O pin. The relays' contacts are connected in parallel
to the power button and the reset button. The Arduino's serial-over-USB port
is connected to one of the USB headers of the motherboard with a home-made
cable, and the timer is set by software through this serial connection. All
the wires coming from the computer case's front panel are connected to the
circuit (to the 8-pin header protruding from the protoboard), and wires go
from there to the motherboard's front-panel header (2 white wires for the
power button, 2 grey wires for the reset button, and 2 blue+black wires for
the power-on LED. The two boards are screwed on the bottom plate of the case
of an old CD drive; for installation, I closed the case and put it into the
computer as a regular CD drive.

While the timer is counting down, it blinks the computer case's HDD LED (which
therefore is not anymore indicating the HDD activity).

When the timer expires, it closes the power button's relay for 500&nbsp;ms. An
optional watchdog timer would close the reset button's relay if the machine
does not boot correctly i.e., if the timer is not reset within 30&nbsp;s. This
watchdog timer is currently disabled in the code, since the problems I have
had with GRUB freezing on startup seem to be related to manually powering the
device and switching the TV on shortly after. I'll enable it if it seems
necessary. Here
is the code for the Arduino.

The software client
for the timer is written in Python and is very
straightforward: send ASCII digits to the serial port, ending with a newline
character. It interprets this number as a number of seconds, and starts
counting down. When disconnecting the client from the serial port, the Arduino
resets and forgets all about the timer value; I found out that setting the DTR
line to False in the Python Serial object prevents this from happeining. I
haven't however found out how to prevent a reset when connecting to the
Arduino; this is less a problem, since when I connect to it, I want to reset
the timer, and reseting the whole program does just that. It seems that it's
the Linux driver that asserts the DTR line when opening the serial port; I
haven't investigated further. It is worth noting that when the machine boots,
it does not reset the Arduino.

Finally, the cristal in the Arduino is accurate to 99.5% which is not enough
to guarantee that the timer will wake up the computer within a minute after a
countdown of several days. I therefore apply a corrective factor to the time
sent to the Arduino. The factor was estimated from a 15.5 hour countdown,
which lasted about 90s more than it should have. Over a 7-days countdown, it
would cause the timer to expire about 16 minutes too late.