*Press Any Key*
You have to walk before you run. You have to crawl before you walk. And
before you crawl, you have to press a single key. And literally, these
days some children do indeed do that before they crawl.
I jumped the gun in the previous tutorials. I expected you to be able to
press multiple keys on your keyboard in a coherent sequence. As if you
were some kind of seasoned pro or computer whiz! Well, let's rewind and
rectify that. Today, we are going to press a single key. We are not going
to press just any key (some people have trouble finding the "any" key
because there isn't one). We are going to press the J key.
*The J Key*
I don't just like the J key because my name starts with J. I like the J
key because it is the home row key on your dominant hand on your dominant
finger. It reminds you that to use a computer effectively, you should try
to keep your hands positioned on the home row as much as possible instead
of reaching for the arrow keys, the mouse, or a can of Dr. Pepper. You
should learn how to touch type as fast as you can. Try to minimize your
mouse use. This will force you to learn shortcut keys and other techniques
that will make your far more efficient than someone who is constantly
switching between hunt and peck typing and mousing through a maze of menus,
moving like molasses. It can be positively painful to perpetually perceive
these plodding, ponderous peasants pecking. (Okay, again, there are no
peasants.) Please persist in perfecting your performance or perish
pathetically.
*Press the J Key*
Fire up your Cygwin terminal (or whatever terminal emulator you prefer),
press the J key, and promptly release it. Some people get confused and
hold it down for ever. A lowercase j should appear in your terminal window
just after the command prompt. Don't press Enter. Don't press any other
keys. Stop and ponder what just happened. What the heck did just happen?
On my screen, the letter h appeared. WTF, man! What kind of teacher are
you, Jeff? You suck. I want a refund. This is bogus.
Apparently, when you press the letter J on your keyboard, magical little
elves do not deliver a letter J to your active window. Something far more
treacherous and mind-boggling is happening.
*Scan Codes, UARTs, and Bits--Oh My!*
You are not in Kansas anymore, Dorothy. Your simplistic view of the world
is a lie. This is your last chance. After this, there is no turning
back. You take the blue pill - the story ends, you wake up in your bed and
believe whatever you want to believe. You take the red pill - you stay in
Wonderland and I show you how deep the rabbit-hole goes. (Yes, being a
programmer whose last name is Anderson, people say "Mr. Anderson ..." to me
now and then. Sometimes they call me Neo. It's kind of like being that
guy name Michael Bolton on Office Space. Except Neo doesn't suck like
Michael Bolton.)
The rabbit hole is DEEP! I am just going to give you a high level tour of
it. Some of what you read may be lies. I don't really know what's
happening. I don't think any single person does anymore.
Okay, when you press a key, two contact points close completing an
electrical circuit. This causes a sequence of values (called scan codes)
to be transmitted to your computer. On a desktop, you probably have a USB
keyboard. USB means "universal serial bus". A bus is a set of
communication lines, and serial means that signals (information) travel
over them 1 bit at a time. Universal means that the same communication
mechanism can be used for a wide variety of devices: keyboards, mice,
external hard disks, web cams, printers, game controllers--just about
anything. However, clearly, this is the real reason USB was invented:
http://www.amazon.com/Doctor-Who-Tardis-USB-Hub/dp/B000F46CQM/.
So, the scan codes are information that gets transmitted over USB one bit
at a time. What's a bit? A bit is a single digit of base 2. That is a
binary digit. There are two possible values for each digit: 0 and 1. On
expensive Apple hardware, you also get the elite value 2 (just kidding).
We are used to talking about values in base 10 (aka, decimal), where there
are 10 possible values per digit: 0 to 9. Let's count to decimal 10 in
binary.
0: 0
1: 1
2: 10
3: 11
4: 100
5: 101
6: 110
7: 111
8: 1000
9: 1001
10: 1010
Suppose a scan code is 8 bits long. It can represent 2^8 = 256 distinct
values. A sequence of 8 bits is often referred to as a byte (although
bytes can be of different lengths than that).
How do these bits travel across the wire in the USB cable to the computer?
The bits travel across the wire as changes in voltage over time. Let's say
we hold a wire at +12V for a second. Maybe this is interpreted as a bit
with value 1 when the receiving end sees it.
Before USB and PCs, people connected to mainframes via real terminals (a
keyboard/display combination--but in fact in early terminals, the "display"
was essentially a remote controlled typewriter and was thus called a
teletype). These reals terminals were connected to said mainframes at a
distance by serial cables. At each end of the serial cable was a UART
(universal asynchronous transmitter/receiver). Computers were so expensive
back then that it made sense for multiple terminals to attach to a single
central computing resource. These days, we are used to everyone having
their own cheap computer where the keyboard, processer, and display are
combined into one unit. This was not always so.
Over a UART (RS-232) serial connection, -12V => 1 and +12V => 0. The
values are inverted from what you'd assume they'd be. Each byte is either
7 or 8 bits and is sent 1 bit at a time at some bit rate (bits/second).
Each byte is framed by a start bit (a 0) and 1 or 2 stop bits (also 0s),
but usually just one. When the wire is not in use, it is held at -12V
(1). An error check, the parity bit, may also be transmitted before the
stop bit(s). The UARTs can be set to even, odd, or no parity. If set to
odd parity, the sum of the data bits and the parity bit should be odd. For
even parity, the sum should be even. If you've ever downloaded large files
from the internet, this is the byte-level equivalent of a checksum. If one
side of the connection is transmitting bytes faster than the other side can
deal with them, that side can tell the other to stop or continue. This is
known as flow control. It can be implemented directly as hardware signals
(on a different wire) or in software (via special control bytes). The UART
on each side of a connection must be configured with the same settings: bit
rate, parity, stop bits, and flow control or else they cannot communicate
with each other. Just like if someone showed up at your house speaking
Chinese, you probably would not understand them. You both must agree on a
language (a protocol). Consider that C3PO in Star Wars is called a
protocol droid. That means he speaks all the languages.
Note that even at this basic level, we have already made the distinction
between bytes that represent data, and bytes that represent commands (flow
control). Also, we have seen that there is non-data overhead to
communication (start/stop/parity bits). These same concepts exist in other
communication protocols like Ethernet and TCP/IP which are the foundation
of the Internet that you waste most of your time on.
The concepts in a UART are so fundamental to computing, they actually made
us build one as part of my Computer Science degree. It's kind of like
building your own light saber, but more frustrating. I never want to do it
again.
Okay, so the Universal, Receive, and Transmit parts of UART make sense, but
what about this Asynchronous thing? Well, even though the individual bits
travel at a certain rate (according to a clock), and can thus be said to be
synchronous, the UART is transmitting bytes, not bits. The timing between
the bytes is totally arbitrary. Consider yourself typing at a keyboard.
You don't press each key one after the other at a constant unwavering speed
in an endless stream, like your boss wants you to. No, you get half way
through a line of code, get distracted by a cat hanging onto a ceiling fan
on YouTube, go to lunch, come back an hour later, can't remember what you
were typing because of the food coma you are experiencing, erase the line,
and start over. That's what asynchronous means.
If the clock on the receiving UART is running at a constant speed, but you
don't know when the sender will begin transmitting, how the heck does the
receiver know when the first bit begins? The way the receiver knows is
that it oversamples the line voltages at a rate higher than the
transmission rate. Since the non-transmitting voltage is always -12V, it
can approximately now detect the time when it changed to +12V (for the
start bit) and begin processing bits with that in mind. This also explains
why you need a start bit.
Also of note is the order of transmission of bits is from least significant
to most significant. This is the opposite of how we write numbers.
Assuming we were sending base 10 digits on the serial line, to send the
number 123, I'd have to send 3, 2, 1.
*Buffers, Interrupts, and Drivers*
All that to get one lousy byte! I am exhausted. This is harder than
mining Dogecoin. Jeez man, I am not immortal, tell me why the freakin' h
appeared on your screen!!!
We're still a ways off. I told you the rabbit hole was deep. Not my fault
your whole generation has ADD.
Okay, so now you have a byte on the receiving UART. Where is it? It's in
a buffer on the UART. In the early days, UARTs really did have a single
byte buffer. Memory was so expensive back then people really couldn't
afford more. And they walked uphill both ways 2 miles through 10 feet of
snow to get to school every day, where they had things called "books"
instead of iPads. Lucky you with 16GB of RAM in a computer you don't even
really know how to press a key on. Speaking of that computer, it needs to
see this super-expensive byte that just arrived. What to do? It turns out
your CPU has a number of interrupt request lines attached to it. When the
UART gets a byte, it uses one of the interrupt lines to signal the CPU that
it needs to do something. When this signal arrives, the computer stops
whatever it was doing (probably updating its Facebook profile) and jumps to
a specific instruction in its main memory. The machine code at that
instruction is the interrupt handler. Now, whoever made your UART probably
wrote some code to do whatever it is should happen when a byte shows up.
Either that or it was surely Linus Torvalds or Richard Stallman, because
they alone wrote all the software. This bit of code is known as a device
driver. Your operating system loads these drivers early on as it boots.
In doing so, makes it so the interrupt handler on that particular interrupt
request (IRQ) line will call the appropriate device driver code. Wow,
suddenly you have a modular way to extend the capabilities of your OS to
deal with a variety of hardware (possibly unknown future hardware). Pretty
spiffy.
In the old days, you used to have to configure your physical device and
driver/OS setting so that they agreed on what IRQ each device would use. I
was alive in those old days and did this. Given that humans were involved,
chaos ensued. Thankfully, now the OS just handles these settings for you.
This is known as "plug and play". At the time it was known as "plug and
pray" because it didn't always work as advertised.
Okay, sweet, the computer has a byte. How does that end up as a j
appearing on my screen? Well, let's remember that in the past the keyboard
and display (which combined are the terminal), were both connected to a the
remote computer via one serial cable. Also, the keyboard did not talk
directly to the display. So, no letter j has showed up on your screen
yet. What actually happens is the computer echoes this same value that you
just typed at the terminal (keyboard) back to the terminal (specifically,
the display part of it). That is, the serial cable is bidirectional: the
computer can talk back to the terminal. Unlike in this tutorial, where I
just talk and you listen because I don't want to hear any of your sass.
Now you have a hint as to why the Bash echo command is called echo and what
it does.
We kind of missed a step. The computer does not echo the raw scan codes
back to your terminal. It first figures out what they mean. The keyboard
driver maps the scan code to a key code. The key code is then passed
through whatever key map is active. The key may turns the key code into an
ASCII value (or maybe a Unicode value, or EBCDIC, or something else). An
ASCII value a small integer that represent an abstract symbol like the
letter J. The key map allows you to have different keyboard languages or
layouts. Such as Programmer Dvorak, which I use. So, when I press the J
key on my keyboard, it translates to the letter H. When you press it, it
probably translates to the letter J. A rose by any other name is still a
rose, and the letter H is still H even if it is called J on my keyboard.
Labels can be deceiving. Even the label on your keyboard falsely
advertises a capital J when it produces a lowercase j. Key-presser beware.
It gets worse. If you are some kind of deranged psychopath, you might
have other layers of key mapping in your setup. Like AutoHotkey or a text
expander. Maybe your keyboard itself has onboard macros. Pressing the J
key could cause your garage door to open. All these transformation layers
need to be correct for whatever you want to happen to happen when you press
a single key. With great power comes great responsibility.
*Display the J*
Your keystroke's journey is not over. Aren't you glad you followed my
advice and did not hit a second key? Consider the absolute mayhem that
would have unfolded then!
Your terminal has now received an ASCII value. To display this, it needs a
font. A default font will already be preloaded into the terminal. A font
is a set of particular visual representations of a particular set of
abstract symbols. What does the letter J look like anyway? It can look
like a miniature dump truck or an upside down smiley face if you have the
right font installed. We just associate the concept of J with a certain
visual representation. So, the ASCII value gets translated to a code point
in the font which yields a glyph. That value is stored at some location in
the display buffer (memory). Early display buffers had maybe only 80
columns and 24 rows. Maybe there was 2 bytes at each position. We're
talking at most 4K of memory. This explains why the earliest terminals
(teletypes) did not even have video displays. 4K of RAM would have been
insanely expensive back then. Whereas a graphics card today will have 2GB
of onboard memory. This glyph is then rendered to the display at its
location. The current state of the terminal affects how this glyph is
rendered on the screen. Certain terminal settings could cause it to blink,
be underlined, be a different color, etc. The glyph is translated into
individual pixels and the display driver of the terminal makes them light
up on the screen for 1/60th of a second, 60 times a second. Photons
interact with rods/cones in your eye transmitting electrochemical signals
to your brain, etc. Yada, yada: http://dilbert.com/2010-11-17/. Without a
display buffer, each character would disappear after 1/60th of a second.
That would not be useful.
No one except Methuselah (and maybe my dad) uses real terminals these
days. You probably have to go to a museum to actually see one. We all use
terminal emulators like Cygwin Terminal, xterm, PuTTY, and the like.
Still, a lot of the concepts from real terminals apply. A lot of the
existing Linux terminology is based on this terminalology, so it is useful
to know. Even this explanation is a gross oversimplification of everything
that is going on when you type a character into a computer, but hopefully
it is enough to help make Bash easier to understand.
*Pro Tip*
Now that you've worked up a sweat by pressed the J key, it's all smudged up
by your greasy, sweaty fingerprint. Don't leave your filth everywhere like
you do at home. Because the electronics in most keyboards is fairly
simple, you can actually run them through the dishwasher when they get
dirty (desecrated by your slime). Don't use detergent and maybe avoid the
heated drying cycle. Give them at least a day to air dry afterwards. I've
done this at least a dozen times, and it works great. Do not do this with
your super-fancy Logitech gaming keyboard with a fancy built-in display.
Actually, do because I'm jealous. Also, probably don't do this on wireless
keyboards. At least take the batteries out before you do. This method of
keyboard cleaning is actually a Dell-recommended procedure (my brother did
Dell tech support). So, I'm not just making this up (unlike the rest of
the information in this tutorial).
*Stay Tuned*
Even though we've talked about terminals and (key)strokes, this is not the
end--unless the information overload has killed you. In the next episode,
we will press a bunch of keys to compose an entire command and then hit
Enter. We'll likely see an error message.