With Unix like OSs, you can use the column command to format the layout; for example:

> column -t-s',' people-example.csv.txt
First Name Last Name Country age"Bob""Smith""United States"24"Alice""Williams""Canada"23"Malcolm""Jone""England"22"Felix""Brown""USA"23"Alex""Cooper""Poland"23"Tod""Campbell""United States"22"Derek""Ward""Switzerland"25

This post is one in a series of stuff formally trained programmers know – the rest of the series can be found in the series index.

Binary Search Tree

In the previous post, we covered a Binary Tree, which is about the shape of storing the data. The Binary Search Tree (BST) is a further enhancement to that structure.

The first important change is that the data we are storing needs a key; if we have a basic type like a string or number then the value itself can be the key and if we have a more complex class, then we need to define a key in that structure or we need to build a unique key for each item.

The second change is a way to compare those keys which is crucial for the performance of the data structure. Numbers are easiest since we can easily compare which is larger and smaller.

The third and final change is the way we store the items; the left node's key will always be smaller than the parent nodes key and the right node's key will be larger than the parent node.

As an example, here is a BST using just numbers as keys:

Note that all nodes to the left are smaller than their parent and all parents above that.

Why?

So, why should we care about a BST? We should care because searching is really performant in it as each time you move a depth down, you eliminate approximately 50% of the potential nodes.

So, for example, if we wanted to find the item in our example with the key 66, we could start at the root (50) and move right. At that point, we have eliminated 8 possible nodes immediately. The next is to the left from the node with the 70 (total possible nodes removed 12). Next is to the right of the node with the value of 65, and then to 66 to the left of 67. So we found the node with 5 steps.

Going to Big O Notation, this means we achieved a performance of close to O(log n). It is possible to have a worst case of O(n), when the tree is not Optimal or Unbalanced.

Balanced versus Unbalanced

In the above example we have a binary search tree which is Optimal, i.e. it has the lowest depth needed. Below we can see a totally valid BST; Each child node is to the right of the parent because it is bigger than the parent.

This, however, will result in a O(n) search performance which is not ideal. The solution is to rebalance the BST and for our example above we can end up with multiple end states (I'll show two below). The key takeaway is that we go from a depth of 5 to a depth of 3.

Implementations

.NET has a built-in implementation with SortedDictionary. Unfortunately, nothing exists out of the box for this in JavaScript or Java.

This quick tip is about two small features of Git I wish I had known about earlier as it makes it way easier to do searching through it.

git-grep

git-grep is a way to search through your tracked files for whatever you provide. For example, if we want all files with the word index in it: git grep index

We can limit to specific files, for example, if we want to filter the above example to just JSON files: git grep index -- '*.json'

We can search for multiple items in a single file, for example, if we want to find all files with index and model in it: git grep --all-match -e index -e model

git-log grep

git-log has a grep function too which is awesome for finding commit messages with a specific word or words in it. For example, if I want to find all commits about Speakers for DevConf I could do: git log --all --grep "Speaker"

Since redoing this blog, I switched out the syntax highlighting to use the Drupal Geshi Module.

For the love of everything I can't remember the tricks for using it, so here is a cheatsheet; mostly for myself but maybe you get value too. These are all HTML attributes you add can to your code block.

language this controls the language for rendering.

line_numbering controls if line numbering is off, on or fancy with the values off, normal and fancy respectively.

With fancy line numbers you can use the attribute interval to control how often to show the line numbers.

title adds a title to the code block.

special_lines takes a comma-separated list of numbers and highlights them.

Unix like OSs

Unix OSs, including MacOS, WSL & Linux, include an awesome calculator called BC. From the man page:

bc is a language that supports arbitrary precision numbers with interactive execution of statements. There are some similarities in the syntax to the C programming language. A standard math library is available by command line option. If requested, the math library is defined before processing any files. bc starts by processing code from all the files listed on the command line in the order listed. After all the files have been processed, bc reads from the standard input. All code is executed as it is read.

The only cavet to use, is the file input; you can't just pass in parameters... but you can use echo to pass in the equation. For example:

>echo'1 + 2 + 3 + 4'|bc>10

bc can also work with different number bases, for example:

>echo"obase=2; ibase=10; 42"|bc>101010

obase stands for output base & ibase stands for input base. So in the example, we are turning 42 (base 10) to binary.

Floating point division is a weirdness with bc. For example, you would expect the answer to be 0.4 below but it is 0:

>echo"2/5"|bc>0

The solution is to use the math library switch -l:

>echo"2/5"|bc-l> .40000000000000000000

and if 20 point position, you can use scale to control it:

>echo"scale=3; 2/5"|bc-l> .400

Windows Command Prompt

Command prompt has a similar tool with the set command.

>set/a 3+36>set/a (3+3)*318>set/a "203>>3"25

PowerShell

PowerShell natively supports some basic functionality, but if you want to use more advanced functionality you can use the entire System.Math class to do a lot of functionality.

If you are getting the too many open files error with MacOS it could be VSCode trying to too many open files (or by default opening more than 10240 files).

You can confirm that with the following:

lsof |awk'{ print $2 " " $1; }'|sort-rn|uniq-c|sort-rn|head-20

So, what can you do about it? If the files are not important, say it is your output folder, then you can use VSCode settings to exclude them. In the example below, I configure VSCode to ignore build folders. I would encourage this as a workspace setting, so everyone in the team gets it.

After using a MacBook Pro for two years I thought it was time to share what utilities I found really useful to have. These are obviously weighted towards being a software developer, so your mileage might vary.

Brew

It is the missing package manager for MacOS, so as with NPM, Chocolatey, or Composer, where you can install what you need via the command line.

It may seem weird, like what is wrong with just download and install what you need?! The advantage is that you can write this stuff down so that if you need to reinstall it is easier (and also easier to share to help others get up and running).

A second advantage is updating, it takes one command to update all the tools I use.

Aerial

The AppleTV has the best screensaver I've ever seen, and some smart person ported it to MacOS with the name Aerial.

A word of warning, these videos are massive and will destroy your bandwidth. One tip to solve that is that under the settings is a Cache section - make sure you have the Cache Aerials As They Play checked else this will destroy your bandwidth. If you are on uncapped, then there is also a download now option which is a must to use.

Fish

Fish Node Manager

Part of my job has involved working with multiple projects, and that means multiple versions of Node, and that was a pain. Thankfully there is a Node Manager for Fish that lets you easily change what version of Node you are using.

Unfortunately, this isn't as easy to setup, as to install it you first need Fisherman, which is like Brew but for Fish; which leads to this 3 step process to install it and configure it.

Amphetamine

Amphetamine is a massively useful tool for MacOS, especially in a DevOps culture where you might get up in the night and just need your machine to behave the exact way you want it. Its core use is to not let your Mac go to sleep and you can control what triggers that, automatically or manually.

This post is one in a series about stuff formally trained programmers know – the rest of the series can be found here.

Binary Tree

In the previous post we looked at the tree pattern, which is a theoretical way of structuring data with many advantages. A tree is just a theory though, so what does an actual implementation of it look like? A common data structure implementation is a binary tree.

The name binary tree gives us a hint to how it is structured, each node can have at most 2 child nodes.

Classifications

As a binary tree has some flexibility in it, a number classifications have come up to have a consistent way to discuss a binary tree. Common classifications are:
- Full binary tree: Each node in a binary tree can have zero, one or two child nodes. In a full binary tree each node can only have zero or two child nodes.
- Perfect binary tree: This is a full binary tree with the additional condition that all leaf nodes (i.e. nodes with no children) are at the same level/depth.
- Complete binary tree: The complete binary tree is where each leaf node is as far left as possible.
- Balanced binary tree: A balanced binary tree is a tree where the height of the tree is as small a number as possible.

Implementations

While a binary tree is more than just a pattern, there are no out of the box implementations in C#, Java or JavaScript for it. The reason is that it is a very simple data structure and so if you need just the data structure you could implement it yourself but more importantly, you likely want more than the simple structure - you want a structure that optimises for traversal or data management.

References

This post is one in a series about stuff formally trained programmers know – the rest of the series can be found here.

Trees

This post will look at the mighty tree, which is more a pattern than a specific data structure. The reason to understand the pattern is that so many of the data structures we will look at in the future use it that a good understanding of it provides a strong basis to work from.

As a computer user though, you already have seen and used a tree structure - you may have just not known it. The most common form of it is the file system, where you have a root (i.e. / or C:\) and that has various folders under it. Each folder itself can have folders, until you end at an empty folder or a file.

This is the way a tree structure works too: you start with a root, then move to nodes and finally end with leaves.

In the basic concept of a tree there are no rules on the nodes and the values they contain, so a node may contain zero, one, two, three or a hundred other nodes.

What makes a tree really powerful, is that it really is a collection of trees. i.e. if you take any node it is in itself a tree and so the algorithms used to work with a tree work with each node too. This enables you to work with a powerful computer science concept, recursion.

Recursion

Recursion is a concept that lacks a real world equivalent and so can be difficult to grasp initially. At its simplest for these posts, it is a method or function which calls itself, until instructed to stop. For example, you might write a function called getFiles which takes in a path to a folder and returns an array of filenames. Inside getFiles it loops over all the files in the folder and adds them to a variable to return. Then it loops over all the folders in that folder and for each folder it finds, it calls getFiles again.

Implementations

It doesn't make sense to talk about coding implementations at this point since this is more a pattern than a structure and we would need a lot more information on what we want to achieve to do actually go through a code implementation. That said, it is interesting to see where trees are used:
- File systems
- Document Object Models (like HTML or XML)