Ramblings from Jessiehttps://blog.jessfraz.com/index.xml
Recent content on Ramblings from JessieHugo -- gohugo.ioen-usTue, 10 Sep 2019 08:09:26 -0700Tales from Firmware Camphttps://blog.jessfraz.com/post/tales-from-firmware-camp/
Tue, 10 Sep 2019 08:09:26 -0700https://blog.jessfraz.com/post/tales-from-firmware-camp/<p>Last week I attended the <a href="https://osfc.io/">Open Source Firmware Conference</a>.
It was amazing!
The talks, people, and overall feel of the conference really left me feeling
inspired and lucky to attend.</p>
<p>Having been pushed to attend vendor conferences and trade shows through my
career for various jobs, it was so refreshing to have the chance to hang out
with folks from such a genuine community that really just want to help one
another.</p>
<p>When the talks hit YouTube you should be sure to check them
out (<a href="https://twitter.com/jessfraz/status/1169361763680210944">I also</a>
<a href="https://twitter.com/jessfraz/status/1168925785211772929">tweeted</a>
<a href="https://twitter.com/jessfraz/status/1168934537415593987">about</a>
<a href="https://twitter.com/jessfraz/status/1168958435288915970">a few</a>
<a href="https://twitter.com/jessfraz/status/1169030969535488002">of them</a>). What
I will focus on in this post was the last two days of the conference that were
devoted to the hackathon.</p>
<p>I had bought a <a href="https://www.supermicro.com/en/products/motherboard/X10SLM-F">X10SLM-F Supermicro board</a>
off of eBay a few months ago that I wanted to run CoreBoot on. If you are
interested in finding a board that will work with CoreBoot, you should check
its <a href="https://coreboot.org/status/board-status.html">status on the status page</a>.
I had
been talking to <a href="https://twitter.com/_zaolin_">Zaolin</a> about wanting a board
to hack on and he recommended this one.</p>
<p>At the hackathon, we decided to start with the BMC instead of the CPU BIOS.
This made for some
fun problems and definitely a lot of lessons learned. I had a
<a href="https://www.dediprog.com/product/SF100">Dediprog SF100</a> flash programmer I brought to the
hackathon as well. Some people use RaspberryPis as their flash programmer but
the Dediprog was recommended to me and definitely came in handy. However, if
you want a cheaper alternative there are a bunch of ways you can skin that cat.</p>
<p>To get started, we read the original binary off the SPI flash&hellip; this worked fairly
simply. We used the opensource <a href="https://github.com/DediProgSW/SF100Linux"><code>dpcmd</code></a> tool
from dediprog to do it. But you could also use <a href="https://github.com/flashrom/flashrom"><code>flashrom</code></a>.</p>
<p>While inspecting the original binary, we found the string <code>linux</code> a few
times&hellip; as well as a MAC adress, boot commands, IP address, and some other
interesting strings.</p>
<p>Before flashing on new firmware we also made sure the board actually booted the
BMC. We didn&rsquo;t have access to any console so we made due with an IPMI LAN port
and dnsmasq to work with DHCP. It worked and we got into the BMC user interface
over the web. If you&rsquo;ve ever used a Supermicro server I probably don&rsquo;t need to
tell you that it&rsquo;s a piece of shit running a 2.6 linux kernel on the BMC.
Getting to the UI proved the board actually booted with the original BMC firmware
so we began to break it by trying to run <a href="https://github.com/openbmc/openbmc">OpenBMC</a>.</p>
<p>Our board has a ASPEED 2400 BMC. We chose a OpenBMC configuration that would
give us a kernel supporting that chip. Thanks <a href="https://github.com/shenki">Joel Stanley</a> for all your work on the kernel patches for all the BMCs. We flashed the SPI flash with our new BMC firmware image and attempted to power on the board.</p>
<p>I am going to interrupt the story here for a second to explain the pain
involved with this development cycle of writing to firmware to SPI flash.
The SPI flash is 16MB <em>but</em> requires erasing the
previous contents (4KB per sector) before you can even write.
A delete cycle of a sector is
120ms per sector at the worst. So that&rsquo;s definitely not ideal and anything you can do to
make this faster is very much so ideal. Most flash programmers will not rewrite
a sector if its contents have not changed which helps, but still super
painful coming from the workflow of a software developer.</p>
<p>Back to our board&hellip; our OpenBMC image we flashed didn&rsquo;t work. Again, a lot of this would have been easier to debug with
a serial console but we didn&rsquo;t have one and we didn&rsquo;t have the spec to get a UART.
Our assumption from this failure was that the IPMI LAN port we
were using was not the same port configured for that specific
configuration.</p>
<p>So we went to build a custom kernel&hellip;</p>
<p>With the help of Joel we built a custom kernel completely separate from OpenBMC,
however we flashed the kernel directly to the SPI
flash without even u-boot, LOL&hellip; obviously this didn&rsquo;t work.</p>
<p>Then we decided to try something easier and had a hunch a different
configuration in the OpenBMC project would have the right port enabled. We built
the image for that and flashed it onto the SPI flash. This was arguably faster
than making our own OpenBMC configuration with our new kernel.</p>
<p>It also didn&rsquo;t work, but here we got into a bit more trouble. After this point we
could no longer write to the SPI flash. The problem was the BMC was interfacing
with the SPI flash and we couldn&rsquo;t take over the ability to write to it. The
SPI flash only allows one device to interact with it at a time. We also could
not flash the SPI flash without the board powered on because the entire board
was pulling power which was too much for our flash programmer to handle. <em>This</em>
is a huge pain in the ass. It turns out it is <em>such a pain in the ass</em> that people
have made solutions for it.</p>
<p>Fortunately for us, <a href="https://github.com/felixheld">Felix Held</a> had just
given a talk on this pain the day before and he was also in the room. He had one
more prototype of his tool, <a href="https://github.com/felixheld/qspimux">qspimux</a>,
and we got to use it on our board.</p>
<p>Qspimux allows for the access to a real SPI flash chip to be multiplexed
between the target and a programmer that also controls the multiplexer. This
way we could flash the SPI flash with the board powered off.</p>
<p>To get his tool installed we had to de-solder the SPI flash and
solder it back on after getting the qspimux parts attached. Props to <a href="https://github.com/edwin-peer">Edwin
Peer</a> for his awesome soldering skills here.
Here is a live action shot&hellip;</p>
<p><blockquote class="twitter-tweet"><p lang="en" dir="ltr">It&#39;s been a journey, desoldered the flash for the BMC now using Felix Held&#39;s qspimux&hellip; so the BMC doesn&#39;t interfere with the flash, so we can actually flash it! <a href="https://t.co/M2mezEMeLa">https://t.co/M2mezEMeLa</a> <a href="https://t.co/iL1xBQzAwh">pic.twitter.com/iL1xBQzAwh</a></p>&mdash; jessie frazelle 👩🏼‍🚀 (@jessfraz) <a href="https://twitter.com/jessfraz/status/1170074325895925760?ref_src=twsrc%5Etfw">September 6, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>After finishing this, we could write to the SPI flash again. At this point we
were trying to re-flash the original Supermicro flash onto the board, just to
make sure we didn&rsquo;t mess anything up along the way. This
proved to be more difficult than we thought. We got the firmware to write to
the flash but the board still wasn&rsquo;t working. We verified with the oscilliscope
that data was indeed leaving the MOSI (master-out-slave-in) pin and the clock
was working on the flash.</p>
<p>Then I tried to read the firmware back from the SPI flash chip to make sure it was
indeed our original flash. We suspected that maybe we were writing to the
device too quickly. This was indeed the case. The two firmware images did not
match. I then wrote the firmware to the SPI flash on the slowest setting just
to be sure. Then I could actually verify the image we wrote and the image we
read back matched our original firmware image. At this point everything was
kosher and we knew the image on the SPI flash chip was indeed the same as the
original we pulled off the board the day before.</p>
<p>At this point the board was still not booting the original firmware image. This
is when we had to go home and firmware camp was over. Overall, this was a great
learning experience. I would have been sad had everything gone smoothly because
we would not have learned as much about how to debug all the components of the
SPI flash and board. I definitely have not given up on this board and will
continue down this rabbit hole until it has open source firmware on the BMC and
open source BIOS for the CPU.</p>
<p>I would like to thank everyone at the Open Source Firmware Conference for
making this a truly amazing week and specifically those who helped with the
crazy hackathon project: <a href="https://github.com/kc8apf">Rick Altherr</a>,
<a href="https://github.com/edwin-peer">Edwin Peer</a>,
<a href="https://github.com/shenki">Joel Stanley</a>,
<a href="https://github.com/felixheld/">Felix Held</a>,
<a href="https://github.com/bcantrill">Bryan Cantrill</a>,
Jacob Yundt (who I can&rsquo;t seem to find online),
<a href="https://github.com/jclulow">Joshua M. Clulow</a>, and everyone else I am
forgetting who gave us wires, clips, cords, and whatever else we
needed to get this thing going! It truly takes a village.</p>
<p>I cannot wait for the next OSFC, but until then I will work on playing with
a logic analyzer to see if what the BMC is reading from the SPI flash is even
the right data ;)</p>
Transactional Memory and Tech Hype Waveshttps://blog.jessfraz.com/post/transactional-memory-and-tech-hype-waves/
Wed, 14 Aug 2019 08:09:26 -0700https://blog.jessfraz.com/post/transactional-memory-and-tech-hype-waves/
<p>At lunch today I learned about Transactional Synchronization Extensions (TSX)
which is an implementation of transactional memory. The conversation started as a rant
about why transactional memory is bad but then it evolved into how this concept
even came to be and how it even got implemented if it&rsquo;s such a terrible idea.</p>
<h2 id="what-is-transactional-memory">What is transactional memory?</h2>
<p>First let&rsquo;s start by going over what transactional memory is.</p>
<p>You might be familiar with a deadlock. A deadlock occurs when a process or thread is waiting
for a specific resource, which is also waiting on a different resource that is
being held by another waiting process. You can think of this as P1 needs R1
and has R2, while in turn P2 needs R2 and has R1. That is a deadlock.</p>
<p>Transactional memory removes the possibility of getting a deadlock and replaces
it with what is known as a livelock. A livelock happens when processes are constantly
changing with regard to one another but neither of them move forward or
progress in anyway. Imagine you are walking down the street while another
person is heading towards you. You move to the right to avoid running into them
as they also move in that direction to avoid running into you. You both then
move to the other side so as to not run into each other. This repeats over and
over again with no progress forward since both people are moving in the
same direction. That is a livelock. With transactional memory you no longer
have deadlocks but livelocks.</p>
<p>Why is this? Well, transactional memory works very similarly to database
transactions. A transaction is a group of operations that can execute and
commit changes as long as there are no conflicts. If there is a conflict, it
will start from state zero and try to run again until there are no conflicts.
Therefore, until there is a successful commit of a run, the outcome of any
operation is speculative.</p>
<p>Intel&rsquo;s implementation of TSX behaves in such a way that when a transaction
aborts due to a hardware exception, it
does not fire typical exceptions. Instead, it invokes a user-specified abort handler
without informing the underlying OS. This seems like it might lead to some
really bad behavior&hellip; we should probably know wtf is going on
in our system at any given point in time.</p>
<h2 id="side-channel-attacks">Side-Channel Attacks</h2>
<p>So we know the outcome of any operation in a trasaction is speculative.
Hmmm speculative you say&hellip; I am reminded of spectre and meltdown.
The solution in the kernel for defending against spectre and meltdown
was Kernel Page Table Isolation (KPTI). Instead let&rsquo;s focus on what you can break with Spectre and meltdown which is Kernel Address Space Layout Randomization (KASLR). KASLR randomizes
the address layout per each boot. This raises the bar for an exploit forcing
an attacker to guess where the code and data are located in the address space.
The probability of an attack then becomes the probability of an information
leak multiplied by the probibility of a memory curruption vulnerability.</p>
<p>However, this can be exploited without an information leak but instead using
a translation lookaside buffer (TLB) and a timing attack. A TLB
is a memory cache that reduces the time taken to access a user memory location.
It keeps recent translations of virtual memory to physical memory.</p>
<p>In the <a href="https://gts3.org/assets/papers/2016/jang:drk-ccs.pdf">DrK paper</a>, the
authors describe an attack that uses the behavior of TSX as a <em>feature</em> of the
exploit. As described above, TSX has the behavior of aborting a commit without leaving any trace as
to why it was aborted. So in DrK, the
authors use TSX to create a bunch of access violations of the privileged
address space inside transactions and turn that into knowledge of mapping and executable status
of the address space
<em>without</em> even generating a page fault.</p>
<p>The point I am making with this example is that transactional memory and it&rsquo;s
implementation TSX are a bad idea.</p>
<p>But who could have possibly seen this as a bad idea?</p>
<h2 id="rewind-to-2008">Rewind to 2008</h2>
<p>Concurrency is the biggest hype in town. This comes from a lot of different
things but can be found in an article, <a href="https://dl.acm.org/citation.cfm?id=1378724">Technical perspective: Transactions are
tomorrow&rsquo;s loads and stores</a>,
in Communications of the ACM (CACM). It seems at the time, this craze was
started out of academia. Some practitioners, Bryan Cantrill and Jeff
Bonwick, wrote rebuttles in the name of &ldquo;please dear god do not make
transactional memory A Thing&rdquo;.
That can be seen in Bryan&rsquo;s blog post,
<a href="http://dtrace.org/blogs/bmc/2008/11/03/concurrencys-shysters/">Concurrency’s Shysters</a>,
and the follow-up ACM Queue article, <a href="https://queue.acm.org/detail.cfm?id=1454462">Real-world Concurrency</a>.</p>
<p>Clearly, in 2008 there was a division between academia and
practitioners.</p>
<h2 id="fastforward-to-2012">Fastforward to 2012</h2>
<p>Intel shipped TSX in <a href="https://software.intel.com/en-us/blogs/2012/02/07/transactional-synchronization-in-haswell">February 2012</a>.</p>
<p><strong>EDIT:</strong> It was pointed out that <a href="https://hydraconf.com/2019/talks/2jix5mst7iduyp9linqhfj/">Azul shipped transactional memory in 2006</a>. Thanks <a href="https://twitter.com/davidcrawshaw/status/1161827880608735232">@davidcrawshaw</a>!</p>
<h2 id="why-is-this-interesting">Why is this interesting?</h2>
<p>Hype cycles come and go and if you spend anytime in our industry you tend to
become pretty numb to them. Seeing through the hype has always been a joy of
mine and I find it interesting the vectors through which hype travels have
changed drastically over time.</p>
<p>With transactional memory, the hype began in academia through academic
conferences and articles in journals. Before the 2000s even, hype might have
spread through magazines like Byte. Today, we have multiple channels for hype
through social networks: Twitter, Reddit, blogging, YouTube, GitHub,
Hacker News (slashdot before
that), and others.</p>
<p>Hype seems to travel through the unconscious need of people to connect to
others. Being a part of movements, like open source projects and a shared sense
of need, allows people to be a part of something bigger than just themselves.</p>
<p>Twitter is fascinating due to the way it hosts so many subcultures. One of my
favorite examples of this is Canadian twitter where everyone is polite and nice
to each other. There are also vehemet subcultures around the latest technology
trends. The way technology can spread has turned from a place where very few
people have a voice (through getting papers accepted at conferences and in
journals) to social networks where everyone has a voice. My hope is that the
loudest of the voices are the ones used to build technology for the best
causes.</p>
<p>I&rsquo;ll leave you with that, hope you enjoyed and learned something fom
my rather weird example of a technology hype wave.</p>
The Business Executive's Guide to Kuberneteshttps://blog.jessfraz.com/post/the-business-executives-guide-to-kubernetes/
Tue, 23 Jul 2019 08:09:26 -0700https://blog.jessfraz.com/post/the-business-executives-guide-to-kubernetes/
<p>Hello!</p>
<p>I thought it would be fun to write a post aimed towards business leaders making technology decisions for their
organizations. There is a lot of hype in our field and little truth behind the hype.</p>
<p>Like most things I write about, this started from an idea I had on Twitter:</p>
<p><blockquote class="twitter-tweet"><p lang="en" dir="ltr">has anyone ever done technical breakdowns of these products in Gartner reports that are actually just trash, is this something you&#39;d read..?</p>&mdash; jessie frazelle 👩🏼‍🚀 (@jessfraz) <a href="https://twitter.com/jessfraz/status/1153866738452221952?ref_src=twsrc%5Etfw">July 24, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>This post will cover some hard truths of Kubernetes and what it means for your organization and business.
You might have heard the term &ldquo;Kubernetes&rdquo; and you might have been led to believe that this will solve all the
infrastructure pain for your organization. There is some truth to that, which will not be the focus of this post. To get
to the state of enlightenment with Kubernetes, you need to first go through some hard challenges.
Let&rsquo;s dive in to some of these hard truths.</p>
<h2 id="stateful-data-is-hard">Stateful Data is Hard</h2>
<p>Kubernetes is not to be used for stateful data. There has been a lot of work done in this area
but it is still not sufficent. For the more technical members of our audience I direct you to
<a href="https://github.com/kubernetes/kubernetes/issues/67250">exhibit A</a>. The linked issue goes over
problems when a &ldquo;StatefulSet&rdquo; gets into an error during deploying or upgrading. This can lead to data
loss or corruption since Kubernetes will need manual intervention
to fix the state of the deployment. This could even lead to the point where the only recommended fix is you <em>delete the state</em>.
What does this mean for your business? Well, if you lose or corrupt your data it could mean a lot of different things depending
on what the data was. If the data was your customer database of new account signups, well you might have just lost the data for
your new customers. If you are an ecommerce site, it might have been your latest sale. If you are in banking or investments,
it might have been data accounting for the movement of capital.</p>
<p>Databases holding valuable information like the examples above should always have mechanisms for replication which is not
something Kubernetes is going to solve for you. While you might choose to use Kubernetes for stateful data, you should always remember
to handle replicating that data in case there is a failure.</p>
<h2 id="exposed-dashboards">Exposed Dashboards</h2>
<p>A lot of organizations are dipping their toes into Kubernetes but forgetting to disable or secure the dashboard for the control plane
from the rest of the internet. The control plane dashboard is a website you can navigate to that controls your cluster.
Leaving the dashboard exposed to the public can have huge implications on your business. If your dashboard is exposed, <em>anyone</em>
could find your dashboard and then control it. Finding an exposed dashboard is not that difficult if you know what you are looking for
and have access to a site like <a href="https://www.shodan.io/">shodan</a>.</p>
<p>What would the finder of the dashboard control? Everything running in Kubernetes. If your website is running in Kubernetes, it
means someone else could make your website go offline, someone else could replicate your website but send all sales and monetary
transactions to their own bank account, someone else can breach your customers&rsquo; data, or someone else could hold your
infrastructure up for ransom and not give you back control of your website
unless you pay what they demand. This is just a few things I thought of off the top of my head but you could probably think of more.</p>
<p>There is a whole other aspect of this in that if this breach goes public, then you have a huge public relations
problem on your hands. Which for a public company might even have implications on your stock price
if shareholders end up losing trust from the news of your company&rsquo;s technical incompetence and they decide to sell their shares.</p>
<p>If it&rsquo;s not the dashboard being exposed it might be your API server or another service. There&rsquo;s a few
options for this particular failure mode.</p>
<h2 id="upgrading-your-kubernetes-version-seems-to-always-break-something">Upgrading your Kubernetes version seems to always break something</h2>
<p>I&rsquo;ve heard from a bunch of people that whenever they need to upgrade their production environment of Kubernetes it always leads to something breaking.
It&rsquo;s recommended that you have <a href="https://twitter.com/kelseyhightower/status/1138586423978672129">more than one cluster in production</a> for this very reason.
Then, if one cluster in production is broken from being upgraded, the other cluster that has not been upgraded is still running the technical parts of
your business. This is very good from a reliability point of view.
It means reaching your website has a &ldquo;plan B&rdquo; where if the &ldquo;plan A&rdquo; infrastructure has a problem, everything
will be redirected to &ldquo;plan B&rdquo; and your customers will not even know the difference. As a downside, your operations teams
now have to figure out ways for managing and maintaining two clusters (more work for them) but your business is
in a better place for it.</p>
<p>The other option is you just don&rsquo;t upgrade. However, if you don&rsquo;t upgrade, your infrastructure might be vulnerable to
security threats and then we are back in the situation above where you might have data breached by hackers, a hostile takeover of your
website, and then a huge public relations scandal leading to investors and shareholders selling their stock.</p>
<h2 id="steep-learning-curve-complexity-is-king-and-operational-pain">Steep learning curve, complexity is king, and operational pain</h2>
<p>A lot of the criticism I hear about Kubernetes is how complex it is. For your organization, this means
your staff are going to have to surmount this very steep learning curve. As with learning anything, things only
get worse before they get better. So get ready for a lot of production outages and failovers as your team starts to
learn the ins and outs of this overly complex system. What does this mean for your website and customers? Availability will
be spotty for awhile but we hope <em>eventually</em> it will even out. Lastly, to quote someone very wise (send a pull request if you know who!), &ldquo;Hope is not a strategy.&rdquo;</p>
<h2 id="managed-kubernetes">Managed Kubernetes</h2>
<p>Now you are probably thinking, &ldquo;my cloud provider said they&rsquo;d take away all the pain you just described by selling
me their managed Kubernetes.&rdquo; That is indeed the dream. However, it is not reality. Having worked for some cloud providers,
I have seen the pain customers still go through trying to learn the patterns Kubernetes implements and applying
those patterns to their existing applications. This means your teams will still have to handle the steep learning curve. Just
because it&rsquo;s managed does not mean that your application&rsquo;s uptime and availability are covered. That is still on <em>your</em> team.
Customers being able to use your website on the internet is your team&rsquo;s responsibility and understanding
Kubernetes is still required for that. For every line of YAML written and debugged to get your website running, it is time
that is being taken away from building on what your business actually does. Unless of course you are a business
of selling Kubernetes, then if so, carry on.</p>
<p>You will also want to be sure your cloud provider did not fall prey to the pitfalls I outlined above as well.
You should make sure your cluster is fully isolated from other customer&rsquo;s clusters. The way the managed Kubernetes offerings
work is by the cloud provider managing the &ldquo;master&rdquo; for your cluster. This means all the data for your cluster is managed by
your cloud provider. If your data is not properly isolated from all the other customer&rsquo;s data, it means that
if the cloud provider gets breached by means of a different customer&rsquo;s cluster then your data has been breached as well.
Then, we are in the scenario where a hacker owns your website, can hold it for ransom, or cause a very public incident
for your company that you will need to handle.</p>
<p>This was just a brief overview and I am not trying to throw shade. I merely wanted to phrase some of these prevalent problems
in a way that people running a business might be more aware of the impact adopting this technology might have. It should not be understated, if your organization does tackle these difficulties (and others I didn&rsquo;t mention), then you will possibly see
great impact on developer productivity, faster feature releases and deployments (among all the other wins Kubernetes can provide).
Just be aware that with the good, comes some bad.</p>
Linux Observability with BPFhttps://blog.jessfraz.com/post/linux-observability-with-bpf/
Wed, 10 Jul 2019 11:25:24 -0400https://blog.jessfraz.com/post/linux-observability-with-bpf/<p>Below is the foreward for the new book on
<a href="http://shop.oreilly.com/product/0636920242581.do">Linux Observability with BPF</a>
by two of my favorite programmers,
<a href="https://twitter.com/calavera">David Calavera</a> and <a href="https://twitter.com/fntlnz">Lorenzo Fontana</a>!
I was pretty stoked about getting to write the foreward, I asked
O&rsquo;Reilly if I could publish it on my blog as well and they said yes. I hope you all check out this
book and share what you&rsquo;ve built after!</p>
<p>As a programmer (and a self confessed dweeb) I like to stay up to date on the latest additions
to various kernels and research in computing. When I first played around with Berkeley Packet
Filters (BPF) and Express Data Path (XDP) in Linux I was in love. This is such a NICE THING
and I am glad this book is putting BPF and XDP on the center stage so more people can start
using it in their projects.</p>
<p>Let me go into detail about my background and why I fell in love with these kernel interfaces&hellip;
I worked as a Docker core maintainer, along with David (one of the brilliant authors of this book).
Docker, if you are not familiar, shells out to iptables for a lot of the filtering and routing logic for containers.
The first patch I ever made to Docker was fixing a problem where a version of iptables on CentOS didn’t have the same
command-line flags so writing to iptables was failing. There were a lot of weird issues like this and anyone
who has ever shelled out to a tool in their software can likely commiserate. Not only that, having
thousands of rules on a host is not what iptables was built for and has performance side effects because of it.</p>
<p>Then I heard about BPF and XDP. This was like music to my ears.
No longer would my scars from iptables bleed with another bug! The kernel community
is even working on
<a href="https://cilium.io/blog/2018/04/17/why-is-the-kernel-community-replacing-iptables/">replacing iptables with BPF</a>!
Halleluyah! <a href="https://cilium.io/">Cilium</a>, container networking,
is using BPF and XDP for the internals of their project as well.</p>
<p>But that’s not all! BPF can do so much more than just fulfilling the iptables use case.
With BPF, you can trace any syscall or kernel function as well as any user-space program.
<a href="https://github.com/iovisor/bpftrace">bpftrace</a> gives users dtrace-like abilities in Linux from their command line.
You can trace all the files that are being opened and the process calling the open,
count the syscalls by the program calling them, trace the OOM killer, and more… the world is your oyster!
XDP and BPF are also used in <a href="https://blog.cloudflare.com/l4drop-xdp-ebpf-based-ddos-mitigations/">Cloudflare</a> and
<a href="https://cilium.io/blog/2018/11/20/fb-bpf-firewall/">Facebook’s</a> load balancer to prevent DDoS attacks. I won’t spoil why
XDP is so great at dropping packets because you will learn about that in the XDP and networking chapters of this book
(<em>cough</em> you don&rsquo;t even allocate a kernel struct <em>cough</em>)!</p>
<p>Lorenzo, another of the authors, I have had the privilege of knowing each other through the
Kubernetes community. His tool, <a href="https://github.com/iovisor/kubectl-trace">kubectl-trace</a>, allows users to run their custom tracing programs
easily inside their kubernetes clusters.</p>
<p>Personally, my favorite use case for BPF has been writing custom tracers to prove to other
folks that the performance of their software was not up to par or making really expensive
amounts of calls to syscalls. Never underestimate the power of proving someone wrong with hard data.
Don’t fret, this book will walk you through writing your first tracing program so you can do the same ;).
The beauty of BPF lies in the fact that before now other tools used lossy queues to send sample sets to user
space for aggregation whereas, BPF is great for production since it allows for constructing histograms and filtering
right at the source of events.</p>
<p>I have spent half of my career working on tools for developers. The best tools allow autonomy in their interfaces
for developers like you to use them for things even the authors never imagined. To quote Richard Feynman,
“I learned very early the difference between knowing the name of something and knowing something.”
Until now you might have only known the name BPF and that it might be useful to you. What I love about this book is
that it gives you the knowledge you need to be able to create all new tools using BPF.</p>
<p>The best books don’t confine readers into a box and that is why I love this one in particular.
After reading and following the exercises, you will be empowered to use BPF like a super power.
You can use this in your toolkit to use on demand when it’s most needed and most useful.
You won’t just learn BPF you will understand it. This book is a path to open your mind
to the possibilities of what you can build with BPF.</p>
<p>This developing ecosystem is very exciting! I hope it will grow even larger
as more people start wielding BPF&rsquo;s power. I am excited to learn about what the readers of
this book end up building, whether it&rsquo;s a script to track down a crazy software bug or a
custom firewall or even <a href="https://lwn.net/Articles/759188/">infrared decoding</a>! Be sure to let us all know what you built!</p>
Corollary to the Hard Thing about Hard Thingshttps://blog.jessfraz.com/post/corollary-to-the-hard-thing-about-hard-things/
Wed, 15 May 2019 08:09:26 -0700https://blog.jessfraz.com/post/corollary-to-the-hard-thing-about-hard-things/<blockquote>
<p>&ldquo;Can I get an encore, do you want more&rdquo; - Jay-Z</p>
</blockquote>
<p>I recently read Ben Horowitz’s book, <a href="https://www.amazon.com/Hard-Thing-About-Things-Building-ebook/dp/B00DQ845EA/ref=sr_1_1">The Hard Thing about Hard Things</a>. It’s really eye opening and creates
a level of empathy in the reader for leaders that make hard decisions every day. It covers everything from how to know
your company is toxic to how to do layoffs. Ben starts each chapter with a rap quote so as did I above ;) obviously I chose Jay-Z but I also love <a href="https://blog.jessfraz.com/post/what-would-2pac-do/">Tupac, as is shown by my first blog post ever</a>.</p>
<p>I have a corollary to this: power dynamics. I, personally, have seen and experienced what it is like being a
leader when no one really has a full view of who you are as a person. I try to always be authentic and personable,
but the fact of the matter is: we are all humans and we all have off days.</p>
<p>Most people only get a view of who I am through Twitter, but that is not fully who I am. I think that is the case
for most people on that website. For executives of companies or leaders of large teams, the same holds true: you only see a small subset,
through very limited communication, of who they really are.</p>
<p>At work, I like to move fast and get things done. This may result in abrupt communications which is not typical
of how I am on the internet. Even more so, if I was to give feedback or an opinion on something, someone might
feel it with the heat of a thousand suns and think it is aggressive, even if that is not how I intended it.
The best we can do is apologize and grow when we fuck up.</p>
<p>Another example would be if someone in a position of power asks someone to do something.
The person without the power might think they have to do it a certain way and can&rsquo;t push back.
We can try to solve this by always making an effort to ask for other&rsquo;s opinions and feedback.</p>
<p>I really do not enjoy when people hero worship me and I do not think people should hero worship anyone.
We are all humans and we are all flawed in our own ways. Anyone who believes someone to be perfect will
soon find that they are not. This holds true for anyone: executives of companies, senior engineers,
tennis champions, and hollywood stars.</p>
<p>Leave room for people to make mistakes, because they will. What
truly matters is how a person grows after making a mistake. It helps to make it very clear that you will make mistakes
and welcome feedback. When someone discovers a mistake you&rsquo;ve made try to treat it as a gift. Allow for
failure and growth from failure in others and they will do the same for you as well.</p>
<p>If you are a leader and you empathize with this, I think this problem can also be solved with time.
You need time for people to understand how you work and time to grow trust. As long as you continue
to be transparent about mistakes over time and grow from them, trust will follow.</p>
<p>It’s hard to see a power dynamic at
play if you are in it and hold the power. Power dynamics are in the eye of the beholder. We can all
try to be conscious of this and patient as the vines of trust grow around us.</p>
Why open source firmware is important for securityhttps://blog.jessfraz.com/post/why-open-source-firmware-is-important-for-security/
Wed, 08 May 2019 08:09:26 -0700https://blog.jessfraz.com/post/why-open-source-firmware-is-important-for-security/
<p>I gave a talk recently at GoTo Chicago on <a href="https://docs.google.com/presentation/d/1Qees556dT9LNoooEdf6En8V82L3V-_N8LbPuyGihZeI/edit?usp=sharing">Why open source firmware is important</a> and I thought it would be nice to also write a blog post with my findings. This post will focus on why open source firmware is important for security.</p>
<h2 id="privilege-levels">Privilege Levels</h2>
<p>In your typical “stack” today you have the various levels of privileges.</p>
<ul>
<li><strong>Ring 3 - Userspace:</strong> has the least amount of privileges, short of there being a sandbox in userspace that is restricted further.</li>
<li><strong>Ring 0 - Kernel:</strong> The operating system kernel, for open source operating systems you get visibility into the code behind this.</li>
<li><strong>Ring -1 - Hypervisor:</strong> The virtual machine monitor (VMM) that creates and runs virtual machines. For open source hypervisors like Xen, KVM, bhyve, etc you have visibility into the code behind this.</li>
<li><strong>Ring -2 -</strong> <strong>System Management Mode (SMM), UEFI kernel:</strong> Proprietary code, more on this <a href="#ring-2-smm-uefi-kernel">below</a>.</li>
<li><strong>Ring -3 - Management Engine:</strong> Proprietary code, more on this <a href="#ring-3-management-engine">below</a>.</li>
</ul>
<p>The negative rings were made up because there was no other way to express something with more privileges.</p>
<p>From the above, it’s pretty clear that for Rings -1 to 3, we have the option to use open source software and have a large amount of visibility and control over the software we run. For the privilege levels under Ring -1, we have less control but it is getting better with the open source firmware community and projects.</p>
<p><strong>It’s counter-intuitive that the code that we have the least visibility into has the most privileges. This is what open source firmware is aiming to fix.</strong></p>
<h3 id="ring-2-smm-uefi-kernel">Ring -2: SMM, UEFI kernel</h3>
<p>This ring controls all CPU resources.</p>
<p><strong>System management mode (SMM)</strong> is invisible to the rest of the stack on top of it. It has half a kernel. It was originally used for power management and system hardware control. It holds a lot of the proprietary designed code and is a place for vendors to add new proprietary features. It handles system events like memory or chipset errors as well as a bunch of other logic.</p>
<p>The <strong>UEFI Kernel</strong> is extremely complex. It has millions of lines of code. UEFI applications are active after boot. It was built with security from obscurity. The <a href="https://uefi.org/specifications">specification</a> is absolutely insane if you want to dig in.</p>
<h3 id="ring-3-management-engine">Ring -3: Management Engine</h3>
<p>This is the most privileged ring. In the case of Intel (x86) this is the Intel Management Engine. It can turn on nodes and re-image disks invisibly. It has a kernel that runs <a href="https://itsfoss.com/fact-intel-minix-case/">Minix 3</a> as well as a web server and entire networking stack. It turns out Minix is the most widely used operating system because of this. There is a lot of functionality in the Management Engine, it would probably take me all day to list it off but there are <a href="https://www.intel.com/content/www/us/en/support/articles/000008927/software/chipset-software.html">many</a> <a href="https://files.bitkeks.eu/docs/intelme-report.pdf">resources</a> for digging into more detail, should you want to.</p>
<p>Between Ring -2 and Ring -3 we have at least 2 and a half other kernels in our stack as well as a bunch of proprietary and unnecessary complexity. Each of these kernels have their own networking stacks and web servers. The code can also modify itself and persist across power cycles and re-installs. <strong>We have very little visibility into what the code in these rings is actually doing, which is horrifying considering these rings have the most privileges.</strong></p>
<h3 id="they-all-have-exploits">They all have exploits</h3>
<p>It should be of no surprise to anyone that Rings -2 and -3 have their fair share of vulnerabilities. They are horrifying when they happen though. Just to use one as an example although I will let you find others on your own, <a href="https://www.wired.com/2017/05/hack-brief-intel-fixes-critical-bug-lingered-7-dang-years/">there was a bug in the web server of the Intel Management Engine that was there for seven years</a> without them realizing.</p>
<h2 id="how-can-we-make-it-better">How can we make it better?</h2>
<h3 id="nerf-non-extensible-reduced-firmware">NERF: Non-Extensible Reduced Firmware</h3>
<p>NERF is what the open source firmware community is working towards. The goals are to make firmware less capable of doing harm and make its actions more visible. They aim to remove all runtime components but currently with the Intel Management Engine, they cannot remove all but they can take away the web server and IP stack. They also remove UEFI IP stack and other drivers, as well as the Intel Management/UEFI self-reflash capability.</p>
<h3 id="me-cleaner">me_cleaner</h3>
<p>This is the project used to clean the Intel Management Engine to the smallest necessary capabilities. You can check it out on GitHub: <a href="https://github.com/corna/me_cleaner">github.com/corna/me_cleaner</a>.</p>
<h3 id="u-boot-and-coreboot">u-boot and coreboot</h3>
<p><a href="https://www.chromium.org/developers/u-boot">u-boot</a> and <a href="https://www.coreboot.org/">coreboot</a> are open source firmware. They handle silicon and DRAM initialization. Chromebooks use both, coreboot on x86, and u-boot for the rest. This is one part of how they <a href="https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42038.pdf">verify boot</a>.</p>
<p>Coreboot’s design philosophy is to <a href="https://doc.coreboot.org/">“do the bare minimum necessary to ensure that hardware is usable and then pass control to a different program called the</a> <a href="https://doc.coreboot.org/"><em>payload</em></a><a href="https://doc.coreboot.org/">.”</a> The payload in this case is linuxboot.</p>
<h3 id="linuxboot">linuxboot</h3>
<p><a href="https://www.linuxboot.org/">Linuxboot</a> handles device drivers, network stack, and gives the user a multi-user, multi-tasking environment. It is built with Linux so that a single kernel can work for several boards. Linux is already quite vetted and has a lot of eyes on it since it is used quite extensively. Better to use a open kernel with a lot of eyes on it, than the 2½ other kernels that were all different and closed off. This means that we are lessening the attack surface by using less variations of code and we are making an effort to rely on code that is open source. Linux improves boot reliability by replacing lightly-tested firmware drivers with hardened Linux drivers.</p>
<p>By using a kernel we already have tooling around firmware devs can build in tools they already know. When they need to write logic for signature verification, disk decryption, etc it’s in a language that is modern, easily auditable, maintainable, and readable.</p>
<h3 id="u-root">u-root</h3>
<p><a href="https://github.com/u-root/u-root">u-root</a> is a set of golang userspace tools and bootloader. It is then used as the initramfs for the Linux kernel from linuxboot.</p>
<p>Through using the NERF stack they saw boot times were 20x faster. But this blog post is on security so let’s get back to that….</p>
<p>The NERF stack helps improve the visibility into a lot of the components that were previously very proprietary. There is still a lot of other firmware on devices.</p>
<h2 id="what-about-all-the-other-firmware">What about all the other firmware?</h2>
<p>We need open source firmware for the network interface controller (NIC), solid state drives (SSD), and base management controller (BMC).</p>
<p>For the NIC, there is some work being done in the open compute project on <a href="https://www.opencompute.org/documents/ocp-nic-3-0-draft-0v85b-20181213b-tn-temp-no-cb-pdf">NIC 3.0</a>. It should be interesting to see where that goes.</p>
<p>For the BMC, there is both <a href="https://github.com/openbmc/openbmc">OpenBMC</a> and <a href="https://github.com/u-root/u-bmc">u-bmc</a>. I had written a little about them in <a href="https://blog.jessfraz.com/post/the-firmware-rabbit-hole/">a previous blog post</a>.</p>
<p>We need to have all open source firmware to have all the visibility into the stack but also to actually verify the state of software on a machine.</p>
<h2 id="roots-of-trust">Roots of Trust</h2>
<p>The goal of the root of trust should be to verify that the software installed in every component of the hardware is the software that was intended. This way you can know without a doubt and verify if hardware has been hacked. Since we have very little to no visibility into the code running in a lot of places in our hardware it is hard to do this. How do we really know that the firmware in a component is not vulnerable or that is doesn’t have any backdoors? Well we can’t. Not unless it was all open source.</p>
<p>Every cloud and vendor seems to have their own way of doing a root of trust. Microsoft has <a href="https://github.com/opencomputeproject/Project_Olympus/tree/master/Project_Cerberus">Cerberus</a>, Google has <a href="https://cloud.google.com/blog/products/gcp/titan-in-depth-security-in-plaintext">Titan</a>, and Amazon has <a href="https://perspectives.mvdirona.com/2019/02/aws-nitro-system/">Nitro</a>. These seem to assume an explicit amount of trust in the proprietary code (the code we cannot see). This leaves me with not a great feeling. <strong>Wouldn’t it be better to be able to use all open source code? Then we could verify without a doubt that the code you can read and build yourself is the same code running on hardware for all the various places we have firmware. We could then verify that a machine was in a correct state without a doubt of it being vulnerable or with a backdoor.</strong></p>
<p>It makes me wonder what the smaller cloud providers like DigitalOcean or Packet have for a root of trust. Often times we only hear of these projects from the big three or five. I asked this on twitter and didn&rsquo;t get any good answers&hellip;</p>
<p><blockquote class="twitter-tweet"><p lang="en" dir="ltr">I’m surprised how many people are responding that they love DigitalOcean but seem entirely unconcerned there’s no answer here. You should be concerned.</p>&mdash; jessie frazelle 👩🏼‍🚀 (@jessfraz) <a href="https://twitter.com/jessfraz/status/1126131424095100929?ref_src=twsrc%5Etfw">May 8, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>There is a great talk by <a href="https://twitter.com/PaulM">Paul McMillan</a> and Matt
King on <a href="https://www.youtube.com/watch?v=PEVVRkd-wPM">Securing Hardware at Scale</a>. It covers in great detail
how to secure bare metal while also giving customers access to the bare
metal. When they get back the hardware from customers they need to ensure with
consistency and reliability that there is nothing from the customer hiding in
any component of the hardware.</p>
<p>All clouds need to ensure that the
hardware they are running has not been compromised after a customer has run
compute on it.</p>
<h2 id="platform-firmware-resiliency">Platform Firmware Resiliency</h2>
<p>As far as chip vendors go, they seem to have a different offering. Intel has <a href="https://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/firmware-resilience-blocks-solution-brief.pdf">Platform Firmware Resilience</a> and Lattice has <a href="http://www.latticesemi.com/en/Solutions/Solutions/SolutionsDetails02/PFR">Platform Firmware Resiliency</a>. These seem to be more focused on the NIST guidelines for <a href="https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-193.pdf">Platform Firmware Resiliency</a>.</p>
<p>I tried to ask the internet who was using this and heard very little back, so if you are using Platform Firmware Resiliency can you let me know!</p>
<p><blockquote class="twitter-tweet"><p lang="en" dir="ltr">It seems that Intel has some effort called Platform Firmware Resiliency (anyone using this one?!) <a href="https://t.co/fQq2gdLNOm">https://t.co/fQq2gdLNOm</a></p>&mdash; jessie frazelle 👩🏼‍🚀 (@jessfraz) <a href="https://twitter.com/jessfraz/status/1126121264819712000?ref_src=twsrc%5Etfw">May 8, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>From the <a href="https://www.opencompute.org/files/Intel-System-Firmware-InnovationsMohanKumar-OCP18.pdf">OCP talk on Intel&rsquo;s firmware innovations</a>, it seems Intel&rsquo;s Platform Firmware Resilience (PFR) and Cerberus
go hand in hand. Intel is using PFR to deliver Cerberus&rsquo; attestation priniciples.
Thanks <a href="https://twitter.com/_msw_">@msw</a> for the clarification.</p>
<p>It would be
nice if there were not so many tools to do this job. I also wish the code was
open source so we could verify for ourselves.</p>
<h2 id="how-to-help">How to help</h2>
<p>I hope this gave you some insight into what’s being built with open source firmware and how making firmware open source is important! If you would like to help with this effort, please help spread the word. Please try and use platforms that value open source firmware components. Chromebooks are a great example of this, as well as <a href="https://puri.sm/">Purism</a> computers. You can ask your providers what they are doing for open source firmware or ensuring hardware security with roots of trust. Happy nerding! :)</p>
<p>Huge thanks to the open source firmware community for helping me along this
journey! Shout out to Ron Minnich, <a href="https://twitter.com/qrs">Trammel Hudson</a>, <a href="https://twitter.com/hugelgupf">Chris Koch</a>,
<a href="https://twitter.com/kc8apf">Rick Altherr</a>, and
<a href="https://twitter.com/_zaolin_">Zaolin</a>. And shout out to <a href="https://twitter.com/bridgetkromhout">Bridget Kromhout</a> for always
finding time to review my posts!</p>
Challenge Accepted: Transposithttps://blog.jessfraz.com/post/challenge-accepted-transposit/
Tue, 23 Apr 2019 00:09:26 -0700https://blog.jessfraz.com/post/challenge-accepted-transposit/
<p>Last week, I had the pleasure of meeting with the <a href="https://www.transposit.com/">Transposit</a>
team in San Francisco. Tech is a super small world and it turns out the two
founders and I are separated by one-degree through several different people
we know. In meeting them I closed many loops without even realizing it, but
I digress&hellip;</p>
<p>Their product is really cool, it exposes a SQL interface for interacting with
numerous APIs at once. For someone like myself who deploys a lot of bots, this
is great. Usually when I have a complex bot I end up writing a lot of
&ldquo;glue code&rdquo; to combine a few different APIs and get the information I want.
Most of my bots have some sort of pagination logic and all have the <code>N+1</code> problem where
I don&rsquo;t really optimize my queries or use anything fancy like graphQL. Many
APIs don&rsquo;t even have graphQL interfaces but also I am old school and I don&rsquo;t
really want to learn something new. This is why I was super intrigued by
Transposit&rsquo;s SQL interface, because hey, I know SQL!</p>
<p>Adam, the CEO, challenged me to try it out, give them feedback, and see if
I could break it with something complex. I am not one to back down from
a challenge and I have some super weird ass bots, so I decided to start with
the weirdest.</p>
<h2 id="gitable">Gitable</h2>
<p><a href="https://github.com/jessfraz/gitable">Gitable</a> is a bot I made for sending all
my open issues and PRs on GitHub to a table in <a href="https://airtable.com/">Airtable</a>.
I fucking love Airtable. It&rsquo;s design just feels right and works the way my
brain works.</p>
<p>I set out to make this bot work in Transposit because I know it has some
super weird loops and has the <code>N+1</code> problem where I loop over all my repos,
then make another API call after.</p>
<p>To reiterate, the goal of the bot is to iterate through all my repos on GitHub
and sync the list of issue and PRs with a table in Airtable.</p>
<h3 id="query-all-the-user-s-repos">Query all the user&rsquo;s repos</h3>
<p>First, I need to get all my repos that are not forks. So I need
a SQL query for this, in Transposit it looks like this:</p>
<pre><code class="language-sql">SELECT name, full_name FROM github.list_repos_for_user
WHERE username=@owner
AND type='owner'
AND fork=false
</code></pre>
<p>The <code>github.list_repos_for_user</code> table is a built in to Transposit and they
handle all your API keys and authorizations when you choose &ldquo;Github&rdquo; as a data
connection in the UI. It also caches the response which is a huge win because
I am the queen of being rate limited.</p>
<p>I named that query: <code>list_repos_for_user</code> so when I want to use it elsewhere in
another query, I can call it by <code>this.list_repos_for_user</code>.</p>
<h3 id="query-all-the-issues-in-all-the-user-s-repos">Query all the issues in all the user&rsquo;s repos</h3>
<p>To get all the issues in all my repos I can use a join on that table I just
created. It ends up looking like this:</p>
<pre><code class="language-sql">SELECT
A.created_at AS created,
A.updated_at AS updated,
B.full_name, A.number,
A.html_url AS url,
A.state, A.title,
A.user.login AS author,
A.labels, B.name,
A.closed_at AS completed,
A.comments
FROM github.list_issues_for_repo
AS A
JOIN this.list_repos_for_user
AS B
ON A.repo = B.name
WHERE A.owner=@owner
AND B.owner=@owner
</code></pre>
<p>Okay so I didn&rsquo;t break anything yet and I just joined my table with all my
repos, <code>this.list_repos_for_user</code>, with the built-in table in Trasnposit
<code>github.list_issues_for_repo</code>. This has now replaced my <code>N+1</code> code with just this
one SQL query and Transposit does all the optimizations on their end.</p>
<p>I called this table <code>list_issues_for_user</code> and <code>@owner</code> is a parameter, so
anyone else can fork this app and change it to their own username.</p>
<h3 id="query-all-the-records-in-an-airtable-table">Query all the records in an Airtable table</h3>
<p>Now I need to get all the existing airtable records in my table so I can know
later on down the road if I need to create a row or update a row with the new
information from the GitHub API.</p>
<p>In my Airtable table I have a column called &ldquo;reference&rdquo; which stores information
about the issue or PR as <code>owner/repo#num</code> so for example it looks like
<code>jessfraz/.vim#1</code>. This is a column defined by me, but I also know it to be
unique. So I want to get the reference of every column and it&rsquo;s airtable record
ID so I can use that to update the record.</p>
<pre><code class="language-sql">SELECT id, fields.Reference as reference FROM airtable.get_records
WHERE baseId=@baseID
AND table=@table
</code></pre>
<p>That winds up looking like the query above. <code>@baseID</code> and <code>@table</code> are
parameters so anyone can replace those with their own for their table in
Airtable.</p>
<p>I named this query <code>get_airtable_records</code> so when I call it later I can do so
with <code>this.get_airtable_records</code>.</p>
<h3 id="update-and-create-rows-in-airtable-for-each-of-the-issues-in-user-s-repos">Update and create rows in Airtable for each of the issues in user&rsquo;s repos</h3>
<p>Okay so now&rsquo;s the part where I am thinking&hellip; I&rsquo;m going to break this thing.
(Narrator: I didn&rsquo;t.)</p>
<p>Transposit has both SQL and Javascript operations and since the next part was
where a lot of the logic was I used Javascript. I haven&rsquo;t written Javascript in
a long time so mind my shitty code. Honestly, SQL is turing complete so
I considered using SQL but I wanted to get this done in an hour. (I will leave
it as an exercise for the reader to fork my app and make it all in SQL.)</p>
<p>What I needed to do was take our earlier table to <code>list_issues_for_user</code>,
iterate over them, and update or create an Airtable record for each of them.
This ends up looking like the following:</p>
<pre><code class="language-js">function run(params) {
var results = api.run(&quot;this.list_issues_for_user&quot;, {owner: params.owner});
for (var i = 0; i &lt; results.length; i++) {
// Build the reference for the issue with the full name and number.
// Winds up looking like &quot;jessfraz/.vim#1&quot;
var reference = results[i].full_name + &quot;#&quot; + results[i].number;
// Get the Airtable recordID for the reference if it exists.
var id = api.query(&quot;select id from this.get_airtable_records where reference='&quot;+reference+&quot;'&quot;, {baseID: params.baseID, table: params.table});
// Define the object params for create and update.
var obj = {
baseID: params.baseID,
table: params.table,
reference: reference,
title: results[i].title,
state: results[i].state,
author: results[i].author,
type: 'issue',
comments: results[i].comments,
url: results[i].url,
updated: results[i].updated,
created: results[i].created,
completed: results[i].completed,
repo: results[i].name,
};
if (id.length &gt; 0) {
results[i].airtable_id = id[0].id;
obj.recordID = id[0].id;
// Update the result in the table.
var r = api.run(&quot;this.update_record&quot;, obj);
api.log(r);
} else {
// Create record in the table.
results[i].airtable_id = 0;
var r = api.run(&quot;this.create_record&quot;, obj);
api.log(r);
}
results[i].reference = reference;
}
return {
results
};
}
</code></pre>
<p>You might be wondering what <code>this.create_record</code> and <code>this.update_record</code> look
like. These are just helper operations so I can use all the fields for the
records as parameters.</p>
<h3 id="create-an-airtable-record">Create an Airtable record</h3>
<p><code>create_record</code> calls the built-in <code>airtable.create_record</code> which looks like
the following:</p>
<pre><code class="language-sql">SELECT * FROM airtable.create_record
AND baseId=@baseID
AND table=@table
AND $body=(SELECT {
'fields' : {
'Reference': @reference,
'Title': @title,
'State': @state,
'Author': @author,
'Type': @type,
'Comments': @comments,
'URL': @url,
'Updated': @updated,
'Created': @created,
'Completed': @completed,
'Repository': @repo,
}
})
</code></pre>
<p>Everything starting with an <code>@</code> is a parameter we can change on the fly in our
Javascript function like you saw above.</p>
<h3 id="update-an-airtable-record">Update an Airtable record</h3>
<p><code>update_record</code> is very similar, it calls the Transposit built-in
<code>airtable.update_record</code>:</p>
<pre><code class="language-sql">SELECT * FROM airtable.update_record
WHERE recordId=@recordID
AND baseId=@baseID
AND table=@table
AND $body=(SELECT {
'fields' : {
'Reference': @reference,
'Title': @title,
'State': @state,
'Author': @author,
'Type': @type,
'Comments': @comments,
'URL': @url,
'Updated': @updated,
'Created': @created,
'Completed': @completed,
'Repository': @repo,
}
})
</code></pre>
<p>Doing the above with pull requests rather than issues is the exact same code
but you swap out the query for issues with pull requests.
You can schedule your operations to run at certain times like cron or when you call an API endpoint.</p>
<p>Sadly, I failed at breaking the thing with one of my most complex bots. But
maybe you will have better luck trying ;) You can fork my app or look at the
queries here:
<a href="https://console.transposit.com/t/jessfraz/gitable">console.transposit.com/t/jessfraz/gitable</a>.</p>
Questions I'd Ask My Cloud Providerhttps://blog.jessfraz.com/post/questions-id-ask-my-cloud-provider/
Mon, 15 Apr 2019 08:09:26 -0700https://blog.jessfraz.com/post/questions-id-ask-my-cloud-provider/
<p>I came up with a list of questions I would ask my cloud provider if I was
buying a product. They are as follows:</p>
<h3 id="1-what-problem-is-this-solving">1. What problem is this solving?</h3>
<p>I would ask this to make sure I even need this product. So many people tend to
buy into the hype for &ldquo;shiny&rdquo;, they miss if they even needed the thing in the
first place.</p>
<h3 id="2-how-did-you-implement-this-what-is-your-threat-model">2. How did <em>you</em> implement this? What is <em>your</em> threat model?</h3>
<p>So much of the cloud is built on popsicle sticks and glue. Does that make you
feel safe at night knowing your customer data is being stored in a proof of
concept that was shipped before it should have been? Best to get your security
team to assess if the product is actually built on the <em>providers side</em> up to
standard. This does not mean what you see as a customer, it means the
proprietary bits you cannot see.</p>
<p>What does the service license agreement say for what happens if the provider
themselves is hacked? Do they have to tell you or can they just sweep it under
the rug? What if a vulnerability comes out on the open source project they are
using, do they have to give you a risk assessment as to if you were hacked?</p>
<p>What if they don&rsquo;t know if they were hacked after a vulnerability is public?
Red flag&hellip;</p>
<p>If they themselves do not know their own threat model, that should be a huge
warning sign.</p>
<p>Bonus points if their implementation is open source; but I will let you in on
a secret, most aren&rsquo;t. The exception is Joyent :)</p>
<h3 id="3-what-customers-did-you-speak-to-before-building-this-feature">3. What customers did you speak to before building this feature?</h3>
<p>Ties back to number one, what problem is this solving? So often these features
seem to be built <em>for fun</em> or based off a <em>feeling</em> a product manager had.</p>
<p>Hope this helps! I will probably update over time. :)</p>
Leadership CIhttps://blog.jessfraz.com/post/leadership-ci/
Tue, 09 Apr 2019 08:09:26 -0700https://blog.jessfraz.com/post/leadership-ci/<p>This post is co-authored by <a href="https://github.com/simpsoka">Kathy Simpson</a>.</p>
<blockquote>
<p>“understanding the true nature of instinctive decision making requires us to be forgiving of those people trapped in circumstances where good judgment is imperiled.”
― Malcolm Gladwell, Blink: The Power of Thinking Without Thinking</p>
</blockquote>
<p>As leaders, setting up a structure that helps us navigate decisions under
pressure is of the utmost importance.
When writing and delivering software we rely on our
continuous integration (CI) infrastructure and test suites to tell us when a test is
failing and code should not be merged.</p>
<p>As leaders, before acting or making decisions it would be nice to have a set of
tests and checks, established ahead of time, to make sure we are in the
right headspace to think, behave and make decisions that are in the best
interest of everyone and our company. There are devastating consequences to
taking actions based on fear and pride;
we hope this set of questions enables taking action based on growth, humility, inclusion,
and soulful reflection.</p>
<p>The following are the sets of questions we brainstormed, but expect them to
change over time as we experience and deal with new problems. These were
started in <a href="https://gist.github.com/simpsoka/14da775a63e22e5083141da5c48e6410">a gist</a>
and are copied below. The diff of this post and the gist will serve as the
evolution of this thought process.</p>
<p>It&rsquo;s important to note that in some instances answering all the questions might
take too much time. Perhaps prioritizing the most important ones in the moment
would be more effective.</p>
<p>Answering all the questions may be a luxury at times, so we suggest breaking
them down based on the situation you find yourself in: prioritize the most
important ones to your role, have a few ‘go to’ questions, or categorize them
based on the situations you find yourself in more often. The important part
of this list is to help us navigate a difficult situation while still
maintaining the integrity we intend for ourselves as leaders.</p>
<ol>
<li><p>Do I want to die on this hill?</p>
<p><strong>Pass:</strong> This is morally good and if not handled has long term consequences.</p>
<p><strong>Fail:</strong> This is self serving.</p></li>
<li><p>Am I including everyone?</p>
<p><strong>Pass:</strong> My ego is not driving this conversation.</p>
<p><strong>Fail:</strong> The people in this conversation will only tell me I&rsquo;m right and not push back.</p></li>
<li><p>Am I hiding something?</p>
<p><strong>Pass:</strong> The information, though painful, is known to all.</p>
<p><strong>Fail</strong>: Yes.</p></li>
<li><p>Is there transparency here?</p>
<p><strong>Pass:</strong> The team agrees on context and can repeat it back to me.</p>
<p><strong>Fail:</strong> Hidden misalignment (test: what do we align on).</p></li>
<li><p>Am I being curious?</p>
<p><strong>Pass:</strong> I&rsquo;m asking questions that make me uncomfortable, and I&rsquo;m comfortable being wrong.</p>
<p><strong>Fail:</strong> I want my way.</p></li>
<li><p>Is my team afraid to tell me things?</p>
<p><strong>Pass:</strong> They freely and continually come to me with answers and information that they know I will not like.</p>
<p><strong>Fail:</strong> They go to each other or people outside the team with the information, and telling me what they think I want to hear.</p></li>
<li><p>Am I only communicating with the same people over and over?</p>
<p><strong>Pass:</strong> My sphere of influence is diverse. I feel comfortable talking with anyone on the team.</p>
<p><strong>Fail:</strong> I continually consult the same individuals (test: do I have entourage?).</p></li>
<li><p>Do I feel insecure?</p>
<p><strong>Pass:</strong> I feel empowered and am willing to take feedback and risks regardless of the outcome as it&rsquo;s good for the company and the customer.</p>
<p><strong>Fail:</strong> I retreat, I am not comfortable, I am not giving up the information because I am scared of what people will think.</p></li>
<li><p>Can my team do the job I hired them to do? Is the job they are hired to do
the job that needs to be done?</p>
<p><strong>Pass:</strong> The team ships outcomes efficiently.</p>
<p><strong>Fail:</strong> The team is not empowered and often stalls (test: do I often have to intervene?).</p></li>
<li><p>Are you scratching an itch?</p>
<p><strong>Pass:</strong> This is a problem that&rsquo;s bigger than myself.</p>
<p><strong>Fail:</strong> It may feel good to solve this problem but only for myself and temporarily.</p></li>
<li><p>Am I being judgmental?</p>
<p><strong>Pass:</strong> Do I trust my team and their decisions?</p>
<p><strong>Fail:</strong> Is someone speaking up and telling me that I’m being judgmental?</p></li>
<li><p>Am I taking risks?</p>
<p><strong>Pass:</strong> I feel comfortable and confident that this decision will lead to positive and fruitful outcomes.</p>
<p><strong>Fail:</strong> I am being a pushover, and I am compromising in the wrong ways.</p></li>
<li><p>Am I being manipulative?</p>
<p><strong>Pass:</strong> I’m being honest, real, straightforward and I’m OK with the outcome and hearing ‘no’.</p>
<p><strong>Fail:</strong> I’m intentionally using words that aren’t representative of what I’m trying to communicate.</p></li>
<li><p>Am I speaking for people or letting them speak for themselves?</p>
<p><strong>Pass:</strong> I am doing the minority of the speaking and people are disagreeing
with my opinions.</p>
<p><strong>Fail:</strong> I am being quoted back to myself. I am talking the majority of the
time.</p></li>
</ol>
<p>Be sure to keep up with the <a href="https://gist.github.com/simpsoka/14da775a63e22e5083141da5c48e6410">original gist</a>
as well to see how this list evolves!</p>
The Truth Seekershttps://blog.jessfraz.com/post/the-truth-seekers/
Mon, 08 Apr 2019 08:09:26 -0700https://blog.jessfraz.com/post/the-truth-seekers/<p>Last week I got to see what it was like to be an investigative journalist for
a day. It was thrilling. I will get into what I learned but first I waned to
give some background on why I was doing this.</p>
<p>I have a general curiosity for people. It&rsquo;s interesting to me to uncover what
people are motivated by. Humans are individual snowflakes and no one is exactly
like the next. It is our unique experiences that form the way we think and
behave, as well as what drives us.</p>
<p>It is in my nature to learn and absorb information. I also recently learned,
although I should have realized this throughout my life, I am well attuned to
absorbing others emotions. I think my deep drive for understanding others and
value of the truth is somewhat perfect for the role of &ldquo;investigative
journalism&rdquo;.</p>
<p>Researching things for investigative journalism is very similar to that of
research for academia. Investigative journalism seems to be driven by
intuition, while academia might be more driven by novel research.</p>
<p>I got to see what <a href="https://twitter.com/jeffykao">Jeff Kao</a>&rsquo;s job was like for
a few hours and I learned a lot.</p>
<p>One of the more interesting things we discussed was diffs. I brought up if
diffs (as in those used by a source control tool) could work as a line of
truth. With a diff, the history of a document is fully transparent, anyone
can see any and all changes to it (of course taking into account, tracking
force pushes as well).</p>
<p>Jeff pointed out that there is past history of journalism using &ldquo;diffs&rdquo;. One
example was from <a href="https://www.usatoday.com/in-depth/news/investigations/2019/04/03/abortion-gun-laws-stand-your-ground-model-bills-conservatives-liberal-corporate-influence-lobbyists/3162173002/">an article</a>
that uncovered bills and laws being copied and influenced by corporations. They
compared the text of the bills to others and showed the changes,
similarities, and motivations
behind them.</p>
<p>I then realized that Jeff was the author of the <a href="https://hackernoon.com/more-than-a-million-pro-repeal-net-neutrality-comments-were-likely-faked-e9f0e3ed36a6">amazing article</a> from a couple
years ago on how net neutrality comments were likely faked. He used natural
language processing to find the similarities in the comments.</p>
<p>Both these articles use comparisons of text to uncover falsifications or
motivations. This is super similar to diffs, which is also a comparison of text! I also started thinking about how
in <a href="https://blog.jessfraz.com/post/government-medicine-capitalism/">my previous article</a>
I mentioned it would be cool if laws were versioned with git. By doing
that, we would get the diff and history of changes to the laws. Changes to laws
or language used over time could be visualized quite easily with the tools for
source control.</p>
<p>Overall, the day was fascinating. Investigative journalism was really aligned
with my joy of learning new things from a variety of different perspectives and
using intuition and research to try to find truth.</p>
<p>Another thought I have been thinking on is: how can we separate emotion from
the truth? So much of the news today is trying to trigger an emotional response
for clicks. Or in the worst case, it is trying to trigger an emotional response
for influencing an election. How can we promote the news sources that focus on
the truth versus triggering a reaction? The truth itself should be enough of
a trigger.</p>
Thoughts on Conway's Law and the software stackhttps://blog.jessfraz.com/post/thoughts-on-conways-law-and-the-software-stack/
Mon, 25 Mar 2019 08:09:26 -0700https://blog.jessfraz.com/post/thoughts-on-conways-law-and-the-software-stack/<p>I’ve been talking to a lot of people in different layers of the stack during my
funemployment. I wanted to share one of the problems I’ve been thinking about
and maybe you can think of some clever solutions to solve it.</p>
<p>Conway&rsquo;s Law states &ldquo;organizations which design systems &hellip; are constrained
to produce designs which are copies of the communication structures of these
organizations.&rdquo;</p>
<p>If you were to apply Conway&rsquo;s Law to all the layers of the software stack and
open source software you’d see a problem: <strong>There is not sufficient
communication between the various layers of software.</strong></p>
<p>Let’s dive in a bit to make the problem super clear.</p>
<p>I’ve met a bunch of hardware engineers and I’ve made a point about asking each
of them how they feel about using a single chip for multiple users. This
is, of course, the use case of the cloud. All of the hardware engineers either
laugh or are horrified and the resounding reaction is “you’d be crazy to think
hardware was ever intended to be used for isolating multiple users safely.”
Spectre and Meltdown proved this was true as well. Speculative execution was
a feature intended to make processors faster but was never thought about in
terms of the vector of hacking something running multi-tenant compute,
like a cloud provider. Seems like the software and hardware layers should
better communicate&hellip;</p>
<p>That’s just one example, let’s reverse the interaction. I’ve talked to a bunch
of firmware and kernel engineers and they’d all love if the firmware from chip
vendors did less complexity. For instance, it seems like a unanimous vote among
firmware and kernel engineers that CPU vendors should not include runtime
services or SMM with their firmware. Open source firmware and kernel developers
would rather handle those problems at their layer of the stack. All the complexity
in the firmware leads to overlooked bugs and odd behavior that can’t be
controlled or debugged from the kernel developers layer and/or user space. Not to mention,
a lot of CPU vendors firmware is proprietary so it’s really hard to know if
a bug is truly a firmware bug.</p>
<p>Another example would be the <a href="https://arstechnica.com/information-technology/2019/02/supermicro-hardware-weaknesses-let-researchers-backdoor-an-ibm-cloud-server/">hack of SoftLayer</a>. Hackers modified the
firmware on the BMC from a bare metal host the cloud provider was offering.
This shows another mistake in having blinders on and not being conscious
of the other layers of the stack and the entire system.</p>
<p>Let’s move up the stack a bit to something I personally have experienced.
I worked a lot on container runtimes. I also have worked on kubernetes.
I was horrified to find people are running multi-tenant kubernetes clusters
with multiple customers processes, aka for isolating untrusted processes. The architecture of kubernetes is
just <a href="https://blog.jessfraz.com/post/secret-design-docs-multi-tenant-orchestrator/#why-not-kubernetes">not designed for this</a>.</p>
<p>A common miscommunication is the &ldquo;window dressing.&rdquo; For example, there is a
feature in kubernetes that prevents exec-ing into
containers. This is implemented by merely preventing the
API call in kubernetes. If a person has access to a cluster there are about 4 dozen different
ways I can think of to exec into a container and bypass this &ldquo;feature&rdquo; and
kubernetes entirely. Using
said &ldquo;security feature&rdquo; in kubernetes alone is not sufficient for security in any respect.
This is a common pattern.</p>
<p>All these problems are not small by any means. They are miscommunications
at various layers of the stack. They are people thinking an interface or
feature is secure when it is merely a window dressing that can be bypassed with
just a bit more knowledge about the stack. I really like the advice
<a href="https://twitter.com/LeaKissner/status/1109259338265165824">Lea Kissner</a> gave:
&ldquo;take the long view, not just the broad view.&rdquo; We should do this more often
when building systems.</p>
<p>The thought I’ve been noodling on is: how do we solve this? Is this something
a code hosting provider like GitHub should fix? But, that excludes all the
projects that are not on that platform. How do we promote better communication
between layers of the stack? How can we automate some of this away? Or is
the answer simply, own all the layers of the stack yourself?</p>
Digging into RISC-V and how I learn new thingshttps://blog.jessfraz.com/post/digging-into-risc-v-and-how-i-learn-new-things/
Sun, 24 Mar 2019 08:09:26 -0700https://blog.jessfraz.com/post/digging-into-risc-v-and-how-i-learn-new-things/
<p>I recently have started researching and playing around with RISC-V for fun. I thought it might be nice to combine some of what I’ve learned into a blog post. However, I don’t just want to highlight <em>what</em> I learned. I want to use this as an example of how to go about learning something new.</p>
<p>Recently, <a href="https://twitter.com/erikstmartin">Erik St. Martin</a>, <a href="https://twitter.com/ScribblingOn">Shubheksha Jalan</a>, and I were discussing how we learn new things and we all thought it might be beneficial to have a way to document this process for others. What better way to document this then by example with my recent research into RISC-V?</p>
<p>I’ve <a href="https://blog.jessfraz.com/post/defining-a-distinguished-engineer/">said it before</a> and I will say it again, I think anyone is capable of doing or learning anything, they just need the right motivation and to believe in themselves. I also made a point of including the book <a href="https://www.amazon.com/Super-Brain-Unleashing-Explosive-Well-Being/dp/0307956830">Super Brain</a> on <a href="https://blog.jessfraz.com/post/books/">my list of recommended books</a>, because it confirms with science that if you set your sights high you can accomplish great things, but if you set your expectations low it becomes a self-fulfilling prophecy. To put it more bluntly, believe in yourself!</p>
<p>I became fascinated by what is happening in the RISC-V space just by seeing it pop up every now and then in my Twitter feed. Since I am currently unemployed I have a lot of time and autonomy to dig into whatever I wish.</p>
<p>RISC-V is a new instruction set architecture. To understand RISC-V, we must first dig into what an instruction set architecture is. This is my learning technique. I bounce from one thing to another, recursively digging deeper as I learn more.</p>
<h2 id="what-is-an-instruction-set-architecture-isa">What is an instruction set architecture (ISA)?</h2>
<p>An instruction set architecture is the interface between the hardware and the software.</p>
<p>Models of processors can implement the same instruction set but have different <em>internal</em> designs for implementing the interface. This leads to various processors having the same instruction set but differing in performance, physical size, and monetary cost. For example, Intel and AMD have processors that both implement the same x86 instruction set but have very different internal designs.</p>
<p>In order to dig deeper, we should look into what some of the various types of instruction set architectures are.</p>
<h2 id="what-are-the-types-of-instruction-set-architectures">What are the types of instruction set architectures?</h2>
<p>Most commonly these are described and classified by their complexity.</p>
<h3 id="reduced-instruction-set-computer-risc">Reduced Instruction Set Computer (RISC)</h3>
<p>This only implements frequently used instructions, less common operations are implemented as subroutines. By using subroutines, there is a trade-off of performance, however it’s only applied to the least common operations.</p>
<p>RISC uses a load/store architecture; meaning it divides instructions into ones that access memory and ones that perform arithmetic logic unit (ALU) operations.</p>
<p>RISC, the name, came out of Berkeley in the 1980s (from a project led by David Patterson) around the same time MIPS (a project led by John L. Hennessy) was going on at Stanford. RISC became commercialized as SPARC by Sun Microsystems and MIPS became commercialized by MIPS Computer Systems. Both are RISC architectures. You might also be familiar with more modern implementations like ARM or PowerPC which are commercialized as well. There are many RISC implementations other than just these, I implore you all to dig further if you so choose.</p>
<p>RISC architectures can also be traced back to before the name existed as well. Examples include Alan Turing&rsquo;s Automatic Computing Engine (ACE) from 1946 and the CDC 6600 designed by Seymour Cray in 1964.</p>
<h3 id="complex-instruction-set-computer-cisc">Complex Instruction Set Computer (CISC)</h3>
<p>This has many very specific, specialized instructions, some may never be used in most programs. In CISC, one instruction can denote an execution of several low-level operations or one instruction is capable of multi-step operations and/or addressing modes.</p>
<p>The term was coined after RISC, so everything that is not RISC tends to get lumped here. It’s become somewhat of a contentious point since some modern CISC designs are in fact less complex than some RISC designs. The main difference is that CISC architectures have arithmetic/computation instructions also perform memory accesses.</p>
<p>Most architectures were classified after the fact since the term wasn’t around at the time of their birth. Some examples include IBM’s System/360 and System Z, the PDP-11, the VAX architecture, and Data General’s Nova.</p>
<h3 id="very-long-instruction-word-vliw-and-explicitly-parallel-instruction-computing-epic">Very Long Instruction Word (VLIW) and Explicitly Parallel Instruction Computing (EPIC)</h3>
<p>These were designed to exploit instruction level parallelism, executing multiple instructions in parallel. This requires less hardware than CISC or RISC and leaves the complexity for the compiler.</p>
<p>Traditionally, processors use a few different ways to improve performance, let’s dig into these.</p>
<ul>
<li><strong>Pipelining</strong> divides instructions into substeps so the instructions can be executed partly at the same time.</li>
<li><strong>Superscalar architectures</strong> dispatch individual instructions to be executed independently in different parts of the processor.</li>
<li><strong>Out-of-order execution</strong> executes instructions in an order different from the program.</li>
</ul>
<p>The methods above all complicate hardware by requiring the hardware to perform all this logic. In contrast, VLIW leaves this complexity to the program. As a trade-off the compiler becomes a lot more complex while the hardware is simplified and still performs well computationally.</p>
<p>VLIW is most commonly found in embedded media processors and graphics processing units (GPU). However, Nvidia and AMD have moved to RISC architectures to improve performance for non-graphics workloads. You can also find VLIW in system-on-a-chip (SoC) designs where customizing a processor for an application is popular.</p>
<p>EPIC architecture was based on VLIW but made a few changes. One of which allows for groups of instructions, called bundles, to be executed in parallel if they do not depend on any subsequent group of instructions. You can often distinguish EPIC from VLIW because of EPICs focus on full instruction predication. This is used to decrease the occurrence of branches and to increase the speculative execution of instructions. Speculative execution loads data before we know whether or not it will be used.</p>
<p>You might be familiar with speculative execution from the Spectre and Meltdown attacks. The Spectre and Meltdown attacks are a whole different rabbit hole I won’t go down in this post, but I hope you can understand how your own learning is almost like a choose your own adventure game. You can choose to go further down any path at any time.</p>
<h3 id="minimal-instruction-set-computer-misc">Minimal Instruction Set Computer (MISC)</h3>
<p>This is more minimal than RISC. It includes a very small number of basic operations and corresponding opcodes. Commonly these are categorized as MISC if they are stack based rather than register based, but can also be defined by the number of instructions (fewer than 32 but greater than one).</p>
<p>Quite a few of the first computers can be classified as MISC. These include (but are not limited to) the ORDVAC (1951) and the ILLIAC (1952) from the University of Illinois and the EDSAC (1949) from the University of Cambridge.</p>
<h3 id="one-instruction-set-computer-oisc">One Instruction Set Computer (OISC)</h3>
<p>This describes an abstract machine that uses only one instruction. It removes the necessity for a machine language opcode. For example, <a href="https://www.cl.cam.ac.uk/~sd601/papers/mov.pdf">“mov” is turing complete</a> which means it’s capable of being an OISC, as well as other instructions using subtract.</p>
<p>This has not been commercialized, as far as I know, but it is very popular for teaching computer science.</p>
<p>This leads down a few paths, some can get into all the nitty gritty details of each instruction set and their differences. For the sake of learning more about RISC-V, let&rsquo;s dig more into that specific design.</p>
<h2 id="risc-v-design">RISC-V Design</h2>
<p>There is a great paper on the <a href="https://people.eecs.berkeley.edu/~krste/papers/EECS-2016-1.pdf">RISC-V design from Berkeley</a>. Chapter 2, “Why Develop a New Instruction Set?”, is my favorite. It goes over the pros and cons of a lot of prior instruction sets, why the authors decided to create a new instruction set, and what lessons they learned and brought over from their knowledge of the past. I will summarize what I thought was interesting but I urge you to dig in for yourself and read the entire paper.</p>
<p>For one, the authors state the importance of the fact that RISC-V is a completely free and open instruction set architecture. In contrast, all the most widely adopted instruction set architectures are proprietary. They are all also immensely complex. For example, you cannot get a hard copy of the x86 manual anymore and even in PDF form it’s ~5,000 pages and that doesn’t include the extensions. Who has time to read all of that? Although there is no exact number, <a href="https://stefanheule.com/blog/how-many-x86-64-instructions-are-there-anyway/">it’s estimated there are around 2,500 instructions in x86</a>, which is just unwieldy.</p>
<p>Props to Sun Microsystems for the fact that SPARC V8 is an open standard, but the design decisions are highly reflective of the other instruction sets from that time, leaving it unsuitable as a modern instruction set. “It was designed to be implemented in a single-issue, in-order, five-stage pipeline, and the ISA reflects this assumption.”</p>
<p>Alpha came out of Digital Equipment Corporation (DEC) in the 1990s so it got to be built with some learning from the earlier eras. However it seems like they over-engineered it. Most interestingly, they also did not think to create any room for extra opcode space for extensions. The authors also point out that ISAs can die and Alpha is a great example of an ISA being pretty obsolete outside of owning an old DEC computer, other than the last implementation by HP in 2004 when the IP changed hands again.</p>
<p>ARMv7 is widely used and the authors seriously considered it due to the fact of its popularity and ubiquity. However ARMv7 is a closed standard and cannot be extended making it unsuitable for the authors. They also found some technical problems as well, but the biggest determent to me was the fact it has over 600 instructions making it quite complex.</p>
<p>The authors go over a few more instruction sets but I think you get the point that none of them were suitable for their needs. Of course you are more than welcome to dig in further yourself, I am just not going to take the time to reiterate their work here.</p>
<h2 id="recapping-how-i-learn">Recapping how I learn</h2>
<p>The paper continues into the details of the design of the RISC-V architecture. Some of this I will cover in my DotGo EU talk. For the sake of showing how I learn things I urge you to read the paper yourself and when you hit a term or concept you don’t know: research that concept. Continue this until you get a general understanding then jump back up into the paper where you left off. This cycle is how I dive into new things.</p>
<p>At the beginning of this post I said I would take you down the path of how I dug into RISC-V, yet I have not even begun to describe the actual design or features of RISC-V. I did this to make a point (and because I was tired, maybe mostly because I was tired). Look how much I dug into the fundamentals of instruction sets before even digging into the thing I set out to learn. This is commonly what I find happens and I wanted to show an example of my process. Now you can go and continue the rest of the process yourself by continuing to read the <a href="https://people.eecs.berkeley.edu/~krste/papers/EECS-2016-1.pdf">RISC-V design paper</a>, watching other <a href="https://www.infoq.com/presentations/risc-v-future">RISC-V talks</a>, <a href="https://riscv.org/risc-v-books/">getting some RISC-V books</a>, or finding other RISC-V papers and learning from those.</p>
<p>Then, buy a board and start playing with it. I got the <a href="https://www.sifive.com/boards/hifive-unleashed">HiFive Unleashed</a> and it&rsquo;s awesome!</p>
<p>I hope this helps open your mind to learning and digging deeper on any topics that interest you. Happy learning!</p>
Defining a Distinguished Engineerhttps://blog.jessfraz.com/post/defining-a-distinguished-engineer/
Thu, 21 Mar 2019 08:09:26 -0700https://blog.jessfraz.com/post/defining-a-distinguished-engineer/
<p>I learned a lot about myself and the way big companies are organized over the past year or so. I had mentioned a bit in a <a href="https://blog.jessfraz.com/post/government-medicine-capitalism/">previous blog post</a> and <a href="https://weirdtrickmafia.fm/post/pilot/">podcast</a> about “the N + 1 shithead problem” (from <a href="https://www.youtube.com/watch?v=1KeYzjILqDo">Bryan Cantrill’s talk on leadership</a>). To reiterate, the “N +1 shithead problem” occurs when you are demotivated by seeing people who are a level above you behave poorly, or more bluntly when they behave like a shithead. I know from experience what a huge demotivator this is and after talking to several other folks I realized this is quite common.</p>
<p>When faced with this demotivator, I found myself thinking “why would I want to be at their level, when once I get there I’ll just be one amongst the dipshits.” It’s a horrible feeling to have and I’d love to have a model that resembles what I think of as a distinguished engineer or technical fellow.</p>
<p>In this post I will define what it means to me to be a distinguished engineer or technical fellow and maybe others that agree will modify their ladders to incentivize people to resemble these qualities.</p>
<h2 id="technical-leader">Technical Leader</h2>
<p>The first thing people think of when they think of a distinguished engineer is that they are a technical leader. I fully agree. A technical leader can understand all parts of a system. They can also be dropped into a new system and pick up the way it is architected and designed with relative ease. I think this is an important distinction to make. It’s good to be an expert in a field, but only being an expert is limiting. It’s also important to understand the full picture and that takes general knowledge. I think having a general knowledge of things outside your area of expertise is key if you choose to gain expertise in something.</p>
<h3 id="value-learning">Value learning</h3>
<p>A technical leader can always realize that there is more to learn. One cannot be an expert in everything and you can have a general knowledge of most things without fully understanding the details within. A technical leader can always strive to continue to learn and persuade others to continue to learn as well.</p>
<h3 id="empower-others">Empower others</h3>
<p>A technical leader can build up others and empower their colleagues to do things that are more challenging than what they might think they are capable of. This is key for growing other members of an organization. I personally believe you don’t need a high title to take on a hard task, you just need the support and faith that you are capable of handling it. That support can come from the distinguished engineer and be reflected in their behavior towards others.</p>
<p>A technical leader can also make time for growing and mentoring others.
They can be approachable and communicate with their peers and colleagues in
a way that makes them approachable. They can welcome newcomers to the team
and treat them as peers from day one.</p>
<h3 id="give-constructive-technical-criticism">Give constructive technical criticism</h3>
<p>A distinguished engineer can never tear others down but they can be capable of giving constructive criticism on technical work. This does not mean finding something wrong just to prove their brilliance; no, that would make them the brilliant jerk. Constructive criticism means teaching others to make their work better when there are problems, while also encouraging them to iterate and empowering them to succeed.</p>
<h3 id="have-opinions-loosely-held">Have opinions loosely held</h3>
<p>A technical leader can be able to have opinions loosely held on designs and architecture. Making an active effort not to say &ldquo;strong opinions, loosely held&rdquo; because with a power dynamic that could over power the rest of the voices. Technical leaders can make sure all voices are heard and they can fully articulate the &ldquo;why&rdquo; of their opinion for others.</p>
<p>They do not need to have opinions on everything, that would be pedantic. Technical leaders can be able to use their experience to help others succeed, while also empowering others to own solutions. Technical leaders can not pass down solutions to problems but allow others to learn by letting others come up with solutions themselves. This is where good constructive criticism (from above) can come into play.</p>
<h3 id="great-communicator-and-bridge">Great communicator and bridge</h3>
<p>A technical leader can have strong communication skills and be able to articulate the “why” of a problem as well as articulate the technical details of designs. They can never communicate in a derogatory manner. They can always communicate to others as peers and colleagues.</p>
<p>At times, technical leaders will need to act as a bridge between teams. It is really important to be able to clearly communicate then as well as always.</p>
<h3 id="humility-and-empathy">Humility and empathy</h3>
<p>A technical leader can not be driven by ego but by a constant urge to learn
and grow both themselves and their colleagues. They can have empathy for
others and portray kindness towards their peers and colleagues.</p>
<h3 id="prioritize-shipping-and-decisiveness">Prioritize shipping and decisiveness</h3>
<p>A technical leader can value shipping and decisiveness. They can not be susceptible to analysis paralysis. At the end of the day most people have jobs to get things out the door and this can be a priority. Of course, shipping can not come with the trade off of burning out a team or setting the company on fire.</p>
<h3 id="customer-focused">Customer focused</h3>
<p>Technical leaders can always seek feedback from their customers. This might
be the internal customers of their infrastructure or external customers if they
are on a product team. The best technical leaders are capable of empathizing
with customers and iterating quickly on customer feedback.</p>
<h3 id="build-resilient-systems">Build resilient systems</h3>
<p>A part of being a technical leader is having the experience of building
multiple systems in the past. Distinguished engineers can be able to
anticipate various failures from their past experiences and build systems that
will not create the same failures. Of course no system is perfect so they
can be able to learn from the failures they cannot anticipate as well. This
is a cycle that they can then use when building the next system.</p>
<h3 id="value-quality-performance-and-security">Value quality, performance, and security</h3>
<p>Great technical leaders value quality, performance, and security in what they build. They
stay up to date on advancements and research in technology so that they might be able to use
new techniques for bettering their solutions. Technical leaders can also build with respect for users and their privacy.</p>
<h3 id="value-maintainability">Value maintainability</h3>
<p>Technical leaders can value writing code that is easy to maintain and easy
to understand. They can value unit and integration tests as well as making
sure if a bug is fixed it has a test to make sure there is not a regression.
Technical leaders can use code comments, not as a garnish, but to denote
things a reader would need to know. This could be details of a code section
that fixes a specific bug or maybe reasoning behind why something is written
a certain way. Documenting context is super valuable and helpful for maintainability.</p>
<h2 id="community">Community</h2>
<p>Good technical leaders are also leaders in the outside communities. This can include giving talks on various things they have built as well as mentoring others in the community or the workplace.</p>
<h3 id="learn-from-external-community">Learn from external community</h3>
<p>If you silo yourself to only learning within your company, you are missing out on a world of experiences and expertise different than yours from the external community. Technical leaders realize this and place importance on learning from the larger world of computing than just their silo.</p>
<h3 id="value-listening-and-be-open-to-feedback">Value listening and be open to feedback</h3>
<p>By gaining feedback and making yourself visible to an external community, leaders avoid a dunning-kruger like effect of only growing inside an echo chamber. It is always valuable to see where the rest of the industry is focusing and how technical leaders at other companies are solving problems. Technical leaders realize that there is much to learn from people with different experiences than their own. They can always be open to listening to others.</p>
<h3 id="humility">Humility</h3>
<p>Technical leaders can always remain humble and modest. The best technical leaders know that it’s not possible for them to know <em>everything</em> and will prioritize keeping an open mind to always be learning.</p>
<h3 id="call-upon-other-experts">Call upon other experts</h3>
<p>The best technical leaders know when they need to call on experts in specific areas for help or feedback on certain designs or architecture. By participating in the external community, leaders form strong networks and bonds with fellow engineers they can call on when they need them. Technical leaders can always be eager to use these relationships when they need them or introduce others to these folks if they could use their expertise.</p>
<h3 id="value-research">Value research</h3>
<p>Along with being able to call upon other experts, technical leaders can
value well researched solutions. They can strive to learn from prior art.</p>
<h2 id="have-fun">Have fun</h2>
<p>Always make sure to have fun and not take yourself too seriously!</p>
<p><a href="https://twitter.com/LeaKissner/status/1109259338265165824">Take the long view, not just the broad view.</a></p>
<p>These are just a few of the things I think define a strong technical leader and engineer. I am sure I will grow this list as I personally grow myself every day.</p>
<p>Most importantly you must actually <em>do</em> these things. Actions speak louder than
words.</p>
An Enigma, unikernels booting on RISC-V, a rack encased in liquid. OH MY.https://blog.jessfraz.com/post/enigma-unikernels-risc-v-oh-my/
Sun, 17 Mar 2019 11:25:24 -0400https://blog.jessfraz.com/post/enigma-unikernels-risc-v-oh-my/
<p>I have written a bit about how I am spending my time while being unemployed and
I thought I would continue.</p>
<p>There was one thing I had left out of my <a href="https://blog.jessfraz.com/post/government-medicine-capitalism/">previous post on my visit to the Pentagon</a>.
THEY HAVE A REAL ENIGMA MACHINE THERE. Okay, moving on&hellip;</p>
<h2 id="qcon-and-university-of-cambridge">QCon and University of Cambridge</h2>
<p>I gave a talk at QCon on SGX and ended up giving the same talk to some really
awesome folks at University of Cambridge. Each time I gave the talk provoked
some really interesting conversations. One of the topics that came up a couple of
times was if RISC-V was going to be supported by any major cloud provider anytime soon.
My honest opinion, which some might disagree with, is this is years away BUT it would certainly help adoption and integration into projects if it was backed by a company with a lot of time to develop integrations. Also I got a bit nerd sniped by some ARM folks and researchers to look more into TrustZone (which is the ARM secure enclave). I haven’t dug in yet but it’s on my list.</p>
<p>It was awesome spending a day in Cambridge (thanks <a href="https://twitter.com/avsm">Anil</a> for the tour!) and learning about all the awesome things they are doing. The MirageOS team is booting unikernels on baremetal RISC-V!</p>
<p><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">🎉OCaml boots on bare-metal <a href="https://twitter.com/ShaktiProcessor?ref_src=twsrc%5Etfw">@ShaktiProcessor</a> <a href="https://twitter.com/risc_v?ref_src=twsrc%5Etfw">@risc_v</a>! 🎉 An important milestone towards building safer apps using <a href="https://twitter.com/OpenMirage?ref_src=twsrc%5Etfw">@OpenMirage</a> on open source hardware. <a href="https://t.co/XFosAxPROR">pic.twitter.com/XFosAxPROR</a></p>&mdash; KC Sivaramakrishnan (@kc_srk) <a href="https://twitter.com/kc_srk/status/1101479406084583424?ref_src=twsrc%5Etfw">March 1, 2019</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>They use this on boards to power light bulbs (at the University!) super securely since it removes the need for all the shitty firmware most other things ship and has a super minimal environment. I’m sure you can think of a number of different other use cases as well. Honestly, unikernels replacing all the crap firmware in the world would be a huge win.</p>
<h2 id="open-compute-summit">Open Compute Summit</h2>
<p>Just this past week I spent a day at the Open Compute Summit. What is happening there in the open firmware space is truly awesome. They had demos of hardware they are booting with LinuxBoot and Coreboot. Facebook runs this on their infrastructure as well as with OpenBMC to replace the traditional, proprietary BMC firmware. Trammel Hudson has some <a href="https://trmm.net/LinuxBoot_34c3">great posts</a> on LinuxBoot, which include links to some really great talks by him and Ron Minnich.</p>
<p><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">😍 the open systems firmware community is awesome <a href="https://t.co/DAqudm6M4Z">pic.twitter.com/DAqudm6M4Z</a></p>&mdash; jessie frazelle 👩🏼‍🚀 (@jessfraz) <a href="https://twitter.com/jessfraz/status/1106301027408465920?ref_src=twsrc%5Etfw">March 14, 2019</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>Facebook’s server racks are gorgeous. They have a power bus which runs down the center and everything gets power from that, with the main power coming out of the power unit towards the middle of the rack (in the first picture below).</p>
<p><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">The Facebook rack and node designs are seriously gorgeous, simple. The power bar <em>chef kiss</em> <a href="https://t.co/pGphy9uLLl">pic.twitter.com/pGphy9uLLl</a></p>&mdash; jessie frazelle 👩🏼‍🚀 (@jessfraz) <a href="https://twitter.com/jessfraz/status/1106336080956018689?ref_src=twsrc%5Etfw">March 14, 2019</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<h3 id="boot-guard">Boot Guard</h3>
<p>One thing I learned that I found fascinating was about Boot Guard for Intel processors and the equivalents on ARM and AMD. Boot Guard is supposed to verify the firmware signatures for the processor. The problem with this, in Intel’s case, is only Intel has the keys for signing firmware packages. This makes it impossible for you to then use Coreboot and LinuxBoot or equivalents as firmware on those processors. If you tried, the firmware would not be signed with Intel’s key and would brick the board. Matthew Garrett wrote <a href="https://mjg59.dreamwidth.org/33981.html">a great post</a> about this as well.</p>
<p>If a person owns the hardware, they have a right to own the firmware as well. Boot Guard prevents this. In <a href="https://trmm.net/OSFC_2018_Security_keynote#Boot_Guard">another great talk</a> by Trammel, he found a vulnerability to <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-12169">bypass BootGuard</a>.</p>
<p><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">CVE-2018-12169 also potentially allows a developer to &quot;jailbreak&quot; their BootGuard protected laptop since the UEFI DXE volume can be replaced with a user provided LinuxBoot ROM image. <a href="https://t.co/yHwwMOTyx7">https://t.co/yHwwMOTyx7</a> <a href="https://t.co/MeWI0DGUBf">pic.twitter.com/MeWI0DGUBf</a></p>&mdash; Trammell Hudson ⚙ (@qrs) <a href="https://twitter.com/qrs/status/1044157473882591233?ref_src=twsrc%5Etfw">September 24, 2018</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>This &ldquo;feature&rdquo; from hardware vendors is preventing the innovation of this
community and preventing pushing technology to a safer place. If you are
in a position to push back on these hardware vendors, please do so. They need all
the help they can get.</p>
<h3 id="server-rack-encased-in-liquid">Server rack encased in liquid</h3>
<p>Lastly, I saw something bat shit crazy at Open Compute Summit. It was
something I saw in the Expo Hall. One vendor has encased an entire server rack
in liquid for liquid cooling. I&rsquo;m not sure I could sleep at night using this.
The funniest part about this though was the demo at their booth still had fans
in the rack! I mean&hellip; why would you need fans if you had liquid cooling&hellip;
they claimed it was just &ldquo;left over&rdquo; and you wouldn&rsquo;t need that.
But at a conference where everyone is showing off their custom hardware, you&rsquo;d
think they would have left the fans at home ;).</p>
<p>That&rsquo;s the end of this update of my adventures. Hope you all enjoyed it. I know
I enjoyed living it!</p>
Trust and Integrityhttps://blog.jessfraz.com/post/trust-and-integrity/
Fri, 01 Mar 2019 18:09:26 -0700https://blog.jessfraz.com/post/trust-and-integrity/<p>I stated in my first post on my <a href="https://blog.jessfraz.com/post/government-medicine-capitalism/">reflections of leadership in other
industries</a>
that I would write a follow up post after having hung out in the world of
finance for a day. This is pretty easy to do when you live in NYC.
Originally for college, I was a finance major at NYU Stern School of Business
before transferring out, so I have always had a bit of affinity for it.</p>
<p>I consider myself pretty good at reading people. This, of course, was not
always the case. I became better at reading people after having a few really
bad experiences where I should have known better than to trust someone. I&rsquo;ve
read a bunch of books on how to tell when people are lying and my favorite
I called out in my <a href="https://blog.jessfraz.com/post/books/">books post</a>. This is
not something I wish that I had to learn but it does protect you from people
who might not have the best intentions.</p>
<p>Most people will tell you to always assume good intentions, and this is true to
an extent. However, having been through some really bad experiences where I did
&ldquo;assume good intentions&rdquo; and should not have, I tend to be less and less willing
to do that.</p>
<p>I am saying this, not because I think people in finance are shady, they
aren&rsquo;t, but because I believe it is important in any field. I, personally, place a lot of value on trust and
integrity.</p>
<p>I&rsquo;m not really going to focus this post on what an investment bankers job is
like because honestly it wasn&rsquo;t really anything to write home about. What I did
find interesting was the lack of trust in the workplace. Trust is a huge thing
for me, like I said, and I think having transparency goes hand-in-hand with that.</p>
<p>To gain trust, I believe a leader must also have integrity and a track record
of doing the right thing. I liked this response to a tweet of mine about using &ldquo;trust
tokens&rdquo; in the case leadership needs to keep something private.</p>
<p><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">They are. It gets hard with legal things like SEC filings and acquisitions but that’s where an already good leadership team can use existing trust tokens.</p>&mdash; Silvia Botros (@dbsmasher) <a href="https://twitter.com/dbsmasher/status/1098602904838197253?ref_src=twsrc%5Etfw">February 21, 2019</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>I think people tend to under estimate how important it is to be transparent
about things that don&rsquo;t need to be private. I&rsquo;ve seen a lot of people in
positions of power, use their power of keeping information private <em>against</em>
those under them. They don&rsquo;t fully disclose the &ldquo;why&rdquo; and it leads to people
they manage not fully being able to help solve the problem as well as not fully
understanding the problem. It also doesn&rsquo;t build trust.</p>
<p>Leaders should try to be cognisant of when something needs to be private and
when they can be transparent about information. I also really enjoyed this
insightful tweet as well:</p>
<p><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Unlike respect, which can start from a positive value and go up or down depending on behavior, trust starts at 0. You have to earn the trust of your colleagues and reports before you can take loans out on it. <a href="https://t.co/aWRpdjAtBR">https://t.co/aWRpdjAtBR</a></p>&mdash; julia ferraioli (@juliaferraioli) <a href="https://twitter.com/juliaferraioli/status/1101572682863296514?ref_src=twsrc%5Etfw">March 1, 2019</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>Just thought I would put my thoughts in writing since I said I would. This
experience seeing how other industries work has been super fun for me. I might
try to find some other jobs to check out as well in the future.</p>