Jordan Westhoff's Blog

I’m taking a brief break from a good deal of programming and figured I’d clear up some recent questions I’ve been getting about disparity in my web domain names. Currently, I hold the leases to this domain and also http://jordanwesthoff.me – why is this?

First off, I’ve had this domain for a long time. I first decided I needed to develop a web presence way back in high school when I was a wee lad. I’m glad I did. I was less busy than I am now and it gave me a great way to learn about the ins and outs of basic web hosting and all of that fun stuff. That being said, I was a less informed wee lad and decided to host directly with WordPress – it seemed like the perfect combo! For ~$40 yearly I get hosting and a domain and it’s borderline idiot proof! Just what high school me needed.

Over the years though, WordPress has been making a steady decline. I don’t mean the WordPress platform itself – that’s alive and doing incredibly well! It’s their business side of things – their hosting practice and whatnot on the WordPress.Com side of the border. Shortly after buying the domain and getting everything set up and pretty, WordPress notified me I could do some cool new customizing. This seemed like news to me but I checked it out and sure enough, for bolt on pricing I could do a little extra customizing. This was worrying, sure, but I didn’t need the customization so I let it slide. Two years later, WordPress graciously let me know that they were going to start putting ads on my site. MY site! The one I pay for! Unless, of course, I paid their $89 Premium fee which would remove all of those ugly suckers from the domain and hosting I already pay for. Not cool and certainly not acceptable.

That being said, I started to realize earlier in the year that I love having a blog, and quite frankly, that is more of what this site has been doing anyways. But I still wanted an elegant way to pass information to other parties as well as generally represent myself on the internet. As a result. I set out to make a FrontPage. In my opinion, a frontpage is essentially, a simple, elegant page that has just the most important information about myself – especially now that I’m embroiled in career hunting. I wanted something I could very quickly manage and I found a great HTML5 skeleton to build off of and manage through GIT. I love it – and I think it looks really slick. I’m probably going to keep it, even after I find a career.

Finished title page on my new domain!

I’m still at a bit of a loss of what to do though – I probably won’t touch this blog besides updates until I graduate. However, I’ve become to really hold resentment towards the once Mighty WordPress and will probably set up my own hosting and a custom install of WordPress post grad once I settle into a job in a couple of months so stay tuned for those updates! Once I have some static servers back up and my firewall and stack back online, hosting my own site locally will be no issue at all – something I’ve been looking at doing for a very long period of time.

Some more sample imagery from my new site.

Long story short though: Jordanwesthoff.com (the page you’re on now) is here to stay and is going to serve as a professional blog for my work and major projects!

Jordanwesthoff.me is the flashy front page with essential information, a new, slick design and links to all of my thesis work! Thesis work will be posted here too, but it can’t hurt to have some information bridge between sites! So go check out my new front page and let me know what you think and in the meantime I’ll keep blogging here!

Lately, I’ve been having issues correctly conveying some facets of the project to the general public that may not be as well versed in some of the technical concepts of the project. In an effort to make it a little easier to explain and the main concept of the project more accessible to all, I tried to find more elegant ways to express the information. Thus, the infographic was born!

Wow, 2015 was a busy year! Due to a lot of time commitments and several big projects I wasn’t able to update as often as I wanted to. Luckily, I’m back in Rochester after a wonderful Christmas and New Years break so I figured I’d catch y’all up!

First off, I’m working on updating all of my pages, on this page and abroad. It’s been a while on some of them, and a lot of information needs to be updated. I’ll post notifications as they get updated but for starters I’ve updated my About Me page. This is a great look at what I’m currently working on as well as some of my upcoming projects.

Cerberus – the mythical three headed beast.

The last several months have been filled with productivity for my thesis project. As well as making lots of imaging headway, I’ve been able to bring up several websites that point to the project. Each of these updates dynamically as the project is updated, thanks to the merits of rsync and GitHub Pages. The main page for the project can be found here, in the form of a wiki: Senior Thesis Wiki. Also, thanks to GitHub hosting, pages with information about the Cerberus cluster (the computational engine I designed to run my thesis software) have been posted that update as well: Cerberus Cluster Page. All of these pages have been written with simplicity in mind to make it easy to understand regardless of the computational awareness of the reader. I named the project Cerberus because it focuses on the collaboration of multiple ‘heads’ or computational assets to accomplish a singular goal.

That’s all I have for now, but check back soon! I’ve been crunching all of the data that I worked with in 2014 and I’m preparing a pretty cool stats page in order to make that data come alive through infographics. Thanks for reading!

In the past I’ve spent a bit of time talking about the merits of RAID and other high speed disk setups. Several years ago I also published posts about dual drive setups and did some surgery on my MacBook Pro in an effort to scrap the optical drive in favor of adding an SSD. (Psst, the old article is still available here)

All in all, the overall speed of a computer can be attributed to the combination of all of its internal parts working together to get more work done in less time. This is especially true of high performance machines, servers and drive array machines. While a wonderful utility, known as the BlackMagic Disk Speed Test, is available for Windows and OSX, it isn’t available for Linux. While I am sure there are a plethora of GUI disk speed utilities for Linux, there is one that I’m particularly drawn to, due to its simplicity and ease of use from the terminal. Since Linux is focused on being minimalist in the pursuit of performance, it only makes sense that installing a whole new utility just for spin testing is a bit wasteful.

As a result, I’ve written a basic script that does a pretty accurate disk speed test via the command line. The utility should work with all flavors of linux, I have been using it and deploying it across my fleet, all of which are either running Debian, ARCH or centOS 6.5.

The script is far from complex, merely taking a user input of how large a block of info to write to the disk and then it both writes and reads that size block of info and takes a time measurement. It is, however, pretty handy and works as fast as your drives can spin. Here, you can see the output of the script. I ran it with an argument of 2048MB across a single Western Digital VelociRaptop 15K RPM drive in one of my servers here in the rack.

Not too shabby for a single 15K drive!

The code isn’t proprietary, you are free to use it how you like as an easy sysAdmin tool and it is easily modified to work however you please. Enjoy!

Summer has been awesome here in Rochester so far, I’ve gotten a lot done in different regards to my project!

Ultimately, I’m still in the stage of hardware and software testing, while conducting studies on 4K formats and compression schemes. All of this is valuable in appraising how much computational power I need to conduct all of the operations that are required to get the 4K footage processed as quickly as possible. Since I last posted, I’ve gotten some additional hardware to use and test, both virtually and locally. Here’s the round up of some stuff:

Last post, I was comparing smaller, underpowered machines to massive computing desktops to see what the differences were. They were, well, humongous. It turns out the small Micro-ITX board and setup I was using is indeed too slow for any kind of operations work. Hence, I re-purposed it for something that is different, but still useful: a netflix box!

The Alienware machine is a powerhouse even though it is still pretty old. Once I stocked it with a powerful GPU and added a bit more ram (went from 2 to 18 GB of RAM). Right now it is conducting CPU vs GPU testing as well as being used as a primary gaming machine in the evenings when I get back from work on campus. Overall, the Alienware definitely did win the battle between light and power efficient vs power hungry and high performance.

New Information:

Okay here’s all the new stuff that I promised. Recently, the school granted me two more physical server machines for use on the project. Both are 64 bit SuperMicro 2U servers, both are taking advantage of AMD’s Opteron processing technology, which I have to admit, is awesome. Both machines are powered by dual quad core CPU’s and 64GB’s of RAM.

One of my new SuperMicro machines waiting to be racked with two of my other, older Dell units.

One of the new devices is posted in the photo, it’s the machine on top and there is a second one that is identical but I already had it racked at the time the photo was taken. As you can see, both have considerably larger drive quantities. Each unit allows me to store 8 drives, and currently both have 15K raptor drives included which is awesome! 15K server drives, which I have RAIDed (laymans term for working in tandem to increase speed) is allowing me to exceed a standard hard drive read and write speed by a factor of 3! This will be invaluable for parsing and spreading out frames for my project across the cluster. Right now, each of the drives is writing at a ballpark of 82 MB/s and reading at a rate of about 260 – 280 MB/s. This is excellent because for the system I am building, read speed are far more important than write speeds for these two units. Write speeds will increase as I RAID the devices.

On top of this, I have been developing a lot of the skeleton dev software for my project. The first stage of this has been individually configuring each server since I haven’t decided to go with a major software solution like Puppet, Salt or Ansible since I’m not sure that all of the configuration time is worth the slight performance boost I would get during only the configuration phase of each server. As a result, I’ve written a full suite of scripts that kick into effect once cent is installed on each machine. I decided to go with CentOS since it focuses on enterprise support, security and longevity (the current cent dystro is supported for 7 years). Once an OS is installed, each machine can run totally autonomously once it connects to my authentication and has all of the account info it needs. The machines install all of the necessary programs and services, in addition to syncing other repositories and cloning them locally . Once each of them is all setup, it notifies me via log that it is ready to join the cluster and processing can begin. As I begin to amass more and more hardware, local and virtual, easier deployment of each unit is increasingly more important because once the semester begins again, it will be very difficult to gather extra time to set up more efficient configurations and whatnot.

In the next week or so, I should be getting access to more hardware, I also have a lot of cool code to share with you all; most of it is linux based deployment, disk testing and a variety of others as well. Look for that in my Git and other repos, hosted here!

Over the weekend I spent a good deal of time looking at just how intensive a full RAW data workflow can be. I also wanted to compare the burden of 4K RAW vs Arri’s 2.8K RAW via a S.Two recording device and see which required the most data overhead to work with. This allows us to simply look at how much drive space is required and discount the physical CPU usage of the project since pretty much all of my machines were running at almost full bore whenever renders were required.

While storage is not really a problem for a lot of industry professionals, it can be quite the burden for independents or students. Not every student has several terabytes (a terabyte is 1,024 gigabytes) of unused, high speed storage. A lot of people wonder if they van get away with slower, basic desktop drives for data of this proportion but it really comes down to how long you want to wait. Slow drives serve information, well, slowly. Waiting for 300GB of renders to load can take ages and when deadlines are at stake, it really isn’t viable.

Below, I’ve compiled a good deal of raw statistics from our recent shootout project. Since I was in charge of managing the data, image processing and running the servers we worked off of, I have the entirety of the raw footage as well as a significant portion of the renders. This accounts for tons and tons of space, enough space in fact that I thought comprehensive statistics might be helpful to visualize where all of that information is going.

There are a couple foreword things to note though, before we get started. In some of the statistics, I pitted the total usage by camera which encompasses all of the information used, start to finish on each camera platform. In one or two other statistics, I broke up the information to reflect intermediate stages. For the Arri D-21, this required converting raw S.Two DPX files to .ARI files and then exporting them again from ARRI RAW Converter to DPX or .TIFF file sequences to color grade and then make a final export of. For the Sony there was simply taking the camera files from the onboard SD card and then grading and re-exporting. In 4K, however, there was significantly more to do. Dumping the card gave a nice, proprietary, MXF wrapper with all of the files which had to be opened with Sony RAW Viewer in order to convert them to 16-bit DPX files. These could then be graded and exported again to a DPX or TIFF sequence to be imported for analysis and editing. Each of these reflect storage as you can imagine, and it presents quite a trend in the statistics.

This accounts for all of the ‘mission critical’ information stored for the shootout.

Here, we can see just how much data there was overall. In total, the final aggregate size of all resources exceeded 1.6TB! This included all footage, start to finish, ARRI, Sony and Sony 4K as well as renders, CC passes, graphics for our final video and any other data in between. Keep in mind that for the actual camera footage (which comprised a significant portion of the overall data used, but more on that later) totaled only about 10 minutes per camera (and less for the Sony 4K). This is because most of the shots used in the shootout were of charts or color scenes – the longest scenes were barely over 50 seconds apiece. Therefore, shooting an entire film on any of these platforms would consume an incredible amount of data. Broke up above are four different categories and each are perhaps a bit vague so I’ll take a moment to explain them.

The first, and the largest is the Footage Archive. This is an aggregate gathering of just the base footage captured from each camera. This also incorporates some intermediate files in the case of the ARI – essentially all of the footage classified here was footage that was ready to go into editing minus any major color correction. The Shootout Archive contains all of the intermediates of the pick scenes and the color corrected scenes. This means that any footage that was observed and chosen to be good enough for analysis went on to continue the chain of picture processing. The files contained in this directory are renders from the S.Two and then processed in ARRI’s ARC as well as the Sony HD and 4K clips that were chosen – those also underwent their respective processing steps as well. Shootout MAIN is the working directory for all of the analysis, as well as the video production portion of the project. Here are all of the final renders, color correction finals, stock footage, B-roll, preliminary video screening renders and narration as well as all of the graphics that our team generated as well. Finally, there is a Web Access Point directory. This was a separate directory created on a network server in order to provide each member of the team with fast, reliable intermediary storage for their own assets in production. These could be screen captures, editing files, project files, you name it. This is the working miscellaneous that helped make the workflow so efficient – each member had a fast directory to work from and then contribute to the final project being assembled in real time.

Each day of shooting generated different amounts of storage requirement based on scene.

Since the shootout was spread over three (technically four, when you look at 4K) days, it was useful to look at how usage varied by day. Some of the graph information was cut off but the four largest portions were indicative of the longest shots. Day 1 files came close to taking the lead in storage but our Day 3 files took the lead with 19.6% of total data usage – these stats merely incorporate the files coming from the ARRI and the Sony in HD video mode. The third largest, at 17% was from our fourth day of shooting and this comprises all of the Sony 4K raw shoot files. Each of the much smaller portions is broken up by shot – some scenes took many shots and some took far less.

This shows a better look of how production workflow can impact your data needs for each project.

Here, this is a final, final look at how much information from each step of production comprises the total. This specific figure ties directly into the final, cleaned up and organized storage stats of the shootout in its entirety. Of the approximate 1.6TB required, the most costly stage of production was generating all of the intermediate files. This was especially true of the 4K tests which equaled almost half of this information despite shooting for only about %20 as long as the ARRI and Sony HD tests. Both RAW tests required multiple intermediate steps which chewed through tons of space because of each’s respective resolution. We chose to work with DPX and TIFF’s since those are lossless formats and overall exhibited the best quality.

All in all, shooting RAW is a very exhausting process, both from a processing and storage perspective. Your storage needs will be dependent on the camera and the codec/format you choose to edit in but it’s always safe to budget one to two terabytes for shooting a short and always, always remember to BACKUP your information! All of the statistics here leave out the backups that were set in place to safeguard our information. At any one point, our information was backed up in two additional places – one in a hardware RAID attached to a workstation on another end of campus and a full minute-to-minute backup stored on a NAS. This NAS also pulled all of the web assets from each member in order to keep their assets online and safe at the same time. Feel free to contact me if you have questions as well!

In the future I’ll be making a post dedicated to the labyrinth of storage and why different types are better than others, as well as a look into what I’m using to manage all of this information! Thanks for reading!

Recently, as part of the MPS Shootout we just finished, my shootout team and I had a great opportunity to shoot with some interesting Sony hardware since our main objective was to shoot and compare the RAW cinema capabilities of the Arri D-21 and the Sony NEX-FS700.

Natively, the Sony FS700 can’t shoot 4K. However, with a gracious software update from SONY that was implemented and installed by the RIT SoFA cage, the feature is unlocked. While the sensor and the camera on board hardware can handle the capture of 4K, the camera itself has no reliable method to record it. Without any hardware upgrade, the Sony FS700 only employs an SD card slot, which is not fast enough, nor high enough capacity to begin to think about recording 4K content. Hence, enter the Sony AXS-R5 + HXR-IFR5 4K bundle. The school didn’t have these units available, but with a grant we were able to rent the equipment for a night in order to conduct our tests.

The most expensive hard drive toaster you ever will buy (for now until…8K?)

It was actually a pretty difficult feat getting our paws on this particular setup. The physical recording unit, the AXS-R5 is built and engineered for Sony’s PMW-F5 and Cine-Alta F55 cameras – not natively meant for the NEX-FS line. SONY solved this problem by engineering an “interface” unit – the HXR-IFR5. This unit takes in the 4K signal over an SDI cable and then pushes the signal to the recorder to be saved. Overall, the two units together cost just over $10,500 and that doesn’t include mounting, storage or other accessories. For our test, we used a single 512gb SSD, also manufactured by SONY, and it really did the trick! As a result of the difficulty in acquiring the devices, we couldn’t shoot for all of our test days but a small rental company out of Tennessee, pulled through for us! Enter LensRentals.com! With the unit acquired, I could then proceed to unbox it and start recording!

Initial Vanguard package containing our SONY gear.

All of our SONY gear nestled inside of its shipping case.

All unboxed and joined together – just need a camera!

After the unit was unboxed, we were able to test it our in an actual scene! We proceeded to setup SOFA’s Studio B for our tests which gave us plenty of space to work, as well as plenty of lights, tables, and surfaces to set up our gear and mount our wall test targets. We shot a variety of scenes, mostly charts, but also we got a couple more shots featuring aesthetic objects as well for style.

Studio B setup for 4K RAW

This was our go-to setup. The camera (SONY FS700) was linked to the onboard SD media and the 4K unit via SDI which was also being monitored via the onboard signal feed. Since our 4K and HD were the same aspect ratio, the framing did not change, which meant we could safely use the Panasonic HD monitor to see what the camera was seeing from the DIT station. On set we had an Apple MacBook Pro to monitor files once they were recorded and ingested. All in all, the setup was far less complicated than some other setups, like the ARRI D-21 setup which was a spaghetti nightmare.

S.Two recording setup for the Arri D-21

Mostly, all of our testing went well. We were able to gather all of the shots we wanted and several others. One snag did occur though, and I think that it is best described by the beautifully composed SnapChat that one of my partners, Carly Cerquone, sent to detail the issue.

Yup, that’s right. We made the ol’ rookie mistake.

In the end though, the project was a ton of fun and myself and the entire team learned a whole lot about the process of shooting and working with 4K. It is significantly different (and far more time consuming) than any other workflow currently around and you can find all of our findings and video information at the Shootout page on my blog here as well. Thanks for reading! As one final note, we decided to engineer our own dolly for pure creativity’s sake to capture the opening scene of the MPS SHOOTOUT Video – here was our super innovative approach. Below are some other photos from on set as well.