A couple of years ago, I open-sourced a project that allows you to count how many markers are within a circle using Leaflet. As part of the template, a user needs to enter an address to get a marker to show up on the map.

Two weeks ago, Jefferson Guerrón reached out to me because he was creating something similar that didn’t require a user to type in their address. He was using the plugin Leaflet.draw to get the marker on the map. But he was having a hard time getting a circle to show up around the marker and count the other markers within that circle.

In the data journalism world, data dumps are pretty common occurrences. And often times these dumps are released periodically — every month, every six months, every year, etc.

Despite their periodic nature, these dumps can catch you off guard. Especially if you are working on a million things at once (but nobody does that, right?).

Case in point: Every year the Centers for Medicare & Medicaid Services releases data on how much money doctors received from medical companies. If I’m not mistaken, the centers release the data now because of the great investigative work of ProPublica. The data they release is important and something our readers care about.

Last year, I spent a ton of time parsing the data to put together a site that shows our readers the top paid doctors in Iowa. We decided to limit the number of doctors shown to 100, mostly because we ran out of time.

In all, I spent at least a week on the project last year. But this year, I was not afforded that luxury because the data dump caught us off guard. Instead of a few weeks, we had a few days to put together the site and story.

Fortunately, I’ve spent quite a bit of time in the last year moving away from editing spreadsheets in Excel or Google Docs to editing them using shell scripts.

The problem with GUI editors is they aren’t easily reproducible. If you make 20 changes to a spreadsheet in Excel one year and then the next year you get an updated version of that same spreadsheet, you have to make those exactly same 20 changes again.

That sucks.

Fortunately with tools like CSVKit and Agate, you can write A LOT of Excel-like functions in a shell script or a Python script. This allows you to document what you are doing so you don’t forget. I can’t tell you how many times I’ve done something in Excel, saved the spreadsheet, then came back to it a few days later and completely forgotten what the hell I was doing.

It also makes it so you can automate your data analysis by allowing you to re-run the functions over and over again. Let’s say you perform five functions in Excel and trim 20 columns from your spreadsheet. Now let’s say you want to do that again, either because you did it wrong the first time or because you received new data. Now would you rather run a shell script that takes seconds to perform this task or do everything over in Excel?

Another nice thing about shell scripts is you can hook your spreadsheet into any number of data processing programs. Want to port your data into a SQLite database so you can perform queries? No problem. You can also create separate files for those SQL queries and run them in your shell script as well.

All of this came in handy when this year’s most paid doctors data was released. Sure, I still spent a stressful day editing the new spreadsheets. But if I hadn’t been using shell scripts, I doubt I would have gotten done by our deadline.

Another thing I was able to do was increase the number of doctors listed online. We went from 100 total doctors to every doctor who received $500 or more (totaling more than 2,000 doctors). This means it’s much more likely that readers will find their doctors this year.

The project is static and doesn’t have a database backend so I didn’t necessarily need to limit the number of doctors last year. We just ran out of time. This year, we were able to not only update the app but give our readers much more information about Iowa doctors. And we did it in a day instead of a week.

Earlier this year, I had the privilege of teaching a class on building your first Leaflet.js map at the NICAR conference. I just realized I forgot to post the code in this blog so I figured I’d post it now. Better later than never.

If you’re interested in building your first map, check out my Github repo for the presentation.

One of the first places I started on this project was building a graph that will take an icon and divide it into buckets. I didn’t see any openly available code that replicated what I was trying to do, so I’d figure I’d post my code online for anyone to use/replicate/steal.

For this project, I put the icons into three columns. In each column, icons are placed side by side until we get four icons in a row, then another row is created. You can see this in action by clicking the button several times. And all of this can be adjusted in the code with the “row_length and “column_length” variables.

To move the icons, I overlaid three icons on top of each other. When the button is clicked, each of the icons gets sent to one of the columns. The icons shrink as they reach their column. After this transition is finished, three more icons are placed on the DOM. And then the whole process starts over.

A bunch of math is used to determine where exactly on the DOM each icon needs to go. Also, we have to keep track of how many icons are on the DOM already, so we can break the icons into new rows if need be.

Data is required to make D3 run so my data is a simple array of [0,1,2]. The array has three values because we have three columns on the page. We do some calculations on the values themselves and then use SVG’s transform attribute to properly place the icons on the DOM. We use D3’s transition function to make it appear like the icons are moving into the icons, not just being placed there.

Hopefully this helps others facing similar problems in D3. If you have any questions, just leave them in the comments.

We are a few weeks into the new year, but I wanted to look back at the biggest project I worked on in 2015: The redesign of KCRG.com.

While most of my blog posts are full of links, I can’t link to that site. Why? Because it’s gone.

What?

In a series of very unfortunate events, the site we spent many, many months planning and developing is already gone.

The timeline: We started preparing for the redesign, which was BADLY needed, in early 2015. We then built it over the course of several months. Finally, it was launched in July. Then, in a move that surprised every one, KCRG was bought by another company in September.

At the time, I was optimistic that the code could be ported over to their CMS. And the site wouldn’t die.

My optimism was short lived. Gray has a standard website template for all its news sites, and they wanted that template on KCRG.

Obviously, this was a big shock for our team. Even worse, the code we wrote was proprietary and requires Newscycle Solutions to parse and display. So even if I wanted to put it on Github, it wouldn’t do anyone any good.

I’m not used to the impermanence of web. When I had my first reporting job in Galesburg, I saved all the newspapers where my stories appeared. And unless my parents’ house catches on fire, those newspapers will last for a long time. They are permanent.

Not so online. Websites disappear all the time. And those who build them have barely any record of their existence.

Projects like PastPages and the Wayback Machinekeep screenshots of old websites, which is better than nothing. But their archives are far cries from the living, breathing websites we build. A screenshot can’t show you nifty Javascript.

It’s an eery feeling. What happens in five years? Ten years? Twenty years? Will any of our projects still be online? Even worse: Will technology have changed so much that these projects won’t even be capable of being viewed online? Will online even exist?

Think about websites from 1996. They are long gone. Hell, many sites from two years ago have vanished.

I don’t have good answers. Jacob Harris has mulled this topic and offered some good tips for making your projects last.

But it’s worth pondering when you’re done with projects. What can I do to save my work for the future? I have a directory of all of my projects from my Courier dayson an external hard drive. I have an in-process directory for The Gazette as well.

I hold onto them like I did my old newspaper clippings. Although, I’m confident those clippings will last a lot longer than my web projects.

You really need to set up a back-end project to use PJAX to its fullest. For this presentation, I focused on just the front-end components, which will at least give you an idea of how the library works.

I’ve seen PJAX (or something like it) a few times in the wild. The Chicago Tribune, for instance, uses it (or something similar) on their website. If you scroll down their page long enough, you’ll notice a new page opens up in your browser. But the entire page doesn’t refresh. Instead the URL changes smoothly as you scroll down the page.

Anyways, check out the code on my site and if you have any questions, fire them my way.