Prepared for: Atlanta Press Club, Sept. 15, 2016

AND FOR:

SPJ Region 3 Workshop, Oct. 29, 2016

Meet some tools that can help you bypass roadblocks and timesucks on your beat. You'll go back to your newsroom with some ideas about how to track websites, gather data like Tweets, search Google better, sum and subtotal data with Excel and unlock PDFs.

We'll also look at some ways to get interactive graphics in your online stories, like Google maps, timelines and graphs — all without coding.

That'll be followed by group discussion/Q&A. This is geared toward IT beginners who want an intro to data-gathering and presentation tools.

It will be helpful to have: Excel and a Google account (you can make a dummy account if you don't want to link outside services to your own Gmail.) I'll also demo some free sites that will require a signup if you want to use them, including:

You can narrow searches by when the site was last updated, by phrases you don't want, etc.

You can also do it from the search bar, which is nice because you can ask for any filetype, not just the ones listed on the advanced search page:

"tom wolfe" -book

= search for the exact phrase "tom wolfe," but leave out any results that contain the word "book"

georgia history site:edu

= search for the words "georgia" and "history", but only in websites that end in ".edu"

Obama AROUND(0) Zuckerberg

= search for "Obama" within 0 words of "Zuckerberg" (ie, right beside it.)

Obama AROUND(5) Zuckerberg

= search for "Obama" within 5 words of "Zuckerberg"

Note: AROUND() has to be in all caps. And it doesn't always change the results much. But sometimes it does.

landfills site:georgia.gov filetype:xls

= search for the word "landfills" in all websites that end in georiga.gov, but only show me Excel spreadsheets (.xls)

inurl:pdf "Georgia State University"

= search for the phrase "Georgia State University" in any url that has "pdf" in it.
("inurl:pdf" will give you pdfs that Google can't find via "filetype:pdf," according to Henk van Ess, to whom I apologize for cribbing some of this stuff. Read him for more fun searches.)

With Plotly's free tier you can make some simple graphs.

So first let's prepare some data to give Plotly.
Let's go back to lotto.xls
And make a Pivot Table that sums up winnings by year.
Then we'll put that info in Plotly.

There are many sites that do something similar.

If you put a graph on your page, check it on several browsers and PHONES. I've seen reputable news orgs run graphs that are half cut off & unzoomable on phones. I don't know if it's the CMS or the embed or what. But don't be the person who offers mobile users a crummy graph. If one service doesn't work, try another.

BONUS ROUND, time permitting: Now you ask, great, how do I add more info to this map, like, say the legislator's name? Or party? Or make the blue districts blue and the red districts red?

That is exactly what Fusion Tables were designed for!! Fuse data across multiple sources!

First you'll need this spreadsheet, GA_HOUSE.csv. It lists every House member, their party, and a color code for the web.

Downlad that spreadsheet and save it in your own Google Drive as a Fusion Table.

Go back to the House map. On the left, go under File -> Merge. Follow the steps merge the two into a new document. Match the two tables by "DISTRICT." Look closely at "DISTRICT." It's a three-digit number with leading zeroes. Google can match "001" to "001". But "1" and "001" might confuse it. Your merge column has to match up!

But before Problem 8, let's talk about limitaions of Google, Plotly, IFTTT, Changedetection and any other outside site you use in publication ...

When you use free services, you're depending on them not to go out of business, for their cloud to not crash, for them to keep the free tier open.

If any of these data disasters happen, they take your data with them — and your embedded thing will appear as a broken picture.

And as for these graphs & maps, you don't get a ton of control over how they look compared to if you were a coder writing someting from scratch.

Note: If you're dealing with seriously confidental/sensitive docs, I doubt you should put them in/through any of the services listed above. They probably keep a copy and/or have vulnerabilities.

Bonus Round:
Some other tools, not all of which I've tried.

Two ways to search for deleted web pages or old versions of web pages: The Wayback Machine aka Archive.org or, google something and click the GREEN down arrow beside the results. Sometimes one of the choices is "cached."

RECAP for Pacer, a plugin for Chrome and Firefox that downloads and stores the federal court docs you view in Pacer and allows other users to see those documents for free. And you can, for free, see what other users have downloaded. Saves $ on your Pacer bill.

Sqoop tracks SEC filings, patent applications and Pacer filings, and I think it lets you search & filter and does email alerts. But I also *think* it's missing the federal courts that don't publish an RSS feed, cough, cough Northern Georgia. I've never messed with patents & SEC parts.