Engaging notes about data and analysis

Post navigation

Time and tide wait for no one. Add to that: earthquakes. I live in the San Francisco Bay Area, a.k.a. “earthquake country”, in a house built in the 1950s before earthquake building codes had been created. Within the next 30 years, the USGS tells us we can expect a “big one” in the East Bay …

Please share!

I was chatting with my Dad recently and he brought up a debate he’d heard on the radio between a Republican and Democratic candidate. The Republican candidate said that in our present-day recession economy, Republican states were better off than Democratic states. My Dad seemed to particularly relish how the Democratic candidate scrambled to defend his …

Please share!

Here’s an interesting relationship. The graph shows the percentage Republicans in the states Lower Legislatures versus the Upper Legislatures. It would appear that if you’re a Republican, you have the best chances of winning an election in the lower house in states where the upper house is split. You will have a hard time in …

Please share!

Thanks to Google’s ngrams project page I have wasted my scarce spare hours looking at micro trends in literature. A couple of months ago, the Google ngrams project presented a database of all the words from Google’s extensive book collection. Making the books freely available presents copyright issues, but a database of word frequency in …

Please share!

The Python language has become one of the premier computational languages for scientific research on account of its many useful in-built data handling methods. Additionally, there are a number of science-oriented packages that rival industry-standard computational packages (I’m mainly thinking of Matlab). The most popular add-on Python science packages are NumPy and SciPy. Python has …

Please share!

I came across a visualization website that can transform a blog’s text (or the text of any url) into a visual display. Wordle.net takes a text and churns out a graphic wherein each word is sized according to frequency. Then it arranges all the words together in a vaguely oval shape which describes how big …

Please share!

Starting a decade ago more and more schools across the US began implementing data mining programs to improve school performance. It was claimed that data-mining real-time and after the school year had ended could identify students who were in danger of not graduating. Real-time data mining would be a way to find these students early …

Please share!

I have just completed the installation of the wiki engine a la Wikipedia. I am still working to customize the interface a little. However, the time-consuming task of creating wiki pages for the datasets of the world is first priority. Won’t you join in and help me by adding to the collection of dataset pages?