My Python experience is a bit like my Spanish. I know a couple of basic phrases, I can find the restroom, and I can click “See Translation” online. In other words, this goal is definitely going to be a learning experience.

This was the first goal I set out to accomplish because it also served as the data cleansing and profiling stage. Once I knew the data was good, I knew I could replicate the analysis across different tools. I created a field mapping to track info for each of the 56 fields:

Shifting to a Python and pandas mindset for performing the data import and data cleansing was definitely more of a challenge than I anticipated. I’m not talking about days or weeks but 20 or 30 minutes here and there staring at the screen wondering why it’s not doing what I think it should. Despite these challenges and a less forgiving experience than my SQL Server and DTS/SSIS knowledge, I started to think about possible uses for the flexibility and power that this approach provided. In other words, the more time I spent troubleshooting and using pandas, the more I appreciated it.

I primarily leaned most heavily on these resources during my Python and pandas meanderings: