Combining CSV Files with Glob

An important part of my job at the USC Language Center is administering placement tests and making the results available to students, advisors, and other administrators. Several times during the year, students take our tests using Scantron forms, and I end up with several CSV files — one for each of the languages we offer. I then need to make sure that all those results end up in a single, fixed-width text file that’s compatible with the university’s student information system. It’s one of those data management tasks that are perfect for automation with python.

Enter glob, a python package that helps you find multiple pathnames matching a certain pattern. For example, it allows me to quickly find all the CSV files that I need to combine into a single DataFrame that I then convert into a fixed-width text file.

I’ve been working through a DataCamp course on cleaning data taught by Daniel Chen. There he advised a slightly different strategy: using glob to create a list of DataFrames and then using the .concat() method to combine them all together. It would look something like this: