Sunday, November 2, 2008

Importing the USDA SR21 into MySQL

If you're planning on using the USDA Standard Reference in your own software, it helps to have it in a format that your program can actually read. If you're in Windows, this isn't an issue, since the database is available for download in MS Access format. For other platforms, an ASCII version of the database is also available.

The format of the files is pretty easy to deal with; each file is caret-delimited, and text fields are surrounded by tildes. The documentation explains pretty clearly how the tables are laid out. It's simple to put together an SQL file that defines the tables, but it's tedious. And then you have to write a script to parse the data.

I just did those things this morning, and I thought I'd share, to save you the trouble. The SQL is in MySQL format, and the parser is written in Perl. I know that not everybody likes those two things, so if anybody wants to convert my SQL to Oracle (shudder) or PostgreSQL format, I'll be happy to post it. The same goes for converting the Perl script to Ruby (/me waves to Hans) or Python (/me waves to Matt) or to some other language.

Tony: It's only a nightmare for someone who doesn't want to do any work, or who has bought into the anti-perl bs that's running around and isn't willing to think for himself ... it's a *really* simple script.