First Steps to Using Your Data

Aaron Browne
· November 20 2013

Using the instructions on the download page will get Harvest up and running super fast, which is awesome, but Harvest still won't be of much use because it won't be connected to your data. In this post I'll outline some of the steps you might follow to get Harvest connected to your data. This might not be enough information for everyone, so stay tuned for future posts, where I'll flesh out each of the steps and provide further hints on using Harvest day-to-day.

Connect Harvest to Your Database

Database connections are handled by the Django framework and defined in the settings files at myproject/conf. Check out the Django tutorial on this topic for help, and look for detailed examples in a future post.

Once you connect to a new database (other than the SQLite database that comes with a new harvest project), you will have to rebuild the database tables used by Django and Harvest. Use python bin/manage.py syncdb --migrate in order to do this (don't worry, this command won't touch any existing tables in your database). Read about this command at the Django docs. (Pro hint: the --migrate option taps into functionality from South).

Model Your Data

Once your database is connected, you'll need to generate Django models for your data. If your data is already in the database, you can use python bin/manage.py inspectdb, which you can read about in the Django docs.

If your data is not already in a relational database, you might choose to write Django models for your data manually and then write a Python script to extract your data from its current location, transform it as needed, and load it into the database you are using for Harvest (this process is called ETL). Look for a future post with a detailed walk-through of this process. You can start by looking at the Django model reference.

After defining your models, you need to tell Harvest which model you want to use as the basis of your queries by defining MODELTREES in your global_settings.py file. Read about this setting in the ModelTree docs and look for a future post about advanced ModelTree configuration.

Create DataFields and DataConcepts

Now Harvest is connected to your data, but you need to define the way your data will appear in Harvest before you can start using it. A good place to start is with python bin/manage.py avocado init myproject, which will auto-generate Avocado DataFields for each of the fields in your Django models.

Then, you can create DataConcepts either at the command line or in the Django admin view (which you can view at the http://localhost:8000/admin/ url if you are using the Django development server). Read about the admin view in the Django tutorial.

In order for a DataField to show up in the Harvest query view, it must be included in a DataConcept (that has a defined Formatter) and both the DataField and DataConcept must be published. A future post will cover this process in detail.