*Notice* Behind the scenes, all the records between the `from-date`and `to-date` are requested from the server, and only filtered locally.Also, because office and election year are not included in the sourcerecord set, it is neccesary to try to guess them from the committeename and date of donation. What this ends up meaning is thata whole lot of HTTP requests must happen if you request the wholedate range, which will in turn, take a while.

## API Usage

Feel free to access the pythonn api. Take a look at the functions in[dc_campaign_finance-scraper/scraper.py](dc_campaign_finance-scraper/scraper.py).

## How did I do it?### Manual Process1. Go to [www.ocf.dc.gov/serv/download.asp](http://www.ocf.dc.gov/serv/download.asp) ![Screenshot of unfilled in serv/download.asp](http://f.cl.ly/items/3J2k2O05223Y1K2T0C43/District%20of%20Columbia%20%20Office%20of%20Campaign%20Finance%20%20Contribution%20%20%20Expenditure%20Search.png)2. Fill in `From Date`, `To Date`, and `Payment Type`. ![Screenshot of filled in serv/download.asp](http://f.cl.ly/items/0T3N0O1I1W0A1t2W1t3N/District%20of%20Columbia%20%20Office%20of%20Campaign%20Finance%20%20Contribution%20%20%20Expenditure%20Search%20filled%20in.png)3. Click `Submit` and it sends a `POST` to [www.ocf.dc.gov/serv/download.asp](http://www.ocf.dc.gov/serv/download.asp) and displays the entered form. ![Screenshot of submitted form](http://f.cl.ly/items/0Z3k1P2W0l1G2P080o2K/District%20of%20Columbia%20%20Office%20of%20Campaign%20Finance%20%20Contribution%20%20%20Expenditure%20Search%20submitted.png)4. Click `Click here to download the CSV File` and it sends a `POST` to [www.ocf.dc.gov/serv/download_conexp.asp](http://www.ocf.dc.gov/serv/download_conexp.asp)5. Returns `POST` with CSV text.

### Automation#### SeleniumAt first I tried using[Selenium with Python](http://selenium-python.readthedocs.org) to fill inthe forms and click the buttons. This will actually run a real(ish) browserand execute all the the JS and simulate user input. This worked, butit couldn't really handle the returned CSV text from step 5. In a browserthis opens in a new window and downloads to your computer, but the[PhantomJS driver for Selenium and Python](http://www.realpython.com/blog/python/headless-selenium-testing-with-python-and-phantomjs/)wasn't really working for that new window. I might have been able to getit to work eventually, but it prompted me to search for a different approach.

#### RequestsI then started experimenting with[Requests for Python](http://docs.python-requests.org/en/latest) to justcall the to just make the actual HTTP calls, instead of pretending to be ahuman and filling in the form. This was 1) faster 2) less verbose 3) easierto understand.

##### Chrome Dev ToolsI fired up my Chrome Dev Tools and looked at what requestswere being made. So I tried to figure out in step 4, what request was actually being sent,so that I could replay it programatically. However, since that openedin a new window, the Dev Tools didn't save the request.![GIF of clicking on download button and it downloading in chrome](http://zippy.gfycat.com/PinkAccomplishedBuffalo.gif)It [isn't possible](http://stackoverflow.com/a/13747562) with chrometo open a new window with Dev Tools already open.

##### Chrome Net InternalsI then tried [chrome://net-internals/#events](chrome://net-internals/#events)to see the actual HTTP request being processed. I could see it was sendinga `POST` to`/serv/download_conexp.asp`and the returned CSV. However it didn't show the `POST` data or thecookies.![chrome net internals events showing POST](http://f.cl.ly/items/050P46040W3o2t30431M/Screen%20Shot%202014-06-15%20at%2012.54.33%20PM.png)

##### CharlesFor that I found [Charles](http://www.charlesproxy.com/)(`brew cask install charles`) which provides a HTTP proxy to run your webtraffic through and then you can inspect every request.

#### CookieI checked the `POST` headers for the request and tried making it myself.I got a response of

I found that it was setting a cookie when I requested`/serv/download.asp`. I first tried it with a cookie I got from the browserand IT WORKED! I got back the CSV.

So I began using[Requests Sessions](http://docs.python-requests.org/en/latest/user/advanced/#session-objects)to first `GET` at `/serv/download.asp` to get a session cookie and then`POST` to `/serv/download_conexp.asp` with that cookie. That didn't work,I got the `Your Session is expired. Please try again` response.So then I tried doing step 3, sending a `POST` to `/serv/download.asp` and thenthe identical post to `/serve/download_conexp.asp`, thinking maybe the serverchecked to see if I submitted the form before letting me download. It worked!However the next day when I tried again I go the`Your Session is expired. Please try again`. Very weird. I tried getting acookie from the my chrome session and using that and it forked. So somethingabout how I get my session on chrome is different from how I get my sessionon Requests. I needed to figure out what the difference was.

Then I tried it again and it worked. So who knows. Maybe their site is weird.