How To

Getting Started With Open Data

Let’s give it a try!
The following steps will show you how to make a simple map from the dataset of 311 Service Requests. 311 is New York City’s non-emergency call center that allows citizens to make service requests, file complaints, and get additional information about the City.

Open the “311 Service Requests 2010 to Present” dataset in your web browser. This will take you to the dataset’s Primer page. This page contains important information about the data, such as the date it was last updated, its data dictionary, and a preview of the actual dataset.

At the top of the screen, click “View Data.” This takes you into the dataset, which contains over 14 million rows of service requests. This is a massive amount of data, so let’s condense this to a shorter time period.

Click on the dark blue “Filter” tab in the upper right corner. The Filter function allows you to narrow the search results into more manageable bits.

Click on the words “Unique Key.” In the drop down, select “Created Date.” Choose a date of your preference and press enter. The data should now be filtered on only service requests for that particular day.

To map this data, select the green “Visualize” tab at the upper-right hand of the screen. Within this tab select “Map.”

Required fields are indicated with a red asterisk. We recommend changing the plot style to a point map. Leave the location field as-is. Click “Apply.”

You have just created a filtered view of 311 Service Requests of a particular day, and visualized the location of these requests on a map.

Your key resource: Data Dictionaries
Data can be complex. Some datasets contain millions of records and many unfamiliar terms. Be sure to review a dataset’s Primer page for additional information about a dataset, including its data dictionary. In the data dictionary, you’ll find definitions of terms and values.

Guidelines for the division of large datasets
In general, if you experience difficulty manipulating and downloading larger datasets, you should restrict the number of records that appear using the “filter” function. In this panel, users may “Add a New Filter Condition” and select attribute values that are exact matches with the “is” condition, or a fall within a range of values with the “contains” condition. A narrower selection of results will require less computing power to view and manipulate data, and will create smaller data files that are quicker to download onto a user’s local device. For more information on filtering tabular datasets, including video tutorials, see the “Filtering Datasets” topic in the Socrata Knowledge Base.

Socrata Knowledge Base

Socrata is the technology company that powers the data catalog. The Socrata Knowledge Base has video tutorials on how to use various tools on Open Data.

Additional Resources

Want to dig deeper? The following blogs and technical resources offer guides on how to do advanced analytics with data on NYC Open Data. If you think we’ve missed anything, drop us a line!

BetaNYC is a civic organization dedicated to improving lives in New York through civic design, technology, and data. They manage data.beta.nyc and host events to discuss Open Data.

NYCityMap is targeted towards non-mapping professionals and provides a wealth of geographic-based information from the input of a single location.

Resident Mario and IQuantNY are blogs by New York civic technologists featuring quantitative analyses of Open Data.

NYC GitHub showcases New York City projects and code. GitHub is a version control system used by software developers and a repository for software projects created by developers using open source code.

Data2go.nyc is a free, easy-to-use online mapping and data tool that brings together federal, state, and City data on a broad range of issues critical to the well-being of all New Yorkers.

API Documentation

Socrata APIs provide rich query functionality through the “Socrata Query Language” (SoQL), which borrows heavily from Structured Query Language (SQL). Its paradigms should be familiar to developers who have worked with SQL and easy to learn for those who are new to it.

The “endpoint” of a SODA API is a unique URL that represents an object or collection of objects. Every Socrata dataset, and even every individual data record, has its own endpoint. The endpoint is what you’ll point your HTTP client at to interact with data resources.