Quotas

All API endpoints are subject to quota-based limitations. According to the domain configuration, authenticated users may have extended quotas compared to anonymous access. Please contact the domain administrator for more information about a user's quotas.

The API response contains three headers to indicate the current state of a user's quota:

X-RateLimit-Limit indicates the total number of API calls the user can do in a single day (resets at midnight UTC)

X-RateLimit-Remaining indicates the remaining number of API calls for the user until reset

{"errorcode":10005,"reset_time":"2017-10-17T00:00:00Z","limit_time_unit":"day","call_limit":1000,"error":"You have exceeded the requests limit for anonymous users."}

When an error occurs, a JSON object describing the error is returned by the API.

Authentication

An authenticated user can be granted access to restricted datasets and benefit from extended quotas for API calls. The API features an authentication mechanism for users to be granted their specific authorizations.

For the platform to authenticate a user, you need to either:

be logged in a portal so a session cookie authenticating your user is passed along your API calls

But passing the API key of an authorized user will return the JSON response with the list of accessible datasets for this user on the portal.

Using OAuth2 authorization

Overview

OpenDataSoft implements the OAuth2 authorization flow, allowing third party application makers to access the data
hosted on an OpenDataSoft platform on behalf of a user while never having to deal with a password, thus avoiding any user
credential to be compromised.

The OpenDataSoft OAuth2 authorization flow is compliant with RFC 6749 and makes
use of Bearer Tokens in compliance with RFC 6750.

Application developers who want to use the OpenDataSoft APIs with OAuth2 must go through the following steps, which will be explained in this section.

Register their application with the OpenDataSoft platform.

Request approval from users via an OAuth2 authorization grant.

Request a bearer token that will allows them to query the OpenDataSoft platform APIs for a limited amount of time.

Refresh the Bearer Token when it expires.

Currently, applications are registered on a specific domain and can only access data on this domain.

Register an application for OAuth2 authentication

Go to the My applications tab of your account page on the domain you want to register the application on.

Fill the registration form with the following information:

Application name: the name of the application

Type:

confidential: client password is kept secret from the user and only used from a trusted environment (e.g: a web service, where the client password is stored server-side and never sent to the user)

public: client password is embedded in a client-side application, making it potentially available to the world (e.g: a mobile or desktop application)

Redirection URL: the URL users will be redirected to after they have granted you permission to access their data

Store the resulting client ID and client secret that will be needed to perform the next steps.

To refresh an expired bearer token, send a request to the /oauth2/token/ endpoint, with the following query parameters:

client_id: the client ID you were given during registration

client_secret: the client secret you were given during registration

refresh_token: the refresh token returned in the bearer token response

grant_type: this should always be set to refresh_token

scopes: a list of space-separated requested scopes. Currently only all is supported

state(optional): a random string of your choice

The response to this request is identical to the bearer token response.

Query Language and Geo Filtering

Filtering features are built in the core of OpenDataSoft API engine. Many of the previously listed APIs can take as a
parameter filters for constraining the list of returned datasets or records.

Note that a given filtering context can simply be copied from one API to another. For example, you can easily build a
user interface which first allows the user to visually select the records their are interested in, using full text
search, facets and geo filtering, and then allowing them to download these records with the same filtering context.

Query language

The OpenDataSoft query language makes it possible to express complex boolean conditions as a filtering context.

The user query can most of the time be expressed through the q HTTP parameter.

For the record search API, the list of available fields depends on the schema of the dataset. To fetch the list of
available fields for a given dataset, you may use the search dataset or lookup dataset APIs.

Multiple operator fields can be used between the field name and the query:

:, -, ==: return results whose field exactly matches the given value (granted the fields are of text or numeric
type)

>, <, >=, <=: return results whose field values are larger, smaller, larger or equal, smaller or equal to the given value (granted the field is of date or numeric type)

[start_date TO end_date]: query records whose date is between start_date and end_date

Date formats can be specified in different formats: simple (YYYY[[/mm]/dd]) or ISO 8601 (YYYY-mm-DDTHH:MM:SS)

Query language functions

Return all records where birthdate is greater or equal to the current datetime:

birthdate >= #now()

Return records where birthdate is not set:

#null(birthdate)

Return records where firstname contains exactly "Marie":

#exact(firstname, "Marie")

Advanced functions can be used in the query language. Function names need to be prefixed with a sharp (#) sign.

Function name

Description

now

Return the current date. This function should be called as a query value for a field

null

Search for records where no value is defined for the given field

exact

Search for records with a field exactly matching a given value

Available parameters for the #now function

#now(years=-1, hours=-1) -> current date minus a year and an hour

years, months, weeks, days, hours, minutes, seconds, microseconds: these parameters add time to the current date.

#now(year=2001) -> current time, day and month for year 2001

year, month, day, hour, minute, second, microsecond: can also be used to specify an absolute date.

#now(weeks=-2, weekday=1) -> Tuesday before last
#now(weekday=MO(2)) -> Monday after next

weekday: Specifies a day of the week. This parameter accepts either an integer between 0 and 6 (where 0 is Monday and
6 is Sunday) or the first two letters of the day (in English) followed by the cardinal of the first week on which to
start the query.

In the records API, facets are defined at field level. A field facet can be available depending on the data producer
choices. Fields (retrieved for instance from the Dataset Lookup API) for which faceting is available can be easily
identified as shown in the example on the right.

When faceting is enabled, facets are returned in the response after the result set.

Every facet has a display value ("name" attribute) and a refine property ("path" attribute) which can be used to alter
the query context.

In the returned result set, only the datasets modified in 2013 will be returned.

As the refinement occurs on the "year" and as the "modified" facet is hierarchical, the sub-level is returned. Results
are dispatched in the "month" sub value.

Excluding

Using the same principle as above, it is possible to exclude from the result set the hits matching a given value of a
given facet. To do so, use the following API parameter: exclude.FACETNAME=FACETVALUE.

This API provides a search facility in the dataset catalog. Full text search as well as multi-criteria field queries
are made possible and results facetting is provided as well.

Parameters

Parameter

Description

q

Full-text query performed on the result set

facet

Activate faceting on the specified field (see list of fields in the Query Language section). This parameter can be used multiple times to activate several facets. By default, faceting is disabled

refine.<FACET>

Limit the result set to records where FACET has the specified value. It can be used several times for the same facet or for different facets

exclude.<FACET>

Exclude records where FACET has the specified value from the result set. It can be used several times for the same facet or for different facets

sort

Sorts results by the specified field. By default, the sort is descending. A minus sign - may be used to perform an ascending sort. Sorting is only available on numeric fields (int, double, date and datetime) and on text fields which have the sortable annotation

rows

Number of results to return in a single call. By default, 10 results are returned. While you can request for up to 10 000 results in a row, such requests are not optimal and can be throttled so you should consider splitting them into smaller ones.

start

Index of the first result to return (starting at 0). Use in conjunction with `rows to implement paging

pretty_print

If set to true (default is false), pretty prints JSON and JSONP output

format

Format of the response output. Can be json (default), jsonp, csv or rss

This API makes it possible to fetch an individual dataset information.

Parameters

The dataset identifier is passed as a part of the URL as indicated by the <dataset_id> placeholder in the example on the right.

Other parameters, passed as query parameters, are described below:

Parameter

Description

pretty_print

If set to true (default is false), pretty prints output

format

Format of the response output. Can be json (default) or jsonp

callback

JSONP callback (only in JSONP requests)

Dataset Records APIs

Record Search API

GET/api/records/1.0/searchHTTP/1.1

This API makes it possible to perform complex queries on the records of a dataset, such as full-text search or geo filtering.

It also provides faceted search features on dataset records.

Parameters

Parameter

Description

dataset

Identifier of the dataset. This parameter is mandatory

q

Full-text query performed on the result set

geofilter.distance

Limit the result set to a geographical area defined by a circle center (WGS84) and radius (in meters): latitude, longitude, distance

geofilter.polygon

Limit the result set to a geographical area defined by a polygon (points expressed in WGS84): ((lat1, lon1), (lat2, lon2), (lat3, lon3))

facet

Activate faceting on the specified field. This parameter can be used multiple times to simultaneously activate several facets. By default, faceting is disabled. Example: facet=city

refine.<FACET>

Limit the result set to records where FACET has the specified value. It can be used several times for the same facet or for different facets

exclude.<FACET>

Exclude records where FACET has the specified value from the result set. It can be used several times for the same facet or for different facets

fields

Restricts field to retrieve. This parameter accepts multiple field names separated by commas. Example: fields=field1,field2,field3

pretty_print

If set to true (default is false), pretty prints JSON and JSONP output

format

Format of the response output. Can be json (default), jsonp, csv, geojson, geojsonp

callback

JSONP or GEOJSONP callback

sort

Sorts results by the specified field (in modified, issued, created and records_count). By default, the sort is descending. A minus sign - may be used to perform an ascending sort

rows

Number of results to return in a single call. By default, 10 results are returned. While you can request for up to 10 000 results in a row, such requests are not optimal and can be throttled so you should consider splitting them into smaller ones or use the Records Download API. Note also that the cumulated value of the parameters start and rows cannot go over 10 000. It means that with the Records Search API, there's no way to access a result with a position greater than 10 000. If however you need to do so, consider again using the Records Download API.

start

Index of the first result to return (starting at 0). Use in conjunction with "rows" to implement paging

Record Lookup API

GET/api/datasets/1.0/<dataset_id>/records/<record_id>HTTP/1.1

Example lookup for record ff1f5b718ce2ee87f18dfaf20610f257979f2f4a in dataset world-heritage-unesco-list:

[{"x":"Afghanistan","max_area":158.9265},{"x":"Albania","max_area":58.9},{"x":"Algeria","max_area":665.03},/*...*/{"x":"Zimbabwe","max_area":676600},{"x":"the Former Yugoslav Republic of Macedonia","max_area":83350}]

Field on which the data aggregation will be based. This parameter is mandatory. It allows for analyzing a subset of data according to the different values of the fields. The behavior may vary according to the field type. For Date and DateTime fields, the slices are built from the dates using the level of aggregation defined through the precision and periodic parameters. For other field types, the actual field values are used as x values

y.<SERIE>.func

The definition of the analysis aggregation function. Multiple series can be computed at once, simply name this parameter with an arbitrary serie name that you may reuse for specifying the associated aggregated expression. The list of available aggregation functions is: COUNT , AVG , SUM , MIN , MAX , STDDEV , SUMSQUARES . These functions return the result of their execution on the expression provided in y..expr (or simply the number of records for the COUNT function) for each value of x

y.<SERIE>.expr

Defines the value to be aggregated. This parameter is mandatory for every aggregation function but the COUNT function. The parameter must have the same name as the one used for the required corresponding aggregation function. The parameter may contain the name of a numeric field in the Dataset (Int or Double), or a mathematical expression (see below to get more details on the expression language).

y.<SERIE>.cumulative

This parameter accepts values true and false (default). If the parameter is set to true, the results of a series are recursively summed up (serie(x) = serie(x) + serie(x-1) )

maxpoints

Limits the maximum number of results returned in the serie. By default there is no limit

periodic

Used only in cases in which x is of type Date or DateTime. It defines the level at which aggregation is done. Possible values are year (default), month , week , weekday , day , hour , minute . This parameter will allow you, for instance, to compute aggregations on months across all years. For instance, with a value set to weekday , the output will be: [{"x": {"weekday":0},"series1": 12}, {"x": {"weekday":1},"series1": 30}] . When weekday is used, the generated value range from 0 to 6 where 0 corresponds to Monday and 6 to Sunday

precision

Used only in cases in which X is of type Date or DateTime. It defines the precision of the aggregation. Possible values are year , month , week , day (default), hour , minute . If weekday is provided as a periodic parameter, the precision parameter is ignored. This parameter shall respect the precision annotation of the field. If the field is annotated with a precision set to day , the serie precision can at maximum be set to day

sort

Sorts the aggregation values according to the specified series, or to the x parameter. By default, the values are sorted in descending order, according to the x parameter. A minus sign ('-') can however be prepended to the argument to make an ascending sort

Expression language

Return the average value of twice the sinus of the areas for each category (for the sake of example):