About

Portfolio

Build a simple Instagram API – Case Study

Revision : July 25, 2016

Instagram has become one of the preferred and most used applications for users and organizations to instantly share media content with the world. Many Instagram users also want to take a step forward and share their Instagram content into their websites, web applications or blogs. In some other cases, Digital Agencies may also need to feature Instagram content from authors other than themselves.

In this case-study, we are going to learn how to build our own API (Application Programming Interface) to get embedded data from Instagram pages, and return it to a requesting (web) application in JSON format. Also, we will discuss the motivations and advantages of creating this self-hosted API versus Instagram's own API.

IMPORTANT : This case-study assumes you are comfortable with PHP commands and syntax. It also assumes you are familiar with concepts such as API, JSON, RESTful and AJAX.

Instagram API

Instagram's own API allows users and developers alike to have open access to their data in order to share Instagram's content in the websites they create.

Before you can use it, you must first register an application and obtain a client ID and a client secret. Then your application can make requests to the API endpoints with the proper credentials.

The API allows users to have access to different type of data that can be used to develop specific web application. For instance, there are many third-party tools, widget generators and plugins that allows you to share Instagram content in your site. Some examples of these applications are snapwidget.com, websta.me or instafeed.

Another application example is the jelled lookup, which allows you to get any Instagram user ID by providing its Instagram user name.

Most of these tools and plugins (if not all) use Instagram's (RESTful) API and they all have their own advantages and limitations.

NOTE : We do not endorse nor we are affiliated to any of the resources mentioned above. They are just mentioned here for reference purposes.

Why another API?

Despite the flexibility of Instagram's API or other third-party resources available, there may be some reasons you still don't want (or need) to use them :

Your requirements don't need the complexity of Instagram's API

You don't want to sign up and register an application

You want to have access to Instagram data without authentication

You only need to get basic statistical information from an Instagram's user, e.g. : ID, full name, biographic information, number of followers, number of following accounts, etc.

Unlike the /p/{media shortcode}/media/ method, we can get more detailed information like author's name, ID, media location, media ID, caption, etc. It also gives us more flexibility while formatting the response within our HTML page.

We could easily get and manipulate each piece of data from the API's JSON response using jQuery.ajax() like :

IMPORTANT : The API's response only provides the URL of the biggest image size available, which is 640 x 640 px. If you require smaller versions of an image, you still could use the /p/{media shortcode}/media/?size= method described above instead of down-scale the returned image via CSS.

Bear in mind the oembed method has the main limitations :

Starting on November 3rd, 2014, the JSON response doesn't indicate the type of media. It will always return type : "rich" instead of photo or video.

Although we can get the actual URL of an image, we cannot get the actual URL of a video (the absolute path of a MP4 file.) This may be inconvenient if we wanted to play a video using our preferred (HTML5) video player.

These 2 previous methods are more suitable if you are only sharing a few Instagram media items in your page.

Like the oembed method, the link above will return a JSON response. The problem with this method is that the URL cannot be requested from an AJAX call without triggering a cross-domain error. Since it returns a JSON response, it cannot be processed as JSONP as we did it with the oembed method.

As a workaround, we may need to use a third-party proxy service like whateverorigin.org. This issue was previously addressed in this post.

The main advantage of this method is, unlike the methods previously covered, we only need a single AJAX request to get the latest (20) media posts. The main disadvantage is that it relies on a third-party application/service. If that service becomes unavailable, our implementation will fail.

Building our own API

There are 2 possible Instagram web pages from where our API can get relevant data and return it as JSON response to a requesting application :

User page : http://instagram.com/{username}

Media page : http://instagram.com/p/{shortcode}

Both type of pages have embedded data in their source code that is stored in a javascriptvariable. The value of that variable is a JSON-formatted javascript object. For instance, if you explore the source code of the Coca Cola Instagram page (or any other user), at the bottom of the page you will find a line like this :

return the extracted data as JSON to the requesting application with the proper header information

Yes, our API can be considered as a proxy service between an Instagram web page and a requesting application, with the following advantages :

There is not limit in the amount of requests you can make to the API since they will only count as page visits

You can restrict what domains the API can serve to

You can extend the response and return it as JSONP for cross-site availability

You don't have to rely on any third-party service but your own server availability

You can serve the API from yours or your clients' own server(s)

IMPORTANT : The API is built in PHP. You require a server that supports PHP 4.x or 5.x to install it.

Requesting data from the API

Since we will be reading data from a user or a media (Instagram) web page, we need to tell the API the type of request we are doing. We can do this by adding a query string or trailing parameter to the request.

For instance, if we named api.php to our API file, the query string should look like :

api.php?user={user's URL}

or if we are requesting data from a media page :

api.php?media={media's URL}

HINT : We could pass the full URL of a user or media page to the API like api.php?user=http://instagram.com/cocacola, or simply pass the username or shortcode and let the API to process the corresponding full URL. We will be doing the latest in our case-study.

Processing the request

Within our API file we will be processing two type of input requests :

In order to read the contents of a user or media (web) page, we will use PHP's file_get_contents() function. This function will place the entire content of the (web) file into a string, including text and HTML tags, just like we could see it in the file's source code.

HINT : Since all Instagram pages are returned as HTTPS, it's advisable to set a timeout environment context in case there is some delay in the page response. We can use PHP's stream_context_create() to set this timeout on the fly like :

From here we can proceed to find the ending position of the string we want to trim. Since this string was set in a javascript variable, the ending position will be when we find the first occurrence of the script closing tag </script> :

IMPORTANT : Since we are extracting the value of the window._sharedData = { ... }; variable, the trailing semicolon ; after the } closing bracket, will invalidate our JSON output, therefore we also need to trim it.

We will use the substr() function again to trim the semicolon at the end of the sub-string :

Allowing cross-domain requests to the API

All AJAX calls are subject to the Same Origin Policy, which means that both, the requesting and the serving application must reside in the same domain. If the requesting application resides in another domain, it will receive a cross-origin error while requesting data from the API.

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at http://www.domain.com/api.php?request=input. This can be fixed by moving the resource to the same domain or enabling CORS.

In most cases, our API will reside in the same domain of our requesting application, however there would be some cases when we would like to make the API accessible from other domain(s). If the second, we would need to enable CORS in our API file.

Allow access to all domains

If you want to provide your API as a (public) service and let any application to request data regardless where the API is hosted, you just need to place the following header at the top of your API file :

HINT : You don't need to set this header if the requesting application and the API reside in the same domain, however you could grant access to another single requesting application residing in another domain.

This could be useful in case you are installing the API on your client's domain but you also want to make requests from your own domain application for testing purposes.

Allow access to a list of specific domains

If you want to grant access to a short list of domains or sub-domains, e.g. a list of selected clients' domains, jsfiddle, codepen, etc. you can create a simple array of those domains. Then grant access if the requesting domain is found in that array :

JSON vs JSONP

JSON with padding (JSONP) is another way to allow cross-domain calls from javascript browser-based clients to the API. JSONP bypasses the limitation enforced by most web browsers where access to the API must be in the same domain.

Bear in mind that for JSONP to work, our API needs to reply with a JSONP-formatted response. If the API only returns JSON-formatted data, the JSONP request won't work.

According to this performance test, JSON responses are faster than JSONP. You can decide whether your API will return a JSON-formatted or a JSONP-formatted response, or both.

REMINDER : We can only make JSON requests from a same origin application, or from an authorized domain if we have enabled CORS. We can only make JSONP requests if our API knows how to reply with a JSONP response.

First, notice we used PHP's array_key_exists() function to check if the parameter callback was passed in the query string.

If so, we wrap the response in a (javascript) function, that is returned with the proper header. For instance, if we perform this request :

HINT : If you are making a dataType: "jsonp" jQuery AJAX request, you don't need to add any callback parameter to the query string since jQuery.ajax() adds it by default, unless you want to override the callback function name. In that case you may need to add the jsonp and jsonpCallback settings. Refer to jQuery.ajax() documentation for further reference.

A working example

We can create a variety of rich web applications to take advantage of the API response.

For instance, we could create an application that requests the full path of an Instagram (MP4) video and play it in our web page using our favorite (HTML5) media player like JWPlayer, Mediaelement.js or FlowPlayer.

We could even create our own Instagram user-id lookup application like the jelled lookup, etc.

See a demo page that shows some of the possible applications for the API.

IMPORTANT : The amount of data we can collected from an Instagram user or media depends on whether the user profile is public or private.

Last Notes

Bear in mind this API doesn't substitutes Instagram's own API in any way but it can be useful in some specific scenarios. The API also has its own limitations, for instance :

We can only request the latest 20 media posts and we cannot use pagination like in the Instagram's API to request more than that.

You also need to be familiar with JSON format to process the API's response so it's not a tool to use "out-of-the-box" like other existing third-party applications.

I think the main advantage of the API is that it offers you full-control of the process, where only the Instagram pages and your own server availability are required.

It would be interesting to know how you have created your own implementation based on this case-study so please feel free to share.

By the way, the code in this case-study is provided "as is" and it's only intended as a learning tool, and is not offered as a software plugin, application or web service. Also, we are not responsible for changes in Instagram policies or services that may affect the functionality of the API described here.

Download the code

For reference purposes, the complete API file, including the HTML, JS and CSS demo files are available at GitHub

Disclaimer

All trademarks, videos and images remain property of their respective holders, and are used here for demo purposes only.

I created an Instagram mashup mobile app but Apple will not approve it
because I dont have a "report as inappropriate" feature. The problem is
that the public API does NOT have an endpoint for reporting a post as
inappropriate. Any ideas how I could achieve this? Maybe something
similar to what you did?

JFK

I think there are two different things and don't see how what I did may help you with your issue, sorry.

https://insta724.com/ instagram

thanks!

eric

great article :) is there any way to grab the list of comments and the location for each post? It seems the only way is to use their developer api. Any help? Thanks.

JFK

To grab more detailed information yes, you need to use their API. You may need to register your application and use server side OAuth authentication.

obscure_reference

Thanks for this! However, when running the complete API setup locally, I keep getting the error "Uncaught SyntaxError: Invalid regular expression: missing /" in Chrome and "SyntaxError: unterminated regular expression literal" in Firefox. Any idea why this is happening?

obscure_reference

I figured it out: As you warned above, the value of "window._sharedData" has changed. On user pages at least, the variable now starts with {"country_code". All you have to do is change "static_root" to "country_code" on line 38 of the PHP file.

JFK

You are correct. I haven't had the chance to update the post but did this fix in the PHP code (line 38)
$start_position = strpos( $dataFile ,'window._sharedData = ' );

obscure_reference

Thanks! That's a great way to future-proof it (at least against further changes to the variable).

CME

Hi JFK ....
i cant see anything after the loading ..just a blank page....what could be the error ?

http://obiv.it/ Artur

Man, you are the best! This tut is 100% complete! Thank you very much!

CME

Hi All, Great tuto. However i caught up an error ..i cant see anything after the loading ..just a blank page....what could be the error ?

Meek

This is one of the best examples I have seen so far. But is it possible to get the Instagram url of each image?