Google provides a decent way for developers to access their API using step by step wizard. But in case of automating data flow using Talend you have to go against the stream in few points. There are several tutorials in the Internet showing super complicated ways of obtaining tokens for the application, refreshing it, handling errors, reporting etc.. But what I wanted was an easy script to generate a token (which I know will not change in the future) to handle simple task of uploading single file to database without all bells and whistles.

To pull Google Sheet from Google drive in a programmatic way using Talend you will need:

Then select the newly created project from the top menu and access the credentials tab on the left

What you have to do right now is to create Oath 2.0 Client id. You should select that option from the Create credentials menu but you will be most likely forced to create OAuth consent screen:

Consent Screen is the typical oath screen you can see in most Google Auth backed applications. This allows app to act on behalf of your Google account. In our case, as we will not show any dialog and rather use hardcoded token, it’s not a big deal for us. However this screen has to be setup for us to move forward. After completing this step, you can continue to create credentials.

Select Web Application (yes, that’s right..) and make sure that for both Authorized JavaScript origins and Authorized redirect URIs you enter: http://localhost.

If user creation is sucessful, you will receive your client ID and secret

Ok, we are closer to the end 🙂 let’s create tokens.

Create refresh token

By default Google issues an access token which allows you to download the content (Google Sheet in our case) from it’s repositories. Problem is, this token expires very quickly and requires either manual refresh or couple additional steps in the program to obtain. Fortunately, you can generate refresh token which can be (guess what) refreshed in Talend without unnecessary hassle.

You will be asked to login to google (consent screen) and after you login, you should see error message saying that page cannot be reached. That’s the localhost we entered during the client id setup. Note the browser url:

Yes, everything after code= is your refresh token you will use in the Talend application. Be aware that this code is show only once so make sure that you have it stored somewhere!

Setup Google Sheet download job

Let’s setup Talend job to make this whole thing running!

As you can see, whole flow is pretty simple and consists of

getting access token using refresh token

querying and downloading files from GDrive

Getting access token

tRESTClient performs query over the GDrive api to get the access token

you require client id, client secret and refresh token you obtained in the previous steps. Also note the grant type and redirect_uri. Those should be specified as on the screenshot above if you setup API as I did in the this tutorial.

tExtractXMLField is pretty straightforward. We need to extract new access token from the response of the API and then store it in global variable and later on update context. I am pretty sure that adding access token to the global variable can be omitted but I was to lazy to change it 😛

Querying and downloading files from GDrive

Having new access token in context, we can proceed with downloading files. First let’s setup tGoogleDriveConnection with the token. Note the application name. This should match with your API access setup.

And then use this connection to search specific folder

I then used if condition for simple name filtering as my folder might accidentaly contain some random files and I don’t want to parse them

And then I am ready to download the file to desired location

And that’s it. Simple Google Sheet downloading using Talend and Google Drive. Without all those fancy stuff – just working as expected.

Hi hamza,
I just run one of my old scripts and it returned the same error you mentioned. This is most likely due to one of the tokens expired.
You are right, as a refresh token you use the string presented in the URL after code= part. Make sure that in the tRestClient you add parameter “grant_type” = “refresh_token”.
Are you sure that you setup google api access with localhost?

Vote Up0Vote Down Reply

10 months ago

Guest

Hamza

Hello darth0s ! Thank you for your answer. I finally understand what’s wrong (in my side maybe not yours or others). My issue was in “Create refresh token” paragraph. This step where we retrieve the code, it’s not the actual “refresh token” but the “Authorization code” that is used to get the refresh token after. To resume : -> Create the “Authorization code” following what is inside the paragraph called “Create refresh token”. You will get the Authorization code to be used to get a refresh token.. -> When you get this code in the url bar address, run a… Read more »