Calling The Power BI REST API From Microsoft Flow, Part 3: Finding All Data Sources Used In A Power BI Tenant

The problem with self-service BI is that you never quite know what your users are up to. For example, what data sources are they using? Are there hundreds of Excel files being used as data sources for reports that you don’t know about? If so, where are they? Could they and should they be replaced by a database or something else more robust? In this post I’ll show you how you can use Microsoft Flow and the Power BI REST API (see part 1 to find out how to create a Flow custom connector to call the Power BI API) to get the details of all the data sources used in all of the workspaces of your Power BI tenant.

I’ll admit that doing this turned out to be a bit trickier than I had expected. My plan was to use the GetDatasetsAsAdmin endpoint to get a list of all datasets, loop over each one and then call the (undocumented, but in the REST API’s Swagger file and therefore in my custom connector) GetDatsourcesAsAdmin endpoint to get the datasources in each dataset. Both these endpoints require administrative permissions to call, so I made sure my custom connector had the correct permissions (at least Tenant.Read.All – you can check this in the Azure Portal on the app you registered in Azure Active Directory) and I ran the Flow as a user with Power BI Admin permissions. But I kept getting 404 errors when requesting the data sources for certain datasets . Eventually I realised that GetDatasetsAsAdmin returns a list of all of the datasets that have ever been created in a tenant, including ones that have been deleted, with no indication as to which ones are the deleted ones, and the errors were the result of asking for the datasources from deleted datasets. Hmm. Once I realised I could ignore these errors the Flow was fairly easy to build.

Here’s the top-level view:

Step-by-step:

I use a button to trigger the flow here, but if you want to create an automated report you should probably run this Flow on a schedule.

Initialize variable initialises an array called datasources to hold the details of each datasource:

Return a list of datasets for the organisation calls the GetDatasetsAsAdmin endpoint. As I said, if there was any indication as to whether the dataset had been deleted (if you look at the results returned by the GetGroupsAsAdmin endpoint it tells you whether a group has been deleted or not) then it would be possible to use $filter to declare an OData filter to remove them. But it isn’t.

Parse list of datasets parses the JSON containing the list of datasets returned by the previous action. I generated the schema from a sample payload I got from calling the endpoint in the “test” section of the edit screen for the Flow custom connector. I had to edit it to make sure only the “id” and “name” properties are listed as required, as shown below.

Apply to each dataset loops over each dataset, and I’ll describe that in more detail below.

Create CSV table takes the contents of the array variable defined above and turns it into a CSV file. One very important point to note is that I had to click the “Configure run after” option and state that this action should run even after Apply to each dataset had failed, which it would inevitably after the 404 errors mentioned above. Just because one iteration fails doesn’t mean the whole Flow should be stopped.

Create file simply saves the resulting CSV file into a folder in OneDrive for Business.

Going back to the Apply to each dataset action, here’s what it does:

It loops over every dataset returned by the call to GetDatasetsAsAdmin

Return a list of datasources for the specified dataset calls GetDatasourcesAsAdmin to get the datasources in each dataset; as I said, this fails when the dataset has been deleted.

Parse datasources parses the JSON returned by the previous action. Once again I generated the schema using a sample payload, and in this case I edited it manually to ensure that only “datasourceType”, “connectionDetails” and “datasourceId” were required.

Apply to each datasource then loops over each datasource (one dataset can have many datasources) and Append to array variable appends some information about each datasource to the datasources array.The connectionDetails property for a datasource returns details such as connection strings and file paths, but since I also wanted the type of the data source and the name and id of the dataset I used the following expression to create an array to append with all this information in:

While this proves to be very useful for Excel-files etc., when importing data from different tables on a DB (e.g. SQL Server / HANA etc.) could you think of a way to even see that level of detail through the Power BI Rest API? I see your CSV-dump holds a server & database column, but no table column.
Hence my question =)

Use-case would be to inform all the affected Power BI reporting key-users when a table-structure is changed. I have tinkered around with the Rest-API myself, but couldn’t find what i was looking for.

Follow Blog via Email

Social

Need some help?

As well as being a blogger, I'm an independent consultant specialising in Analysis Services, MDX, DAX, Power BI, Power Query and Power Pivot. I work with customers from all round the world solving design problems, performance tuning queries and delivering training courses, and I am happy to work on short-term engagements. For more details see http://www.crossjoin.co.uk