Answered by:

Deleting data content from PowerPivot data model

Question

I don't know if the question has been nailed down. Aside from deleting tables, can we delete the *content* of data within the tables. It doesn't seem crazy that, if you can pull in data from a feed then you should be able to remove
the content out again (without also destroying the user's meta-data work ). Reasons for this include:

- Security (a user may not have rights to see *my* data and should go refresh their own)

- Size (workbook doesn't need to have GB's of irrelevant data saved to disk in a workbook if it was just useful during development phase to a pre-production data feed)

- Bad data (pre-production data feed is not good data)

- User-friendliness (data feed was refreshed 2 years ago and workbook was saved to file server. Users shouldn't be presented with irrelevant data, but should get empty pivot tables until they go do their refresh)

Obviously Excel internally knows how to clear out PowerPivot data, given the prompt shown here:

Answers

So, it may not be an issue for him anymore but it is for me. This is something that the PowerPivot team needs to solve. It is not user friendly and selecting rows, then clicking delete (which only serves to delete the columns) is not only confusing, it
isn't how Excel users are used to working.

All replies

I don't think that we can delete all of content data in the PowerPivot model. If we need to delete all of content data, we must delete the corresponding columns and that's doesn't make sense for data analysis. The security of PowerPivot data model is based
on Excel file security level, it isn't support to implement cell data or columns security.

Thank you for the reply. I can understand why you'd be encouraging me to move forward into the full-fledged back-end solution with SQL Analysis Services. (My software development background is already in Analysis Services and I'm investigating
the PowerPivot alternative to see what level of problems it can solve. In SSAS we simply have a command "Unprocess" that clears data out of the data model.)

Notice that you only focused on the security aspect (first of four points). But this really isn't just about security. And in my case, security is already addressed in the data resource (ie. the data feed). The only thing left to do is
clear out the content data and give the workbook a fresh start. Then the workbook can be distributed to new users as a sample/template.

Are you saying there is some internal/technical reason why the data can't be unprocessed from the data model? (You said deleting one forces you to delete the other). I was hoping I might find an "unprocess" command in
the new VB API for Excel 2013 data models (even if it isn't exposed in the UI). But I'm not finding it there either...

So, it may not be an issue for him anymore but it is for me. This is something that the PowerPivot team needs to solve. It is not user friendly and selecting rows, then clicking delete (which only serves to delete the columns) is not only confusing, it
isn't how Excel users are used to working.

So, it may not be an issue for him anymore but it is for me. This is something that the PowerPivot team needs to solve. It is not user friendly and selecting rows, then clicking delete (which only serves to delete the columns) is not only confusing,
it isn't how Excel users are used to working.

I had this issue today. I had a PowerPivot model connected to views that implemented Geography based security. When I refreshed the model, I would get Worldwide data, but when I distributed the file to the field, I needed it to be empty so they could refresh
and pull only their market. I got really frustrated when there was no simple "Unprocess" like in SSAS. I know this solution is not elegant, and will probably not apply, but here's how I solved it.

I backed up the database I was connected to (here's where I imagine i'll lose most people, as they won't be admins on the SQL side), and when I restored it, I truncated the tables that had the sensitive data. I then pointed PowerPivot to this empty DB, refreshed,
and the data fell out. Then I updated the connection back to true production, saved, and sent the file (it doesn't automatically refresh when you update a connection, thankfully).

When your users receive the file, it is empty, but contains the correct connection data (possibly the password is not saved). They ask for Refresh All, give the password and the live data is loaded for them.

Hi, I was looking for this same solution and was surprised to find pretty low amount of chatter on it. I came up with my own solution, which works for me, but is somewhat specific to the environment. Thought I'd share it for posterity though.

So, use case is exactly the same as Nick's. All tables are populated by views which apply row level security based on mapping users to geographies they are allowed to see. My security is wide open, so when I hit refresh I get it all. I want to be able to
distribute an empty workbook that populates with allowed data for the user that hits refresh.

In our case, we are using ActiveDirectory identity to map users to geographies, so I took one of our testing service accounts that was lying around and gave it access to the objects (the views, in this case) a user would have access to, but did not map it
to any geographies. So it sees all of the views that a user can see, but when it queries any of them, it returns 0 results.

When I'm ready to distribute. I close Excel, then Shift+right click to fire up Excel and choose to run Excel as different user. I use the credentials of my service account to open Excel, open the PowerPivot workbook from that session, and when I hit
refresh, all my tables go empty. Save the workbook. Nice and lightweight, easy to email/upload, etc, no confidential data in workbook.

I use the following solution for David's original problem. When I plan to delete the data from my datamodel, I add an extra WHERE clause to each of my views the datamodel is reading from. The WHERE clause should be evaluated as FALSE, thus no rows are returned.

As I have many views and as I don't want to modify my view definitions all the time, I use a SETVAR variable that controls that I would like to get the data or not. I use SQL Server Management Studio where it should be done as follows. If I set Test = 1,
no data will be returned. If I set it to 0, the views will return all data normally.

-- Set the variable to 1 if you woud like to retrieve empty database for distribution
-- Reuires switching to SQLCMD mode: under Query->SQLCMD
:setvar Test 0
GO
CREATE VIEW dbo.MYVIEW AS
SELECT
-- any select definitions ...
WHERE $(Test) <> 1

So far it is working fine for me. Not the most convenient solution, but works. Please tell your ideas about it.

Microsoft is conducting an online survey to understand your opinion of the Technet Web site. If you choose to participate, the online survey will be presented to you when you leave the Technet Web site.