Security

(public)

User Story

the reports table saves a small subset of the raw crash data. Each time the a new field is added to the raw_crash for which someone wants access, we've got to modify the schema of the reports table. By saving the entire raw crash json, we don't have to modify the reports table.
This will necessitate getting the crashmover to write to both HBase and PG. This change enables crashmover to actually do this. The using the `PolyCrashStorage` class in the crashmover with both HBase and Postgres will allow this.

Of course, one thing that could be strange here is that we restrict access to fields with private data, and the raw JSON definitely contains private data - so we might need to restrict access to that as well - which makes it harder for someone like me to run reports against that data as the user we access the DB with doesn't have access to private data, and so we need to move fields from that raw JSON into a different, public spot again if we want us to have access...
Also, note that there can be quite huge stuff in the raw JSON, like the 200 lines of logcat on newer Android versions.

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #2)
> Of course, one thing that could be strange here is that we restrict access
> to fields with private data, and the raw JSON definitely contains private
> data - so we might need to restrict access to that as well - which makes it
> harder for someone like me to run reports against that data as the user we
> access the DB with doesn't have access to private data, and so we need to
> move fields from that raw JSON into a different, public spot again if we
> want us to have access...
> Also, note that there can be quite huge stuff in the raw JSON, like the 200
> lines of logcat on newer Android versions.
Thanks for raising these issues. This is all easily addressed - and I think largely *already addressed*, for the following reasons:
1) There is a limited amount of PII in PostgreSQL currently, primarily in the reports table and also in the email-related table (a feature that hasn't been turned on yet, afaik)
2) PostgreSQL has supported column-level permissions since version 8.4: http://www.postgresql.org/docs/current/static/sql-grant.html
And the analyst user already operates with a limited view of the data.
3) The raw JSON will be in it's own table.
Our suspicion is that we will be creating special reporting tables with aggregate information from the primary JSON table.
Let me know if I've missed something here!