Considering the GDPR as a whole

Should versus shall

Sitting down to read through the GDPR is not a casual undertaking, but initial skim-throughs left me wondering about the word should, which one encounters often in the text of the legislation. It seemed odd to me that legislation should merely suggest behaviors and outcomes; I had assumed that legislation is a recital of what you must (or must not) do.

It might be useful to compare the frequency of words like ‘should’ and ‘shall’ (known to English grammar as modal or auxiliary verbs) in the GDPR in order to understand the intentions of its creators. What are they trying to convey with their use of these different modal verbs?

This type of analysis is easy using Python’s Natural Language Tool Kit (NLTK). I used the script below to generate the word-frequency counts and dispersion plot that follow. (The lines starting with a ‘#’ sign are comments that explain what the code is doing. The reference to the file ending with “.txt” is the text of the GDPR, extracted from the legislation using Adobe Reader’s ‘save as other/text” command.)

We see that should, may, and shall (marked above in bold) are in the top 20 words (note that stop words have been removed), and are the only verbs (unless you count ‘referred’). Is there a pattern?

The frequency distribution produced by the above code shows a clear boundary:

(click on image to enlarge)

We see that the shoulds are concentrated in the first part, but that the shalls take over somewhat before the half-way mark (in terms of word offset; note that these word-offsets don’t correspond to the actual text, given that stopwords and punctuation have been removed). Unlike either should or shall, however, may is scattered evenly over the entire document.

This switch-over point between should and shall falls at the beginning of Chapter 1. This is where the GDPR stops enumerating goals, sentiments, and housekeeping matters (such as the relation of the legislation to other EU and member-state laws), and gets down to what you must do.

The role of should

How are we to make sense of all these shoulds? We know you should sort your trash and floss your teeth, but how does should fit into legislation? Do you have to do it, or is it just a suggestion?

The interpretation that makes the most sense to me is that the should-clauses are desirable but non-mandatory actions. They express the broad policy goals of the GDPR, goals whose best implementation has to depend on the particular industry and technology in question. By implementing so as to support as many of the shoulds as possible, we can justify our decisions and fortify our position in the event that we have to defend those decisions.

A strategy for the legislation as a whole

If you’ve read this far you may be wondering how any of this can be useful to you. While it is always useful to ponder key passages of the GDPR in isolation (as does most of the commentary I have seen thus far), I propose that organizations consider incorporating the GDPR into their models (functional, data, procedural) so that every data exposure points to its justification or legal purpose (or badgelink) in one or more sections of the legislation. For example, a particular data element’s exposure may be justified by both data-subject consent and by legitimate business purpose. If challenged, you need to be able to find all justifications.

Just as every element of a functional or data model has to be traceable to a stated requirement, so must every data exposure be justified by at least one specific clause of the GDPR. In effect, the GDPR constitutes a new set of requirements, independent of (and sometimes in conflict with) our system’s original requirements. It has been dropped into our laps and we have to find a way to integrate it as effectively, cheaply, and non-disruptively as possible.

Equally important, we have to be prepared to show that we have complied with the GDPR’s shalls when the auditor comes knocking, when there is a complaint or request from a data subject, or (heaven forbid) a data incident occurs. We must also perform the reverse operation, which is to consider every should, may, and must in the regulation and ask whether it applies to our system and, if so, how we can best incorporate it into our practice.

In addition to the shalls, our analysis will help us to show that our compliance measures further the goals set out by the shoulds, and that we have made good-faith efforts to do our part in the may department (for example, participating in industry groups to establish codes of conduct, as set out in Article 40).

In future posts I plan to expand the Data Inventory to include references to applicable articles of the GDPR and to incorporate the inventory into a defensive strategy.

Post navigation

Step 9 of the Belgian Privacy Commission’s guide to getting started with GDPR compliance concerns detecting, analyzing, and dealing with the fall-out of a data breach. The recent recall and re-issue of Estonia’s smartcard IDs brought home to me that public relations (PR) planning is an essential part of breach preparation, not to protect the […]