Storing Data Securely on Android

Storing Data Securely on Android

An
app's credibility today highly depends on how the user's private data
is managed. The Android stack has many powerful APIs surrounding
credential and key storage, with specific features only available in
certain versions.

This short series will start off with a simple
approach to get up and running by looking at the storage system and
how to encrypt and store sensitive data via a user-supplied passcode.
In the second tutorial, we
will look at more complex ways of protecting keys and credentials.

The Basics

The
first question to think about is how much data you actually need to acquire. A
good approach is to avoid storing private data if you don't really
have to.

For
data that you must
store, the Android architecture is ready to help. Since 6.0 Marshmallow, full-disk encryption is enabled
by default, for devices with the capability. Files andSharedPreferences that are saved by the app are automatically set
with the MODE_PRIVATE constant. This means
the data can be accessed only by your own app.

It's a good
idea to stick to this default.
You can set it explicitly when
saving a shared preference.

Avoid storing data
on external storage, as the data is then visible by other apps and users.
In fact, to make it harder for people to copy your
app binary and data, you
can prevent users from being
able to install the app
on external storage. Adding android:installLocation
with a value of internalOnly to the manifest file will
accomplish that.

You can also prevent
the app and its data from being backed up. This also prevents the contents of an app's
private data directory from being downloaded using adb backup. To do so, set the android:allowBackup attribute to false in the manifest file. By
default, this attribute is set to true.

These are best
practices, but they won't work for a compromised or rooted device,
and disk encryption is only useful when the device is secured with a
lock screen. This is where having an app-side password that protects its data with encryption is beneficial.

Securing User Data With a Password

Conceal is a
great choice for an encryption library because it gets you up and
running very quickly without having to worry about the underlying
details. However, an exploit targeted for a popular framework will simultaneously affect all of the apps that rely on it.

It's
also important to
be knowledgeable about how encryption systems work in order to be able to tell if you're using a particular framework
securely. So, for this post, we are going to get our
hands dirty by looking at the cryptography
provider directly.

AES and Password-Based Key Derivation

We will use the
recommended AES standard, which
encrypts data given a key. The same key used to encrypt the data is
used to decrypt the data, which is called symmetric encryption. There
are different key sizes, and AES256 (256 bits) is the preferred
length for use with sensitive data.

While
the user experience of your app should force a user to use a strong
passcode, there is a
chance that the
same passcode will also be chosen
by another user. Putting the security of our encrypted data in the
hands of the user is not safe. Our data needs to be secured instead with akey that is random and large enough (i.e. that has enough
entropy) to be considered strong. This is why it's never recommended to use a password directly to encrypt data—that is where a function called Password-Based Key Derivation Function (PBKDF2) comes
into play.

PBKDF2 derives a key from
a password by hashing
it many times over with a salt. This is called key stretching. The
salt is just a random sequence of data and makes the derived key
unique even if the same password was used by someone else.

The SecureRandom class guarantees
that the generated output will be hard to predict—it is a "cryptographically strong random number generator". We can now put the salt and password into a password-based encryption object: PBEKeySpec. The object's constructor also takes an iteration count form, making the key stronger.
This is because increasing the number of iterations expands the time it would take to
operate on a set of keys during a brute force attack. The PBEKeySpec
then gets passed into the SecretKeyFactory, which finally generates the key as abyte[] array. We will wrap that raw byte[] array into a SecretKeySpec object.

Note that the password is passed as a char[] array, and the PBEKeySpec
class stores it as a char[] array as well. char[] arrays are usually
used for encryption functions because while the String class is
immutable, a char[] array containing sensitive information can be
overwritten—thus removing the sensitive data entirely from the device's phyc RAM.

Initialization Vectors

We are now ready to
encrypt the data, but we have one more thing to do. There are
different modes of encryption with AES, but we'll be using the recommended
one: cipher block chaining (CBC). This operates on our data one
block at a time. The great thing about this mode is that each next
unencrypted block of data is XOR’d
with the previous encrypted block to make the encryption stronger.
However, that means the first block is never as unique as all the
others!

If a message to be encrypted were to start off the same as
another message to be encrypted, the beginning encrypted output would
be the same, and that would give an attacker a clue to figuring out
what the message might be. The solution is to use an initialization
vector (IV).

An IV is just a block of random bytes that will be XOR’d
with the first block of user data. Since each block depends on all
blocks processed up until that point, the entire message will be
encrypted uniquely—identical messages encrypted with the same key
will not produce identical results.

A note about SecureRandom.
On versions 4.3 and under, the Java
Cryptography Architecture had a
vulnerability due to improper initialization of the underlying pseudorandom number
generator (PRNG). If
you are targeting versions 4.3 and under, a fix is available.

Here we pass in the string "AES/CBC/PKCS7Padding". This specifies AES encryption with cypher block chaining. The
last part of this string refers to PKCS7, which is
an established standard for padding data that doesn't
fit perfectly into the block size. (Blocks are 128 bits, and padding
is done before encryption.)

To complete our
example, we will put this code in an encrypt method that will package
the result into a HashMap
containing the
encrypted data, along with the salt and initialization vector necessary for decryption.

The Decryption Method

You only need to
store the IV and salt with your data. While salts and IVs are
considered public, make sure they are not sequentially incremented or
reused. To decrypt the data, all we need to do is change the mode in
the Cipherconstructor fromENCRYPT_MODE to DECRYPT_MODE.

The decrypt
method will
take a HashMap
that contains the
same required
information (encrypted data, salt and IV) and return a decrypted byte[] array, given the correct
password. The decrypt method will regenerate the encryption key from the password. The key should never be stored!

Testing the Encryption and Decryption

To keep the example simple, we are omitting error checking that would make sure the HashMap contains the required key, value pairs. We can now test our
methods to ensure that the data is decrypted correctly after
encryption.

Since the SharedPreferences is an XML system that accepts only specific primitives and objects as values, we need to
convert our data into a compatible format such as a String object. Base64
allows us to convert the raw data into a String representation that contains only the characters allowed by the XML format. Encrypt both the key and the value so an attacker can't figure out what a value might be for.

In the example above, encryptedKey and encryptedValue are both encrypted byte[] arrays returned from our encryptBytes() method. The IV and salt can be saved into the preferences file or as a separate file. To get back the encrypted bytes from the SharedPreferences, we can apply a Base64 decode on the stored String.

Wiping Insecure Data From Old Versions

Now
that the stored data is secure, it may be the case that you have a
previous version of the app that had the data stored insecurely. On
an upgrade, the data could be wiped
and re-encrypted. The following code wipes over a file using
random data.

In theory, you can just delete your shared preferences by removing the /data/data/com.your.package.name/shared_prefs/your_prefs_name.xml and your_prefs_name.bak files and clearing the in-memory preferences with the following code:

However, instead of attempting to wipe the old data
and hoping that it works,
it's better to encrypt it in the first place! This is especially
true in general for solid state drives that often spread out data-writes to
different regions to prevent wear. That means that even if you overwrite a file in the filesystem, the physical solid-state memory might preserve your data in its original location on disk.

Conclusion

That wraps up our tutorial on storing encrypted data. In this post, you learned how to securely encrypt and decrypt sensitive data with a user-supplied password. It's easy to do when you know how, but it's important to follow all the best practices to ensure your users' data is truly secure.

In the next
post, we will take a look at how to leverage the KeyStore and
other credential-related APIs to store items safely. In the meantime,
check out some of our other great articles on Android app development.