Can I just confirm that the access keys are used in plain text format for the Redis lookup, rather than being hashed?I was wondering about security implications of storing unencrypted keys in the Redis store. Are the keys themselves considered as non-secure data?

That's correct, they are not hashed before being put into the keystore, a few notes around that:

The key structure is meaningful, it's orgId+key, it makes it very quick to search for keys

It's assumed that keys are ephemeral and would be refreshed regularly, it's not like a password and so to some degree they aren't considered secure data

We could probably do a lot more around securing the data store, which we haven't looked at, e.g. basic auth passwords are still stored plaintext in the session object, so there's things that need improving on that front.

I did a bit of experimenting, and have an experimental branch (experiment/hash) in the Tyk repo, which uses murmur3 to hash they keys as they pass through the storage manager. The tests seem to be passing, and the only-non-functional element is the key listing in the dashboard. However analytics (if you go to the key analytics URL directly) works fine.

However this breaks the multi-organisational structure of Tyk, since keys are segmented in two parts, you would need to hash orgIDs and KeyIds separately before running the search, which would mean pushing the hashing function up a level into the implementation instead of just the redis storage driver. There's a way to do it, I just haven't thought about it yet

Actually, having thought about it some more - if you are not using the dashboard for key management, everything will work. You can actually still add keys, and view their analytics and settings, and even update then, however there is no listing.

Segmentation and org level permissions from a security perspective should all still work fine. So actually all that would be required is a change to the UI.

The feature may need to be structured more so that the interface can react if hashed storage is in use,

I've added this to the roadmap for the next version of Tyk (1.6) as its a valid issue considering how these keys need to be secure.

However, and this is a bit annoying I guess, the new portal feature we are adding allows devs to self-serve API keys, and in their dashboard they get to see their usage graphs. The keys that are generated here are stored alongside the developer profile in Mongo and are not hashed (we need them for the analytics lookups as well as some ownership tests).

Which poses a dilemma, we close a potential security hole in the gateway only for it to still exist in the portal and dashboard, since if the database is breached then there's a treasure trove of key data to be exploited.

API keys are also stored alongside analytics data, this would need to be changed to use the hashed representation, then the hashed key can be stored alongside the developer profile, this would make analytics work for the portal.

In the dashboard, API key rankings (biggest users, etc) wouldn't be possible as only the hashed key would be available to the admin.

Overall it's quite a large bit of work to make all the components behave properly, but it's key to securing the solution so I'm quite eager to solve it.

Well - I've done a bit of architectural re-jigging, and it seems like we have a way to avoid some of the stricter security requirements here, so hopefully we are okay! I am planning to implement additional authentication of our own, which means we are only using the Tyk key as a user identifier, rather than as a secure credential.

It seems to me as though you can't really treat the key as an authentication credential (i.e. always stored hashed/encrypted at rest) unless it is totally separate and distinct from the concept of the user, so the analytic/dashboard management would be based around a user identifier, with the key merely being an attribute within (which can therefore always be stored in a hashed state) - I guess the tricky bit here is trying to do so without making the key check more convoluted.

So far, we've done the following:The gateway will hash all keys as they come in so they are stored securelyThe gateway will also hash key data that is stored in the DB so that it is anonymisedThe dashboard "API Keys" UI will be re-factored to use a direct lookup (e.g. type the full key in and request the data, it will request the key data from the gateway via the API, which will do the request to Redis via the hashed key, so the unencrypted key is only available throughout the function and must be known beforehand), this means editing and updating keys is fine so long as they are knownIn the next version we are introducing a portal, and the portal has a concept of "Developers", developer records have map of api-id:key in their record, we will store the obfuscated key in this map instead of the raw one. Now you have the relationship you are speaking about: User profile -> key-hash -> Analytics data, the auth token is never visible or available to anyone, which means it needs to be regenerated if it is lost (because it is now impossible to retrieve)The best thing i that basically you can still do all key management with the original key, so long as it is known - this means that all integrations can work with raw keys if they need to and the key is only exposed to the user once and never again.

this will make xpiring keys extremely important because otherwise floating keys might exist!

Anyway, in the hash branch all tests are passing now and it's stable, we're working on portal integration now to make the UI clearner

I'm curious why murmur3 was chosen. Correct me if I'm wrong but the purpose of hashing the api-keys is to mitigate the effect of large data breaches (eg: someone gets read access to your Redis cluster). This is why we hash passwords. As far as I know, murmur is a non-cryptographyc hashing function which means it is not specifically designed to be difficult to reverse.

Thought so. You might be right, a cryptographic function might be just too slow for constant access. But it's good to be aware of all the tradeoffs and their implications. Thanks for the answer.

I will give adding a configurable function a try. Been diving in the code lately and it seems fairly simple to add this as an option.

As a side note, it kinda beats the purpose of hashing if all keys and their associated hashes are logged together. Eg: time="Apr 13 10:52:13" level=info msg="Reset quota for key." inbound-key=56fc0a4e38c3015ba4000001e28df584baa4494359075986d3992817 key=quota-5cdb385c.

As a side note, it kinda beats the purpose of hashing if all keys and their associated hashes are logged together. Eg: time="Apr 13 10:52:13" level=info msg="Reset quota for key." inbound-key=56fc0a4e38c3015ba4000001e28df584baa4494359075986d3992817 key=quota-5cdb385c.

Haha, yes - it is an issue, but we thought it would be more valuable for an API owner to be able to view log files and trace activity for token actions than to hash everything, since the log data would be meaningless.

But yeah, that particular log might need amending, we don't need to know their quota bucket.