Search the History

Searching for credentials in a repository

If you are a developer or a code reviewer, then it is probable that you use git as your source control management tool. As everyone knows, production credentials should be protected; this post will explain how to extract old credentials and how to protect those.

At the moment, every company that develops their own product
is sure that they are using
some form of a source control management tool.
This is used to track modifications to a source code repository
and also helps developers by preventing loss of work
due to conflict overwriting
and ensures that they are always working
on the right version of the source code.

The most common form of version control systems
is a centralized version control,
where the repository is in one place,
and it allows access to multiple clients.
Here Git is one of the biggest ones;
it is an open-source distributed source code management system
that allows you to create a copy of your repository known as a branch.
With this branch, you can work on your code independently,
and when you are ready with your changes,
you can store them as a commit,
then Git compare your changes with the main branch
(this is called a diff)
and finally you can merge them to the master branch.
It also allows you to reverse the changes
and to work in different versions
of the same source code.
Used by millions of developers,
it is the base of many platforms
such as Github, Gitlab,
Bitbucket, among others.

As you know, storing clear text passwords
in your machine, code, or anywhere
(yes, I mean the sticky notes too)
is a huge hole in your security.
OWASP and CWE mark
this as a vulnerability,
but many developers make this mistake
by creating configuration files and uploading them to a repository.

Maybe you are thinking,
"who in the world is going to do that?"
But this practice is more common than it appears.
Recently (September 2019), it was discovered that
a big bank was storing highly sensitive data
on a publicly accessible repository on Github,
maybe your company is doing this right now.

Git disclosure lab

To set up our lab, we are going to create an empty repository,
here we are going to create a database file
with some credentials and commit the change:

If this change goes to production, then there are no credentials in the file
but anyone with access to the repository could view those changes.
Also, it is common that the credentials do not change
because it will break some interconnected systems.

To get credentials from a git repository,
we can use several tools such as:

Solution

As we have seen by now,
if a developer puts sensitive data into a file
and commits the changes,
an attacker could get our credentials
by searching the history of our source code,
but what can we do about that?

First of all, we can avoid using credentials at all
by using environment variables and pipelines;
every major source code management platform
has this feature within their services.
Pipelines are the top-level component of
continuous integration, delivery, and deployment.
With this, we can test, build, and deploy our projects,
and by setting our credentials there into environment variables,
we ensure the principle of least privilege.

Another thing we can do
is to delete them from the repository
using tools like BFG Repo-Cleaner.
This searches through the commit history
and removes sensitive data.
Using our example, we can put our credentials into a file:

If, for whatever reason,
we could not avoid storing passwords into configuration files,
then it is possible to store them
encoded in a strong cryptographic algorithm.
Please avoid the use of base64 for this endeavor
because the encoding can be detected and decoded easily.

The last thing that we must do
is to revoke any exposed credentials
in order to minimize the damage done.

If you want more information about secure coding,
you can check our rules
about them.