We presented Secret Magpie at Blackhat!

Our full slide deck and a presentation recording are embedded into this page.

Blackhat EU is the standout cyber security conference here in the UK, with some of the best speakers and content from around the world. Cutting edge research is shared from the top universities and commercial companies, as well as from exceptionally gifted independent researchers. This year was no disappointment, with talks on everything from financial fraud to hardware hacking through electromagnetic fault injection.

We were honoured to be accepted to talk about our research and secret scanning tool, Secret Magpie.

Secret Magpie is a completely free and opensource tool that aims to solve a very difficult problem.

Right now, hackers all over the world are looking for accidentally leaked secrets such as passwords and keys. When they find them, they are quickly used to launch ransomware and cryptomining attacks. These can be found in websites, application software packages, container images and source control systems.

What’s a cryptomining attack?

Attackers want to turn their efforts into money, and cryptomining is one of the easiest ways to do this. Once they have credentials for your cloud environment, such as AWS, they will then create servers which mini cryptocurrency. The more servers they can create, and it’s typically the really expensive servers, the more money they make in cryptocurrencies.

There are a lot of tools already doing amazing work to solve this problem. We love Gitleaks and Trufflehog.

The real issue is that scanning for secrets isn’t as simple a problem as it sounds like it should be!

What are the main issues with secret scanning:

1. Secret scanning has rubbish signatures or high false-positives rates

Secret scanning looks for patterns, and it’s very prone to false-positives. A typical API key may simply be 32 random characters, so to find them we need a pattern that finds 32 character strings. Unfortunately, that’s going to find a LOT of things that are not secrets.

People don’t like problems, so most tools are written to use patterns which produce a low number of false-positives. This essentially means they deliberately do not look for certain patterns, and therefore ignore some potential secrets. Even worse, some signatures rely on the detected secret sitting on the same code line as a keyword, such as ‘cloudflare’.

The solution is to write your own patterns and then handle the high count of false-positives.

2. When do we scan?

We can run our scanning tools manually, but this requires significant human effort. It makes a lot more sense to run the detection tools automatically.

We can run our scanning tools automatically in a few different scenarios, but they’ve each got their own issues (see the slide deck). We don’t ever get full coverage that forces someone to take an action.

… and then what happens when you write a new pattern to detect an API key for a service you found out that you use?

Secret Magpie allows you to run a full manual scan of every code repository in your organisation in one go, which allows you to iteratively improve your own patterns and retest them.

You can see our full slide deck below, and check out Secret Magpie on Github

Title slide

Public secrets are meant to be shared, but can gain too many permissions and be used by attackers.
We are more interested in secrets which are never meant to be shared at all.

When we are looking for secrets in source control, we can look in public and private repositories.
Secrets in public repositories are a huge risk as any attacker can find them, but a lot of organisations have awful hygiene in private repositories.
These repositories are cloned onto developer laptops, servers and CI/CD systems.

Lets take a quick look at Git flow

Git keeps a version history, and point-in-time commits produce these versions.

In this exmaple, we’ve deleted a README file we added earlier on.
… but has it been deleted?

If that README.md has a secret in it, then we can still recover that secret!

One option we have is to rewrite our version history, but this a bad idea.

Git can squash these commits into just one commit, so now the secret is never added.

The new version history looks like this, but the secret isn’t completely gone.

Developers work in branches, which are forks of the version history. Rewriting history can impact these branches.

Removing this commit, or modifying it, would destroy the common history of these branches.

TLDR secrets are generally bad, lets go find them so we can deal with the exposure.

Gitleaks and Trufflehog are two of our favourite opensource tools for detecting secrets.

Gitleaks is really fast, and Trufflehog has extra features. Both are great for scanning a single repository.

Both tools work by scanning the difference between commits, as shown here, and looking for patterns which match known secrets.

Ok, so we want to stop secrets entering the code base, when should we scan?

Scanning on every commit can be done on the developers laptop, but can be turned off.
Scanning on push to Github or Gitlab is a great option, but developers will ignore the results of the check.
Scanning on Pull Request is the most common approach. But what if a Pull Request is never created for a branch of code?

The best practice is to write your own regular expressions to match the secrets your organisation uses, so what do we do when we create a new pattern? How do we scan the historical commits?

Secret Magpie fills this gap for both defenders and pentesters.

Secret Magpie allows you to scan every repository in your SCM, such as Github, with both Gitleaks and Trufflehog.

Secret Magpie outputs to a spreadsheet (CSV) or JSON file, but also into an html file like this

The web output is designed to make processing secrets really easy.
In this example, you can mark all the detections in a single file as false positives.
By following this flow, 90% of false positives can be filtered out in a few minutes.

When you’re finished, you can save the results back to a CSV file.