We presented Secret Magpie at Blackhat!
Our full slide deck and a presentation recording are embedded into this page.
Blackhat EU is the standout cyber security conference here in the UK, with some of the best speakers and content from around the world. Cutting edge research is shared from the top universities and commercial companies, as well as from exceptionally gifted independent researchers. This year was no disappointment, with talks on everything from financial fraud to hardware hacking through electromagnetic fault injection.
We were honoured to be accepted to talk about our research and secret scanning tool, Secret Magpie.
Secret Magpie is a completely free and opensource tool that aims to solve a very difficult problem.
Right now, hackers all over the world are looking for accidentally leaked secrets such as passwords and keys. When they find them, they are quickly used to launch ransomware and cryptomining attacks. These can be found in websites, application software packages, container images and source control systems.
What’s a cryptomining attack?
Attackers want to turn their efforts into money, and cryptomining is one of the easiest ways to do this. Once they have credentials for your cloud environment, such as AWS, they will then create servers which mini cryptocurrency. The more servers they can create, and it’s typically the really expensive servers, the more money they make in cryptocurrencies.
There are a lot of tools already doing amazing work to solve this problem. We love Gitleaks and Trufflehog.
The real issue is that scanning for secrets isn’t as simple a problem as it sounds like it should be!
What are the main issues with secret scanning:
1. Secret scanning has rubbish signatures or high false-positives rates
Secret scanning looks for patterns, and it’s very prone to false-positives. A typical API key may simply be 32 random characters, so to find them we need a pattern that finds 32 character strings. Unfortunately, that’s going to find a LOT of things that are not secrets.
People don’t like problems, so most tools are written to use patterns which produce a low number of false-positives. This essentially means they deliberately do not look for certain patterns, and therefore ignore some potential secrets. Even worse, some signatures rely on the detected secret sitting on the same code line as a keyword, such as ‘cloudflare’.
The solution is to write your own patterns and then handle the high count of false-positives.
2. When do we scan?
We can run our scanning tools manually, but this requires significant human effort. It makes a lot more sense to run the detection tools automatically.
We can run our scanning tools automatically in a few different scenarios, but they’ve each got their own issues (see the slide deck). We don’t ever get full coverage that forces someone to take an action.
… and then what happens when you write a new pattern to detect an API key for a service you found out that you use?
Secret Magpie allows you to run a full manual scan of every code repository in your organisation in one go, which allows you to iteratively improve your own patterns and retest them.
You can see our full slide deck below, and check out Secret Magpie on Github
We are more interested in secrets which are never meant to be shared at all.
Secrets in public repositories are a huge risk as any attacker can find them, but a lot of organisations have awful hygiene in private repositories.
These repositories are cloned onto developer laptops, servers and CI/CD systems.
… but has it been deleted?
Gitleaks and Trufflehog are two of our favourite opensource tools for detecting secrets.
Gitleaks is really fast, and Trufflehog has extra features. Both are great for scanning a single repository.
Ok, so we want to stop secrets entering the code base, when should we scan?
Scanning on every commit can be done on the developers laptop, but can be turned off.
Scanning on push to Github or Gitlab is a great option, but developers will ignore the results of the check.
Scanning on Pull Request is the most common approach. But what if a Pull Request is never created for a branch of code?
In this example, you can mark all the detections in a single file as false positives.
By following this flow, 90% of false positives can be filtered out in a few minutes.
… Or catch the recording here: