CRESTDrive Preparation Guide

Blog banner image


Many examinations allow you to bring in a limited amount of exam material. This guide will take you through making the most of the available storage.



Cybersecurity examinations and certifications can be some of the most difficult and yet most valuable things faced within this industry, and their difficulty is only made worse when you are forbidden from accessing the internet during your examination! As Cybersecurity professionals, we become accustomed to having a huge arsenal of tooling at our disposal at all times, and we quickly begin to learn and remember all of the resources that we depend upon in our day to day activities. So, when these examinations force us to collect all of these resources locally, it can be a very unpleasant surprise just how quickly the file sizes add up.

This guide will focus on optimising general purpose tooling and resources, such as Cyberchef and the HackTricks Book in to fitting within the constraints of CREST’s examinations, which allow you to bring 100MB of content in to the exam through the “CRESTDrive” service. We will also be covering more specific material such as manuals for database platforms like MongoDB, as the MongoDB Manual is particularly large in its uncompressed form, sitting at almost 700MB of HTML!

⚠️ The content of this guide assumes you are running on a Linux platform, and gives snippets of scripts using the Bash scripting language.

⚠️ If you are on a non-Linux platform, such as Windows or MacOS, Docker can be a very useful platform for running these commands.

CREST very helpfully provides AMIs (Amazon Machine Images) for the examination machines used within the CCT-APP and CCT-INF Practical examinations, meaning that you can validate the produced archives for compatibility before you go in to the exam. We strongly recommend doing this, especially if you make use of newer or more exotic formats or compression algorithms, as the AMIs may not always have the latest software versions installed.

Methodology

First, we will cover the methodology we will use for maximising the amount of value we are bringing in to these examinations, this methodology is not specific to any particular manual, tool or resource which you will want to bring in to the exam, and so can be applied across resources not even mentioned here.

The primary goal of what we are doing is to fit as much value as possible in to the 100MB limit that CRESTDrive gives us, and the most important part of this is understanding what that “value” is in this context. The goal here isn’t to bring in perfect copies of the materials, we will be making heavy use of modern lossy compression algorithms, as well as deleting large amounts of assets within some of these resources to cut down the file size as much as possible.

To illustrate why we are doing this, we will use the MongoDB manual as an example. MongoDB being a modern NoSQL database makes it a well loved component for examination providers to include within their syllabi. Compared to SQL oriented database engines, there is a lot less tooling around automated attacking of NoSQL databases, and in general people have a lot less internal knowledge regarding the specific vulnerabilities that can exist within applications leveraging MongoDB. For this reason, taking in a full copy of the MongoDB manual can be a very tempting thing to do, but the MongoDB manual when downloaded is extremely large!

32K     ico
313K    woff
331K    eot
800K    ttf
2M      svg
7M      png
72M     woff2
695M    html

As you can see, the bulk of the MongoDB manual is composed of it’s HTML content, as well as 72MB of fonts. The fonts are easy enough to discard, but the HTML content is much trickier! It may be tempting to just use GZIP compression over the HTML files, but this doesn’t help much.

[...]
103M    gz

Even with GZIP compression, the size of the HTML content of the MongoDB manual remains over 100MB, so it simply just will not fit within out CRESTDrive budget? So, what is the solution? Well, it is to apply much more creative techniques to compression! With the techniques discussed here, you will be able to achieve something like this:

[...]
40M     zst

For our final archive of all of our exam resources, we will be using 7z with LZMA compression. LZMA as a compression algorithm strikes the perfect balance of compatibility, compression ratio and decompression speed for our more general compression needs. In particular, we will be compressing our content with a command like this one:

7z a -mx=9 -m0=lzma -md=64m ExamPreparation.7z ExamPreparation/

To give a quick breakdown of what this command is doing:

  • 7z - Invokes the 7z command line tool
  • a - Adds content to an archive
  • -mx=9 - Tells 7z to use the highest compression level, spending the most time and achieving the best compression ratios
  • -m0=lzma - Explicitly sets the compression algorithm to LZMA
  • -md=64m - Instructs 7z to use up to 64MB for its dictionary encoding

CyberChef

ℹ️ Whilst we strongly recommend following this guide to produce the most up to date copy of CyberChef, for your convenience, we have produced a highly compressed copy of the CyberChef application as of 8th April 2025. This file can be downloaded here. CyberChef is being redistributed per the terms of the Apache 2.0 License

We’ll start with CyberChef, the first tool we’ll cover. CyberChef is a really easy one to get pulled down locally, and required very minimal manual effort to get it in a good state to compress and keep locally, as of the time of writing this guide, the latest version of CyberChef sits at 43MB in the ZIP form we can get from GitHub, taking a look at the composition of the contents of this archive, we can observe that the bulk of it JavaScript.

1K      ico
75K     html
85K     fnt
125K    txt
236K    png
345K    ttf
643K    css
10M     gz
31M     js

CyberChef out of the box compresses really well, using a command such as 7z a -mx=9 -m0=lzma -md=64m CyberChef_v10.19.4.7z CyberChef_v10.19.4/ reduces CyberChef all the way down to only 15MB!

The HackTricks Book

Compressing the book

ℹ️ Whilst we strongly recommend following this guide to produce the most up to date copy of HackTricks, for your convenience, we have produced a highly compressed copy of the HackTricks book as of 8th April 2025 with patched search functionality. This file can be downloaded here. The HackTricks book is being redistributed per the terms outlined within the license detailed in the welcome/hacktricks-values-and-faq.html file within the 7Z archive.

The HackTricks book is a little more involved! Unfortunately, The HackTricks Book does not distribute prebuilt copies of the book, meaning that we will need to build this book ourselves. Unfortunately, the way that the book is produced means that certain functionality, such as search, will be broken when used offline. We will cover exactly what patches you will need to make to ensure perfect functionality before going in to your exam.

Building the HackTricks can be achieved by following the instructions in the repository, once the Docker container is running, you can use

docker logs -f hacktricks

To watch as it is progressing, once you see the following line

2025-03-30 22:42:40 [INFO] (warp::server): listening on http://0.0.0.0:3000

The book has been built, and it can be accessed from the “book” directory within the “hacktricks” folder you cloned down. Unfortunately the book is very big, sitting at 376MB in total.

376M    book

Trying to compress this with the 7z command we decided on using will not yield the compression ratio we need, the resultant archive being 135MB. When we break it down by file size, we can see the reason why pretty clearly.

[...]
1M      zip
3M      pdf
7M      gif
13M     jpg
28M     html
44M     json
45M     js
109M    png
116M    raw

A very large portion of the HackTricks book is images, and a pair of “raw” files! The raw files are sample data for a program called “SigDigger”, a digital signal analyser tool. These files are safe to delete and already save us over 100MB!

To deal with the images, we can exploit a quirk in how web browsers handle images. Modern web browsers do not care what file extension an image has when loading it, rather they care about the actual file content and whether it can recognise it as a supported image codec. For this reason, we can convert all of the images in place to a much smaller format. In this case, our format of choice is going to be lossy .webp as it is a well supported and modern lossy compression algorithm, able to achieve much smaller file sizes than even JPEG. We can use the following command to compress our images in place:

find . -type f \( -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" \) -exec cwebp -q 50 {} -o {} \;

ℹ️ The cwebp command can be found within the webp package of most Linux distributions.

Once we have converted all of our images in place to webp, and deleted the two raw files we have no use for, we can see the progress we have made by compressing the file with our 7z command again. This time, the result is just 34MB!

Patching the book

As mentioned earlier, the HackTricks book does not work properly offline properly in its current form. Most importantly, the search functionality is completely broken! This is pretty trivial to fix, requiring changing two lines within theme/ht_searcher.js within the book’s built code. The following patch file can be used too if you are on a platform with easy access to the patch command.

--- theme/ht_searcher.js
+++ theme/ht_searcher.js
@@ -474,10 +474,10 @@
     var branch = lang === "en" ? "master" : lang
-    fetch(`https://raw.githubusercontent.com/HackTricks-wiki/hacktricks/refs/heads/${branch}/searchindex.json`)
+    fetch(`searchindex.json`)
         .then(response => response.json())
         .then(json => init(json))        
         .catch(error => { // Try to load searchindex.js if fetch failed
             var script = document.createElement('script');
-            script.src = `https://raw.githubusercontent.com/HackTricks-wiki/hacktricks/refs/heads/${branch}/searchindex.js`;
+            script.src = `searchindex.js`;
             script.onload = () => init(window.search);
             document.head.appendChild(script);
         });

MongoDB Manual

As covered in the Methodology section, the MongoDB manual is extremely big, containing almost 700MB of HTML content, but it is possible to get this down to just 36MB for the HTML content!

How? By using ZSTD and training a ZSTD dictionary using the HTML content of the MongoDB manual.

But first, let’s try and compress the HTML content using ZSTD without a dictionary.With the latest version of the MongoDB manual downloaded and extracted, we can use the following command within the directory to achieve this.

find . -type f -name "*.html" -exec zstd --ultra -22 --long=31 {} -o {}.zst \;

ℹ️ The zstd command can be found within the zstd package of most Linux distributions.

And then, looking at file sizes, we see that it’s…

82M     zst

82M is way too much for this, and with the 100MB limit we have for CRESTDrive, is unacceptable to spend on only a single manual. But, there is a way we can do better. If we train a ZSTD dictionary on the HTML content of the directory, we can half this figure. The following command will allow us to train a dictionary. Despite ZSTD’s recommendations, we’re using a dictionary size of 512000, as ZSTD recommends the dictionary size be 100x smaller than that of your content you wish to train it on. This is a surprisingly fast process!

zstd --train ./**/*.html -o dict.zst --maxdict=512000

Now, with our dictionary in hand, we can recompress, this time, passing the -D dict.zst argument to zstd.

find . -type f -name "*.html" -exec zstd --ultra -22 --long=31 -D dict.zst {} -o {}.zst \;

Looking again, we see that now the file size is down to just 40MB!

40M     zst

Note that you must keep a copy of the dictionary in order to decompress the content. The following command can be used to recursively decompress the ZSTD compressed files within the current directory:

find . -name '*.zst' -exec sh -c 'unzstd -D dict.zst "$0" -o "${0%.*}"' {} \;

Helpful Scripts and Commands

Get the size of a directory

ℹ️ Note: the -b is necessary in the case you’re using some filesystems with compression, otherwise du will return the compressed size of the files.

du -bhd0 <directory>

Recursively compress common image types within a directory in place using cwebp

#!/bin/env bash

if [ -z $1 ]; then
  echo "Usage: $0 <directory>"
  exit 1
fi

find $1 -type f \( -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" \) -exec cwebp -q 50 {} -o {} \;

Aggregate the size of all files within the a directory, grouped by file extensions

#!/bin/env bash

if [ -z $1 ]; then
  echo "Usage: $0 <directory>"
  exit 1
fi

find $1 -type f -exec du -b {} \; | awk -F'\t' -s '
BEGIN{
    split("K M G", suffixes, " ")
} 

function human_readable(bytes) {
    s=0; 
    while (bytes >= 1024) {
        bytes /= 1024;
        s++
    }
    return int(bytes) suffixes[s]
} 

function extension(path) {
    sub(".*\\.", "", path); 
    return path
} 

{
    sizes[extension($2)]+=$1
}

END {
    for (ext in sizes) {
        print(human_readable(sizes[ext]) "\t\t" ext)
    }
}' | sort -h
For more information, email us at [email protected] or call us on 01609 635 932

Author

Alex Brozych

- DevSecOps consultant -

Alex is an experienced and enthusiastic developer, with an in-depth knowledge of both web and traditional programming languages.

read more