Cpplumber
Cpplumber is a static analysis tool based on clang that helps detecting and keeping track of C and C++ source code information that leaks into compiled executable files.
This tool is aimed at people developing software that may contain sensitive information in some debug or private configurations and want to make sure it doesn’t go out accidentally in release builds or for people that are just looking to make it so that reverse engineers don’t have it too easy on their software.
The command line interface is inspired from the one of Cppcheck to make it easier to get familiar with.
Installation
To function, Cpplumber requires libclang
>=10.0.0 to be properly installed on
the system.
Linux
On Linux distributions, you can install clang
and libclang
using your
package manager. For example, on Ubuntu with apt
:
sudo apt install clang libclang-dev
Windows
On Windows, simply install LLVM using a pre-built installer (i.e,
LLVM-*-win64.exe
) from the LLVM project’s release page
and make sure that libclang.dll
is accessible from the PATH
environment
variable.
Getting started
First test
Here is some simple C code:
#include <stdio.h>
int main()
{
printf("Magic number: %d\n", 1337);
return 0;
}
If you save that into file1.c
and compile it into an executable:
clang file1.c
And then execute:
# Note: Might be `a.exe` on Windows
cpplumber --bin a.out file1.c
The output from Cpplumber should be something like that:
"Magic number: %d\n" (string literal) leaked at offset 0x14f20 in "/full/path/to/a.out" [declared at /full/path/to/file1.c:5]
It basically tells you that a string literal declared in file1.c
at line 5, has
been found in the executable file a.out
at offset 0x14f20.
Checking all files in a folder
On all platform, you can use glob expressions to include all files in a folder:
cpplumber --bin a.out "src/*"
Note: quotes are important if you’re using a terminal that handles glob expressions itself.
Checking files matching a given file filter
You can also do the same as above, but to include files with more specific filters:
cpplumber --bin a.out "src/**/*.cc" "src/**/*.h"
Ignoring certain leak types
Cpplumber can currently report leaks for string literals and class/struct names. However, it’s possible to ignore a certain type of leaks with a command-line argument:
# Ignore leaks of string literals
cpplumber --bin a.out --ignore-string-literals "src/*"
# Ignore leaks of class and struct names
cpplumber --bin a.out --ignore-struct-names "src/*"
Specifying a minimum size for leaks
By default, Cpplumber ignores leaks stricly smaller than 4 bytes. This diminishes the chance of reporting false positives and spamming the reports with useless data.
However, it’s possible to modify this behavior if needed:
cpplumber --bin a.out --minimum-leak-size 6 "src/*"
This tells Cpplumber to ignore potential leaks that would be of less than 6 bytes in size. Keep in mind that this is in bytes, so for example, a single UTF-32 character would be considered a 4-byte leak.
Importing a project
CMake
To make things easier, it is possible to specify a compilation database to fetch source file paths and compiler arguments from. With CMake, you can generate a compilation database like so:
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .
A compile_commands.json
file should have been created in the current folder.
You can now run cpplumber
with the appropriate argument:
cpplumber --bin a.out --project=compile_commands.json
To ignore certain files or folders you can use suppression filters (see the next section for details).
Visual Studio
Support for Visual Studio’s project and solution files is not yet implemented. In the mean time, you can use tools like Clang Power Tools to manually generate compilation databases from projects and solutions.
Suppressions
Suppression files are YAML configuration files that can be used to prevent some files or artifacts from generating leak reports. Here’s a simple example:
# Files to ignore (can include glob expressions)
files:
- "*\\file2.cc"
- "*\\extern\\*"
# Artifacts to ignore
artifacts:
- nonsensitive_c_string
- nonsensitive_utf32_string
To specify a suppression file:
cpplumber --bin a.out --suppressions-list suppressions.yml "src/*"
Reporting leaks from system headers
By default, Cpplumber ignores potentially leaking data coming from system headers as it’s most likely nonsensitive data. It’s possible to tell Cpplumber to do otherwise with a command-line argument:
cpplumber --bin a.out --report-system-headers "src/*"
Suppressing multiple reports for a single artifact
In larger projects, and especially for string literals, it may happen that multiple declarations exist that lead to a single leak in the compiled binary, or that the same string literal is found multiple times in the target binary.
By default, Cpplumber keeps track of and reports all source-to-binary correspondences, but it’s possible to force it to generate a single report for each leaked artifacts like so:
cpplumber --bin a.out --ignore-multiple-locations "src/*"
This allow generating reports that give a compact overview of unique data leaks that happen across a project.
JSON output
Cpplumber can generate its output in JSON format. You can use the --json
argument for that:
cpplumber --json --bin a.out "src/*"
Here’s an example of JSON reports generated by Cpplumber:
{
"version": {
"executable": "0.1.0",
"format": 1
},
"leaks": [
{
"data_type": "StringLiteral",
"data": "sensitive_utf8_string",
"location": {
"source": {
"file": "/full/path/to/file.cc",
"line": 4
},
"binary": {
"file": "/full/path/to/a.out",
"offset": 86326
}
}
}
]
}
The version
object
The version
object contains information that can help contextualize and parse
the rest of the report.
executable
: Version of thecpplumber
executable that generated the reportformat
: Version of the report’s format
The leaks
object
data_type
: The kind of data that has been leaked. Can beStringLiteral
,StructName
orClassName
.data
: The data that has been leaked, as declared in the source codelocation
: An object that contains two sub-objects which indicate where the leaked data is located, in thesource
code and in thebinary
file.
Caveats
Cross-platform binary analysis
The resolution of wide char’s sizes is currently done automatically. On Linux
and Mac, Cpplumber assumes that a wchar_t
is 4-byte long (and encoded as
UTF-32) and on Windows, it assumes that a wchar_t
is 2-byte long (and encoded
as UTF-16).
So keep in mind that analyzing an EXE file on Linux might miss leaks of wide strings.
Scaling
Cpplumber depends on the clang
crate to use libclang
and it’s not currently
possible to parallelize source file parsing with that crate. Expect Cpplumber to
take a few minutes to parse source files on larger projects.