# Compiler security tools

The LLVM compilers support several tools and features for improving
the security and reliability of program code.

## Sanitizer support

Not all sanitizers are supported on all targets. The following table list
which sanitizers are supported on which targets.

**Table 8-1 Sanitizer support**

| **Sanitizer** | **AARCH64-Linux** | **AARCH64-Android** | **Arm-Linux** | **Arm-Android** |
| --- | --- | --- | --- | --- |
| Address | 4 | 4 | 4 | 4 |
| Data flow | 4 | X | X | X |
| Leak | 4 | X | X | X |
| Memory | 4 | X | X | X |
| Thread | 4 | X | X | X |
| Undefined<br>behavior | 4 | X | 4 | X |

### Special case lists

The behavior of the sanitizers can be controlled for certain
source-level entities (such as functions) by providing a special file
at compile-time. This file is called a *special case list*.

Special case lists are used to do the following things:

- Speed up time-critical functions that are already known to be correct
- Ignore functions that perform low-level operations (such as
traversing thread stacks, which bypasses the stack frame boundaries)
- Ignore functions with known problems

To create a special case list, create a text file that lists the
source-level entities to be ignored. Then pass this file to the
compiler with the [-fsanitize-blacklist](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#option-fsanitize-blacklist)
option.

Example case list:

# Disable checks in function and source file
    fun:my_func
    src:my_file
    Copy to clipboard

Each line in a special case list file has the following syntax:

*entity:regexp\*[\*=category*]

Where:

- *entity* specifies the type of source-level entity. It has the
following possible values:

    - src - Source file
    - fun - Function
    - global - Global variable (ASan only)
    - type - Class or structure type (ASan only)

    global and type are specific to the Address Sanitizer. They are used
to suppress error reports for out-of-bound accesses to the specified
global symbols, or to instances of the specified class or structure
type.
- *regexp* specifies a regular expression that specifies the entity
name.
- *category* optionally specifies a category value to associate with
the entity. Category values are specific to each sanitizer.

Empty lines and lines starting with # are ignored. The meaning of \*
in regular expression for entity names is different; it is treated as
in shell wildcards. For example:

# Lines starting with # are ignored.
    # Turn off checks for the source file (use absolute path or path relative
    # to the current working directory):
    src:/path/to/source/file.c
    # Turn off checks for a particular function (use mangled names):
    fun:MyFooBar
    fun:_Z8MyFooBarv
    # Extended regular expressions are supported:
    fun:bad_(foo|bar)
    src:bad_source[1-9].c
    # Shell like usage of * is supported (* is treated as .*):
    src:bad/sources/*
    fun:*BadFunction*
    # Specific sanitizer tools may introduce categories.
    src:/special/path/*=special_sources
    Copy to clipboard

### Usage on Android

Generating an Android LLVM executable with sanitizer instrumentation
requires the following items:

- The Android NDK (for its linker)
- sysroot (for building the executable)

Once the executable is built, push the executable, the sanitizer
runtime library, and the LLVM Symbolizer ([LLVM Symbolizer](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-llvm-symbolizer))
to an Android device. The sanitizer runtime library is a shared object
that must be preloaded into the executable when launched.

The shared object can be found under the LLVM release tools
installation directory:

export INSTALL_PREFIX=<LLVM_release_tools_install_dir>
    file $INSTALL_PREFIX/lib/clang/*/lib/linux/libclang_rt.<xsan>-arm-android.so
    Copy to clipboard

Note

*xsan* specifies a sanitizer library (asan, msan, and so on).

#### Example

Choose one of the examples that are provided in the individual
sanitizer sections (see [Usage](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-address-sanitizer-usage)).

1. Build a C/C++ executable with the sanitizer instrumentation:

mkdir -p out
        $INSTALL_PREFIX/bin/clang++ -target arm-linux-androideabi -g \
                                    -fsanitize=<san_opt> boom.cc -o out/boom \
                                    --sysroot=<Android_ARM_sysroot> \
                                    --gcc-toolchain=<Android_NDK_toolchain> \
        Copy to clipboard

Note

*san\_opt* specifies a sanitizer option value (address, memory, and so on).
2. Push the executable, sanitizer runtime library, and symbolizer to Android device (Jellybean or later)

adb push out/boom /data/data/
        adb push $INSTALL_PREFIX/lib/clang/*/lib/linux/libclang_rt.<xsan>-arm-android.so /data/data/
        adb push $INSTALL_PREFIX/arm-linux-androideabi/llvm-symbolizer /data/data/
        Copy to clipboard

Note

*xsan* specifies a sanitizer library (asan, msan, and so on).
3. Run the sanitizer-instrumented executable:

adb shell "<san_path>_SYMBOLIZER_PATH=/data/data/llvm-symbolizer \
                   LD_PRELOAD=/data/data/libclang_rt.<xsan>-arm-android.so /data/data/boom"
        Copy to clipboard

Note

san\_path and xsan specify a sanitizer path variable (ASAN, MSAN, and so on) and library (asan, msan, and so on).

Include the symbolizer in the argument string (as shown in these
steps) only if the sanitizer you are using requires a symbolizer to
resolve the symbol names.

If the command line execution outputs the error, `CANNOT LINK
EXECUTABLE: could not load library`, try exporting the
LD\_LIBRARY\_PATH.

adb shell "export LD_LIBRARY_PATH=/data/data/ ; \
               <san_path>_SYMBOLIZER_PATH=/data/data/llvm-symbolizer \
               LD_PRELOAD=/data/data/libclang_rt.<xsan>-arm-android.so \
               /data/data/boom"
    Copy to clipboard

### Usage on Linux

1. Build a C/C++ executable with sanitizer instrumentation:

$INSTALL_PREFIX/bin/clang++ -target arm-linux-gnueabi --sysroot=<Linux_ARM_sysroot> \
                                    --gcc-toolchain=<Linux_ARM_toolchain> -g -fsanitize=<san_opt> \
                                    boom.cc -o boom
        Copy to clipboard
2. Run the Arm sanitizer-instrumented executable. You can run the
executable on an Arm Linux system:

<san_path>_SYMBOLIZE_PATH=$INSTALL_PREFIX/arm-linux-gnueabi/llvm-symbolizer ./boom
        Copy to clipboard

Note

san\_opt and san\_path specify a sanitizer option value
(address, memory, and so on) and path variable (ASAN, MSAN, and so on).

    Include the symbolizer (as shown in these steps) only if the
sanitizer you are using requires a symbolizer to resolve the symbol
names.

## Address Sanitizer

The LLVM compiler release includes a tool named *Address Sanitizer*
(ASan), which can be used to detect memory errors in C and C++ code.

ASan controls checking for the following memory errors:

- Out-of-bounds accesses to heap, stack, and globals
- Use-after-free
- Use-after-return (to a certain extent)
- Double-free, invalid free
- Double-free, invalid free
- Memory leaks (experimental)

ASan is a runtime tool that requires compile-time instrumentation of
the code, and a dedicated runtime library. If ASan encounters a bug
during the execution of a program, it halts the execution and
displays (on stderr) an error message and stack trace.

Note

A program instrumented with ASan typically runs 2x slower.

### Usage

To use ASan, you must instrument your C/C++ code and generate an
Android/Linux executable.

To instrument your C/C++ code with ASan, add the following options to
both the compile and link options in LLVM:

-g -fsanitize=address
    Copy to clipboard

The ASan runtime library must be linked to the final executable—be
sure to use clang (not ld) for the final link step.

When linking shared libraries, the ASan runtime is not linked, so
-Wl,-z,defs might cause link errors (do not use it with ASan). To get
a reasonable performance add -O1 or higher. To get nicer stack traces
in error messages, add -fno-omit-frame-pointer. To get perfect stack
traces, it might be necessary to disable inlining (only use -O1) and
tail call elimination (-fno-optimize-sibling-calls). For example:

% cat example_UseAfterFree.cc
    int main(int argc, char **argv) {
       int *array = new int[100];
       delete[] array;
       return array[argc]; // BOOM
    }
    
    # Compile and link
    
    % clang -O1 -g -fsanitize=address -fno-omit-frame-pointer example_UseAfterFree.cc
    
    Or:
    
    # Compile
    
    % clang -O1 -g -fsanitize=address -fno-omit-frame-pointer -c example_UseAfterFree.cc
    
    # Link
    
    % clang -g -fsanitize=address example_UseAfterFree.o
    Copy to clipboard

If a bug is detected, the program will print an error message to
stderr and exit with a non-zero exit code. ASan exits on the first
detected error. This is by design:

- This approach enables ASan to produce faster and smaller generated
code (both by approximately 5%).
- Fixing bugs becomes unavoidable. ASan does not produce false alarms.
Once memory is corrupted the program is in an inconsistent state,
which can lead to confusing results and potentially misleading
subsequent reports.

### Symbolize reports

To make ASan symbolize its output, you must set the
ASAN\_SYMBOLIZER\_PATH environment variable to point to the
llvm-symbolizer binary (or alternatively ensure that
llvm-symbolizer is in the $PATH). For example:

% ASAN_SYMBOLIZER_PATH=/usr/local/bin/llvm-symbolizer ./a.out
    ==9442== ERROR: AddressSanitizer heap-use-after-free on address
    0x7f7ddab8c084 at pc 0x403c8c bp 0x7fff87fb82d0 sp 0x7fff87fb82c8
    READ of size 4 at 0x7f7ddab8c084 thread T0
       #0 0x403c8c in main example_UseAfterFree.cc:4
       #1 0x7f7ddabcac4d in libc_start_main ??:0
    0x7f7ddab8c084 is located 4 bytes inside of 400-byte region [0x7f7ddab8c080,0x7f7ddab8c210)
    freed by thread T0 here:
       #0 0x404704 in operator delete[](void*) ??:0
       #1 0x403c53 in main example_UseAfterFree.cc:4
       #2 0x7f7ddabcac4d in libc_start_main ??:0
    previously allocated by thread T0 here:
       #0 0x404544 in operator new[](unsigned long) ??:0
       #1 0x403c43 in main example_UseAfterFree.cc:2
       #2 0x7f7ddabcac4d in libc_start_main ??:0
    ==9442== ABORTING
    Copy to clipboard

### Additional checks

ASan performs the following additional checks.

#### Initialization order checking

ASan can optionally detect dynamic initialization order problems,
when initialization of globals defined in one translation unit uses
globals defined in another translation unit. To enable this check at
runtime, set the environment variable:

ASAN\_OPTIONS=check\_initialization\_order=1.

#### Memory leak detection

For more information on memory leak detection in ASan, see [Section
8.4](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-leak-sanitizer).

### Issue suppression

ASan generally does not produce false positives, so if you see one,
look again. Most likely it is a true positive.

#### Suppress reports in external libraries

Runtime interposition allows ASan to find bugs in code that is not
being recompiled.

If you run into an issue in external libraries, we recommend
immediately reporting it to the library maintainer so that it is
resolved. However, you can use the following suppression mechanism to
unblock yourself and continue with testing. Use this suppression
mechanism only for suppressing issues in external code; it does not
work on code recompiled with ASan.

To suppress errors in external libraries, set the environment
variable ASAN\_OPTIONS to point to a suppression file. You can specify
either the full path to the file, or the path of the file relative to
the location of your executable.

For example:

ASAN_OPTIONS=suppressions=MyASan.supp
    Copy to clipboard

Use the following format to specify the names of the functions or
libraries you want to suppress. You can see these in the error
report. Remember that the narrower the scope of the suppression, the
more bugs you will be able to catch.

interceptor_via_fun:NameOfCFunctionToSuppress
    interceptor_via_fun:-[ClassName objCMethodToSuppress:]
    interceptor_via_lib:NameOfTheLibraryToSuppress
    Copy to clipboard

#### has\_feature(address\_sanitizer)

In some cases, you might need to execute different code depending on
whether ASan is enabled. The language extension has\_feature can be
used for this purpose. For example:

#if defined( has_feature)
    # if has_feature(address_sanitizer)
    // code that builds only under AddressSanitizer # endif
    #endif
    Copy to clipboard

has\_feature is a function-like macro that accepts a single identifier
argument that is the name of a feature. It evaluates to 1 if the
feature is both supported by Clang and standardized in the current
language standard. If not, it evaluates to 0.

Another use of has\_feature is to check for compiler features not
related to the language standard (such as ASan itself).

#### attribute ((no\_sanitize(“address”)))

Some code should not be instrumented by ASan. To disable the
instrumentation of a particular function, use the following function
attribute:

attribute ((no_sanitize("address")))
    Copy to clipboard

Note

The no\_sanitize attribute has the deprecated synonyms no\_sanitize\_address and
no\_address\_safety\_analysis.

#### Blacklist—Runtime suppression

ASan supports the use of sanitizer special case lists to suppress
error reports in the specified source files or functions ([Section
8.1.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-special-case-lists)). It also defines the ASan-specific
entity types global and type for suppressing error reports on any
out-of-bound accesses to globals with certain names and types (you
can only specify class or structure types).

ASan defines a sanitizer-specific category, init, which can be used
in a case list to suppress error reports about initialization-order
problems occurring in certain source files or with certain global
variables. For example:

# Suppress error reports for code in a file or in a function:
    src:bad_file.cpp
    # Ignore all functions with names containing MyFooBar:
    fun:*MyFooBar*
    # Disable out-of-bound checks for global: global:bad_array
    # Disable out-of-bound checks for global instances of a given class
    ...
    type:Namespace::BadClassName
    # ... or a given struct. Use wildcard to deal with anonymous namespace.
    type:Namespace2::*::BadStructName
    # Disable initialization-order checks for globals:
    global:bad_init_global=init type:*BadInitClassSubstring*=init
    src:bad/init/files/*=init
    Copy to clipboard

### Suppress memory leaks

If the Leak Sanitizer ([Section 8.4](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-leak-sanitizer)) is run as
part of ASan, any memory leak reports it generates can be suppressed
by a separate file passed to the compiler using the environment
variable LSAN\_OPTIONS. For example:

LSAN_OPTIONS=suppressions=MyLSan.supp
    Copy to clipboard

The specified text file (in this case, MyLSan.supp) contains one or
more lines of the following form:

`leak:<pattern>`

Memory leaks will be suppressed if the any of the patterns specified
in this file match a function name, source filename, or library name
in the symbolized stack trace of the leak report. For details, see
[Section 8.4](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-leak-sanitizer).

### Limitations

- ASan uses more real memory than a native run. Exact overhead depends
on the allocations sizes. The smaller the allocations you make the
bigger the overhead is.
- ASan uses more stack memory. Up to a 3x increase can occur.
- On 64-bit platforms, ASan maps (but does not reserve) 16+ terabytes
of virtual address space. Thus, tools like ulimit might not work as
usually expected.
- Static linking is not supported.

### Options

When running the instrumented executable and ASan does not detect any
errors, there is no output. Conversely, when you there is no output,
it can mean either that no errors occurred or that the executable was
not instrumented with the ASan runtime.

To verify that the executable is instrumented with ASan, use the
ASAN\_OPTIONS environment variable and verbosity=1 flag. Doing this
directs the ASan runtime to output a startup message when the
executable is launched. For example (in Bash):

$ ASAN_OPTIONS=verbosity=1 ./myExe
    Copy to clipboard

If no output is generated while verbose mode is enabled, this implies
the executable was not instrumented with ASan. Check that the
-fsanitize=address option was passed to clang for *both* for the
compilation step and the linking step.

ASan offers a variety of options for controlling the runtime behavior
and enabling or disabling its functionality. For example, if you are
running out of memory, set qualantine\_size=0. This causes ASan to
miss any use-after-free errors but still detect buffer-overflow
errors.

Similarly, if you are overflowing the stack, set redzone=0 to save
stack space. In this case, you will miss buffer-overflow errors, but
can still detect use-after-free errors.

You can specify multiple options by separating flags with a colon.
For example:

$ ASAN_OPTIONS=log_path=my-asan-report:redzone=8
    Copy to clipboard

#### Descriptions

**verbosity**

> 
> 
> Be more verbose (mostly for testing the tool itself). (Default = 0)

**malloc\_context\_size**

> 
> 
> Number of frames in malloc/free stack traces (0-256). (Default = 30)

**redzone**

> 
> 
> Size of minimal redzone. (Default = 16)

**log\_path**

> 
> 
> Path to log files. If specified as log\_path=PATH, every process will
> write error reports to PATH.PID. (Default = stderr)

**sleep\_before\_dying**

> 
> 
> Sleep for the specified number of seconds before exiting the process
> on failure. (Default = 0)

**quarantine\_size**

> 
> 
> Size of quarantine (in bytes) for finding use-after-free errors.
> Lower values save memory but increase false negatives rate. (Default = 256 MB)

**exitcode**

> 
> 
> Call \_exit(exitcode) on error. (Default = 1)

**abort\_on\_error**

> 
> 
> If set to 1, on error call abort() instead of \_exit(exitcode).
> (Default = 0)

**strict\_memcmp**

> 
> 
> If set to 1, treat memcmp(“foo”, “bar”, 100) as a bug. (Default = 1)

**alloc\_dealloc\_mismatch**

> 
> 
> If set to 1, check for mismatches between malloc()/new/new and
> free()/delete/delete. (Default = 1)

**handle\_segv**

> 
> 
> If set to 1, ASan installs its own handler for SIGSEGV. (Default = 1)

**allow\_user\_segv\_handler**

> 
> 
> If set to 1, allows you can override the SIGSEGV handler installed by
> ASan. (Default = 0)

**check\_initialization\_order**

> 
> 
> If set to 1, detect existing initialization order problems. (Default = 0)

**strip\_path\_prefix**

> 
> 
> If strip\_path\_prefix=PREFIX, remove the substring .\*PREFIX from the
> reported filenames. (Default =” “)

### Notes

The ASan runtime library does not yet demangle symbols, but the LLVM
symbolizer can be used to demangle symbols ([Section 8.8](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-llvm-symbolizer)).

The ASan runtime library cannot be statically linked on Android. The
linker does not load the libc symbols before any others, as it does
on Linux systems. ASan relies on this feature to hijack symbols
before any other shared objects are loaded. Therefore, on Android it
is necessary to use the LD\_PRELOAD trick.

For more information on ASan, go to:
[clang.llvm.org/docs/AddressSanitizer.html](http://clang.llvm.org/docs/AddressSanitizer.html)

## Data Flow Sanitizer

The LLVM compiler release includes a tool named *Data Flow Sanitizer*
(DFSan), which can be used to perform generalized data flow analysis
on C and C++ code. Unlike the other sanitizers, DFSan is not designed
to detect a specific class of bugs on its own. Instead, it provides a
generic dynamic data flow analysis framework to be used by clients to
help detect application-specific issues within their own code.

### Usage

With no program changes, applying DFSan to a program will not alter
its behavior. To use DFSan, the program uses API functions to apply
tags to data to cause it to be tracked, and to check the tag of a
specific data item. DFSan manages the propagation of tags through the
program according to its data flow. The functions are defined in the
sanitizer/dfsan\_interface.h header file. For more information about
each function, see the header file.

### ABI list

DFSan uses a list of functions known as an *ABI list* to decide
whether a call to a specific function should use the operating
system’s native ABI, or whether it should use a variant of this ABI
that also propagates labels through function parameters and return
values.

The ABI list file also controls how labels are propagated in the
former case. DFSan comes with a default ABI list that is intended to
eventually cover the glibc library on Linux. But it might become
necessary to extend the ABI list when a particular library or
function cannot be instrumented (for example, because it is
implemented in assembly or another language that DFSan does not
support) or a when function is called from a library or function that
cannot be instrumented.

DFSan’s ABI list file uses the same format as a sanitizer special
case list ([Section 8.1.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-special-case-lists)). The pass treats
every function in the uninstrumented category in the ABI list file as
conforming to the native ABI. Unless the ABI list contains additional
categories for those functions, a call to one of those functions will
produce a warning message because the labeling behavior of the
function is unknown.

DFSan defines the sanitizer-specific categories discard, functional,
and custom to control the sanitizer behavior:

- discard - To the extent that this function writes to
(user-accessible) memory, it also updates labels in shadow memory
(this condition is trivially satisfied for functions that do not
write to user-accessible memory). Its return value is unlabeled.
- functional - Like discard, except the label of its return value is
the union of the label of its arguments.
- custom - Instead of calling the function, a custom wrapper, dfsw\_F
is, called, where F is the name of the function. This function might
wrap the original function or provide its own implementation.

    This category is generally used for functions that cannot be
uninstrumented to write to user-accessible memory, or for functions
that have more complex label propagation behavior.

    The signature of dfsw\_F is based on that of F with each argument
having a label of type dfsan\_label appended to the argument list. If
F is a non-void return type, a final argument of type dfsan\_label \*
is appended, to which the custom function can store the label for the
return value.

    For example:

void f(int x);
        void dfsw_f(int x, dfsan_label x_label);
        void *memcpy(void *dest, const void *src, size_t n);
        void * dfsw_memcpy(void *dest, const void *src, size_t n,
                          dfsan_label dest_label, dfsan_label src_label,
                          dfsan_label n_label, dfsan_label *ret_label);
        Copy to clipboard

If a function defined in the translation unit being compiled belongs
to the uninstrumented category, it will be compiled so as to conform
to the native ABI. Its arguments will be assumed to be unlabeled, but
it will propagate labels in shadow memory.

For example:

# main is called by the C runtime using the native ABI.
    fun:main=uninstrumented
    fun:main=discard
    # malloc only writes to its internal data structures, not
    #user-accessible memory.
    fun:malloc=uninstrumented fun:malloc=discard
    # tolower is a pure function. fun:tolower=uninstrumented
    fun:tolower=functional
    # memcpy needs to copy the shadow from the source to the destination region.
    # This is done in a custom function. fun:memcpy=uninstrumented
    fun:memcpy=custom
    Copy to clipboard

### Example

The following program demonstrates label propagation by checking that
the correct labels are propagated.

#include <sanitizer/dfsan_interface.h> #include <assert.h>
    
    int main(void) {
       int i = 1;
       dfsan_label i_label = dfsan_create_label("i", 0);
       dfsan_set_label(i_label, &i, sizeof(i));
       int j = 2;
       dfsan_label j_label = dfsan_create_label("j", 0);
       dfsan_set_label(j_label, &j, sizeof(j));
       int k = 3;
       dfsan_label k_label = dfsan_create_label("k", 0);
       dfsan_set_label(k_label, &k, sizeof(k));
       dfsan_label ij_label = dfsan_get_label(i + j);
       assert(dfsan_has_label(ij_label, i_label));
       assert(dfsan_has_label(ij_label, j_label));
       assert(!dfsan_has_label(ij_label, k_label));
       dfsan_label ijk_label = dfsan_get_label(i + j + k);
       assert(dfsan_has_label(ijk_label, i_label));
       assert(dfsan_has_label(ijk_label, j_label));
       assert(dfsan_has_label(ijk_label, k_label));
       return 0;
    }
    Copy to clipboard

For more information on DFSan, go to:
[clang.llvm.org/docs/DataFlowSanitizer.html](http://clang.llvm.org/docs/DataFlowSanitizer.html)

## Leak sanitizer

The LLVM compiler release includes a tool named *Leak Sanitizer*
(LSan), which can be used to detect runtime memory leaks in C and C++
code.

LSan can be combined with the Address Sanitizer
([Address Sanitizer](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-address-sanitizer)) to enable both memory error and leak
detection, or it can be used as a standalone tool.

Note

LSan adds almost no performance overhead until the very end
of the process, when an extra leak detection phase is performed.

### Usage

To use LSan, simply build the program with the Address Sanitizer
([Section 8.2](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-address-sanitizer)). For example:

$ cat memory-leak.c #include <stdlib.h> void *p;
    int main() {
       p = malloc(7);
       p = 0; // The memory is leaked here. return 0;
    }
    % clang -fsanitize=address -g memory-leak.c ; ./a.out
    ==23646==ERROR: LeakSanitizer: detected memory leaks Direct leak of 7
    byte(s) in 1 object(s) allocated from:
    #0 0x4af01b in interceptor_malloc
    rt/lib/asan/asan_malloc_linux.cc:52:3
    /projects/compiler-
       #1 0x4da26a in main memory-leak.c:4:7
       #2 0x7f076fd9cec4 in libc_start_main libc-start.c:287 SUMMARY:
          AddressSanitizer: 7 byte(s) leaked in 1 allocation(s).
    Copy to clipboard

To use LSan in standalone mode, link the program with the
-fsanitize=leak option. Be sure to use clang (not ld) for the link
step, to ensure that the proper LSan runtime library is linked into
the final executable.

For more information on LSan, go to:
[clang.llvm.org/docs/LeakSanitizer.html](http://clang.llvm.org/docs/LeakSanitizer.html)

## Memory Sanitizer

The LLVM compiler release includes a tool named *Memory Sanitizer*
(MSan), which can be used to detect the use of uninitialized memory
in C and C++ code.

MSan is a runtime tool that requires compile-time instrumentation of
the code, and a dedicated runtime library. If MSan encounters a bug
during the execution of a program, it halts the execution and
displays (on stderr) an error message and stack trace. In addition,
it can optionally display information on where the uninitialized
memory was originally allocated.

### Usage

To instrument your C/C++ code with MSan, add the following option to
both the compile and link options in LLVM:

`-fsanitize=memory`

The MSan runtime library must be linked to the final executable—be
sure to use clang (not ld) for the final link step.

When linking shared libraries, the MSan runtime is not linked, so
-Wl,-z,defs might cause link errors (do not use it with MSan). For
reasonable execution performance use -O1 or higher. For meaningful
stack traces in error messages, use -fno-omit-frame-pointer. For
perfect stack traces, you might need to disable inlining (only use
-O1) and tail call elimination (-fno-optimize-sibling-calls).

For example:

% cat umr.cc #include <stdio.h>
    int main(int argc, char** argv) {
       int* a = new int[10];
       a[5] = 0;
       if (a[argc])
          printf("xx\\n");
       return 0;
    }
    % clang -fsanitize=memory -fno-omit-frame-pointer -g -O2 umr.cc
    Copy to clipboard

If a bug is detected, the program will print an error message to
stderr and exit with a non-zero exit code:

% ./a.out``
    WARNING: MemorySanitizer: use-of-uninitialized-value
       #0 0x7f45944b418a in main umr.cc:6
       #1 0x7f45938b676c in libc_start_main libc-start.c:226
    Copy to clipboard

By default, MSan exits on the first detected error. If you find the
error report hard to understand, try enabling origin tracking
([Origin tracking](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-origin-tracking)).

#### has\_feature(memory\_sanitizer)

In some cases, you might need to execute different code depending on
whether MSan is enabled. The language extension has\_feature can be
used for this purpose.

For example:

#if defined( has_feature)
    # if has_feature(memory_sanitizer)
    // code that builds only under MemorySanitizer # endif
    #endif
    Copy to clipboard

has\_feature is a function-like macro that accepts a single identifier
argument that is the name of a feature. It evaluates to 1 if the
feature is both supported by Clang and standardized in the current
language standard. If not, it evaluates to 0.

Another use of has\_feature is to check for compiler features not
related to the language standard (such as MSan itself).

#### attribute ((no\_sanitize\_memory))

Some code should not be instrumented by MSan. To disable the
instrumentation of a particular function, use the following function
attribute:

attribute ((no_sanitize_memory))
    Copy to clipboard

To avoid false positives, MSan might still instrument such functions.

#### Blacklist

MSan supports the use of sanitizer special case lists to suppress
error reports in the specified source files or functions
([Section 8.1.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-special-case-lists)). All *Use of uninitialized value*
warnings are suppressed, and all values loaded from memory are
considered fully initialized.

### Report symbolization

MSan uses an external symbolizer to print files and line numbers in
reports. Ensure that the llvm-symbolizer binary is in PATH, or set
the environment variable MSAN\_SYMBOLIZER\_PATH to point to it.

### Origin tracking

MSan can track origins of uninitialized values, similar to Valgrind’s
-track-origins option. This feature is enabled with the
-fsanitize-memory-track-origins=2 option (or simply
-fsanitize-memory-track-origins).

Example of origin tracking (using the code from the preceding
example):

% cat umr2.cc #include <stdio.h>
    int main(int argc, char** argv) {
       int* a = new int[10];
       a[5] = 0;
       volatile int b = a[argc];
       if (b)
          printf("xx\\n");
       return 0;
    }
    % clang -fsanitize=memory -fsanitize-memory-track-origins=2 -fno-omit-frame-pointer -g -O2 umr2.cc
    % ./a.out
    WARNING: MemorySanitizer: use-of-uninitialized-value
       #0  0x7f7893912f0b in main umr2.cc:7
       #1  0x7f789249b76c in libc_start_main libc-start.c:226
    Uninitialized value was stored to memory at
       #0  0x7f78938b5c25 in msan_chain_origin msan.cc:484
       #1  0x7f7893912ecd in main umr2.cc:6
    Uninitialized value was created by a heap allocation
       #0  0x7f7893901cbd in operator new[](unsigned long) msan_new_delete.cc:44
       #1 0x7f7893912e06 in main umr2.cc:4
    Copy to clipboard

By default, MSan collects both the allocation points and all
intermediate stores that the uninitialized value went through.

Origin tracking has proved to be very useful for debugging MSan
reports. It slows down program execution by a factor of 1.5x-2x on
top of the usual MSan slowdown, and increases memory overhead.

The -fsanitize-memory-track-origins=1 option enables a slightly
faster mode when MSan collects only allocation points and not
intermediate stores.

### Use-after-destruction detection

MSan supports use-after-destruction detection. After its destructor
is invoked, an object is considered no longer readable, and using the
underlying memory will lead to error reports in runtime.

Note

This feature is experimental.

To enable this feature at runtime, perform the following steps:

1. During compilation, specify the -fsanitize-memory-use-after-dtor option.
2. Before running the program, set the MSAN\_OPTIONS=poison\_in\_dtor=1 environment variable.

### Handling external code

MSan requires all program code to be instrumented, including any
libraries that the program depends on (even libc). Failure to do this
can result in the generation of false reports.

Full MSan instrumentation is very difficult to achieve. To make it
easier, the MSan runtime library includes 70+ interceptors for the
most common libc functions. This makes it possible to run
MSan-instrumented programs linked with an uninstrumented version of
libc.

### Limitations

- MSan uses 2x more real memory than a native run, and 3x with origin
tracking.
- MSan maps (but not reserves) 64 terabytes of virtual address space.
Thus, tools like ulimit might not work as expected.
- Static linking is not supported.
- Older versions of MSan (LLVM 3.7 and older) didn’t work with
non-position-independent executables, and could fail on some Linux
kernel versions with disabled ASLR. For more information, see the
LLVM documentation for older versions.

For more information on MSan, go to:
[clang.llvm.org/docs/MemorySanitizer.html](http://clang.llvm.org/docs/MemorySanitizer.html)

## Thread Sanitizer

The LLVM compiler release includes a tool named *Thread Sanitizer*
(TSan), which can be used to detect data race conditions in C and C++
code.

TSan is a runtime tool that requires compile-time instrumentation of
the code and a dedicated runtime library. If TSan encounters a bug
during the execution of a program, it displays (on stderr) an error
message.

TSan slows down program execution by a factor of 5x-15x, with a
memory overhead of about 5x-10x.

Note

Currently TSan symbolizes its error output using an
external addr2line process (this will be fixed in the future).

### Usage

To instrument your C/C++ code with TSan, add the following option to
both the compile and link options in LLVM:

`-fsanitize=thread`

The TSan runtime library must be linked to the final executable—be
sure to use clang (not ld) for the final link step.

For reasonable execution performance use -O1 or higher. To include
filenames and line numbers in the generated error messages use -g.

For example:

% cat projects/compiler-rt/lib/tsan/lit_tests/tiny_race.c
    #include <pthread.h>
    int Global;
    void *Thread1(void *x) {
       Global = 42;
       return x;
    }
    int main() {
       pthread_t t;
       pthread_create(&t, NULL, Thread1, NULL);
       Global = 43;
       pthread_join(t, NULL);
       return Global;
    }
    $ clang -fsanitize=thread -g -O1 tiny_race.c
    Copy to clipboard

If a data race is detected, the program will print an error message
to stderr:

% ./a.out
    WARNING: ThreadSanitizer: data race (pid=19219) Write of size 4 at
       0x7fcf47b21bc0 by thread T1:
       #0 Thread1 tiny_race.c:4 (exe+0x00000000a360)
    Previous write of size 4 at 0x7fcf47b21bc0 by main thread:
       #0 main tiny_race.c:10 (exe+0x00000000a3b4)
    Thread T1 (running) created at:
       #0 pthread_create tsan_interceptors.cc:705 (exe+0x00000000c790)
       #1 main tiny_race.c:9 (exe+0x00000000a3a4)
    Copy to clipboard

#### has\_feature(thread\_sanitizer)

In some cases, you might need to execute different code depending on
whether TSan is enabled. The language extension has\_feature can be
used for this purpose. For example:

#if defined( has\_feature)

# if has\_feature(thread\_sanitizer)

// code that builds only under ThreadSanitizer # endif

#endif

has\_feature is a function-like macro that accepts a single identifier
argument that is the name of a feature. It evaluates to 1 if the
feature is both supported by Clang and standardized in the current
language standard. If not, it evaluates to 0.

Another use of has\_feature is to check for compiler features not
related to the language standard (such as TSan itself).

#### attribute ((no\_sanitize\_thread))

Some code should not be instrumented by TSan. To disable the
instrumentation of a particular function, use the following function
attribute:

attribute ((no_sanitize_thread))
    Copy to clipboard

To avoid false positives and provide meaningful stack traces, TSan
might still instrument such functions.

#### Blacklist

TSan supports the use of sanitizer special case lists to suppress
data race reports in the specified source files or functions
([Section 8.1.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-special-case-lists)).

Note

Unlike functions marked with no\_sanitize\_thread,
blacklisted functions are not instrumented at all. This can result in
false positives due to missed synchronization via atomic operations,
and missed stack frames in reports.

### Limitations

- TSan uses more real memory than a native run. At the default
settings, the memory overhead is 5x plus 1 MB per thread. Settings
with 3x (less accurate analysis) and 9x (more accurate analysis)
overhead are also available.
- TSan maps (but does not reserve) a lot of virtual address space.
Thus, tools like ulimit might not work as usually expected.
- libc/libstdc++ static linking is not supported.
- Non-position-independent executables are not supported. Therefore:
- When compiling without -fPIC, -fsanitize=thread causes the
compiler to act as though -fPIE had been specified.
- When linking an executable, -fsanitize=thread causes the compiler
to act as though -pie had been specified.

For more information on TSan, go to:
[clang.llvm.org/docs/ThreadSanitizer.html](http://clang.llvm.org/docs/ThreadSanitizer.html)

## Undefined Behavior Sanitizer

The LLVM compiler release includes a tool named *Undefined BehaviorSanitizer* (UBSan), which can detect code whose behavior is undefined
according to the C language specification.

UBSan can catch a wide variety of errors, including the following:

- Using misaligned or NULL pointers
- Signed integer overflow
- Conversions to, from, or between floating-point types that result in
overflow

UBSan is a runtime tool that requires compile-time instrumentation of
the code. It includes an optional runtime library that provides
better error reporting.

If UBSan encounters code with undefined behavior during the execution
of a program, it displays (on stderr) an error message, and then
responds according to the type of program behavior:

- After a signed integer overflow, the program continues executing.
- After the invalid use of a NULL pointer, the program is halted.
- After the use of a misaligned pointer, a trap is generated.

### Usage

To instrument your C/C++ code with UBSan, add the following option to
both the compile and link options in LLVM:

`-fsanitize=undefined`

If you link the UBSan runtime library to the final executable, be
sure to use clang++ (not ld) for the final link step, to ensure that
the executable is linked with the proper UBSan runtime libraries.

Note

When using C code, you can link with clang instead of
clang++. For example:

% cat test.cc
    int main(int argc, char **argv) {
       int k = 0x7fffffff;
       k += argc;
       return 0;
    }
    % clang++ -fsanitize=undefined test.cc
    % ./a.out
    test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1
    cannot be represented in type 'int'
    Copy to clipboard

You can configure UBSan to change the following behavior:

- Enable only a subset of the regular UBSan checks.
- Define how UBSan responds to each type of undefined program behavior
(either continue, halt, or trap).

For example:

% clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment
    Copy to clipboard

In this example, the program will continue executing after a signed
integer overflow, exit after the invalid use of a NULL pointer, and
trap after the use of a misaligned pointer.

Note

The trap option does not require UBSan runtime support.

### Checks and option values

UBSan performs checks that are individually controlled by option
values passed to the `-fsanitize=<event>` option ([Section 4.3.19](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-code-generation)).

- **alignment**
    - Use of a misaligned pointer or creation of a misaligned reference.

- **bool**
    - Load of a boolean value that is neither TRUE nor FALSE.

- **bounds**
    - Out-of-bounds array indexing, in cases where the array bound can be
statically determined.

- **enum**
    - Load of a value of an enumerated type that is not in the range of
representable values for that enumerated type.

- **float-cast-overflow**
    - Conversion to, from, or between floating-point types that would
overflow the destination.

- **float-divide-by-zero**
    - Floating point division by zero.

- **function**
    - Indirect call of a function through a function pointer of the wrong type.

- **integer-divide-by-zero**
    - Integer division by zero.

- **nonnull-attribute**
    - Pass a NULL pointer as a function parameter that is declared to never be NULL.

- **null**
    - Use a NULL pointer or creation of a NULL reference.

- **object-size**
    - Attempt to use bytes that the optimizer can determine are not part of
the object being accessed.

- **return**
    - In C++, reaching the end of a value-returning function without
returning a value.

- **returns-nonnull-attribute**
    - Return a NULL pointer from a function that is declared to never
return NULL.

- **shift**
    - Shift operators where the amount shifted is greater or equal to the
promoted bitwidth of the left-hand side or less than zero, or where
the left-hand side is negative. For a signed left shift, also checks
for signed overflow in C, and for unsigned overflow in C++.

- **shift-base**
    - Check only left-hand side of a shift operation.

- **shift-exponent**
    - Check only right-hand side of a shift operation.

- **signed-integer-overflow**
    - Signed integer overflow, including all the checks added by -ftrapv,
and checking for overflow in signed division (INT\_MIN / -1).

- **unreachable**
    - If control flow reaches builtin\_unreachable.

- **unsigned-integer-overflow**
    - Unsigned integer overflows.

- **unreachable**
    - If control flow reaches builtin\_unreachable.

- **unsigned-integer-overflow**
    - Unsigned integer overflows.

- **vla-bound**
    - A variable-length array whose bound does not evaluate to a positive value.

- **vptr**
    - Use of an object whose vptr indicates that it is of the wrong dynamic
type, or that its lifetime has not begun or has ended. Incompatible
with **-fno-rtti**. Link must be performed by clang++, not clang, to
make sure C++-specific parts of the runtime library and C++ standard
libraries are present.

- **undefined**
    - All of the checks listed in this section, other than
unsigned-integer-overflow.

- **integer**
    - Checks for undefined or suspicious integer behavior (such as an
unsigned integer overflow).

### Stack traces and report symbolization

To make UBSan print a symbolized stack trace for each error report,
use the following procedure:

1. Compile with -g and -fno-omit-frame-pointer to get the proper debug
information in your binary.
2. Run the program with the environment variable

    `UBSAN_OPTIONS=print_stacktrace=1`.
3. Ensure that the llvm-symbolizer binary is in PATH.

### Issue suppression

UBSan generally does not produce false positives, so if you see one,
look again. Most likely it is a true positive.

#### attribute ((no\_sanitize(“undefined”)))

Some code should not be instrumented by UBSan. To disable the
instrumentation of a particular function, use the following function
attribute:

attribute ((no_sanitize\_("undefined")))
    Copy to clipboard

All values of -fsanitize=&lt;event&gt; can be used in this attribute. For
example, if the function deliberately contains possible signed
integer overflow, you can use the following:

attribute ((no_sanitize("signed-integer-overflow"))).
    Copy to clipboard

#### Blacklist

UBSan supports the use of sanitizer special case lists to suppress
error reports in the specified source files or functions
([Special case lists](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-special-case-lists)).

#### Runtime suppression

Sometimes you can suppress UBSan error reports for specific files,
functions, or libraries without recompiling the code. You must pass a
path to suppression file in a UBSAN\_OPTIONS environment variable.

UBSAN\_OPTIONS=suppressions=MyUBSan.supp

You must specify a check ([Additional checks](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-additional-checks)) you
are suppressing, along with the bug location. For example:

signed-integer-overflow:file-with-known-overflow.cpp
alignment:function\_doing\_unaligned\_access
vptr:shared\_object\_with\_vptr\_failures.so

Several limitations apply:

- Sometimes the binary must have enough debug information or a symbol
table so the runtime code can figure out the source file or function
name to match against the suppression.
- Only recoverable checks can be suppressed.
- For the previous example, you can also pass
-fsanitize-recover=signed-integer- overflow,alignment,vptr, although
most of the UBSan checks are recoverable by default.
- Check groups (such as undefined) cannot be used in suppressions
files. Only fine- grained checks are supported.

### Notes

In the C language specification, *undefined behavior* is the result
of performing certain erroneous operations that are not flagged with
an error. A single instance of undefined behavior causes *all* of a
program’s output to be considered unpredictable and therefore
useless.

For more information on undefined behavior, go to:
[blog.llvm.org/2011/05/what-every-c-programmer-should-know.html](http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html)

For more information on UBSan, go to:
[clang.llvm.org/docs/UndefinedBehaviorSanitizer.html](http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html)

## LLVM Symbolizer

The LLVM compiler release includes a tool named *LLVM Symbolizer*,
which can be used to convert program addresses into source code
locations.

The Symbolizer is a command-line tool—it reads object filenames and
addresses from the standard input, and writes the corresponding
source code locations to the standard output.

If an object filename is directly specified as a command-line
argument, the Symbolizer treats it as the name of the input object
file, and reads only addresses from the standard input.

Note

To perform its conversion, the Symbolizer uses the symbol
tables and debug information sections that are stored in the object
files.

To start the Symbolizer from a command line, enter the following:

> 
> 
> llvm-symbolizer <options...>
>     Copy to clipboard

Command options are used to control the symbolizer ([Options](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#llvm-symbolizer-options)).

The Symbolizer normally returns 0 as a program return code. Any other
code values indicate an internal program error.

The Symbolizer is used with ASan ([Section 8.2](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-address-sanitizer)),
MSan (<cite>Section 8.5 &lt;sec-memory-sanitizer&gt;</cite>), and
UBSan (<cite>Section 8.7 &lt;sec-undefined-behavior-sanitizer&gt;</cite>).

### Usage

$ cat addr.txt a.out 0x4004f4
    /tmp/b.out 0x400528
    /tmp/c.so 0x710
    /tmp/mach_universal_binary:i386 0x1f84
    /tmp/mach_universal_binary:x86_64 0x100000f24
    $ llvm-symbolizer < addr.txt main
    /tmp/a.cc:4
    f(int, int)
    /tmp/b.cc:11
    h_inlined_into_g
    /tmp/header.h:2 g_inlined_into_f
    /tmp/header.h:7 f_inlined_into_main
    /tmp/source.cc:3 main
    /tmp/source.cc:8
    _main
    /tmp/source_i386.cc:8
    _main
    /tmp/source_x86_64.cc:8
    $ cat addr2.txt
    0x4004f4
    0x401000
    $ llvm-symbolizer -obj=a.out < addr2.txt main
    /tmp/a.cc:4
    foo(int)
    /tmp/a.cc:12
    Copy to clipboard

### Options

The Symbolizer is controlled by command-line options.

-default-arch <arch_name>
    Copy to clipboard

If a binary contains object files for multiple architectures (for
example, it is a Mach-O universal binary), symbolize the object file
for the specified architecture.

The architecture name is specified as a string value. The default
setting is an empty string.

The architecture can alternatively be specified by passing the string

*binary\_name:arch\_name* as part of the input ([Usage](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#llvm-symbolizer-usage)).

Note

If an architecture is not specified in either way,
addresses will not be symbolized.

- **-demangle**
    - Print demangled function names. The default setting is enabled.

- **-functions=(none|short|linkage)**
    - Specify how function names are printed.

- none – Omit the function name.
- short – Print the short function name.
- linkage – Print the full linkage name (Default).

The default setting is linkage.

- **-inlining**
    - If a source code location is in an inlined function, prints all the
inlined frames. The default setting is enabled.

- **-obj**
    - Path to the object file to be symbolized.

- **-use-symbol-table**
    - Favor function names stored in the symbol table over function names
in debug information sections.

## Control flow integrity

Control flow integrity (CFI) is a compiler feature that aborts the
program upon detecting certain forms of undefined behavior that might
allow attackers to subvert the program’s control flow.

When CFI is enabled, the program code is instrumented with fast
checks for indirect calls, and hooks for a function to report
violations of forward-edge control-flow integrity.

CFI is controlled with the compile options -ffcfi and -fno-fcfi. For
example:

clang -S -emit-llvm -ffcfi -o foo.ll foo.c
    Copy to clipboard

### Configuration

CFI must be configured to handle control-flow violations; otherwise,
the violations are ignored by default. It can also be configured to
generate different types of code instrumentation. Different types of
programs can execute more efficiently with different types of
instrumentation.

CFI configuration is performed with the LLVM static compiler tool.

Note

The static compiler is different from the normal LLVM
compiler—it is invoked by the latter to translate LLVM bitcode into
target native code.

To start the static compiler from a command line, enter:

llc <options...>
    Copy to clipboard

Command options are used to control the static compiler
([Options](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#control-flow-integrity-options)).

### Usage

1. Compile the program files, enabling CFI and generating the LLVM
bitcode files:

clang -S -emit-llvm -ffcfi -o foo.ll foo.c clang -S -emit-llvm -ffcfi -o bar.ll bar.c
        Copy to clipboard
2. Link the bitcode files:

llvm-link -o prog.ll foo.ll bar.ll
        Copy to clipboard
3. Static-compile the bitcode files, configuring CFI and generating the
relocatable object file:

llc -cfi-enforcing -cfi-type=sub -jump-table-type=simplified -filetype=obj -o prog.o prog.ll
        Copy to clipboard
4. Link the relocatable object file into the executable binary:

clang -o prog prog.o
        Copy to clipboard
5. Run the CFI-instrumented executable binary:

./prog
        Copy to clipboard

### Options

The LLVM static compiler is controlled by command-line options.

- **-cfi-enforcing**
    - Enforce control-flow integrity.

By default, integrity violations invoke a handler function specified
with -cfi-func- name. If no function is specified the violation is
ignored.

- **-cfi-func-name=&lt;name&gt;**
    - Specify the handler function that is called when a CFI violation occurs.

Note

This option is superseded by -cfi-enforcing.

- **-cfi-type=(sub|ror|add)**
    - Specify the type of CFI checks to be performed.

- sub – Subtract pointer from table base, and then mask (default).
- ror – Use rotate to check offset from table base.
- add – Mask out high bits and add to aligned base.

- **-jump-table-type=(single|arity|simplified|full)**
    - Specify the type of jump table to use for CFI instrumentation.

- single – Create a single table for all functions (default).
- arity – Group functions into tables by the number of arguments they receive.
- simplified – Create one table per simplified function type.
- full – Create one table per function type.

Note

We recommend using simplified because it offers the best
balance between security and robustness.

### Handler functions

If a user-defined handler function is specified (with the
-cfi-func-name option), the function must accept two char\*
parameters:

- The first parameter is a C string that WILL contain the name of the
function where the control-flow integrity violation occurred.
- The second parameter will contain the pointer that violated
control-flow integrity.

To be specified with -cfi-func-name, the handler function must be
defined as a linker symbol.

### Notes

The current CFI implementation does not imply -fsanitize or -flto.
Therefore you must compile each source file to bitcode using -ffcfi,
and then compile and link the bitcode files into native code.

The LLVM Snapdragon compiler includes support for a particular CFI
scheme known as *forward-edge control flow integrity*. It is
supported on Armv7 and Armv8 (AArch32 and AArch64) targets. For more
information on this scheme, go to:

[www.pcc.me.uk/~peter/acad/usenix14.pdf](http://www.pcc.me.uk/~peter/acad/usenix14.pdf)
For more information on CFI, go to:

[clang.llvm.org/docs/ControlFlowIntegrity.html](http://clang.llvm.org/docs/ControlFlowIntegrity.html)

For more information on the LLVM static compiler, go to:
[llvm.org/docs/CommandGuide/llc.html](http://llvm.org/docs/CommandGuide/llc.html)

## Static program analysis

The LLVM compiler release includes the following tools for performing
static analysis on a program:

- Static analyzer

    A tool that analyzes source code to find potential bugs in C and C++
programs. It can be used to analyze individual files or entire
programs.
- Postprocessor

    A tool that creates a summary of the reports that are generated by
performing static analysis while compiling a program.
- Scan-build

    An additional tool for compiling and statically analyzing a program.
It can be used with a make-based build system.

### Static analyzer

The static analyzer is a source code analysis tool that is integrated
into the LLVM compiler. It analyzes a program for potential
bugs—including security threats, memory corruption, and garbage
values—and generates a diagnostic report describing the potential
bugs it detected.

The static analyzer has the following features:

- It supports more than 100 distinct *checkers* that are organized into
the alpha, core, cplusplus, debug, and security categories
- Checkers can be selectively enabled or disabled from the command line
- Disabling a checker category disables all the checkers in that
category
- Selected parts of the program code can be excluded from checking

#### Analyze programs

To use the static analyzer on an entire program, invoke the LLVM
compiler on the program using the –compile-and-analyze option
([Compiler security](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-compiler-security)). For example:

clang --compile-and-analyze <dir input_files...>
    Copy to clipboard

- `--compile-and-analyze <dir>` specifies the directory where the
static analyzer report will be stored. (If the directory does not
exist, the compiler automatically creates it.). The report is
automatically generated in HTML format. The files are named
report\*.html.
- *input\_files* specifies the program source files.

    We recommend statically analyzing an entire program at once (as
opposed to selected source files) for the following reasons:
- The generated analysis report files are all stored in a single
location.
- The command option can be passed from the build system, which helps
perform the static analysis and compilation every time the program is
built.
- Because build systems are good at tracking files that have changed,
and compiling only the minimal set of required files, the overall
turnaround time for static analysis is relatively small, making it
reasonable to run the static analyzer with every build.

Note

When using a build system, specifying the same directory
name throughout the build generates all the HTML report files in the
specified directory. The filenames generated for a report are based
on hashing functions, so the report files are not overwritten.

#### Analyze programs using default flags

To use the static analyzer on specific program source files, invoke
the LLVM compiler on the files using the static analyzer options
([Section 4.3.28](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-compiler-security)). For example:

clang --analyze -Xclang --analyzer-output -Xclang html -o <dir files>
    Copy to clipboard

Where:

- –analyze causes the compiler to generate a static analyzer report
instead of a program object file.
- –analyzer-output html specifies that the report is generated in HTML
format.

Note

-Xclang must be used (twice) to pass the –analyzer-output
html option to the compiler. For details, see
[Section 4.3.2](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-compilation).
- -o specifies the directory where the report files will be stored. (If
the directory does not exist, the compiler automatically creates
it.). The files are named report\*.html.
- *files* specifies the program source files to be analyzed.

Example of a diagnostic report entry:

// @file: test.cpp int main() {
          int* p = new int();
          return* p;
    }
    warning: Potential leak of memory pointed to by 'p'
    Copy to clipboard

Each potential bug flagged in a report includes the path (control and
data) required for locating the bug in the program.

Note

To convert static analyzer warnings to errors, use the
`--analyzer-Werror` option.

### Analyze programs with priority modes

In addition to the default –compile-and-analyze flag, the static
analyzer also offers a list of specific directives for more targeted
analysis.

- High Priority mode

    This mode turns on the highest priority checkers that catch critical
issues and have low false positive rates.

    This mode can be enabled by the –compile-and-analyze-high flag.
- Medium Priority mode

    This mode is slightly more inclusive compared to the high priority
mode, but it still avoids some noisy checkers that are enabled by
default.

    This mode can be enabled by the –compile-and-analyze-medium flag.
- KW mode

    This mode reduces issues that are typically caught by commercial
static analysis tools. It is especially helpful if you want to catch
those issues early in the development process.

    Enable this mode with the –compile-and-analyze-kw flag.

#### Manage checkers

The static analyzer supports more than 100 individual checkers that
can analyze programs for various types of potential bugs. By default,
only a subset of these checkers is enabled, to minimize both the
compile time and the generation of false positives.

To enable additional checkers, invoke the static analyzer using the
-analyzer-checker option ([Section 4.3.28](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-compiler-security)).
This option specifies the checker to be enabled (in the following
example, NewDelete).

clang++ --compile-and-analyze <dir> -Xclang -analyzer-checker=NewDelete test.cpp
    Copy to clipboard

To disable individual checkers, invoke the static analyzer using the
-analyzer-disable-checker option ([Section 4.3.28](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-compiler-security)). For example:

clang++ --compile-and-analyze <dir> -Xclang -analyzer-disable-checker=deadcode.deadstore test.cpp
    Copy to clipboard

Note

-Xclang must be used to pass the -analyzer-output, -analyzer-checker, and
-analyzer-disable-checker options to the compiler. For details, see
[Section 4.3.28](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-compiler-security).

To list all the supported checkers, use the following command:

clang -cc1 -analyzer-checker-help
    Copy to clipboard

To list only the default checkers, use the -v option while running
the analyzer on a test file ([Section 4.3.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/use_the_compilers.html#sec-display)). For
example:

clang++ -v --compile-and-analyze --analyzer-perf test.cpp
    Copy to clipboard

##### Packages

The individual checkers are organized into the following categories:

- alpha
- core
- cplusplus (only for analyzing C++ programs)
- debug
- security
- unix
- optin

Each category (or package) is defined to include a number of
checkers. For example, the checker NullDereference is a core checker,
while NewDelete is a cplusplus checker. Organizing checkers into
packages (and sub-packages) makes it easier to enable/disable
specific sets of checkers.

Example of using the static analyzer with all alpha checkers enabled:

clang --compile-and-analyze <dir> -Xclang -analyzer-checker=alpha test.cpp
    Copy to clipboard

##### Lists

To enable or disable multiple individual checkers, multiple checker
and package names can be specified as a single comma-separated list.
For example:

clang --compile-and-analyze <dir> -Xclang -analyzer-checker=alpha,core test.cpp
    Copy to clipboard

### Cross-file analysis

A limitation of the clang static analyzer is its inability to analyze
code that is called across file boundaries. This is due to the nature
of the compiler, which compiles files individually. This feature
enables the clang static analyzer to perform its analysis on code
that is called across file boundaries.

The cross-translation unit (CTU) feature allows the analysis of
called functions even if the definition of the function is external
to the currently analyzed file. This feature allows detection of bugs
in library functions stemming from incorrect usage. If an external
function is invoked, this feature also allows for more precise
analysis of the caller in general.

Enabling CTU analysis is a three-step process:

1. Generate the compile\_commands.json file.

    The makefile is read by the intercept-build script. This script is
responsible for creating a compile\_commands.json file that contains a
list of every build command that is run from this build system.

$share/scan-build-py/bin/intercept-build make
        Copy to clipboard
2. Generate the AST database.

    Generate the ASTs of every function in the build tree by issuing the
following command:

$share/scan-build-py/bin/analyze-build --cdb <location of the compile_commands json file>  \
                       -o <location of the intended report directory>                              \
                       --ctu-collect-only --ctudir <location where all the AST's are to be stored>
        Copy to clipboard
3. Run the cross-file analysis.

    Kick off the static analyzer in the usual way (with the
`--compile-and-analyze` flag) but with following added flags:

-Xclang -analyzer-checker=debug.ExprInspection
        -Xclang -analyzer-config
        -Xclang -ctu-dir=<location of the intended report directory>
        Copy to clipboard

#### Handle false positives

While checking a program for potential bugs, the static analyzer
might report false positives, which are sections of code that the
analyzer incorrectly flags as bugs.

To minimize false positives, the static analyzer, by default, enables
a set of checkers that has been tested to identify a high percentage
of actual program bugs ([Section 8.10.2.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-manage-checkers)). And
if necessary, additional checkers can be individually enabled.

However, despite the overall accuracy of the checkers, several cases
still exist where false positives can be generated. For instance, if
you enable the checker used to analyze dead code, the static analyzer
will flag as a false positive any code that has been conditionally
enabled for debugging purposes.

To handle such cases, the static analyzer supports several features
for handling false positives:

- Compile-time blacklist file
- Special comment
- Preprocessor symbol
- Function attribute
- Postprocess blacklist file

We do not recommend using comments or symbols to handle false
positives because they make the code inaccessible to the analyzer.
Instead, report any false positives so the existing checkers can be
improved to eliminate them. For more information on false positives,
go to:

[clang-analyzer.llvm.org/faq.html](http://clang-analyzer.llvm.org/faq.html)

##### Compile-time blacklist file

To silence warnings before they are reported by the analyzer, use the
external file-based suppression mechanism. Enable this mechanism by
adding the compiler flag,

> 
> 
> -analyzer-suppression-file<file location>.
>     Copy to clipboard

Suppression data in this text file must be entered line-by-line with
the following format:

<filename.extension<function identifier> <checker name>
    Copy to clipboard

The analyzer reads the contents of this file during the analysis and
does not report any warnings that are mentioned in it.

##### Special comment

Individual lines of code can be excluded from checking by adding the
following comment to the line:

// clang_sa_ignore [<checker>] [<user_comment_text>]
    Copy to clipboard

*checker* specifies a checker, package, or list
([Section 8.10.2.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-manage-checkers)) that is excluded from being
applied to the line of code. It must be enclosed in square brackets. For
example:

g_ptr = new int(0); // clang_sa_ignore [deadcode.DeadStores]
    g_ptr = new int(0); // clang_sa_ignore [alpha] my comment text
    g_ptr = new int(0); // clang_sa_ignore [alpha,deadcode.DeadStores]
    Copy to clipboard

##### Preprocessor symbol

One or more lines of code can be conditionally excluded from all
checking by using the preprocessor symbol clang\_analyzer , which is
automatically defined by the static analyzer. For example:

#ifndef __clang_analyzer__
       // Code excluded from checking
    #endif
    Copy to clipboard

When using the preprocessor symbol with the static analyzer, the code
must remain compilable, even though it is not required to be linkable
or executable. For example, to exclude the body of a function from
being analyzed, use the following conditional code:

#ifdef __clang_analyzer__
       void noisyFunction(); // this version is for analysis only
    
    #else // __clang_analyzer__
    static void noisyFunction() {
       // function body is generating too many false positives
    }
    #endif // __clang_analyzer__
    Copy to clipboard

##### Function attribute

A common source of false positives is non-returning functions such as
assert functions.

Although the static analyzer is aware of the standard library
non-returning functions, if (for example) a program has its own
implementation of asserts, it helps to mark them with the following
function attribute:

__attribute__((__noreturn__))
    Copy to clipboard

Using this attribute greatly improves the static analysis diagnostics
and lessens the number of false positives. For example:

void my_abort(const char* msg) __attribute__((__noreturn__)) {
       printf("%s", msg);
       exit(1);
    }
    Copy to clipboard

##### Postprocess blacklist file

The blacklist file consists of row-wise indications of warnings that
must be silenced. It uniquely identifies each warning by filename,
function name, and bug description. Place this file in a network
location that is accessible to all developer machines using the
analyzer.

During postprocessing, identify the blacklist file through the
`--blacklist-file <file>` flag. The post-process script does not
report the bugs marked in this file. Besides silencing warnings
across developer machines, this mechanism also allows you to silence
warnings across build variants.

The blacklist file contains row-by-row warnings in the following
format:

<filename.cpp> <function identifier/Global> <full warning description>
    Copy to clipboard

For example:

test.cpp foo Address of stack memory associated with local variable buff returned to caller [core.StackAddressEscape]
    Copy to clipboard

#### Create whitelist directories

Often, unwieldy build systems make it impossible to turn on the
static analyzer for only some subdirectories. For example, given the
following hierarchy, assume you want to see results for only the
*Qcomm\_code* directories and not the *Open\_source* directory:

ANDROID
    
    Open_source     Qcomm_hardware_code     Qcomm_software_code
    Copy to clipboard

The build system employed here allows only cflags to be set at a
global level. In this case, create a whitelist file containing
row-wise entries of the folders to be scanned. For example:

> 
> 
> *ANDROID/Qcomm\_hardware\_code ANDROID/Qcomm\_software\_code*

The post-process script must be notified of the whitelist file
through the flag, `--whitelist-file <whitelist text file>`. The
script then identifies the row-wise entries and checks them against
any static analysis warnings found. If none of the entries are a
substring of the file location of a specific bug, that bug is
silenced. Thus, having only *Qcomm\_hardware\_code* and
*Qcomm\_software\_code* in the whitelist file will suffice because
they will both be part of the location of the warnings you are
interested in.

#### Treat warnings as errors

Static analysis warnings can be treated as errors by appending the
–analyzer-Werror flag to the build flags. For example:

clang --compile-and-analyze <dir> -Xclang --analyzer-Werror test.c
    Copy to clipboard

#### Checker categorization by priority

Because checkers fall under two major priority categories (high and
medium), the following flags are available in addition to the regular
–compile-and-analyze flag:

> 
> 
> --compile-and-analyze-high
>     Copy to clipboard

The high flag allows only a small subset of checkers to be turned on.
These checkers are security critical and have an extremely low false
positive rate. Teams adopting this tool should start with the high
flag.

--compile-and-analyze-medium
    Copy to clipboard

The medium flag is slightly more permissive than the high flag. The
medium flag contains a few more checkers that are deemed security
critical but have a slightly higher false positive rate.

#### Performance mode

The static analyzer can be run in Performance mode. This mode
prevents the creation of all the raw HTML files, and instead it
displays a summary of the warnings in the console output.

To enable this mode, append the –analyzer-perf flag to the
`--compile-and-analyze` flag in place of the *&lt;dir&gt;* option. For example:

clang --compile-and-analyze --analyzer-perf test.c clang --compile-and-analyze-high --analyzer-perf test.c
    Copy to clipboard

#### YAML configuration file

The static analyzer can also be configured using a single
configuration file (–qc-config- file) that combines all previously
mentioned options in a single location. For example:

clang --qc-config-file qc_sa_config.txt test.c
    Copy to clipboard

The configuration file uses YAML format and is subject to all syntax
restrictions. Following is a sample configuration file with embedded
comments.

#############################################
    # Static Analyzer configuration file #
    #############################################
    
    #############################################
    # Global SA options #
    #############################################
    
    ---
    
    entry-type: global_settings
    
    # Verbosity suppression level. Most useful is "high" i.e. "high
    suppression" == fewer false positives. The "high" setting here is
    equivalent to the command line compile-and-analyze- high flag.
    
    # Available options: zero|default|high|medium|low verbosity: high
    
    # Location for individual reports. Absolute or relative path.
    analyzer-output-dir: sa_report_dir
    
    # Format for output reports. Most useful html.
    analyzer-output-format: html
    
    # Same as -analyzer-config stable-report-filename=true reduces the
    number of duplicate reports
    
    stable-report-filename: true
    
    # Do not output duplicate report for different path to the same
    location.
    
    # The same header file can be included from multiple .c/cpp locations
    - report issues only once.
    
    no-duplicate-reports: true
    
    #############################################
    
    # Per-checker individual options # # It is applied \_after\_ the
    GLOBAL settings # #############################################
    
    ---
    
    entry-type: checker_config config:
    
    - package: core
    
    # "default" - use verbosity setting from global_settings # "on" -
    enable checker
    
    # "off" - disable checker checkers:
    
    # Check for dereferences of null pointers
    
    -  checker-name: NullDereference state: off
    
    # Check for logical errors for function calls and Objective-C message
    expressions (e.g., uninitialized arguments, null function pointers)
    
    -  checker-name: CallAndMessage state: off
    
    -  package: alpha.core checkers:
    
    # Check when casting a malloc'ed type T, whether the size is a
    multiple of the size of T
    
    - checker-name: CastSize state: default
    
    # Here follows the full list of all available packages and checkers
    with individual settings for each.
    
    #############################################
    # Individual files to skip during SA #
    #############################################
    
    ---
    
    entry-type: suppression_list
    
    # Blacklist file. See -analyzer-suppression-file option for details
    analyzer-suppression-file: suppression_file
    
    # Per path report suppression. If any part of the file path matches
    this list - no report is generated. Currently there is no wildcard
    support.
    
    suppress-path:
    
    -  modem_proc/audio_avs/main/voice/algos/mmecns/mmecns_lib/src
    
    -  modem_proc/audio_avs
    
    ...
    Copy to clipboard

### Postprocessor

The postprocessor is a report generator that is implemented as a
standalone script. This

post-process script creates a summary of the report that is generated
by using the -compile-and-analyze option ([Section 8.10.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-static-analyzer)).
To invoke the postprocessor, enter the following command:

post-process --report-dir <dir>
    Copy to clipboard

The postprocessor reads all the files from the directory specified by
the –report-dir option and writes in the same directory a summary
report file named index.html. The report title can be specified with
the –html-title option. For more information on the postprocessor,
enter the **post-process –help** command.

The post-process script is stored in the $INSTALL\_PREFIX/bin
directory.

Note

In some cases the static analyzer might generate multiple
report files for the same bug. The postprocessor cleans up after
multiple report files. For this reason, run it regularly to keep the
report directory clean.

### Scan-build

Scan-build is a standalone tool for compiling and statically
analyzing a program. You can use it with a make-based build system
(instead, however, we recommend using the LLVM static analyzer
whenever possible; see [Section 8.10.1](https://docs.qualcomm.com/doc/80-VB419-99/topic/compiler_security_tools.html#sec-static-analyzer)).

The scan-build tool enables you to run the static analyzer as part of
regular build process. Here are two examples of invoking scan-build:

scan-build clang++ -c test.cpp
    scan-build -v -k -o out-dir -disable-checker deadcode -use-c++=clang++ --use-c=clang make -j8
    Copy to clipboard

The scan-build script is stored in the directory, *$INSTALL\_PREFIX/bin*.
For more information, enter the `scan-build --help` command.

Last Published: May 10, 2024

[Previous Topic
Resource analyzer](https://docs.qualcomm.com/bundle/publicresource/80-VB419-99/topics/resource_analyzer.md) [Next Topic
Port code from GCC](https://docs.qualcomm.com/bundle/publicresource/80-VB419-99/topics/port_code_from_gcc.md)