This patch implements the initial support for upstreaming [llubi](https://github.com/dtcxzyw/llvm-ub-aware-interpreter). It only provides the minimal functionality to run a simple main function. I hope we can focus on the interface design in this PR, rather than trivial implementations for each instruction. RFC link: https://discourse.llvm.org/t/rfc-upstreaming-llvm-ub-aware-interpreter/89645 Excluding the driver `llubi.cpp`, this patch contains three components for better decoupling: + `Value.h/cpp`: Value representation + `Context.h/cpp`: Global state management (e.g., memory) and interpreter configuration + `Interpreter.cpp`: The main interpreter loop Compared to the out-of-tree version, the major differences are listed below: + The interpreter logic always returns the control to its caller, i.e., it never calls `exit/abort` when immediate UBs are triggered. + `EventHandler` provides an interface to dump the trace. It also allows callers to inspect the actual value and verify the correctness of analysis passes (e.g, KnownBits/SCEV). + The context is designed to be reentrant. That is, you can call `runFunction` multiple times. But its usefulness remains in doubt due to side effects made by previous calls. + `runFunction` handles function calls with a loop, instead of calling itself recursively. This makes it no longer bounded by the stack depth. + Uninitialized memory is planned to be approximated by returning random values each time an uninitialized byte is loaded.
80 lines
2.5 KiB
ReStructuredText
80 lines
2.5 KiB
ReStructuredText
llubi - LLVM UB-aware Interpreter
|
|
=================================
|
|
|
|
.. program:: llubi
|
|
|
|
SYNOPSIS
|
|
--------
|
|
|
|
:program:`llubi` [*options*] [*filename*] [*program args*]
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
|
|
:program:`llubi` directly executes programs in LLVM bitcode format and tracks values in LLVM IR semantics.
|
|
Unlike :program:`lli`, :program:`llubi` is designed to be aware of undefined behaviors during execution.
|
|
It detects immediate undefined behaviors such as integer division by zero, and respects poison generating flags
|
|
like `nsw` and `nuw`. As it captures most of the guardable undefined behaviors, it is highly suitable for
|
|
constructing an interesting-ness test for miscompilation bugs.
|
|
|
|
If `filename` is not specified, then :program:`llubi` reads the LLVM bitcode for the
|
|
program from standard input.
|
|
|
|
The optional *args* specified on the command line are passed to the program as
|
|
arguments.
|
|
|
|
GENERAL OPTIONS
|
|
---------------
|
|
|
|
.. option:: -fake-argv0=executable
|
|
|
|
Override the ``argv[0]`` value passed into the executing program.
|
|
|
|
.. option:: -entry-function=function
|
|
|
|
Specify the name of the function to execute as the program's entry point.
|
|
By default, :program:`llubi` uses the function named ``main``.
|
|
|
|
.. option:: -help
|
|
|
|
Print a summary of command line options.
|
|
|
|
.. option:: -verbose
|
|
|
|
Print results for each instruction executed.
|
|
|
|
.. option:: -version
|
|
|
|
Print out the version of :program:`llubi` and exit without doing anything else.
|
|
|
|
INTERPRETER OPTIONS
|
|
-------------------
|
|
|
|
.. option:: -max-mem=N
|
|
|
|
Limit the amount of memory (in bytes) that can be allocated by the program, including
|
|
stack, heap, and global variables. If the limit is exceeded, execution will be terminated.
|
|
By default, there is no limit (N = 0).
|
|
|
|
.. option:: -max-stack-depth=N
|
|
|
|
Limit the maximum stack depth to N. If the limit is exceeded, execution will be terminated.
|
|
The default limit is 256. Set N to 0 to disable the limit.
|
|
|
|
.. option:: -max-steps=N
|
|
|
|
Limit the number of instructions executed to N. If the limit is reached, execution will
|
|
be terminated. By default, there is no limit (N = 0).
|
|
|
|
.. option:: -vscale=N
|
|
|
|
Set the value of `llvm.vscale` to N. The default value is 4.
|
|
|
|
EXIT STATUS
|
|
-----------
|
|
|
|
If :program:`llubi` fails to load the program, or an error occurs during execution (e.g, an immediate undefined
|
|
behavior is triggered), it will exit with an exit code of 1.
|
|
If the return type of entry function is not an integer type, it will return 0.
|
|
Otherwise, it will return the exit code of the program.
|