Various improvements to the LLDB Variable Formatting documentation: 1. Use consistent formatting. 2. Polish wording. 3. Add examples. Signed-off-by: Will Hawkins <hawkinsw@obs.cr>
1430 lines
62 KiB
ReStructuredText
1430 lines
62 KiB
ReStructuredText
Variable Formatting
|
|
===================
|
|
|
|
LLDB has a data formatters subsystem that allows users to define custom display
|
|
options for their variables.
|
|
|
|
Usually, when you type ``frame variable`` or run some expression LLDB will
|
|
automatically choose the way to display your results on a per-type basis, as in
|
|
the following example:
|
|
|
|
::
|
|
|
|
(lldb) frame variable
|
|
(uint8_t) x = 'a'
|
|
(intptr_t) y = 124752287
|
|
|
|
Note: ``frame variable`` without additional arguments prints the list of
|
|
variables of the current frame.
|
|
|
|
However, in certain cases, you may want to associate a different style to the
|
|
display for certain datatypes. To do so, you need to give hints to the debugger
|
|
as to how variables should be displayed. The LLDB type command allows you to do
|
|
just that.
|
|
|
|
Using it you can change your visualization to look like this:
|
|
|
|
::
|
|
|
|
(lldb) frame variable
|
|
(uint8_t) x = chr='a' dec=65 hex=0x41
|
|
(intptr_t) y = 0x76f919f
|
|
|
|
In addition, some data structures can encode their data in a way that is not
|
|
easily readable to the user, in which case a data formatter can be used to
|
|
show the data in a human readable way. For example, without a formatter,
|
|
printing a ``std::deque<int>`` with the elements ``{2, 3, 4, 5, 6}`` would
|
|
result in something like:
|
|
|
|
::
|
|
|
|
(lldb) frame variable a_deque
|
|
(std::deque<Foo, std::allocator<int> >) $0 = {
|
|
std::_Deque_base<Foo, std::allocator<int> > = {
|
|
_M_impl = {
|
|
_M_map = 0x000000000062ceb0
|
|
_M_map_size = 8
|
|
_M_start = {
|
|
_M_cur = 0x000000000062cf00
|
|
_M_first = 0x000000000062cf00
|
|
_M_last = 0x000000000062d2f4
|
|
_M_node = 0x000000000062cec8
|
|
}
|
|
_M_finish = {
|
|
_M_cur = 0x000000000062d300
|
|
_M_first = 0x000000000062d300
|
|
_M_last = 0x000000000062d6f4
|
|
_M_node = 0x000000000062ced0
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
which is very hard to understand.
|
|
|
|
Note: ``frame variable <var>`` prints out the variable ``<var>`` in the current
|
|
frame.
|
|
|
|
On the other hand, a proper formatter is able to produce the following output:
|
|
|
|
::
|
|
|
|
(lldb) frame variable a_deque
|
|
(std::deque<Foo, std::allocator<int> >) $0 = size=5 {
|
|
[0] = 2
|
|
[1] = 3
|
|
[2] = 4
|
|
[3] = 5
|
|
[4] = 6
|
|
}
|
|
|
|
which is what the user would expect from a good debugger.
|
|
|
|
Note: you can also use ``v <var>`` instead of ``frame variable <var>``.
|
|
|
|
It's worth mentioning that the ``size=5`` string is produced by a summary
|
|
provider and the list of children is produced by a synthetic child provider.
|
|
More information about these providers is available in :ref:`type-summary` and
|
|
:ref:`synthetic-children`, respectively.
|
|
|
|
|
|
There are several features related to data visualization: formats, summaries,
|
|
filters, synthetic children.
|
|
|
|
To reflect this, the type command has five subcommands:
|
|
|
|
::
|
|
|
|
type format
|
|
type summary
|
|
type filter
|
|
type synthetic
|
|
type category
|
|
|
|
These commands are meant to bind printing options to types. When variables are
|
|
printed, LLDB will first check if custom printing options have been associated
|
|
to a variable's type and, if so, use them instead of picking the default
|
|
choices.
|
|
|
|
Each of the commands (except ``type category``) has four subcommands available:
|
|
|
|
- ``add``: associates a new printing option to one or more types
|
|
- ``delete``: deletes an existing association
|
|
- ``list``: provides a listing of all associations
|
|
- ``clear``: deletes all associations
|
|
|
|
Type Format
|
|
-----------
|
|
|
|
Type formats enable you to quickly override the default format for displaying
|
|
primitive types (the usual basic C/C++/ObjC types: int, float, char, ...).
|
|
|
|
If for some reason you want all int variables in your program to print out as
|
|
hex, you can add a format to the int type.
|
|
|
|
This is done by typing
|
|
|
|
::
|
|
|
|
(lldb) type format add --format hex int
|
|
|
|
at the LLDB command line.
|
|
|
|
The ``--format`` (which you can shorten to ``-f``) option accepts a `format
|
|
name`_. Then, you provide one or more types to which you want the
|
|
new format applied.
|
|
|
|
A frequent scenario is that your program has a typedef for a numeric type that
|
|
you know represents something that must be printed in a certain way. Again, you
|
|
can add a format just to that typedef by using type format add with the name
|
|
alias.
|
|
|
|
But things can quickly get hierarchical. Let's say you have a situation like
|
|
the following:
|
|
|
|
::
|
|
|
|
typedef int A;
|
|
typedef A B;
|
|
typedef B C;
|
|
typedef C D;
|
|
|
|
and you want to show all A's as hex, all C's as byte arrays and leave the
|
|
defaults untouched for other types (albeit its contrived look, the example is
|
|
far from unrealistic in large software systems).
|
|
|
|
If you simply type
|
|
|
|
::
|
|
|
|
(lldb) type format add -f hex A
|
|
(lldb) type format add -f uint8_t[] C
|
|
|
|
values of type B will be shown as hex and values of type D as byte arrays, as in:
|
|
|
|
::
|
|
|
|
(lldb) frame variable -T
|
|
(A) a = 0x00000001
|
|
(B) b = 0x00000002
|
|
(C) c = {0x03 0x00 0x00 0x00}
|
|
(D) d = {0x04 0x00 0x00 0x00}
|
|
|
|
This is because by default LLDB cascades formats through typedef chains. In
|
|
order to prevent cascading, use the option ``-C`` with the value ``no`` when
|
|
defining the type format:
|
|
|
|
::
|
|
|
|
(lldb) type format add -C no -f hex A
|
|
(lldb) type format add -C no -f uint8_t[] C
|
|
|
|
|
|
Without cascading, the same command gives different results:
|
|
|
|
::
|
|
|
|
(lldb) frame variable -T
|
|
(A) a = 0x00000001
|
|
(B) b = 2
|
|
(C) c = {0x03 0x00 0x00 0x00}
|
|
(D) d = 4
|
|
|
|
Note that qualifiers such as ``const`` and ``volatile`` will be stripped when
|
|
matching types. For example:
|
|
|
|
::
|
|
|
|
(lldb) frame var x y z
|
|
(int) x = 1
|
|
(const int) y = 2
|
|
(volatile int) z = 4
|
|
(lldb) type format add -f hex int
|
|
(lldb) frame var x y z
|
|
(int) x = 0x00000001
|
|
(const int) y = 0x00000002
|
|
(volatile int) z = 0x00000004
|
|
|
|
Type formats *can* be applied to pointers and references by using the
|
|
``pointer`` "adjective" before the type in the ``type format`` command.
|
|
However, they specify the format to be applied to the pointer or reference
|
|
and not the value being referenced or pointed to. Use ``--skip-pointers`` (``-p``)
|
|
and ``--skip-references`` (``-r``) to change this behavior. These two
|
|
options prevent LLDB from applying a format defined for type ``T`` to
|
|
values of type ``T*`` and ``T&``, respectively.
|
|
|
|
::
|
|
|
|
(lldb) type format add -f float32[] int
|
|
(lldb) frame variable pointer *pointer -T
|
|
(int *) pointer = {1.46991e-39 1.4013e-45}
|
|
(int) *pointer = {1.53302e-42}
|
|
(lldb) type format add -f float32[] int -p
|
|
(lldb) frame variable pointer *pointer -T
|
|
(int *) pointer = 0x0000000100100180
|
|
(int) *pointer = {1.53302e-42}
|
|
|
|
If you need to delete a custom format type, use ``type format delete`` followed
|
|
by the name of the type for which to delete the format.
|
|
|
|
|
|
::
|
|
|
|
(lldb) type format delete int
|
|
|
|
To delete *all* formats, use ``type format clear``.
|
|
|
|
To see all the formats defined, use ``type format list``.
|
|
|
|
Instead of installing a type format for the entire debugging session, you
|
|
can specify type formats in an ad hoc manner using the ``-f``
|
|
option to the ``frame variable`` command. For example:
|
|
|
|
::
|
|
|
|
(lldb) frame variable counter -f hex
|
|
|
|
Will display the value of counter as an hexadecimal number.
|
|
|
|
|
|
Type Formats
|
|
++++++++++++
|
|
|
|
.. _`format name`:
|
|
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| **Format name** | **Abbreviation** | **Description** |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``default`` | | the default LLDB algorithm is used to pick a format |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``boolean`` | B | show this as a true/false boolean, using the customary rule that 0 is |
|
|
| | | false and everything else is true |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``binary`` | b | show this as a sequence of bits |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``bytes`` | y | show the bytes one after the other |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``bytes with ASCII`` | Y | show the bytes, but try to display them as ASCII characters as well |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``character`` | c | show the bytes as ASCII characters |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``printable character`` | C | show the bytes as printable ASCII characters |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``complex float`` | F | interpret this value as the real and imaginary part of a complex |
|
|
| | | floating-point number |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``c-string`` | s | show this as a 0-terminated C string |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``decimal`` | d | show this as a signed integer number (this does not perform a cast, it |
|
|
| | | simply shows the bytes as an integer with sign) |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``enumeration`` | E | show this as an enumeration, printing the |
|
|
| | | value's name if available or the integer value otherwise |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``hex`` | x | show this as in hexadecimal notation (this does |
|
|
| | | not perform a cast, it simply shows the bytes as hex) |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``float`` | f | show this as a floating-point number (this does not perform a cast, it |
|
|
| | | simply interprets the bytes as an IEEE754 floating-point value) |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``octal`` | o | show this in octal notation |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``OSType`` | O | show this as a MacOS OSType |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``unicode16`` | U | show this as UTF-16 characters |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``unicode32`` | | show this as UTF-32 characters |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``unsigned decimal`` | u | show this as an unsigned integer number (this does not perform a cast, |
|
|
| | | it simply shows the bytes as unsigned integer) |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``pointer`` | p | show this as a native pointer (unless this is really a pointer, the |
|
|
| | | resulting address will probably be invalid) |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``char[]`` | | show this as an array of characters |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``int8_t[], uint8_t[]`` | | show this as an array of the corresponding integer type |
|
|
| ``int16_t[], uint16_t[]`` | | |
|
|
| ``int32_t[], uint32_t[]`` | | |
|
|
| ``int64_t[], uint64_t[]`` | | |
|
|
| ``uint128_t[]`` | | |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``float32[], float64[]`` | | show this as an array of the corresponding |
|
|
| | | floating-point type |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``complex integer`` | I | interpret this value as the real and imaginary part of a complex integer |
|
|
| | | number |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``character array`` | a | show this as a character array |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``address`` | A | show this as an address target (symbol/file/line + offset), possibly |
|
|
| | | also the string this address is pointing to |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``hex float`` | | show this as hexadecimal floating point |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``instruction`` | i | show this as an disassembled opcode |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
| ``void`` | v | don't show anything |
|
|
+-----------------------------------------------+------------------+--------------------------------------------------------------------------+
|
|
|
|
.. _type-summary:
|
|
|
|
Type Summary
|
|
------------
|
|
|
|
Type formats specify the way to format a variable/value with a fundamental type.
|
|
When you want to display a variable/value for a user-defined type,\ [#]_ you need
|
|
another tool, type summaries.
|
|
|
|
Type summaries work by extracting information from aggregate types and arranging it
|
|
in a user-defined format. Consider the following example:
|
|
|
|
Before adding a type summary, for a C++ program with a ``struct`` declared like
|
|
|
|
.. code-block:: C++
|
|
|
|
struct Person {
|
|
std::string name;
|
|
int age;
|
|
};
|
|
|
|
|
|
LLDB will format a variable that is an instance of the ``Person`` type as
|
|
|
|
::
|
|
|
|
(lldb) frame variable -T p
|
|
(Person) p = {
|
|
(std::string) name = "Jaime"
|
|
(int) age = 44
|
|
}
|
|
|
|
Using a type summary, the LLDB user can specify that an instance of
|
|
the ``Person`` type be formatted as
|
|
|
|
::
|
|
|
|
(lldb) frame variable -T p
|
|
(Person) p = The person is named "Jaime" and is 44 years old.
|
|
|
|
There are two methods for specifying type summaries:
|
|
1. by binding a Summary String to the type; and
|
|
2. by writing and registering a Python script that returns the Summary String for a type.
|
|
|
|
|
|
Method (1) was used to format the ``Person`` type shown above:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "The person is named ${var.name} and is ${var.age} years old." Person
|
|
|
|
Summary Format Matching On Pointers
|
|
-----------------------------------
|
|
|
|
A summary formatter for a type ``T`` may not be appropriate to use
|
|
for pointers to that type.
|
|
|
|
- If the formatter is only appropriate for the type and
|
|
not its pointers, use the ``-p`` option to ``type summary`` to restrict it to match
|
|
values of type ``T``.
|
|
- If the formatter is appropriate for the type and its pointers,
|
|
use the ``-d <number>`` option to ``type summary`` to specify the number of
|
|
pointer indirections the formatter should match. The default value is 1. If you want
|
|
to also match ``T**`` set ``-d`` to 2, and so on.
|
|
|
|
In all cases, the value passed to the summary formatter for interpolation
|
|
into the Summary String (see :ref:`summary-strings`) will be dereferenced to the type ``T``
|
|
if the type ``T`` is reached in at most the number of dereferences you specify
|
|
to the ``-d`` option (or once (1), when using the default).
|
|
|
|
For example, assume that the ``Person`` struct is in the scope of a C++
|
|
program with the following declarations:
|
|
|
|
.. code-block:: C++
|
|
|
|
Person p{.name = "Jaime", .age = 44};
|
|
Person *q{new Person{.name = "Jaime", .age = 44}};
|
|
Person **r{&q};
|
|
|
|
And the following Summary String has been bound to ``Person`` type using the ``type summary``
|
|
command shown here:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "The person is named ${var.name} and is ${var.age} years old." Person
|
|
|
|
|
|
Then LLDB will produce output which, at first, is unexpected:
|
|
|
|
::
|
|
|
|
(lldb) frame var p
|
|
(Person) The person is named "Jaime" and is 44 years old.
|
|
(lldb) frame var q
|
|
(Person *) 0x00000000004172b0 The person is named "Jaime" and is 44 years old.
|
|
(lldb) frame var r
|
|
(Person **) 0x00007fffffffd658 error: summary string parsing error
|
|
|
|
Upon closer inspection, LLDB is doing exactly as instructed. ``p`` has type ``T``
|
|
and needs to be dereferenced 0 times to reach type ``T``. ``q``
|
|
has type ``*T`` and needs to be dereferenced 1 time to reach type ``T``. Because
|
|
the Summary String was installed with the default ``-d`` value of 1, the summary formatter
|
|
will use ``*q`` for interpolation.
|
|
|
|
However, ``r`` has type ``**T`` and needs to be dereferenced 2 times to reach ``T``. Because
|
|
the Summary String was installed with the default ``-d`` value of 1, the summary formatter
|
|
will use ``*r`` for interpolation. Because ``*r`` has type ``*T`` (and not ``T``), the error
|
|
above is reasonable.
|
|
|
|
|
|
.. _summary-strings:
|
|
|
|
Summary Strings
|
|
---------------
|
|
A Summary String contains a sequence of tokens that are processed by LLDB to
|
|
generate the summary for a user-defined type. Summary Strings are written in
|
|
an LLDB-specific control language.\ [#]_
|
|
|
|
Summary Strings can contain
|
|
|
|
- references to the instance of the user-defined type being formatted by the
|
|
Summary String
|
|
- plain text,
|
|
- control characters, and
|
|
- special variables that have access to information about the current object
|
|
and the overall program state.
|
|
|
|
``${var}`` can be used refer to the variable being formatted by the Summary String.
|
|
|
|
Plain text is any sequence of characters that doesn't contain a ``{``, ``}``, ``$``,
|
|
or ``\`` character, which are the syntax control characters.
|
|
|
|
Special variables can be used between ``${`` prefix, and ``}`` suffix. For example,
|
|
``var`` could be used to refer to the instance of the variable being
|
|
formatted by the Summary String. Children of variables can be accessed using the
|
|
variable itself (e.g. ``${var.child.otherchild}`` or ``${file.basename}``). A variable
|
|
can also be prefixed or suffixed with other symbols to change the way its value is
|
|
handled. For example, ``${*var.int_pointer[0-3]}``.
|
|
|
|
The simplest thing you can do is access a member variable of a class or structure
|
|
by typing its expression path. The expression path for a field named ``y`` in a
|
|
user-defined type is simply ``.y``. Thus, to ask the Summary String to display
|
|
field ``y`` of an instance of ``struct A`` (below) you would type ``${var.y}``.
|
|
|
|
If you have code in a C++ program with the following declarations:
|
|
|
|
.. code-block:: C++
|
|
|
|
struct A {
|
|
int x;
|
|
int y;
|
|
};
|
|
struct B {
|
|
A x;
|
|
A y;
|
|
int *z;
|
|
};
|
|
|
|
And were writing a Summary String to format an instance of ``B``, the expression
|
|
path for the ``y`` member of the ``x`` member would be ``.x.y`` and you would
|
|
use ``${var.x.y}`` in the Summary String.
|
|
|
|
By default, a summary defined for type ``T``, also works for types ``T*`` and ``T&``
|
|
(as mentioned above, you can disable this behavior if desired). For this reason,
|
|
expression paths do not differentiate between ``.`` and ``->``, and the above
|
|
expression path ``.x.y`` would be valid for displaying a value with type ``B*``,
|
|
``B`` or even if the actual declaration of ``B`` were instead:
|
|
|
|
.. code-block:: C++
|
|
|
|
struct B {
|
|
A *x;
|
|
A y;
|
|
int *z;
|
|
};
|
|
|
|
This behavior differs from ``frame variable`` which does enforce the distinction
|
|
between ``T`` and ``T*``. The rationale for this choice is that ignoring this
|
|
distinction enables you to write a Summary String once for type ``T`` and use
|
|
it for both ``T`` and ``T*`` instances. As a Summary String is mostly about
|
|
extracting nested members' information, a pointer to an object is just as good
|
|
as the object itself for that purpose.
|
|
|
|
If you need to access the value of the integer pointed to by ``B::z``, your
|
|
Summary String cannot simply use expression path ``.z`` because that symbol
|
|
refers to the pointer ``z``. To access the value of the integer pointed to by
|
|
``B::z`` in your Summary String, you should use the same expression path but
|
|
dereference it: ``${*var.z}``. The ``*`` tells LLDB to get the object that
|
|
the expression path leads to and then dereference it. In this example, it is
|
|
equivalent to ``*(bObject.z)`` in C/C++ syntax. Because ``.`` and ``->``
|
|
operators can be used interchangeably, there is no need to have dereferences
|
|
in the middle of an expression path. For example, you do not need to type
|
|
``${*(var.x).x}`` to read ``A::x`` as contained in ``*(B::x)``. Instead, you
|
|
can simply write ``${var.x->x}``, or even ``${var.x.x}``. The ``*`` operator
|
|
only binds to the result of the whole expression path, rather than piecewise,
|
|
and there is no way to use parentheses to change that behavior.
|
|
|
|
Formatting Summary Elements
|
|
---------------------------
|
|
|
|
An expression path can include formatting codes. Much like the type formats
|
|
discussed previously, you can also customize the way variables are displayed in
|
|
Summary Strings, regardless of the format they have applied to their types. To
|
|
do that, you can use %format inside an expression path, as in ${var.x->x%u},
|
|
which would display the value of x as an unsigned integer.
|
|
|
|
Additionally, custom output can be achieved by using an LLVM format string,
|
|
commencing with the ``:`` marker. To illustrate, compare ``${var.byte%x}`` and
|
|
``${var.byte:x-}``. The former uses LLDB's builtin hex formatting (``x``),
|
|
which unconditionally inserts a ``0x`` prefix, and also zero pads the value to
|
|
match the size of the type. The latter uses ``llvm::formatv`` formatting
|
|
(``:x-``), and will print only the hex value, with no ``0x`` prefix, and no
|
|
padding. This raw control is useful when composing multiple pieces into a
|
|
larger whole.
|
|
|
|
You can also use some other special format markers, not available for formats
|
|
themselves, but which carry a special meaning when used in this context:
|
|
|
|
+------------+--------------------------------------------------------------------------+
|
|
| **Symbol** | **Description** |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``Symbol`` | ``Description`` |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%S`` | Use this object's summary (the default for aggregate types) |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%V`` | Use this object's value (the default for non-aggregate types) |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%@`` | Use a language-runtime specific description (for C++ this does nothing, |
|
|
| | for Objective-C it calls the NSPrintForDebugger API) |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%L`` | Use this object's location (memory address, register name, ...) |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%#`` | Use the count of the children of this object |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%T`` | Use this object's datatype name |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%N`` | Print the variable's basename |
|
|
+------------+--------------------------------------------------------------------------+
|
|
| ``%>`` | Print the expression path for this item |
|
|
+------------+--------------------------------------------------------------------------+
|
|
|
|
Since LLDB 3.7.0, you can also specify ``${script.var:pythonFuncName}``.
|
|
|
|
It is expected that the function name you use specifies a function whose
|
|
signature is the same as a Python summary function. The return string from the
|
|
function will be placed verbatim in the output.
|
|
|
|
You cannot use element access, or formatting symbols, in combination with this
|
|
syntax. For example the following:
|
|
|
|
::
|
|
|
|
${script.var.element[0]:myFunctionName%@}
|
|
|
|
is not valid and will cause the summary to fail to evaluate.
|
|
|
|
|
|
Element Inlining
|
|
----------------
|
|
|
|
Option --inline-children (-c) to type summary add tells LLDB not to look for a Summary String, but instead to just print a listing of all the object's children on one line.
|
|
|
|
As an example, given a type pair:
|
|
|
|
::
|
|
|
|
(lldb) frame variable --show-types a_pair
|
|
(pair) a_pair = {
|
|
(int) first = 1;
|
|
(int) second = 2;
|
|
}
|
|
|
|
If one types the following commands:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --inline-children pair
|
|
|
|
the output becomes:
|
|
|
|
::
|
|
|
|
(lldb) frame variable a_pair
|
|
(pair) a_pair = (first=1, second=2)
|
|
|
|
|
|
Of course, one can obtain the same effect by typing
|
|
|
|
::
|
|
|
|
(lldb) type summary add pair --summary-string "(first=${var.first}, second=${var.second})"
|
|
|
|
While the final result is the same, using --inline-children can often save
|
|
time. If one does not need to see the names of the variables, but just their
|
|
values, the option --omit-names (-O, uppercase letter o), can be combined with
|
|
--inline-children to obtain:
|
|
|
|
::
|
|
|
|
(lldb) frame variable a_pair
|
|
(pair) a_pair = (1, 2)
|
|
|
|
which is of course the same as typing
|
|
|
|
::
|
|
|
|
(lldb) type summary add pair --summary-string "(${var.first}, ${var.second})"
|
|
|
|
Bitfields And Array Syntax
|
|
--------------------------
|
|
|
|
Sometimes, a basic type's value actually represents several different values
|
|
packed together in a bitfield.
|
|
|
|
With the classical view, there is no way to look at them. Hexadecimal display
|
|
can help, but if the bits actually span nibble boundaries, the help is limited.
|
|
|
|
Binary view would show it all without ambiguity, but is often too detailed and
|
|
hard to read for real-life scenarios.
|
|
|
|
To cope with the issue, LLDB supports native bitfield formatting in summary
|
|
strings. If your expression paths leads to a so-called scalar type (the usual
|
|
int, float, char, double, short, long, long long, double, long double and
|
|
unsigned variants), you can ask LLDB to only access some bits out of the value
|
|
and display them in any format you like. If you only need one bit you can use
|
|
the [n], just like indexing an array. To extract multiple bits, you can use a
|
|
slice-like syntax: [n-m], e.g.
|
|
|
|
::
|
|
|
|
(lldb) frame variable float_point
|
|
(float) float_point = -3.14159
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "Sign: ${var[31]%B} Exponent: ${var[30-23]%x} Mantissa: ${var[0-22]%u}" float
|
|
(lldb) frame variable float_point
|
|
(float) float_point = -3.14159 Sign: true Exponent: 0x00000080 Mantissa: 4788184
|
|
|
|
In this example, LLDB shows the internal representation of a float variable by
|
|
extracting bitfields out of a float object.
|
|
|
|
When typing a range, the extremes n and m are always included, and the order of
|
|
the indices is irrelevant.
|
|
|
|
LLDB also allows to use a similar syntax to display array members inside a Summary String.
|
|
For instance, you may want to display all arrays of a given type using a more compact
|
|
notation than the default, and then just delve into individual array members that prove
|
|
interesting to your debugging task. You can tell LLDB to format arrays in special ways,
|
|
possibly independent of the way the array members' datatype is formatted. For example:
|
|
|
|
::
|
|
|
|
(lldb) frame variable sarray
|
|
(Simple [3]) sarray = {
|
|
[0] = {
|
|
x = 1
|
|
y = 2
|
|
z = '\x03'
|
|
}
|
|
[1] = {
|
|
x = 4
|
|
y = 5
|
|
z = '\x06'
|
|
}
|
|
[2] = {
|
|
x = 7
|
|
y = 8
|
|
z = '\t'
|
|
}
|
|
}
|
|
|
|
(lldb) type summary add --summary-string "${var[].x}" "Simple [3]"
|
|
|
|
(lldb) frame variable sarray
|
|
(Simple [3]) sarray = [1,4,7]
|
|
|
|
The [] symbol amounts to: if ``var`` is an array and I know its size, apply this Summary String
|
|
to every element of the array. Here, we are asking LLDB to display ``.x`` for every element of
|
|
the array, and in fact this is what happens. If you find some of those integers anomalous,
|
|
you can then inspect that one item in greater detail, without the array format getting in the way:
|
|
|
|
::
|
|
|
|
(lldb) frame variable sarray[1]
|
|
(Simple) sarray[1] = {
|
|
x = 4
|
|
y = 5
|
|
z = '\x06'
|
|
}
|
|
|
|
You can also ask LLDB to only print a subset of the array range by using the
|
|
same syntax used to extract bit for bitfields:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "${var[1-2].x}" "Simple [3]"
|
|
|
|
(lldb) frame variable sarray
|
|
(Simple [3]) sarray = [4,7]
|
|
|
|
If you are dealing with a pointer that you know is an array, you can use this
|
|
syntax to display the elements contained in the pointed array instead of just
|
|
the pointer value. However, because pointers have no notion of their size, the
|
|
empty brackets [] operator does not work, and you must explicitly provide
|
|
higher and lower bounds.
|
|
|
|
In general, LLDB needs the square brackets ``operator []`` in order to handle
|
|
arrays and pointers correctly, and for pointers it also needs a range. However,
|
|
a few special cases are defined to make your life easier:
|
|
|
|
you can print a 0-terminated string (C-string) using the %s format, omitting
|
|
square brackets, as in:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "${var%s}" "char *"
|
|
|
|
This syntax works for char* as well as for char[] because LLDB can rely on the
|
|
final \0 terminator to know when the string has ended.
|
|
|
|
LLDB has default Summary Strings for char* and char[] that use this special
|
|
case. On debugger startup, the following are defined automatically:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "${var%s}" "char *"
|
|
(lldb) type summary add --summary-string "${var%s}" -x "char \[[0-9]+]"
|
|
|
|
any of the array formats (int8_t[], float32{}, ...), and the y, Y and a formats
|
|
work to print an array of a non-aggregate type, even if square brackets are
|
|
omitted.
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "${var%int32_t[]}" "int [10]"
|
|
|
|
This feature, however, is not enabled for pointers because there is no way for
|
|
LLDB to detect the end of the pointed data.
|
|
|
|
This also does not work for other formats (e.g. boolean), and you must specify
|
|
the square brackets operator to get the expected output.
|
|
|
|
Python Scripting
|
|
----------------
|
|
|
|
Most of the times, Summary Strings prove good enough for the job of summarizing
|
|
the contents of a variable. However, as soon as you need to do more than
|
|
picking some values and rearranging them for display, Summary Strings stop
|
|
being an effective tool. This is because Summary Strings lack the power to
|
|
actually perform any kind of computation on the value of variables.
|
|
|
|
To solve this issue, you can bind some Python scripting code as a summary for
|
|
your datatype, and that script has the ability to both extract child
|
|
variables as the Summary Strings do, and to perform active computation on the
|
|
extracted values. As a small example, let's say we have a Rectangle class:
|
|
|
|
::
|
|
|
|
|
|
class Rectangle
|
|
{
|
|
private:
|
|
int height;
|
|
int width;
|
|
public:
|
|
Rectangle() : height(3), width(5) {}
|
|
Rectangle(int H) : height(H), width(H*2-1) {}
|
|
Rectangle(int H, int W) : height(H), width(W) {}
|
|
int GetHeight() { return height; }
|
|
int GetWidth() { return width; }
|
|
};
|
|
|
|
Summary Strings are effective to reduce the screen real estate used by the
|
|
default viewing mode, but are not effective if we want to display the area and
|
|
perimeter of Rectangle objects
|
|
|
|
To obtain this, we can simply attach a small Python script to the Rectangle
|
|
class, as shown in this example:
|
|
|
|
::
|
|
|
|
(lldb) type summary add -P Rectangle
|
|
Enter your Python command(s). Type 'DONE' to end.
|
|
def function (valobj,internal_dict,options):
|
|
height_val = valobj.GetChildMemberWithName('height')
|
|
width_val = valobj.GetChildMemberWithName('width')
|
|
height = height_val.GetValueAsUnsigned(0)
|
|
width = width_val.GetValueAsUnsigned(0)
|
|
area = height*width
|
|
perimeter = 2*(height + width)
|
|
return 'Area: ' + str(area) + ', Perimeter: ' + str(perimeter)
|
|
DONE
|
|
(lldb) frame variable
|
|
(Rectangle) r1 = Area: 20, Perimeter: 18
|
|
(Rectangle) r2 = Area: 72, Perimeter: 36
|
|
(Rectangle) r3 = Area: 16, Perimeter: 16
|
|
|
|
In order to write effective summary scripts, you need to know the LLDB public
|
|
API, which is the way Python code can access the LLDB object model. For further
|
|
details on the API you should look at the LLDB API reference documentation.
|
|
|
|
|
|
As a brief introduction, your script is encapsulated into a function that is
|
|
passed two parameters: ``valobj`` and ``internal_dict``.
|
|
|
|
``internal_dict`` is an internal support parameter used by LLDB and you should
|
|
not touch it.
|
|
|
|
``valobj`` is the object encapsulating the actual variable being displayed, and
|
|
its type is `SBValue`. Out of the many possible operations on an `SBValue`, the
|
|
basic one is retrieve the children objects it contains (essentially, the fields
|
|
of the object wrapped by it), by calling ``GetChildMemberWithName()``, passing
|
|
it the child's name as a string.
|
|
|
|
If the variable has a value, you can ask for it, and return it as a string
|
|
using ``GetValue()``, or as a signed/unsigned number using
|
|
``GetValueAsSigned()``, ``GetValueAsUnsigned()``. It is also possible to
|
|
retrieve an `SBData` object by calling ``GetData()`` and then read the object's
|
|
contents out of the `SBData`.
|
|
|
|
If you need to delve into several levels of hierarchy, as you can do with
|
|
Summary Strings, you can use the method ``GetValueForExpressionPath()``,
|
|
passing it an expression path just like those you could use for Summary Strings
|
|
(one of the differences is that dereferencing a pointer does not occur by
|
|
prefixing the path with a ``*```, but by calling the ``Dereference()`` method
|
|
on the returned `SBValue`). If you need to access array slices, you cannot do
|
|
that (yet) via this method call, and you must use ``GetChildAtIndex()``
|
|
querying it for the array items one by one. Also, handling custom formats is
|
|
something you have to deal with on your own.
|
|
|
|
``options`` Python summary formatters can optionally define this
|
|
third argument, which is an object of type ``lldb.SBTypeSummaryOptions``,
|
|
allowing for a few customizations of the result. The decision to
|
|
adopt or not this third argument - and the meaning of options
|
|
thereof - is up to the individual formatter's writer.
|
|
|
|
Other than interactively typing a Python script there are two other ways for
|
|
you to input a Python script as a summary:
|
|
|
|
- using the --python-script option to type summary add and typing the script
|
|
code as an option argument; as in:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --python-script "height = valobj.GetChildMemberWithName('height').GetValueAsUnsigned(0);width = valobj.GetChildMemberWithName('width').GetValueAsUnsigned(0); return 'Area: %d' % (height*width)" Rectangle
|
|
|
|
|
|
- using the --python-function (-F) option to type summary add and giving the
|
|
name of a Python function with the correct prototype. Most probably, you will
|
|
define (or have already defined) the function in the interactive interpreter,
|
|
or somehow loaded it from a file, using the command script import command.
|
|
LLDB will emit a warning if it is unable to find the function you passed, but
|
|
will still register the binding.
|
|
|
|
Regular Expression Typenames
|
|
----------------------------
|
|
|
|
As you noticed, in order to associate the custom Summary String to the array
|
|
types, one must give the array size as part of the typename. This can long
|
|
become tiresome when using arrays of different sizes, Simple [3], Simple [9],
|
|
Simple [12], ...
|
|
|
|
If you use the -x option, type names are treated as regular expressions instead
|
|
of type names. This would let you rephrase the above example for arrays of type
|
|
Simple [3] as:
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "${var[].x}" -x "Simple \[[0-9]+\]"
|
|
(lldb) frame variable
|
|
(Simple [3]) sarray = [1,4,7]
|
|
(Simple [2]) sother = [3,6]
|
|
|
|
The above scenario works for Simple [3] as well as for any other array of
|
|
Simple objects.
|
|
|
|
While this feature is mostly useful for arrays, you could also use regular
|
|
expressions to catch other type sets grouped by name. However, as regular
|
|
expression matching is slower than normal name matching, LLDB will first try to
|
|
match by name in any way it can, and only when this fails, will it resort to
|
|
regular expression matching.
|
|
|
|
One of the ways LLDB uses this feature internally, is to match the names of STL
|
|
container classes, regardless of the template arguments provided. The details
|
|
for this are found at FormatManager.cpp
|
|
|
|
The regular expression language used by LLDB is the POSIX extended language, as
|
|
defined by the Single UNIX Specification, of which macOS is a compliant
|
|
implementation.
|
|
|
|
Names Summaries
|
|
---------------
|
|
|
|
For a given type, there may be different meaningful summary representations.
|
|
However, currently, only one summary can be associated to a type at each
|
|
moment. If you need to temporarily override the association for a variable,
|
|
without changing the Summary String for to its type, you can use named
|
|
summaries.
|
|
|
|
Named summaries work by attaching a name to a summary when creating it. Then,
|
|
when there is a need to attach the summary to a variable, the frame variable
|
|
command, supports a --summary option that tells LLDB to use the named summary
|
|
given instead of the default one.
|
|
|
|
::
|
|
|
|
(lldb) type summary add --summary-string "x=${var.integer}" --name NamedSummary
|
|
(lldb) frame variable one
|
|
(i_am_cool) one = int = 3, float = 3.14159, char = 69
|
|
(lldb) frame variable one --summary NamedSummary
|
|
(i_am_cool) one = x=3
|
|
|
|
When defining a named summary, binding it to one or more types becomes
|
|
optional. Even if you bind the named summary to a type, and later change the
|
|
Summary String for that type, the named summary will not be changed by that.
|
|
You can delete named summaries by using the type summary delete command, as if
|
|
the summary name was the datatype that the summary is applied to
|
|
|
|
A summary attached to a variable using the --summary option, has the same
|
|
semantics that a custom format attached using the -f option has: it stays
|
|
attached till you attach a new one, or till you let your program run again.
|
|
|
|
.. _synthetic-children:
|
|
|
|
Synthetic Children
|
|
------------------
|
|
|
|
Summaries work well when one is able to navigate through an expression path. In
|
|
order for LLDB to do so, appropriate debugging information must be available.
|
|
|
|
Some types are opaque, i.e. no knowledge of their internals is provided. When
|
|
that's the case, expression paths do not work correctly.
|
|
|
|
In other cases, the internals are available to use in expression paths, but
|
|
they do not provide a user-friendly representation of the object's value.
|
|
|
|
For instance, consider an STL vector, as implemented by the GNU C++ Library:
|
|
|
|
::
|
|
|
|
(lldb) frame variable numbers -T
|
|
(std::vector<int>) numbers = {
|
|
(std::_Vector_base<int, std::allocator<int> >) std::_Vector_base<int, std::allocator<int> > = {
|
|
(std::_Vector_base<int, std::allocator&tl;int> >::_Vector_impl) _M_impl = {
|
|
(int *) _M_start = 0x00000001001008a0
|
|
(int *) _M_finish = 0x00000001001008a8
|
|
(int *) _M_end_of_storage = 0x00000001001008a8
|
|
}
|
|
}
|
|
}
|
|
|
|
Here, you can see how the type is implemented, and you can write a summary for
|
|
that implementation but that is not going to help you infer what items are
|
|
actually stored in the vector.
|
|
|
|
What you would like to see is probably something like:
|
|
|
|
::
|
|
|
|
(lldb) frame variable numbers -T
|
|
(std::vector<int>) numbers = {
|
|
(int) [0] = 1
|
|
(int) [1] = 12
|
|
(int) [2] = 123
|
|
(int) [3] = 1234
|
|
}
|
|
|
|
Synthetic children are a way to get that result.
|
|
|
|
The feature is based upon the idea of providing a new set of children for a
|
|
variable that replaces the ones available by default through the debug
|
|
information. In the example, we can use synthetic children to provide the
|
|
vector items as children for the std::vector object.
|
|
|
|
In order to create synthetic children, you need to provide a Python class that
|
|
adheres to a given interface (the word is italicized because Python has no
|
|
explicit notion of interface, by that word we mean a given set of methods must
|
|
be implemented by the Python class):
|
|
|
|
.. code-block:: python
|
|
|
|
class SyntheticChildrenProvider:
|
|
def __init__(self, valobj: lldb.SBValue, internal_dict):
|
|
""""
|
|
This call should initialize the Python object using valobj as the
|
|
variable to provide synthetic children for.
|
|
""""
|
|
|
|
def num_children(self, max_children: int) -> int:
|
|
"""
|
|
This call should return the number of children that you want your
|
|
object to have[1].
|
|
"""
|
|
|
|
def get_child_index(self, name: str) -> int:
|
|
"""
|
|
This call should return the index of the synthetic child whose name is
|
|
given as the argument. Array subscripting, names in the form "[N]", is
|
|
automatically supported.
|
|
Return -1 if there is no child at the index.
|
|
"""
|
|
|
|
def get_child_at_index(self, index: int) -> lldb.SBValue | None:
|
|
""""
|
|
This call should return a new LLDB SBValue object representing the
|
|
child at the index given as argument.
|
|
"""
|
|
|
|
def update(self) -> bool:
|
|
""""
|
|
This call should be used to update the internal state of this Python
|
|
object whenever the state of the variables in LLDB changes.[2]
|
|
Also, this method is invoked before any other method in the interface.
|
|
"""
|
|
|
|
def has_children(self) -> bool:
|
|
"""
|
|
This call should return True if this object might have children, and
|
|
False if this object can be guaranteed not to have children.[3]
|
|
"""
|
|
|
|
def get_value(self) -> lldb.SBValue | None:
|
|
"""
|
|
This call can return an SBValue to be presented as the value of the
|
|
synthetic value under consideration.[4]
|
|
""""
|
|
|
|
As a warning, exceptions that are thrown by python formatters are caught
|
|
silently by LLDB and should be handled appropriately by the formatter itself.
|
|
Being more specific, in case of exceptions, LLDB might assume that the given
|
|
object has no children or it might skip printing some children, as they are
|
|
printed one by one.
|
|
|
|
[1] The ``max_children`` argument is optional (since LLDB 3.8.0) and indicates the
|
|
maximum number of children that LLDB is interested in (at this moment). If the
|
|
computation of the number of children is expensive (for example, requires
|
|
traversing a linked list to determine its size) your implementation may return
|
|
``max_children`` rather than the actual number. If the computation is cheap (e.g., the
|
|
number is stored as a field of the object), then you can always return the true
|
|
number of children (that is, ignore the ``max_children`` argument).
|
|
|
|
[2] This method is optional. Also, a boolean value must be returned (since LLDB
|
|
3.1.0). If ``False`` is returned, then whenever the process reaches a new stop,
|
|
this method will be invoked again to generate an updated list of the children
|
|
for a given variable. Otherwise, if ``True`` is returned, then the value is
|
|
cached and this method won't be called again, effectively freezing the state of
|
|
the value in subsequent stops. Beware that returning ``True`` incorrectly could
|
|
show misleading information to the user.
|
|
|
|
[3] This method is optional (since LLDB 3.2.0). While implementing it in terms
|
|
of num_children is acceptable, implementors are encouraged to look for
|
|
optimized coding alternatives whenever reasonable.
|
|
|
|
[4] This method is optional (since LLDB 3.5.2). The `SBValue` you return here
|
|
will most likely be a numeric type (int, float, ...) as its value bytes will be
|
|
used as-if they were the value of the root `SBValue` proper. As a shortcut for
|
|
this, you can inherit from lldb.SBSyntheticValueProvider, and just define
|
|
get_value as other methods are defaulted in the superclass as returning default
|
|
no-children responses.
|
|
|
|
If a synthetic child provider supplies a special child named
|
|
``$$dereference$$`` then it will be used when evaluating ``operator *`` and
|
|
``operator ->`` in the frame variable command and related SB API
|
|
functions. It is possible to declare this synthetic child without
|
|
including it in the range of children displayed by LLDB. For example,
|
|
this subset of a synthetic children provider class would allow the
|
|
synthetic value to be dereferenced without actually showing any
|
|
synthetic children in the UI:
|
|
|
|
.. code-block:: python
|
|
|
|
class SyntheticChildrenProvider:
|
|
[...]
|
|
def num_children(self) -> int:
|
|
return 0
|
|
|
|
def get_child_index(self, name: str) -> int:
|
|
if name == '$$dereference$$':
|
|
return 0
|
|
return -1
|
|
|
|
def get_child_at_index(self, index: int) -> lldb.SBValue | None:
|
|
if index == 0:
|
|
return <valobj resulting from dereference>
|
|
return None
|
|
|
|
|
|
For examples of how synthetic children are created, you are encouraged to look
|
|
at examples/synthetic in the LLDB trunk. Please, be aware that the code in
|
|
those files (except bitfield/) is legacy code and is not maintained. You may
|
|
especially want to begin looking at this example to get a feel for this
|
|
feature, as it is a very easy and well commented example.
|
|
|
|
The design pattern consistently used in synthetic providers shipping with LLDB
|
|
is to use the __init__ to store the `SBValue` instance as a part of self. The
|
|
update function is then used to perform the actual initialization. Once a
|
|
synthetic children provider is written, one must load it into LLDB before it
|
|
can be used. Currently, one can use the LLDB script command to type Python code
|
|
interactively, or use the command script import fileName command to load Python
|
|
code from a Python module (ordinary rules apply to importing modules this way).
|
|
A third option is to type the code for the provider class interactively while
|
|
adding it.
|
|
|
|
For example, let's pretend we have a class Foo for which a synthetic children
|
|
provider class Foo_Provider is available, in a Python module contained in file
|
|
~/Foo_Tools.py. The following interaction sets Foo_Provider as a synthetic
|
|
children provider in LLDB:
|
|
|
|
::
|
|
|
|
(lldb) command script import ~/Foo_Tools.py
|
|
(lldb) type synthetic add Foo --python-class Foo_Tools.Foo_Provider
|
|
(lldb) frame variable a_foo
|
|
(Foo) a_foo = {
|
|
x = 1
|
|
y = "Hello world"
|
|
}
|
|
|
|
LLDB has synthetic children providers for a core subset of STL classes, both in
|
|
the version provided by libstdcpp and by libcxx, as well as for several
|
|
Foundation classes.
|
|
|
|
Synthetic children extend Summary Strings by enabling a new special variable:
|
|
``${svar``.
|
|
|
|
This symbol tells LLDB to refer expression paths to the synthetic children
|
|
instead of the real ones. For instance,
|
|
|
|
::
|
|
|
|
(lldb) type summary add --expand -x "std::vector<" --summary-string "${svar%#} items"
|
|
(lldb) frame variable numbers
|
|
(std::vector<int>) numbers = 4 items {
|
|
(int) [0] = 1
|
|
(int) [1] = 12
|
|
(int) [2] = 123
|
|
(int) [3] = 1234
|
|
}
|
|
|
|
It's important to mention that LLDB invokes the synthetic child provider before
|
|
invoking the Summary String provider, which allows the latter to have access to
|
|
the actual displayable children. This applies to both inlined Summary Strings
|
|
and python-based summary providers.
|
|
|
|
|
|
As a warning, when programmatically accessing the children or children count of
|
|
a variable that has a synthetic child provider, notice that LLDB hides the
|
|
actual raw children. For example, suppose we have a ``std::vector``, which has
|
|
an actual in-memory property ``__begin`` marking the beginning of its data.
|
|
After the synthetic child provider is executed, the ``std::vector`` variable
|
|
won't show ``__begin`` as child anymore, even through the SB API. It will have
|
|
instead the children calculated by the provider. In case the actual raw
|
|
children are needed, a call to ``value.GetNonSyntheticValue()`` is enough to
|
|
get a raw version of the value. It is import to remember this when implementing
|
|
Summary String providers, as they run after the synthetic child provider.
|
|
|
|
|
|
In some cases, if LLDB is unable to use the real object to get a child
|
|
specified in an expression path, it will automatically refer to the synthetic
|
|
children. While in summaries it is best to always use ${svar to make your
|
|
intentions clearer, interactive debugging can benefit from this behavior, as
|
|
in:
|
|
|
|
::
|
|
|
|
(lldb) frame variable numbers[0] numbers[1]
|
|
(int) numbers[0] = 1
|
|
(int) numbers[1] = 12
|
|
|
|
Unlike many other visualization features, however, the access to synthetic
|
|
children only works when using frame variable, and is not supported in
|
|
expression:
|
|
|
|
::
|
|
|
|
(lldb) expression numbers[0]
|
|
Error [IRForTarget]: Call to a function '_ZNSt33vector<int, std::allocator<int> >ixEm' that is not present in the target
|
|
error: Couldn't convert the expression to DWARF
|
|
|
|
The reason for this is that classes might have an overloaded ``operator []``,
|
|
or other special provisions and the expression command chooses to ignore
|
|
synthetic children in the interest of equivalency with code you asked to have
|
|
compiled from source.
|
|
|
|
Filters
|
|
-------
|
|
|
|
Filters are a solution to the display of complex classes. At times, classes
|
|
have many member variables but not all of these are actually necessary for the
|
|
user to see.
|
|
|
|
A filter will solve this issue by only letting the user see those member
|
|
variables they care about. Of course, the equivalent of a filter can be
|
|
implemented easily using synthetic children, but a filter lets you get the job
|
|
done without having to write Python code.
|
|
|
|
For instance, if your class Foobar has member variables named A thru Z, but you
|
|
only need to see the ones named B, H and Q, you can define a filter:
|
|
|
|
::
|
|
|
|
(lldb) type filter add Foobar --child B --child H --child Q
|
|
(lldb) frame variable a_foobar
|
|
(Foobar) a_foobar = {
|
|
(int) B = 1
|
|
(char) H = 'H'
|
|
(std::string) Q = "Hello world"
|
|
}
|
|
|
|
Callback-based type matching
|
|
----------------------------
|
|
|
|
Even though regular expression matching works well for the vast majority of data
|
|
formatters (you normally know the name of the type you're writing a formatter
|
|
for), there are some cases where it's useful to look at the type before deciding
|
|
what formatter to apply.
|
|
|
|
As an example scenario, imagine we have a code generator that produces some
|
|
classes that inherit from a common ``GeneratedObject`` class, and we have a
|
|
summary function and a synthetic child provider that work for all
|
|
``GeneratedObject`` instances (they all follow the same pattern). However, there
|
|
is no common pattern in the name of these classes, so we can't register the
|
|
formatter neither by name nor by regular expression.
|
|
|
|
In that case, you can write a recognizer function like this:
|
|
|
|
.. code-block:: python
|
|
|
|
def is_generated_object(sbtype: lldb.SBType, internal_dict) -> bool:
|
|
for base in sbtype.get_bases_array():
|
|
if base.GetName() == "GeneratedObject"
|
|
return True
|
|
return False
|
|
|
|
And pass this function to ``type summary add`` and ``type synthetic add`` using
|
|
the flag ``--recognizer-function``.
|
|
|
|
::
|
|
|
|
(lldb) type summary add --expand --python-function my_summary_function --recognizer-function is_generated_object
|
|
(lldb) type synthetic add --python-class my_child_provider --recognizer-function is_generated_object
|
|
|
|
Objective-C Dynamic Type Discovery
|
|
----------------------------------
|
|
|
|
When doing Objective-C development, you may notice that some of your variables
|
|
come out as of type id (for instance, items extracted from NSArray). By
|
|
default, LLDB will not show you the real type of the object. it can actually
|
|
dynamically discover the type of an Objective-C variable, much like the runtime
|
|
itself does when invoking a selector. In order to be shown the result of that
|
|
discovery that, however, a special option to frame variable or expression is
|
|
required: ``--dynamic-type``.
|
|
|
|
|
|
``--dynamic-type`` can have one of three values:
|
|
|
|
- ``no-dynamic-values``: the default, prevents dynamic type discovery
|
|
- ``no-run-target``: enables dynamic type discovery as long as running code on
|
|
the target is not required
|
|
- ``run-target``: enables code execution on the target in order to perform
|
|
dynamic type discovery
|
|
|
|
If you specify a value of either no-run-target or run-target, LLDB will detect
|
|
the dynamic type of your variables and show the appropriate formatters for
|
|
them. As an example:
|
|
|
|
::
|
|
|
|
(lldb) expr @"Hello"
|
|
(NSString *) $0 = 0x00000001048000b0 @"Hello"
|
|
(lldb) expr -d no-run @"Hello"
|
|
(__NSCFString *) $1 = 0x00000001048000b0 @"Hello"
|
|
|
|
Because LLDB uses a detection algorithm that does not need to invoke any
|
|
functions on the target process, no-run-target is enough for this to work.
|
|
|
|
As a side note, the summary for NSString shown in the example is built right
|
|
into LLDB. It was initially implemented through Python (the code is still
|
|
available for reference at CFString.py). However, this is out of sync with the
|
|
current implementation of the NSString formatter (which is a C++ function
|
|
compiled into the LLDB core).
|
|
|
|
Categories
|
|
----------
|
|
|
|
Categories are a way to group related formatters. For instance, LLDB itself
|
|
groups the formatters for STL types in a category named cpluspus. Basically,
|
|
categories act like containers in which to store formatters for a same library
|
|
or OS release.
|
|
|
|
By default, several categories are created in LLDB:
|
|
|
|
- default: this is the category where every formatter ends up, unless another category is specified
|
|
- objc: formatters for basic and common Objective-C types that do not specifically depend on macOS
|
|
- cplusplus: formatters for STL types (currently only libc++ and libstdc++ are supported). Enabled when debugging C++ targets.
|
|
- system: truly basic types for which a formatter is required
|
|
- AppKit: Cocoa classes
|
|
- CoreFoundation: CF classes
|
|
- CoreGraphics: CG classes
|
|
- CoreServices: CS classes
|
|
- VectorTypes: compact display for several vector types
|
|
|
|
If you want to use a custom category for your formatters, all the ``type ... add``
|
|
provide a ``--category`` (``-w``) option, that names the category to add the formatter
|
|
to. To delete the formatter, you then have to specify the correct category.
|
|
|
|
Categories can be in one of two states: enabled and disabled. A category is
|
|
initially disabled, and can be enabled using the ``type category enable`` command.
|
|
To disable an enabled category, the command to use is ``type category disable``.
|
|
|
|
The order in which categories are enabled or disabled is significant, in that
|
|
LLDB uses that order when looking for formatters. Therefore, when you enable a
|
|
category, it becomes the second one to be searched (after default, which always
|
|
stays on top of the list). The default categories are enabled in such a way
|
|
that the search order is:
|
|
|
|
- default
|
|
- objc
|
|
- CoreFoundation
|
|
- AppKit
|
|
- CoreServices
|
|
- CoreGraphics
|
|
- cplusplus
|
|
- VectorTypes
|
|
- system
|
|
|
|
As said, cplusplus contain formatters for C++ STL data types.
|
|
system contains formatters for char* and char[], which reflect the behavior of
|
|
older versions of LLDB which had built-in formatters for these types. Because
|
|
now these are formatters, you can even replace them with your own if so you
|
|
wish.
|
|
|
|
There is no special command to create a category. When you place a formatter in
|
|
a category, if that category does not exist, it is automatically created. For
|
|
instance,
|
|
|
|
::
|
|
|
|
(lldb) type summary add Foobar --summary-string "a foobar" --category newcategory
|
|
|
|
automatically creates a (disabled) category named newcategory.
|
|
|
|
Another way to create a new (empty) category, is to enable it, as in:
|
|
|
|
::
|
|
|
|
(lldb) type category enable newcategory
|
|
|
|
However, in this case LLDB warns you that enabling an empty category has no
|
|
effect. If you add formatters to the category after enabling it, they will be
|
|
honored. But an empty category per se does not change the way any type is
|
|
displayed. The reason the debugger warns you is that enabling an empty category
|
|
might be a typo, and you effectively wanted to enable a similarly-named but
|
|
not-empty category.
|
|
|
|
Finding Formatters 101
|
|
----------------------
|
|
|
|
Searching for a formatter (including formats, since LLDB 3.4.0) given a
|
|
variable goes through a rather intricate set of rules. Namely, what happens is
|
|
that LLDB starts looking in each enabled category, according to the order in
|
|
which they were enabled (latest enabled first). In each category, LLDB does the
|
|
following:
|
|
|
|
- If there is a formatter for the type of the variable, use it
|
|
- If this object is a pointer, and there is a formatter for the pointee type
|
|
that does not skip pointers, use it
|
|
- If this object is a reference, and there is a formatter for the referred type
|
|
that does not skip references, use it
|
|
- If this object is an Objective-C class and dynamic types are enabled, look
|
|
for a formatter for the dynamic type of the object. If dynamic types are
|
|
disabled, or the search failed, look for a formatter for the declared type of
|
|
the object
|
|
- If this object's type is a typedef, go through typedef hierarchy (LLDB might
|
|
not be able to do this if the compiler has not emitted enough information. If
|
|
the required information to traverse typedef hierarchies is missing, type
|
|
cascading will not work. The clang compiler, part of the LLVM project, emits
|
|
the correct debugging information for LLDB to cascade). If at any level of
|
|
the hierarchy there is a valid formatter that can cascade, use it.
|
|
- If everything has failed, repeat the above search, looking for regular
|
|
expressions instead of exact matches
|
|
|
|
If any of those attempts returned a valid formatter to be used, that one is
|
|
used, and the search is terminated (without going to look in other categories).
|
|
If nothing was found in the current category, the next enabled category is
|
|
scanned according to the same algorithm. If there are no more enabled
|
|
categories, the search has failed.
|
|
|
|
**Warning**: previous versions of LLDB defined cascading to mean not only going
|
|
through typedef chains, but also through inheritance chains. This feature has
|
|
been removed since it significantly degrades performance. You need to set up
|
|
your formatters for every type in inheritance chains to which you want the
|
|
formatter to apply.
|
|
|
|
.. [#] These types of variables go by different names depending on the language. In C++, in
|
|
particular, they are known as compound types.
|
|
|
|
.. [#] If you are familiar with the syntax for Frame and Thread Formatting
|
|
you will feel right at home with the syntax for Summary Strings.
|