When lldb stops at the "new binaries loaded" internal breakpoint, it
must read the list of addresses of the new binaries out of process
memory, then send a jGetLoadedDynamicLibrariesInfos packet to
debugserver to get the filepath, uuid, and addresses of where all the
segments are loaded in memory.
It's possible for debugserver to find the "new binaries loaded" function
address in the inferior itself, recognize when it has stopped at a
breakpoint there, and expedite some/all of the information lldb is going
to ask for in the stop info packet that we send to lldb. This will make
big improvements to a large-batch-of-binaries loaded stop event, but
also focuses even more on the single-binary-loaded `dlopen()` use case,
which can be quite expensive when many binaries are loaded one by one.
This PR reduces the packet traffic for a new binary load notifications
by
1. When debugserver sees a thread that has hit a breakpoint, and the pc
matches the new-binaries-loaded function address, reads the list of
binaries that have been newly added and includes them in the stop info
packet (or the jThreadsInfo packet) in the `added-binaries` key. The
value is a list (array) of binary addresses.
2. If the number of binaries is small (today: one), debugserver may
collect the full information that jGetLoadedDynamicLibrariesInfos would
send back about it, and also expedite that in the stop info packet (or
jThreadsInfo) in the `detailed-binaries-info` key. This is a JSON
string, and the stop info packet is a semicolon separated series of
key-values, so it must be asciihex encoded, just like the `jstopinfo`
key. In the jThreadsInfo packet, the JSON for the binary information is
included in the response as-is as the value-dictionary.
3. If the remote stub doesn't provide these new keys, lldb will use the
same process as before. However, in
DynamicLoaderMacOS::NotifyBreakpointHit I was reading the load addresses
out of memory individually, with each binary having a 24-byte entry.
lldb's memory cache meant we read 512 bytes per 8-byte read, but when
1000 binaries were being loaded at process launch time, that was 24,000
bytes of VM that we would read in 512 byte batches. This patch changes
that to read the entire VM range that we will be accessing in one large
memory read (as large as the remote gdb RSP stub will support),
dramatically reducing packet traffic in that case.
4. debugserver needs to read the "new binaries loaded" function pointer
out of the "dyld_all_image_infos" structure in the inferior, and it is a
signed function pointer on arm64e processes, so debugserver needs to
strip off the signing bits before comparing the pc. I hoisted the strip
function out of DNBArchImplArm64 into DNBFixAddress(), and the only
complicated bit here is in DNBProcessAddrSize(), when an arm64e
debugserver is debugging an arm64_32 process on a watch. It's not a
common combination (mostly we will have arm64e debugservers debugging
arm64 processes, or arm64_32 debugservers debuggging arm64_32 processe),
but it is supported.
5. A very minor enhancement, I have debugserver now include a new key,
`sizeof_mh_and_loadcmds` in the full binary information that
jGetLoadedDynamicLibrariesInfos returns. When lldb needs to read a
binary out of memory, it needs to read the Mach-O header & load
commands, and it doesn't know the full size of that, so we end up doing
one read of the Mach-O header, then the header + load commands. I'm not
using this information in lldb yet, but I would like to, to improve
that.
At an implementation detail level, ProcessGDBRemote collects these two
new data from the stop packet / jThreadsInfo, and passes them to the
method that creates a new ThreadGDBRemote. I added two methods to the
Thread base class to retrieve the information. DynamicLoaderMacOS will
try to read the data from the thread that hit the "new binaries loaded"
breakpoint, and if the number of entries matches the number expected by
the register value, uses them. Else it falls back to fetching them the
traditional way. On an old debugserver that doesn't support these new
expedited fields, DynamicLoaderMacOS will get back a zero-length of
binary addresses and a null StructuredData dictionary for the detailed
image information, and behave as it always does. I tested this patch
with both the debugserver changes, and without.
Testing is clearly the big questionmark here - I added none. While
writing these patches, I had some bugs and the lldb testsuite on macOS
was very good at finding them, simply with our normal process launching
and dlopen'ing in our existing API tests. I could imagine a test that
would capture the packet log and try to ensure that the expedited
information is being used by lldb and we are not re-fetching the
information, though.
rdar://175033129
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Felipe de Azevedo Piovezan <piovezan.fpi@gmail.com>