Files
llvm-project/libc/docs/dev/code_style.rst
Jeff Bailey a23ddcd3b2 [libc][docs][NFC] Remove dead files and consolidate check.rst (#194442)
Deleted check.rst, Helpers/Styles.rst, dev/cmake_build_rules.rst, and
dev/clang_tidy_checks.rst. Moved the |check| substitution into
rst_prolog in conf.py so it is available globally without per-file
include directives.

Removed all '.. include:: check.rst' lines from hand-written header docs
and from the docgen.py generator that emits them for auto-generated
header pages.

Merged the clang-tidy checks documentation into code_style.rst under a
new 'Static Analysis & Clang-Tidy' section, preserving the
_clang_tidy_checks label for existing cross-references.

Updated code examples in both libc docs and the upstream clang-tidy
check docs to replace the stale LLVM_LIBC_ENTRYPOINT macro with the
current LLVM_LIBC_FUNCTION macro.

Updated dev/index.rst to drop the two deleted toctree entries.
2026-04-30 18:03:55 +01:00

412 lines
16 KiB
ReStructuredText

.. _code_style:
===================
The libc code style
===================
Naming style
============
For the large part, the libc project follows the general `coding standards of
the LLVM project <https://llvm.org/docs/CodingStandards.html>`_. The libc
project differs from that standard with respect to the naming style. The
differences are as follows:
#. **Non-const variables** - This includes function arguments, struct and
class data members, non-const globals and local variables. They all use the
``snake_case`` style.
#. **const and constexpr variables** - They use the capitalized
``SNAKE_CASE`` irrespective of whether they are local or global.
#. **Function and methods** - They use the ``snake_case`` style like the
non-const variables.
#. **Internal type names** - These are types which are internal to the libc
implementation. They use the ``CaptilizedCamelCase`` style.
#. **Public names** - These are the names as prescribed by the standards and
will follow the style as prescribed by the standards.
Macro style
===========
We define two kinds of macros:
#. **Build defined** macros are generated by `CMake` or `Bazel` and are passed
down to the compiler with the ``-D`` command line flag. They start with the
``LIBC_COPT_`` prefix. They are used to tune the behavior of the libc.
They either denote an action or define a constant.
#. **Code defined** macros are defined within the ``src/__support/macros``
folder. They all start with the ``LIBC_`` prefix.
* ``src/__support/macros/properties/`` - Build related properties like
target architecture or enabled CPU features defined by introspecting
compiler defined preprocessor definitions.
* ``architectures.h`` - Target architecture properties.
e.g., ``LIBC_TARGET_ARCH_IS_ARM``.
* ``compiler.h`` - Host compiler properties.
e.g., ``LIBC_COMPILER_IS_CLANG``.
* ``cpu_features.h`` - Target cpu feature availability.
e.g., ``LIBC_TARGET_CPU_HAS_AVX2``.
* ``types.h`` - Type properties and availability.
e.g., ``LIBC_TYPES_HAS_FLOAT128``.
* ``os.h`` - Target os properties.
e.g., ``LIBC_TARGET_OS_IS_LINUX``.
* ``src/__support/macros/config.h`` - Important compiler and platform
features. Such macros can be used to produce portable code by
parameterizing compilation based on the presence or lack of a given
feature. e.g., ``LIBC_HAS_FEATURE``
* ``src/__support/macros/attributes.h`` - Attributes for functions, types,
and variables. e.g., ``LIBC_UNUSED``
* ``src/__support/macros/optimization.h`` - Portable macros for performance
optimization. e.g., ``LIBC_LIKELY``, ``LIBC_LOOP_NOUNROLL``
Inline functions and variables defined in header files
======================================================
When defining functions and variables inline in header files, we follow certain
rules:
#. The functions should not be given file-static linkage. There can be class
static methods defined inline however.
#. Instead of using the ``inline`` keyword, functions should be tagged with the
``LIBC_INLINE`` macro and variables should be tagged with the
``LIBC_INLINE_VAR`` macro defined in ``src/__support/macros/attributes.h``.
For example:
.. code-block:: c++
LIBC_INLINE_VAR constexpr bool foo = true;
LIBC_INLINE ReturnType function_defined_inline(ArgType arg) {
...
}
#. The ``LIBC_INLINE`` tag should also be added to functions which have
definitions that are implicitly inline. Examples of such functions are
class methods (static and non-static) defined inline and ``constexpr``
functions.
Setting ``errno`` from runtime code
===================================
Many libc functions set ``errno`` to indicate an error condition. If LLVM's libc
is being used as the only libc, then the ``errno`` from LLVM's libc is affected.
If LLVM's libc is being used in the :ref:`overlay_mode`, then the ``errno`` from
the system libc is affected. When a libc function, which can potentially affect
the ``errno``, is called from a unit test, we do not want the global ``errno``
(as in, the ``errno`` of the process thread running the unit test) to be
affected. If the global ``errno`` is affected, then the operation of the unit
test infrastructure itself can be affected. To avoid perturbing the unit test
infrastructure around the setting of ``errno``, the following rules are to be
followed:
#. A special macro named ``libc_errno`` defined in ``src/__support/libc_errno.h``
should be used when setting ``errno`` from libc runtime code. For example,
code to set ``errno`` to ``EINVAL`` should be:
.. code-block:: c++
libc_errno = EINVAL;
#. ``errno`` should be set just before returning from the implementation of the
public function. It should not be set from within helper functions. Helper
functions should use idiomatic C++ constructs like
`cpp::optional <https://github.com/llvm/llvm-project/blob/main/libc/src/__support/CPP/optional.h>`_
and
`ErrorOr <https://github.com/llvm/llvm-project/blob/main/libc/src/__support/error_or.h>`_
to return error values.
#. The header file ``src/__support/libc_errno.h`` is shipped as part of the target
corresponding to the ``errno`` entrypoint ``libc.src.errno.errno``. We do
not in general allow dependencies between entrypoints. However, the ``errno``
entrypoint is the only exceptional entrypoint on which other entrypoints
should explicitly depend on if they set ``errno`` to indicate error
conditions.
Assertions in libc runtime code
===============================
The libc developers should, and are encouraged to, use assertions freely in
the libc runtime code. However, the assertion should be listed via the macro
``LIBC_ASSERT`` defined in ``src/__support/libc_assert.h``. This macro can be
used from anywhere in the libc runtime code. Internally, all it does is to
print the assertion expression and exit. It does not implement the semantics
of the standard ``assert`` macro. Hence, it can be used from any where in the
libc runtime code without causing any recursive calls or chicken-and-egg
situations.
Allocations in the libc runtime code
====================================
Some libc functions allocate memory. For example, the ``strdup`` function
allocates new memory into which the input string is duplicated. Allocations
are typically done by calling a function from the ``malloc`` family of
functions. Such functions can fail and return an error value to indicate
allocation failure. To conform to standards, the libc should handle
allocation failures gracefully and surface the error conditions to the user
code as appropriate. Since LLVM's libc is implemented in C++, we want
allocations and deallocations to employ C++ operators ``new`` and ``delete``
as they implicitly invoke constructors and destructors respectively. However,
if we use the default ``new`` and ``delete`` operators, the libc will end up
depending on the C++ runtime. To avoid such a dependence, and to handle
allocation failures gracefully, we use special ``new`` and ``delete`` operators
defined in
`src/__support/CPP/new.h <https://github.com/llvm/llvm-project/blob/main/libc/src/__support/CPP/new.h>`_.
Allocations and deallocations using these operators employ a pattern like
this:
.. code-block:: c++
#include "src/__support/CPP/new.h"
#include "src/__support/alloc-checker.h"
...
LIBC_NAMESPACE::AllocChecker ac;
auto *obj = new (ac) Type(...);
if (!ac) {
// handle allocator failure.
}
...
delete obj;
The only exception to using the above pattern is if allocating using the
``realloc`` function is of value. In such cases, prefer to use only the
``malloc`` family of functions for allocations and deallocations. Allocation
failures will still need to be handled gracefully. Further, keep in mind that
these functions do not call the constructors and destructors of the
allocated/deallocated objects. So, use these functions carefully and only
when it is absolutely clear that constructor and destructor invocation is
not required.
Warnings in sources
===================
We expect contributions to be free of warnings from the `minimum supported
compiler versions`__ (and newer).
.. __: https://libc.llvm.org/compiler_support.html#minimum-supported-versions
Header Inclusion Policy
=======================
Because llvm-libc supports
`Overlay Mode <https://libc.llvm.org/overlay_mode.html>`__,
`Full Host Build Mode <https://libc.llvm.org/full_host_build.html>`__ and
`Full Cross Build Mode <https://libc.llvm.org/full_cross_build.html>`__ care
must be taken when ``#include``'ing certain headers.
The ``include/`` directory contains public facing headers that users must
consume for fullbuild mode. As such, types defined here will have ABI
implications as these definitions may differ from the underlying system for
overlay mode and are NEVER appropriate to include in ``libc/src/`` without
preprocessor guards for ``LLVM_LIBC_FULL_BUILD``.
Consider the case where an implementation in ``libc/src/`` may wish to refer to
a ``sigset_t``, what header should be included? ``<signal.h>``, ``<spawn.h>``,
``<sys/select.h>``?
None of the above. Instead, code under ``src/`` should ``#include
"hdr/types/sigset_t.h"`` which contains preprocessor guards on
``LLVM_LIBC_FULL_BUILD`` to either include the public type (fullbuild mode) or
the underlying system header (overlay mode).
Implementations in ``libc/src/`` should NOT be ``#include``'ing using ``<>`` or
``"include/*``, except for these "proxy" headers that first check for
``LLVM_LIBC_FULL_BUILD``.
These "proxy" headers are similarly used when referring to preprocessor
defines. Code under ``libc/src/`` should ``#include`` a proxy header from
``hdr/``, which contains a guard on ``LLVM_LIBC_FULL_BUILD`` to either include
our header from ``libc/include/`` (fullbuild) or the corresponding underlying
system header (overlay).
Policy on Assembly sources
==========================
Coding in high level languages such as C++ provides benefits relative to low
level languages like Assembly, such as:
* Improved safety
* Compile time diagnostics
* Instrumentation
* Code coverage
* Profile collection
* Sanitization
* Automatic generation of debug info
While it's not impossible to have Assembly code that correctly provides all of
the above, we do not wish to maintain such Assembly sources in llvm-libc.
That said, there are a few functions provided by llvm-libc that are impossible
to reliably implement in C++ for all compilers supported for building
llvm-libc.
We do use inline or out-of-line Assembly in an intentionally minimal set of
places; typically places where the stack or individual register state must be
manipulated very carefully for correctness, or instances where a specific
instruction sequence does not have a corresponding compiler builtin function
today.
Contributions adding functions implemented purely in Assembly for performance
are not welcome.
Contributors should strive to stick with C++ for as long as it remains
reasonable to do so. Ideally, bugs should be filed against compiler vendors,
and links to those bug reports should appear in commit messages or comments
that seek to add Assembly to llvm-libc.
Patches containing any amount of Assembly ideally should be approved by 2
maintainers. llvm-libc maintainers reserve the right to reject Assembly
contributions that they feel could be better maintained if rewritten in C++,
and to revisit this policy in the future.
LIBC_NAMESPACE_DECL
===================
llvm-libc provides a macro `LIBC_NAMESPACE` which contains internal implementations of
libc functions and globals. This macro should only be used as an
identifier for accessing such symbols within the namespace (like `LIBC_NAMESPACE::cpp::max`).
Any usage of this namespace for declaring or defining internal symbols should
instead use `LIBC_NAMESPACE_DECL` which declares `LIBC_NAMESPACE` with hidden visibility.
Example usage:
.. code-block:: c++
#include "src/__support/macros/config.h" // The macro is defined here.
namespace LIBC_NAMESPACE_DECL {
void new_function() {
...
}
} // LIBC_NAMESPACE_DECL
Having hidden visibility on the namespace ensures extern declarations in a given TU
have known visibility and never generate GOT indirections. The attribute guarantees
this independently of global compile options and build systems.
.. _clang_tidy_checks:
Static Analysis & Clang-Tidy
=============================
Configuration
-------------
LLVM libc uses layered ``.clang-tidy`` configuration files:
- ``libc/.clang-tidy``: baseline checks for the ``libc`` subtree (currently
focuses on identifier naming conventions).
- ``libc/src/.clang-tidy``: adds LLVM-libc-specific checks (``llvmlibc-*``) for
implementation code under ``libc/src`` and also enables
``readability-identifier-naming`` and ``llvm-header-guard``. Diagnostics from
``llvmlibc-*`` checks are treated as errors.
LLVM-libc checks
----------------
restrict-system-libc-headers
----------------------------
Check name: ``llvmlibc-restrict-system-libc-headers``.
One of libc-project's design goals is to use kernel headers and compiler
provided headers to prevent code duplication on a per platform basis. This
presents a problem when writing implementations since system libc headers are
easy to include accidentally and we can't just use the ``-nostdinc`` flag.
Improperly included system headers can introduce runtime errors because the C
standard outlines function prototypes and behaviors but doesn't define
underlying implementation details such as the layout of a struct.
This check prevents accidental inclusion of system libc headers when writing a
libc implementation.
.. code-block:: c++
#include <stdio.h> // Not allowed because it is part of system libc.
#include <stddef.h> // Allowed because it is provided by the compiler.
#include "internal/stdio.h" // Allowed because it is NOT part of system libc.
implementation-in-namespace
---------------------------
Check name: ``llvmlibc-implementation-in-namespace``.
All LLVM-libc implementation constructs must be enclosed in the
``LIBC_NAMESPACE_DECL`` namespace. See :ref:`code_style` for the full technical
rationale and macro definitions.
This check ensures that top-level declarations in a translation unit are
enclosed within the ``LIBC_NAMESPACE_DECL`` namespace.
.. code-block:: c++
// Correct: implementation inside the correct namespace.
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(char *, strcpy, (char *dest, const char *src)) {}
// Namespaces within LIBC_NAMESPACE namespace are allowed.
namespace inner{
int localVar = 0;
}
// Functions with C linkage are allowed.
extern "C" void str_fuzz(){}
}
// Incorrect: implementation not in a namespace.
LLVM_LIBC_FUNCTION(char *, strcpy, (char *dest, const char *src)) {}
// Incorrect: outer most namespace is not correct.
namespace something_else {
LLVM_LIBC_FUNCTION(char *, strcpy, (char *dest, const char *src)) {}
}
callee-namespace
----------------
Check name: ``llvmlibc-callee-namespace``.
LLVM-libc is distinct because it is designed to maintain interoperability with
other libc libraries, including the one that lives on the system. This feature
creates some uncertainty about which library a call resolves to especially when
a public header with non-namespaced functions like ``string.h`` is included.
This check ensures any function call resolves to a function within the
LIBC_NAMESPACE namespace.
There are exceptions for the following functions:
``__errno_location`` so that ``errno`` can be set;
``malloc``, ``calloc``, ``realloc``, ``aligned_alloc``, and ``free`` since they
are always external and can be intercepted.
.. code-block:: c++
namespace LIBC_NAMESPACE_DECL {
// Disallow calls to the public versions with the LIBC_NAMESPACE.
LIBC_NAMESPACE::strlen("hello");
// Allow calls to compiler provided functions.
(void)__builtin_abs(-1);
// Disallow bare calls.
strlen("world");
// Disallow calling into functions in the global namespace.
::strlen("!");
// Allow calling into specific global functions (explained above).
::malloc(10);
} // namespace LIBC_NAMESPACE_DECL
inline-function-decl
--------------------
Check name: ``llvmlibc-inline-function-decl``.
LLVM libc uses the ``LIBC_INLINE`` macro to tag inline function declarations in
headers. This check enforces that any inline function declaration in a header
begins with ``LIBC_INLINE`` and provides a fix-it to insert the macro.