cmake-cxxmodules(7)

Added in version 3.28.

C++ 20 introduced the concept of "modules" to the language. The design requires build systems to order compilations among each other to satisfy import statements reliably. CMake's implementation asks the compiler to scan source files for module dependencies during the build, collates scanning results to infer ordering constraints, and tells the build tool how to dynamically update the build graph.

Compilation Strategy

With C++ modules, compiling a set of C++ sources is no longer embarrassingly parallel. That is, any given source may first require the compilation of another source file first in order to provide a "BMI" (or "CMI") that C++ compilers use to satisfy import statements in other sources. With included headers, sources could share their declarations so that any consumers could compile independently. With modules, declarations are now generated into these BMI files by the compiler during compilation based on the contents of the source file and its export statements.

The order necessary for compilation requires build-time resolution of the ordering because the order is controlled by the contents of the sources. This means that the ordering needs extracted from the source during the build to avoid regenerating the build graph via a configure and generate phase for every source change to get a correct build.

Build systems must use some way to order these compilations within the build graph. There are multiple ways that are suitable for this, but each have their pros and cons. The strategy that CMake uses a step called "scanning" which is the most visible change for CMake users about modules in the context of the build. CMake provides multiple ways to control the scanning behavior of source files.

Scanning Control

Whether or not sources get scanned for C++ module usage is dependent on the following queries. The first query that provides a yes/no answer is used.

  • If the source file belongs to a file set of type CXX_MODULES, it will be scanned.

  • If the target does not use at least C++ 20, it will not be scanned.

  • If the source file is not the language CXX, it will not be scanned.

  • If the CXX_SCAN_FOR_MODULES source file property is set, its value will be used.

  • If the CXX_SCAN_FOR_MODULES target property is set, its value will be used. Set the CMAKE_CXX_SCAN_FOR_MODULES variable to initialize this property on all targets as they are created.

  • Otherwise, the source file will be scanned if the compiler and generator support scanning. See policy CMP0155.

Note that any scanned source will be excluded from any unity build (see UNITY_BUILD) because module-related statements can only happen at one place within a C++ translation unit.

Compiler Support

Compilers which CMake natively supports module dependency scanning include:

  • MSVC toolset 14.34 and newer (provided with Visual Studio 17.4 and newer)

  • LLVM/Clang 16.0 and newer

  • GCC 14 (for the in-development branch, after 2023-09-20) and newer

import std Support

Support for import std is limited to the following toolchain and standard library combinations:

  • Clang 18.1.2 and newer with -stdlib=libc++

  • MSVC toolset 14.36 and newer (provided with Visual Studio 17.6 Preview 2 and newer)

  • GCC 15 and newer.

The CMAKE_CXX_COMPILER_IMPORT_STD variable may be used to detect support for a standard level with the active C++ toolchain.

Note

This support is provided only when experimental support for import std; has been enabled by the CMAKE_EXPERIMENTAL_CXX_IMPORT_STD gate.

Generator Support

The list of generators which support scanning sources for C++ modules include:

Limitations

There are a number of known limitations of the current C++ module support in CMake. This does not document known limitations or bugs in compilers as these can change over time.

For all generators:

  • Header units are not supported.

  • No builtin support for import std; or other compiler-provided modules.

For the Ninja Generators:

  • ninja 1.11 or newer is required.

For the Visual Studio Generators:

  • Only Visual Studio 2022 and MSVC toolsets 14.34 (Visual Studio 17.4) and newer.

  • No support for exporting or installing BMI or module information.

  • No support for compiling BMIs from IMPORTED targets with C++ modules (including import std).

  • No diagnosis of using modules provided by PRIVATE sources from PUBLIC module sources.

Usage

Troubleshooting CMake

This section aims to help diagnose or explain common questions or errors that may arise on the CMake side of its C++ modules support.

File Extension Support

CMake imposes no requirements upon file extensions for modules of any unit type. While there are preferences that differ between toolchains (e.g., .ixx on MSVC and .cppm on Clang), there is no universally agreed upon extension. As such, CMake only requires that the file be recognized as a CXX-language source file. Any recognized extension will do so by default, but the LANGUAGE property may also be used to use any other extension as well.

File Name Requirements

The name of a module has no relation to the name or path of the file in which its declaration resides. The C++ standard has no requirements here and neither does CMake. However, it may be useful to have some pattern in use within a project just for easier navigation within environments that lack IDE-like "find symbol" functionality (e.g., on code review platforms).

Scanning Without Modules

A common problem is files being scanned when they should not because the project has not yet adopted modules. This usually happens when a project bumps its minimum version or its policy maximum version to 3.28 or newer or a project that already has starts using C++20. This ends up setting CMP0155 to NEW which enables scanning of C++ sources with C++20 or newer by default. The easiest way for projects to turn this off is with:

set(CMAKE_CXX_SCAN_FOR_MODULES 0)

near the top of their top-level CMakeLists.txt file. Note that it should not be in the cache as it may otherwise affect projects using it via FetchContent. Attention should also be paid to vendored projects which may want to enable scanning for their own sources as this would change the default for them as well.

Debugging Module Builds

This section aims to help diagnose or explain common errors that may arise on the build side of its C++ modules support.

Circular Import Cycles

The C++ standard does not allow for cycles in the import graph of a translation unit and as such, CMake does not either. Currently CMake will leave it to the build tool to detect this based on the dynamic dependencies used to order module compilations. See CMake Issue 26119 to improve the user experience in this case.

Internal Module Partition Extension

When the implementation of building C++ modules was first investigated, it appeared as though there existed a type of translation unit that represented the intersection of a partition unit and an implementation unit. Initial CMake designs included specific support for these translation units, however, after a closer reading of the standard, these did not actually exist. These units would have had module M:part; as their module declaration statement. The problem is that this is the exact syntax also used for declaring module partitions that do not contribute to the external interface of the primary module. Only MSVC supports this distinction. Other compilers do not and will treat such files as an internal partition unit and CMake will raise an error that a module-providing C++ source must be in a FILE_SET of type CXX_MODULES.

The fix is to not use the extension as it provides no further expressivity over not using the extension. All implementation unit source files should instead only use module M; as their module declaration statement regardless of what partition the defined entities are declared within.

Module Visibility

CMake enforces module visibility between and within targets. This essentially means that a module (say, I) provided from a PRIVATE FILE_SET on a target T may not be imported:

  • by other targets depending on T; or

  • modules provided from a PUBLIC FILE_SET on target T itself.

This is because, in general, all imported entities from a module must also be importable by all potential importers of that module. This is because even if module I is only used within parts of a module without the export keyword, it may affect things within it in such a way that consumers of the module need to be able to transitively import it to work correctly. As CMake uses the visibility to determine whether to install module interface units, a PRIVATE module interface unit will not be installed and therefore make it so that usage of any installed module which imports I would not work.

Instead, it is possible to import a PRIVATE C++ module from within a implementation unit as these are not exposed to consumers of any module.

Design

The design of CMake's C++ module support makes a number of trade-offs compared to other designs. First, CMake's chosen design will be covered. Later sections cover alternative designs that were not chosen for CMake's implementation.

Overall, the designs fall somewhere along two axes:

Explicit Dynamic

Explicit Static

Explicit Fixed

Implicit Dynamic

Implicit Static

Implicit Fixed

  • Explicit builds control which modules are visible to each translation unit directly. For example, in a compilation of a source requiring a module M, the compiler will be given information which states the exact BMI file to use when importing the M module.

  • Implicit builds can control module visibility as well, but do so by instead grouping BMIs into directories which are then searched for files to satisfy import statements in the source file.

  • Static builds use a static set of build commands in order to complete the build. There must be support to add edges between nodes at build time.

  • Dynamic builds may create new build commands during the build and schedule any discovered work during the build.

  • Fixed builds are generated with all module dependencies already known.

Design Goals

CMake's implementation of building C++ modules has decided to focus on supporting the following design goals:

  1. Correct builds

  2. Deterministic builds

  3. Support generated sources

  4. Static communication

Correct Builds

Above all else, an incorrect build is a frustrating experience for all involved. A build which does not detect errors and instead lets a build with detectable problems run to completion is a good way to start wild goose chase debugging sessions. CMake would rather err on the side of avoiding such situations.

Deterministic Builds

Given a state on-disk of a build, it should be possible to determine what steps will happen next. This does not mean that the exact order of rules within the build that can be run concurrently is deterministic, but instead that the set of work to be done and its results are deterministic. For example, if there is no dependency between tasks A and B, A should have no effects on the execution of B and vice versa.

Support Generated Sources

Code generation is prevalent in the C++ ecosystem, so only supporting modules in files whose content is known at configure time is not suitable. Without supporting generated sources which use or provide modules, code generation tools are basically cut off from the use of modules. This also means that projects which are to be used from generated sources must also support non-modular ways of using their interfaces (i.e., provide headers). Given that all C++ implementations use strong module ownership for symbol mangling, this is problematic when such interfaces end up referring to compiled symbols in other libraries.

Static Communication

Everything should be statically communicated between different steps of the build. Given the build tools that CMake supports, it is difficult to set up a controllable lifetime for a companion tool to be communicated with during the compilation. Neither make nor ninja provide a mechanism to start a tool to use during the build that is also torn down at the "end" of the build. Instead, communication with the compilers is done through input and output files using dependencies in the build tool to make sure everything is up-to-date. This allows for normal debugging strategies for builds to be used and for investigations into problems to try commands the build is performing directly without having to consider other tools not running at the same.

Use Case Considerations

There are a number of tricky situations a build may find itself in when supporting modules. This section describes them and how CMake supports them within its design goals.

TODO

Selected Design

The general strategy CMake uses is to "scan" sources to extract the ordering dependency information and update the build graph with new edges between existing edges by taking the per-source scan results (represented by P1689R5 files) and then "collate" the scan results within a target with information from dependent targets. The primary task of the collator is to generate "module map" files to pass to each compile rule with the paths to the BMIs needed to satisfy import statements and to inform the build tool of dependencies needed to satisfy those import statements during the compilation. The collator also uses the build-time information to fill out information including install rules for the module interface units, their BMIs, and properties for any exported targets with C++ modules. It also enforces that PRIVATE modules may not be used by other targets or by any PUBLIC module interface unit within the target.

Implementation Details

This section details how CMake actually structures the build graph, the data passed between various parts, as well as the files which contain them. It is intended to be used both as documentation of how it works as well as a guide to help those debugging a module build to understand where to look for various bits of data.

Toolchain (scanning)

Compiler which support modules must also provide a scanning tool. Generally this will either be the compiler itself with some extra flags or a tool shipped with the compiler. The command template for scanning is stored in the CMAKE_CXX_SCANDEP_SOURCE variable. It is expected to write its P1689R5 results to the <DYNDEP_FILE> placeholder. Additionally, it should provide any discovered dependencies to the <DEP_FILE> placeholder. This allows build tools to rerun the scan if any of its dependencies change.

Additionally, toolchains should set the following variables:

  • CMAKE_CXX_MODULE_MAP_FORMAT: The format of the module map describing where dependent BMI files for imported modules exist during compilation. Must be one of gcc, clang, or msvc.

  • CMAKE_CXX_MODULE_MAP_FLAG: The flag used to inform the compiler of the module map file. It should use the <MODULE_MAP_FILE> placeholder.

  • CMAKE_CXX_MODULE_BMI_ONLY_FLAG: The flag used to only compile a BMI file from a module interface unit. This is used when consuming modules from external projects to compile BMI files for use within the current build.

If a toolchain does not provide the CMAKE_CXX_MODULE_BMI_ONLY_FLAG, it will not be able to consume modules provided by IMPORTED targets.

Toolchain (import std)

If the toolchain supports import std, it must also provide a toolchain identification module named ${CMAKE_CXX_COMPILER_ID}-CXX-CXXImportStd.

Note

Currently only CMake may provide these files due to the way they are included. Once import std is no longer experimental, external toolchains may provide support independently as well.

This module must provide the _cmake_cxx_import_std command. It will be passed two arguments: the version of the C++ standard (e.g., 23) and the name of a variable in which to place the result of its import std support. The variable should be filled in with CMake source code which declares the __CMAKE::CXX${std} target where ${std} is the version passed in. If the target cannot be made, instead the source code should set the CMAKE_CXX${std}_COMPILER_IMPORT_STD_NOT_FOUND_MESSAGE variable to the reason that import std is not supported in the current configuration. Note that CMake will guard the returned code with conditional checks to ensure that the target is only defined once.

Ideally, the __CMAKE::CXX${std} target will be an IMPORTED INTERFACE target with the std module sources attached to it. However, it may be necessary to actually compile objects for some implementations. It would be good to ask the toolchain vendor to instead provide all such symbols with the standard library itself as otherwise the intended home of such symbols is in danger of running into the ODR with respect to those symbols. The compelling scenario in this case is a C++ executable that links to a library offering a C API while being implemented in C++ itself. If both C++ parts want to use import std and the standard library requires that consumers provide symbols from it, both the C++ executable and the C library will end up providing the symbols as the former uses them directly and the latter needs them so that consumers don't need to know about the C++ implementation detail for its C API.

Configure

During the configure step, CMake needs to track which sources even care about modules at all. See Scanning Control for how each source determines whether it cares about modules or not. CMake tracks these in its internal target representation structure (cmTarget) and may, generally, be modified by using the target_sources(), target_compile_features(), and set_property() commands.

Additionally, targets may use the CXX_MODULE_STD target property to indicate that import std is desired within the target's sources.

Generate

During generation, CMake needs to add additional rules to make sure that the sources that provide modules can be built before the sources that import those modules. Since CMake uses a static build, the build graph must contain all possible commands for scanning and module generation. The dependency edges between commands to ensure that modules are provided will then ensure that the build graph executes correctly. This means that while all sources may get scanned, only modules that are actually used will be generated.

The first step CMake performs is to generate a synthetic target for each unique usage of a module-providing target. These targets are based on other targets but provide only BMI files for the specific usage. This is because the compatibility of BMI files is extremely narrow. Due to internal workings of toolchains, there can generally only be a single set of settings for a variety of flags for any one compilation, including imported module BMI files. As an example, the C++ standard in use needs to agree across all modules, but there are many settings which may cause incompatibilities.

Note

CMake currently assumes that all usages are compatible and will only create one set of BMIs for each target. See CMake Issue 25916 for progress on this support.

Once all of the synthetic targets, CMake looks at each target that has any source that may use C++ modules and creates a command to scan each of them. This command will output a P1689R5-formatted file describing the C++ modules it uses and provides. It will also create a command to collate module dependencies for the eligible compilations. This command depends on the scan results of all eligible sources, information about the target itself, as well as the collate results of any dependent targets which provide C++ modules. The collate step uses a target-specific CXXDependInfo.json file which contains the following information:

  • compiler-*: basic compiler information (id, frontend-variant, and simulate-id) which is used to generate correctly formatted paths when generating paths for the compiler

  • cxx-modules: a map of object files to the FILE_SET information for them which is used to enforce module visibility and generate install rules for module interface unit sources

  • module-dir: where to place BMI files for this target

  • dir-*: the source (src) and build (bld) directories for the current directory (cur) and the top (top) of the project

  • non-module source files which is used to compute accurate relative paths for the build tool dynamic dependencies

  • exports: The list of exports the target belongs to which are providing C++ module information so that the exported targets can provide accurate module properties as IMPORTED targets

  • bmi-installation: installation information which is used to generate install scripts for BMI files

  • database-info: information required to generate build database information if requested by EXPORT_BUILD_DATABASE

  • sources: other source files in the target which is used to add to the build database if requested

  • config: the configuration for the target which is used to set the appropriate properties in generated export files

  • language: the language's collation the file describes

  • include-dirs and forward-modules-from-target-dirs: unused for C++

For each compilation, CMake will also provide a module map which will be created during the build by the collate command. How this is provided to the compiler is specified by the CMAKE_CXX_MODULE_MAP_FORMAT and CMAKE_CXX_MODULE_MAP_FLAG toolchain variables.

Scan

The compiler is expected to implement the scan command. This is because only the compiler itself can reliably answer preprocessor predicates like __has_builtin in order to provide accurate module usage information in the face of arbitrary flags that may be used when compiling sources.

CMake names these files with the .ddi extension which stands for "dynamic dependency information". These files are in P1689R5 format and are used by the collate command to perform its tasks.

Collate

The collate command performs the bulk of the work to make C++ modules work within the build graph. It consumes the following files as input:

  • CXXDependInfo.json from the generate step

  • .ddi files from the scanning results of the target's sources

  • CXXModules.json files output from eligible dependent targets' collate commands

It uses the information from these files to then generate:

  • CXX.dd files to inform the build tool of dependencies that exist between the compilation of a source and the BMI files of the modules that it imports

  • CXXModules.json for usage in collate commands of depending targets

  • *.modmap files for each compilation

  • install-cxx-module-bmi-$<CONFIG>.cmake scripts for the installation of any BMI files (included by the install scripts)

  • target-*-$<CONFIG>.cmake export files for any exports of the target to provide the IMPORTED_CXX_MODULES_<CONFIG> properties

  • CXX_build_database.json build database files for the target when the its EXPORT_BUILD_DATABASE property is set

During its processing, it enforces the following guarantees:

C++ modules have the rule that only a single module of a given name may exist within a program. This is not exactly enforceable with the existence of private modules, but it is enforceable for public modules. The enforcement is done by the collate command. Part of the CXXModules.json files is the set of modules that are transitively imported by each module it provides. When a module is then imported, the collate command ensures that all modules with a given name agree upon a given BMI file to provide it

Compile

Compilation takes the module map file generated by the collate command to find the imported modules during compilation. Because CMake only provides the locations of modules that are discovered by the scan command, any modules missed by it will not be provided to the compilation.

It is possible for toolchains to reject the BMI file that CMake provides to a compilation as incompatible. This is because CMake assumes that all usages are compatible at the moment. See CMake Issue 25916 for progress.

Install

During installation, install scripts which have been written by the collate command during the build are included so that any BMI files that need to be installed. These need to be generated as it is not known what the BMI file names will be during CMake's generation (because CMake names the BMI files after the module name itself). These install scripts are included with the OPTIONAL keyword, so an incomplete build may result in an incomplete installation as well.

Alternative Designs

There are alternative designs that CMake does not implement. This section aims to give a brief overview and to explain why they were not chosen for CMake's implementation.

Implicit Builds

TODO: limiting outstanding compilations TODO: stale builds TODO: stale modules TODO: duplicate modules TODO: circular builds TODO: private modules

Possible Future Enhancements

This section documents possible future enhancements to CMake's support of C++ modules. Nothing here is a guarantee of future implementation and the ordering is arbitrary.

Separate BMI Generation

Currently CMake uses a single rule to generate the BMI and the object file for a compilation. At least Clang supports compiling an object directly from the BMI. The benefit is that BMI generation is typically a lot faster and generating the BMI as a separate step allows importers to start compiling without waiting for the object to also be generated.

This is not supported in the current implementation as only Clang supports generating an object directly from the BMI. Other compilers either do not support such a two-phase generation (GCC) or need to start object compilation from the source again.

Conflicts with Easier Source Specification on a single target because CMake must know all BMI-generating sources at generate time rather than build time to create the two-phase rules.

Easier Source Specification

The initial implementation of CMake's module support had used the "just list sources; CMake will figure it out" pattern but it ran into issues related to other metadata requirements that were discovered while implementing CMake support beyond just building the modules-using code.

Conflicts with Separate BMI Generation on a single target as it requires knowledge of all BMI-generating rules at generate time.

BMI Modification Optimization

Compilers currently always update the BMI causing recompiles of all importers. It is possible to juggle the BMI through a cmake -E copy_if_different pass with ninja's restat = 1 feature to avoid recompiling importers if the BMI doesn't actually change.

Batch Scanning

It is possible to scan all sources within a target at once which should be faster where sources share transitive includes. This does have side effects for incremental builds as the update of any source in the target means that all sources in the target are scanned again; given the potential performance possibilities of scanning, it should be minimal.

Module Compilation Glossary

BMI
CMI
build database
build system
build tool
C++ module
collate
discovered dependencies
dynamic dependencies
embarrassingly parallel
explicit build
fixed build
implementation unit
implicit build
interface partition unit
interface unit
internal partition unit
module implementation unit
module interface unit
module map
module visibility
ODR
one definition rule
partition unit
scan
static build
strong module ownership
synthetic target
translation unit
weak module ownership