summaryrefslogtreecommitdiff
path: root/tools/build/doc/src/architecture.xml
diff options
context:
space:
mode:
Diffstat (limited to 'tools/build/doc/src/architecture.xml')
-rw-r--r--tools/build/doc/src/architecture.xml668
1 files changed, 668 insertions, 0 deletions
diff --git a/tools/build/doc/src/architecture.xml b/tools/build/doc/src/architecture.xml
new file mode 100644
index 0000000000..0b22defef9
--- /dev/null
+++ b/tools/build/doc/src/architecture.xml
@@ -0,0 +1,668 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE appendix PUBLIC "-//Boost//DTD BoostBook XML V1.0//EN"
+ "http://www.boost.org/tools/boostbook/dtd/boostbook.dtd">
+
+ <appendix id="bbv2.arch">
+ <title>Boost.Build v2 architecture</title>
+
+ <sidebar>
+ <para>
+ This document is work-in progress. Do not expect much from it yet.
+ </para>
+ </sidebar>
+
+ <section id="bbv2.arch.overview">
+ <title>Overview</title>
+
+ <!-- FIXME: the below does not mention engine at all, making rest of the
+ text confusing. Things like 'kernel' and 'util' don't have to be
+ mentioned at all. -->
+ <para>
+ Boost.Build implementation is structured in four different components:
+ "kernel", "util", "build" and "tools". The first two are relatively
+ uninteresting, so we will focus on the remaining pair. The "build" component
+ provides classes necessary to declare targets, determining which properties
+ should be used for their building, and creating the dependency graph. The
+ "tools" component provides user-visible functionality. It mostly allows
+ declaring specific kinds of main targets, as well as registering available
+ tools, which are then used when creating the dependency graph.
+ </para>
+ </section>
+
+ <section id="bbv2.arch.build">
+ <title>The build layer</title>
+
+ <para>
+ The build layer has just four main parts -- metatargets (abstract
+ targets), virtual targets, generators and properties.
+
+ <itemizedlist>
+ <listitem><para>
+ Metatargets (see the "targets.jam" module) represent all the
+ user-defined entities that can be built. The "meta" prefix signifies
+ that they do not need to correspond to exact files or even files at all
+ -- they can produce a different set of files depending on the build
+ request. Metatargets are created when Jamfiles are loaded. Each has a
+ <code>generate</code> method which is given a property set and produces
+ virtual targets for the passed properties.
+ </para></listitem>
+ <listitem><para>
+ Virtual targets (see the "virtual-targets.jam" module) correspond to
+ actual atomic updatable entities -- most typically files.
+ </para></listitem>
+ <listitem><para>
+ Properties are just (name, value) pairs, specified by the user and
+ describing how targets should be built. Properties are stored using the
+ <code>property-set</code> class.
+ </para></listitem>
+ <listitem><para>
+ Generators are objects that encapsulate specific tools -- they can
+ take a list of source virtual targets and produce new virtual targets
+ from them.
+ </para></listitem>
+ </itemizedlist>
+
+ </para>
+
+ <para>
+ The build process includes the following steps:
+
+ <orderedlist>
+ <listitem><para>
+ Top-level code calls the <code>generate</code> method of a metatarget
+ with some properties.
+ </para></listitem>
+
+ <listitem><para>
+ The metatarget combines the requested properties with its requirements
+ and passes the result, together with the list of sources, to the
+ <code>generators.construct</code> function.
+ </para></listitem>
+
+ <listitem><para>
+ A generator appropriate for the build properties is selected and its
+ <code>run</code> method is called. The method returns a list of virtual
+ targets.
+ </para></listitem>
+
+ <listitem><para>
+ The virtual targets are returned to the top level code, and for each instance,
+ the <literal>actualize</literal> method is called to setup nodes and updating
+ actions in the depenendency graph kepts inside Boost.Build engine. This dependency
+ graph is then updated, which runs necessary commands.
+ </para></listitem>
+ </orderedlist>
+ </para>
+
+ <section id="bbv2.arch.build.metatargets">
+ <title>Metatargets</title>
+
+ <para>
+ There are several classes derived from "abstract-target". The
+ "main-target" class represents a top-level main target, the
+ "project-target" class acts like a container holding multiple main
+ targets, and "basic-target" class is a base class for all further target
+ types.
+ </para>
+
+ <para>
+ Since each main target can have several alternatives, all top-level
+ target objects are actually containers, referring to "real" main target
+ classes. The type of that container is "main-target". For example, given:
+<programlisting>
+alias a ;
+lib a : a.cpp : &lt;toolset&gt;gcc ;
+</programlisting>
+ we would have one-top level "main-target" instance, containing one
+ "alias-target" and one "lib-target" instance. "main-target"'s "generate"
+ method decides which of the alternative should be used, and calls
+ "generate" on the corresponding instance.
+ </para>
+
+ <para>
+ Each alternative is an instance of a class derived from "basic-target".
+ "basic-target.generate" does several things that should always be done:
+
+ <itemizedlist>
+ <listitem><para>
+ Determines what properties should be used for building the target.
+ This includes looking at requested properties, requirements, and usage
+ requirements of all sources.
+ </para></listitem>
+
+ <listitem><para>
+ Builds all sources.
+ </para></listitem>
+
+ <listitem><para>
+ Computes usage requirements that should be passed back to targets
+ depending on this one.
+ </para></listitem>
+ </itemizedlist>
+
+ For the real work of constructing a virtual target, a new method
+ "construct" is called.
+ </para>
+
+ <para>
+ The "construct" method can be implemented in any way by classes derived
+ from "basic-target", but one specific derived class plays the central role
+ -- "typed-target". That class holds the desired type of file to be
+ produced, and its "construct" method uses the generators module to do the
+ actual work.
+ </para>
+
+ <para>
+ This means that a specific metatarget subclass may avoid using
+ generators all together. However, this is deprecated and we are trying to
+ eliminate all such subclasses at the moment.
+ </para>
+
+ <para>
+ Note that the <filename>build/targets.jam</filename> file contains an
+ UML diagram which might help.
+ </para>
+ </section>
+
+ <section id="bbv2.arch.build.virtual">
+ <title>Virtual targets</title>
+
+ <para>
+ Virtual targets are atomic updatable entities. Each virtual
+ target can be assigned an updating action -- instance of the
+ <code>action</code> class. The action class, in turn, contains a list of
+ source targets, properties, and a name of an action which
+ should be executed.
+ </para>
+
+ <para>
+ We try hard to never create equal instances of the
+ <code>virtual-target</code> class. Code creating virtual targets passes
+ them though the <code>virtual-target.register</code> function, which
+ detects if a target with the same name, sources, and properties has
+ already been created. In that case, the preexisting target is returned.
+ </para>
+
+ <!-- FIXME: the below 2 para are rubbish, must be totally rewritten. -->
+ <para>
+ When all virtual targets are produced, they are "actualized". This means
+ that the real file names are computed, and the commands that should be run
+ are generated. This is done by the <code>virtual-target.actualize</code>
+ and <code>action.actualize</code> methods. The first is conceptually
+ simple, while the second needs additional explanation. Commands in Boost.Build
+ are generated in a two-stage process. First, a rule with an appropriate
+ name (for example "gcc.compile") is called and is given a list of target
+ names. The rule sets some variables, like "OPTIONS". After that, the
+ command string is taken, and variable are substitutes, so use of OPTIONS
+ inside the command string gets transformed into actual compile options.
+ </para>
+
+ <para>
+ Boost.Build added a third stage to simplify things. It is now possible
+ to automatically convert properties to appropriate variable assignments.
+ For example, &lt;debug-symbols&gt;on would add "-g" to the OPTIONS
+ variable, without requiring to manually add this logic to gcc.compile.
+ This functionality is part of the "toolset" module.
+ </para>
+
+ <para>
+ Note that the <filename>build/virtual-targets.jam</filename> file
+ contains an UML diagram which might help.
+ </para>
+ </section>
+
+ <section id="bbv2.arch.build.properties">
+ <title>Properties</title>
+
+ <para>
+ Above, we noted that metatargets are built with a set of properties.
+ That set is represented by the <code>property-set</code> class. An
+ important point is that handling of property sets can get very expensive.
+ For that reason, we make sure that for each set of (name, value) pairs
+ only one <code>property-set</code> instance is created. The
+ <code>property-set</code> uses extensive caching for all operations, so
+ most work is avoided. The <code>property-set.create</code> is the factory
+ function used to create instances of the <code>property-set</code> class.
+ </para>
+ </section>
+ </section>
+
+ <section id="bbv2.arch.tools">
+ <title>The tools layer</title>
+
+ <para>Write me!</para>
+ </section>
+
+ <section id="bbv2.arch.targets">
+ <title>Targets</title>
+
+ <para>NOTE: THIS SECTION IS NOT EXPECTED TO BE READ!
+ There are two user-visible kinds of targets in Boost.Build. First are
+ "abstract" &#x2014; they correspond to things declared by the user, e.g.
+ projects and executable files. The primary thing about abstract targets is
+ that it is possible to request them to be built with a particular set of
+ properties. Each property combination may possibly yield different built
+ files, so abstract target do not have a direct correspondence to built
+ files.
+ </para>
+
+ <para>
+ File targets, on the other hand, are associated with concrete files.
+ Dependency graphs for abstract targets with specific properties are
+ constructed from file targets. User has no way to create file targets but
+ can specify rules for detecting source file types, as well as rules for
+ transforming between file targets of different types. That information is
+ used in constructing the final dependency graph, as described in the <link
+ linkend="bbv2.arch.depends">next section</link>.
+ <emphasis role="bold">Note:</emphasis>File targets are not the same entities
+ as Jam targets; the latter are created from file targets at the latest
+ possible moment.
+ <emphasis role="bold">Note:</emphasis>"File target" is an originally
+ proposed name for what we now call virtual targets. It is more
+ understandable by users, but has one problem: virtual targets can
+ potentially be "phony", and not correspond to any file.
+ </para>
+ </section>
+
+ <section id="bbv2.arch.depends">
+ <title>Dependency scanning</title>
+
+ <para>
+ Dependency scanning is the process of finding implicit dependencies, like
+ "#include" statements in C++. The requirements for correct dependency
+ scanning mechanism are:
+ </para>
+
+ <itemizedlist>
+ <listitem><simpara>
+ <link linkend="bbv2.arch.depends.different-scanning-algorithms">Support
+ for different scanning algorithms</link>. C++ and XML have quite different
+ syntax for includes and rules for looking up the included files.
+ </simpara></listitem>
+
+ <listitem><simpara>
+ <link linkend="bbv2.arch.depends.same-file-different-scanners">Ability
+ to scan the same file several times</link>. For example, a single C++ file
+ may be compiled using different include paths.
+ </simpara></listitem>
+
+ <listitem><simpara>
+ <link linkend="bbv2.arch.depends.dependencies-on-generated-files">Proper
+ detection of dependencies on generated files.</link>
+ </simpara></listitem>
+
+ <listitem><simpara>
+ <link
+ linkend="bbv2.arch.depends.dependencies-from-generatedfiles">Proper
+ detection of dependencies from a generated file.</link>
+ </simpara></listitem>
+ </itemizedlist>
+
+ <section id="bbv2.arch.depends.different-scanning-algorithms">
+ <title>Support for different scanning algorithms</title>
+
+ <para>
+ Different scanning algorithm are encapsulated by objects called
+ "scanners". Please see the "scanner" module documentation for more
+ details.
+ </para>
+ </section>
+
+ <section id="bbv2.arch.depends.same-file-different-scanners">
+ <title>Ability to scan the same file several times</title>
+
+ <para>
+ As stated above, it is possible to compile a C++ file multiple times,
+ using different include paths. Therefore, include dependencies for those
+ compilations can be different. The problem is that Boost.Build engine does
+ not allow multiple scans of the same target. To solve that, we pass the
+ scanner object when calling <literal>virtual-target.actualize</literal>
+ and it creates different engine targets for different scanners.
+ </para>
+
+ <para>
+ For each engine target created with a specified scanner, a
+ corresponding one is created without it. The updating action is
+ associated with the scanner-less target, and the target with the scanner
+ is made to depend on it. That way if sources for that action are touched,
+ all targets &#x2014; with and without the scanner are considered outdated.
+ </para>
+
+ <para>
+ Consider the following example: "a.cpp" prepared from "a.verbatim",
+ compiled by two compilers using different include paths and copied into
+ some install location. The dependency graph would look like:
+ </para>
+
+<programlisting>
+a.o (&lt;toolset&gt;gcc) &lt;--(compile)-- a.cpp (scanner1) ----+
+a.o (&lt;toolset&gt;msvc) &lt;--(compile)-- a.cpp (scanner2) ----|
+a.cpp (installed copy) &lt;--(copy) ----------------------- a.cpp (no scanner)
+ ^
+ |
+ a.verbose --------------------------------+
+</programlisting>
+ </section>
+
+ <section id="bbv2.arch.depends.dependencies-on-generated-files">
+ <title>Proper detection of dependencies on generated files.</title>
+
+ <para>
+ This requirement breaks down to the following ones.
+ </para>
+
+ <orderedlist>
+ <listitem><simpara>
+ If when compiling "a.cpp" there is an include of "a.h", the "dir"
+ directory is on the include path, and a target called "a.h" will be
+ generated in "dir", then Boost.Build should discover the include, and create
+ "a.h" before compiling "a.cpp".
+ </simpara></listitem>
+
+ <listitem><simpara>
+ Since Boost.Build almost always generates targets under the "bin"
+ directory, this should be supported as well. I.e. in the scenario above,
+ Jamfile in "dir" might create a main target, which generates "a.h". The
+ file will be generated to "dir/bin" directory, but we still have to
+ recognize the dependency.
+ </simpara></listitem>
+ </orderedlist>
+
+ <para>
+ The first requirement means that when determining what "a.h" means when
+ found in "a.cpp", we have to iterate over all directories in include
+ paths, checking for each one:
+ </para>
+
+ <orderedlist>
+ <listitem><simpara>
+ If there is a file named "a.h" in that directory, or
+ </simpara></listitem>
+
+ <listitem><simpara>
+ If there is a target called "a.h", which will be generated in that
+ that directory.
+ </simpara></listitem>
+ </orderedlist>
+
+ <para>
+ Classic Jam has built-in facilities for point (1) above, but that is not
+ enough. It is hard to implement the right semantics without builtin
+ support. For example, we could try to check if there exists a target
+ called "a.h" somewhere in the dependency graph, and add a dependency to
+ it. The problem is that without a file search in the include path, the
+ semantics may be incorrect. For example, one can have an action that
+ generated some "dummy" header, for systems which do not have a native one.
+ Naturally, we do not want to depend on that generated header on platforms
+ where a native one is included.
+ </para>
+
+ <para>
+ There are two design choices for builtin support. Suppose we have files
+ a.cpp and b.cpp, and each one includes header.h, generated by some action.
+ Dependency graph created by classic Jam would look like:
+
+<programlisting>
+a.cpp -----&gt; &lt;scanner1&gt;header.h [search path: d1, d2, d3]
+
+ &lt;d2&gt;header.h --------&gt; header.y
+ [generated in d2]
+
+b.cpp -----&gt; &lt;scanner2&gt;header.h [search path: d1, d2, d4]
+</programlisting>
+ </para>
+
+ <para>
+ In this case, Jam thinks all header.h target are not related. The
+ correct dependency graph might be:
+
+<programlisting>
+a.cpp ----
+ \
+ &gt;----&gt; &lt;d2&gt;header.h --------&gt; header.y
+ / [generated in d2]
+b.cpp ----
+</programlisting>
+
+ or
+
+<programlisting>
+a.cpp -----&gt; &lt;scanner1&gt;header.h [search path: d1, d2, d3]
+ |
+ (includes)
+ V
+ &lt;d2&gt;header.h --------&gt; header.y
+ [generated in d2]
+ ^
+ (includes)
+ |
+b.cpp -----&gt; &lt;scanner2&gt;header.h [ search path: d1, d2, d4]
+</programlisting>
+ </para>
+
+ <para>
+ The first alternative was used for some time. The problem however is:
+ what include paths should be used when scanning header.h? The second
+ alternative was suggested by Matt Armstrong. It has a similar effect: Any
+ target depending on &lt;scanner1&gt;header.h will also depend on
+ &lt;d2&gt;header.h. This way though we now have two different targets with
+ two different scanners, so those targets can be scanned independently. The
+ first alternative's problem is avoided, so the second alternative is
+ implemented now.
+ </para>
+
+ <para>
+ The second sub-requirements is that targets generated under the "bin"
+ directory are handled as well. Boost.Build implements a semi-automatic
+ approach. When compiling C++ files the process is:
+ </para>
+
+ <orderedlist>
+ <listitem><simpara>
+ The main target to which the compiled file belongs to is found.
+ </simpara></listitem>
+
+ <listitem><simpara>
+ All other main targets that the found one depends on are found. These
+ include: main targets used as sources as well as those specified as
+ "dependency" properties.
+ </simpara></listitem>
+
+ <listitem><simpara>
+ All directories where files belonging to those main targets will be
+ generated are added to the include path.
+ </simpara></listitem>
+ </orderedlist>
+
+ <para>
+ After this is done, dependencies are found by the approach explained
+ previously.
+ </para>
+
+ <para>
+ Note that if a target uses generated headers from another main target,
+ that main target should be explicitly specified using the dependency
+ property. It would be better to lift this requirement, but it does not
+ seem to be causing any problems in practice.
+ </para>
+
+ <para>
+ For target types other than C++, adding of include paths must be
+ implemented anew.
+ </para>
+ </section>
+
+ <section id="bbv2.arch.depends.dependencies-from-generated-files">
+ <title>Proper detection of dependencies from generated files</title>
+
+ <para>
+ Suppose file "a.cpp" includes "a.h" and both are generated by some
+ action. Note that classic Jam has two stages. In the first stage the
+ dependency graph is built and actions to be run are determined. In the
+ second stage the actions are executed. Initially, neither file exists, so
+ the include is not found. As the result, Jam might attempt to compile
+ a.cpp before creating a.h, causing the compilation to fail.
+ </para>
+
+ <para>
+ The solution in Boost.Jam is to perform additional dependency scans
+ after targets are updated. This breaks separation between build stages in
+ Jam &#x2014; which some people consider a good thing &#x2014; but I am not
+ aware of any better solution.
+ </para>
+
+ <para>
+ In order to understand the rest of this section, you better read some
+ details about Jam's dependency scanning, available at <ulink url=
+ "http://public.perforce.com:8080/@md=d&amp;cd=//public/jam/src/&amp;ra=s&amp;c=kVu@//2614?ac=10">
+ this link</ulink>.
+ </para>
+
+ <para>
+ Whenever a target is updated, Boost.Jam rescans it for includes.
+ Consider this graph, created before any actions are run.
+<programlisting>
+A -------&gt; C ----&gt; C.pro
+ /
+B --/ C-includes ---&gt; D
+</programlisting>
+ </para>
+
+ <para>
+ Both A and B have dependency on C and C-includes (the latter dependency
+ is not shown). Say during building we have tried to create A, then tried
+ to create C and successfully created C.
+ </para>
+
+ <para>
+ In that case, the set of includes in C might well have changed. We do
+ not bother to detect precisely which includes were added or removed.
+ Instead we create another internal node C-includes-2. Then we determine
+ what actions should be run to update the target. In fact this means that
+ we perform the first stage logic when already in the execution stage.
+ </para>
+
+ <para>
+ After actions for C-includes-2 are determined, we add C-includes-2 to
+ the list of A's dependents, and stage 2 proceeds as usual. Unfortunately,
+ we can not do the same with target B, since when it is not visited, C
+ target does not know B depends on it. So, we add a flag to C marking it as
+ rescanned. When visiting the B target, the flag is noticed and
+ C-includes-2 is added to the list of B's dependencies as well.
+ </para>
+
+ <para>
+ Note also that internal nodes are sometimes updated too. Consider this
+ dependency graph:
+<programlisting>
+a.o ---&gt; a.cpp
+ a.cpp-includes --&gt; a.h (scanned)
+ a.h-includes ------&gt; a.h (generated)
+ |
+ |
+ a.pro &lt;-------------------------------------------+
+</programlisting>
+ </para>
+
+ <para>
+ Here, our handling of generated headers come into play. Say that a.h
+ exists but is out of date with respect to "a.pro", then "a.h (generated)"
+ and "a.h-includes" will be marked for updating, but "a.h (scanned)" will
+ not. We have to rescan "a.h" after it has been created, but since "a.h
+ (generated)" has no associated scanner, it is only possible to rescan
+ "a.h" after "a.h-includes" target has been updated.
+ </para>
+
+ <para>
+ The above consideration lead to the decision to rescan a target whenever
+ it is updated, no matter if it is internal or not.
+ </para>
+
+ </section>
+ </section>
+
+ <warning>
+ <para>
+ The remainder of this document is not intended to be read at all. This
+ will be rearranged in the future.
+ </para>
+ </warning>
+
+ <section>
+ <title>File targets</title>
+
+ <para>
+ As described above, file targets correspond to files that Boost.Build
+ manages. Users may be concerned about file targets in three ways: when
+ declaring file target types, when declaring transformations between types
+ and when determining where a file target is to be placed. File targets can
+ also be connected to actions that determine how the target is to be created.
+ Both file targets and actions are implemented in the
+ <literal>virtual-target</literal> module.
+ </para>
+
+ <section>
+ <title>Types</title>
+
+ <para>
+ A file target can be given a type, which determines what transformations
+ can be applied to the file. The <literal>type.register</literal> rule
+ declares new types. File type can also be assigned a scanner, which is
+ then used to find implicit dependencies. See "<link
+ linkend="bbv2.arch.depends">dependency scanning</link>".
+ </para>
+ </section>
+
+ <section>
+ <title>Target paths</title>
+
+ <para>
+ To distinguish targets build with different properties, they are put in
+ different directories. Rules for determining target paths are given below:
+ </para>
+
+ <orderedlist>
+ <listitem><simpara>
+ All targets are placed under a directory corresponding to the project
+ where they are defined.
+ </simpara></listitem>
+
+ <listitem><simpara>
+ Each non free, non incidental property causes an additional element to
+ be added to the target path. That element has the the form
+ <literal>&lt;feature-name&gt;-&lt;feature-value&gt;</literal> for
+ ordinary features and <literal>&lt;feature-value&gt;</literal> for
+ implicit ones. [TODO: Add note about composite features].
+ </simpara></listitem>
+
+ <listitem><simpara>
+ If the set of free, non incidental properties is different from the
+ set of free, non incidental properties for the project in which the main
+ target that uses the target is defined, a part of the form
+ <literal>main_target-&lt;name&gt;</literal> is added to the target path.
+ <emphasis role="bold">Note:</emphasis>It would be nice to completely
+ track free features also, but this appears to be complex and not
+ extremely needed.
+ </simpara></listitem>
+ </orderedlist>
+
+ <para>
+ For example, we might have these paths:
+<programlisting>
+debug/optimization-off
+debug/main-target-a
+</programlisting>
+ </para>
+ </section>
+ </section>
+
+ </appendix>
+
+<!--
+ Local Variables:
+ mode: xml
+ sgml-indent-data: t
+ sgml-parent-document: ("userman.xml" "chapter")
+ sgml-set-face: t
+ End:
+-->