diff options
Diffstat (limited to 'tools/build/doc/src/architecture.xml')
-rw-r--r-- | tools/build/doc/src/architecture.xml | 668 |
1 files changed, 668 insertions, 0 deletions
diff --git a/tools/build/doc/src/architecture.xml b/tools/build/doc/src/architecture.xml new file mode 100644 index 0000000000..0b22defef9 --- /dev/null +++ b/tools/build/doc/src/architecture.xml @@ -0,0 +1,668 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE appendix PUBLIC "-//Boost//DTD BoostBook XML V1.0//EN" + "http://www.boost.org/tools/boostbook/dtd/boostbook.dtd"> + + <appendix id="bbv2.arch"> + <title>Boost.Build v2 architecture</title> + + <sidebar> + <para> + This document is work-in progress. Do not expect much from it yet. + </para> + </sidebar> + + <section id="bbv2.arch.overview"> + <title>Overview</title> + + <!-- FIXME: the below does not mention engine at all, making rest of the + text confusing. Things like 'kernel' and 'util' don't have to be + mentioned at all. --> + <para> + Boost.Build implementation is structured in four different components: + "kernel", "util", "build" and "tools". The first two are relatively + uninteresting, so we will focus on the remaining pair. The "build" component + provides classes necessary to declare targets, determining which properties + should be used for their building, and creating the dependency graph. The + "tools" component provides user-visible functionality. It mostly allows + declaring specific kinds of main targets, as well as registering available + tools, which are then used when creating the dependency graph. + </para> + </section> + + <section id="bbv2.arch.build"> + <title>The build layer</title> + + <para> + The build layer has just four main parts -- metatargets (abstract + targets), virtual targets, generators and properties. + + <itemizedlist> + <listitem><para> + Metatargets (see the "targets.jam" module) represent all the + user-defined entities that can be built. The "meta" prefix signifies + that they do not need to correspond to exact files or even files at all + -- they can produce a different set of files depending on the build + request. Metatargets are created when Jamfiles are loaded. Each has a + <code>generate</code> method which is given a property set and produces + virtual targets for the passed properties. + </para></listitem> + <listitem><para> + Virtual targets (see the "virtual-targets.jam" module) correspond to + actual atomic updatable entities -- most typically files. + </para></listitem> + <listitem><para> + Properties are just (name, value) pairs, specified by the user and + describing how targets should be built. Properties are stored using the + <code>property-set</code> class. + </para></listitem> + <listitem><para> + Generators are objects that encapsulate specific tools -- they can + take a list of source virtual targets and produce new virtual targets + from them. + </para></listitem> + </itemizedlist> + + </para> + + <para> + The build process includes the following steps: + + <orderedlist> + <listitem><para> + Top-level code calls the <code>generate</code> method of a metatarget + with some properties. + </para></listitem> + + <listitem><para> + The metatarget combines the requested properties with its requirements + and passes the result, together with the list of sources, to the + <code>generators.construct</code> function. + </para></listitem> + + <listitem><para> + A generator appropriate for the build properties is selected and its + <code>run</code> method is called. The method returns a list of virtual + targets. + </para></listitem> + + <listitem><para> + The virtual targets are returned to the top level code, and for each instance, + the <literal>actualize</literal> method is called to setup nodes and updating + actions in the depenendency graph kepts inside Boost.Build engine. This dependency + graph is then updated, which runs necessary commands. + </para></listitem> + </orderedlist> + </para> + + <section id="bbv2.arch.build.metatargets"> + <title>Metatargets</title> + + <para> + There are several classes derived from "abstract-target". The + "main-target" class represents a top-level main target, the + "project-target" class acts like a container holding multiple main + targets, and "basic-target" class is a base class for all further target + types. + </para> + + <para> + Since each main target can have several alternatives, all top-level + target objects are actually containers, referring to "real" main target + classes. The type of that container is "main-target". For example, given: +<programlisting> +alias a ; +lib a : a.cpp : <toolset>gcc ; +</programlisting> + we would have one-top level "main-target" instance, containing one + "alias-target" and one "lib-target" instance. "main-target"'s "generate" + method decides which of the alternative should be used, and calls + "generate" on the corresponding instance. + </para> + + <para> + Each alternative is an instance of a class derived from "basic-target". + "basic-target.generate" does several things that should always be done: + + <itemizedlist> + <listitem><para> + Determines what properties should be used for building the target. + This includes looking at requested properties, requirements, and usage + requirements of all sources. + </para></listitem> + + <listitem><para> + Builds all sources. + </para></listitem> + + <listitem><para> + Computes usage requirements that should be passed back to targets + depending on this one. + </para></listitem> + </itemizedlist> + + For the real work of constructing a virtual target, a new method + "construct" is called. + </para> + + <para> + The "construct" method can be implemented in any way by classes derived + from "basic-target", but one specific derived class plays the central role + -- "typed-target". That class holds the desired type of file to be + produced, and its "construct" method uses the generators module to do the + actual work. + </para> + + <para> + This means that a specific metatarget subclass may avoid using + generators all together. However, this is deprecated and we are trying to + eliminate all such subclasses at the moment. + </para> + + <para> + Note that the <filename>build/targets.jam</filename> file contains an + UML diagram which might help. + </para> + </section> + + <section id="bbv2.arch.build.virtual"> + <title>Virtual targets</title> + + <para> + Virtual targets are atomic updatable entities. Each virtual + target can be assigned an updating action -- instance of the + <code>action</code> class. The action class, in turn, contains a list of + source targets, properties, and a name of an action which + should be executed. + </para> + + <para> + We try hard to never create equal instances of the + <code>virtual-target</code> class. Code creating virtual targets passes + them though the <code>virtual-target.register</code> function, which + detects if a target with the same name, sources, and properties has + already been created. In that case, the preexisting target is returned. + </para> + + <!-- FIXME: the below 2 para are rubbish, must be totally rewritten. --> + <para> + When all virtual targets are produced, they are "actualized". This means + that the real file names are computed, and the commands that should be run + are generated. This is done by the <code>virtual-target.actualize</code> + and <code>action.actualize</code> methods. The first is conceptually + simple, while the second needs additional explanation. Commands in Boost.Build + are generated in a two-stage process. First, a rule with an appropriate + name (for example "gcc.compile") is called and is given a list of target + names. The rule sets some variables, like "OPTIONS". After that, the + command string is taken, and variable are substitutes, so use of OPTIONS + inside the command string gets transformed into actual compile options. + </para> + + <para> + Boost.Build added a third stage to simplify things. It is now possible + to automatically convert properties to appropriate variable assignments. + For example, <debug-symbols>on would add "-g" to the OPTIONS + variable, without requiring to manually add this logic to gcc.compile. + This functionality is part of the "toolset" module. + </para> + + <para> + Note that the <filename>build/virtual-targets.jam</filename> file + contains an UML diagram which might help. + </para> + </section> + + <section id="bbv2.arch.build.properties"> + <title>Properties</title> + + <para> + Above, we noted that metatargets are built with a set of properties. + That set is represented by the <code>property-set</code> class. An + important point is that handling of property sets can get very expensive. + For that reason, we make sure that for each set of (name, value) pairs + only one <code>property-set</code> instance is created. The + <code>property-set</code> uses extensive caching for all operations, so + most work is avoided. The <code>property-set.create</code> is the factory + function used to create instances of the <code>property-set</code> class. + </para> + </section> + </section> + + <section id="bbv2.arch.tools"> + <title>The tools layer</title> + + <para>Write me!</para> + </section> + + <section id="bbv2.arch.targets"> + <title>Targets</title> + + <para>NOTE: THIS SECTION IS NOT EXPECTED TO BE READ! + There are two user-visible kinds of targets in Boost.Build. First are + "abstract" — they correspond to things declared by the user, e.g. + projects and executable files. The primary thing about abstract targets is + that it is possible to request them to be built with a particular set of + properties. Each property combination may possibly yield different built + files, so abstract target do not have a direct correspondence to built + files. + </para> + + <para> + File targets, on the other hand, are associated with concrete files. + Dependency graphs for abstract targets with specific properties are + constructed from file targets. User has no way to create file targets but + can specify rules for detecting source file types, as well as rules for + transforming between file targets of different types. That information is + used in constructing the final dependency graph, as described in the <link + linkend="bbv2.arch.depends">next section</link>. + <emphasis role="bold">Note:</emphasis>File targets are not the same entities + as Jam targets; the latter are created from file targets at the latest + possible moment. + <emphasis role="bold">Note:</emphasis>"File target" is an originally + proposed name for what we now call virtual targets. It is more + understandable by users, but has one problem: virtual targets can + potentially be "phony", and not correspond to any file. + </para> + </section> + + <section id="bbv2.arch.depends"> + <title>Dependency scanning</title> + + <para> + Dependency scanning is the process of finding implicit dependencies, like + "#include" statements in C++. The requirements for correct dependency + scanning mechanism are: + </para> + + <itemizedlist> + <listitem><simpara> + <link linkend="bbv2.arch.depends.different-scanning-algorithms">Support + for different scanning algorithms</link>. C++ and XML have quite different + syntax for includes and rules for looking up the included files. + </simpara></listitem> + + <listitem><simpara> + <link linkend="bbv2.arch.depends.same-file-different-scanners">Ability + to scan the same file several times</link>. For example, a single C++ file + may be compiled using different include paths. + </simpara></listitem> + + <listitem><simpara> + <link linkend="bbv2.arch.depends.dependencies-on-generated-files">Proper + detection of dependencies on generated files.</link> + </simpara></listitem> + + <listitem><simpara> + <link + linkend="bbv2.arch.depends.dependencies-from-generatedfiles">Proper + detection of dependencies from a generated file.</link> + </simpara></listitem> + </itemizedlist> + + <section id="bbv2.arch.depends.different-scanning-algorithms"> + <title>Support for different scanning algorithms</title> + + <para> + Different scanning algorithm are encapsulated by objects called + "scanners". Please see the "scanner" module documentation for more + details. + </para> + </section> + + <section id="bbv2.arch.depends.same-file-different-scanners"> + <title>Ability to scan the same file several times</title> + + <para> + As stated above, it is possible to compile a C++ file multiple times, + using different include paths. Therefore, include dependencies for those + compilations can be different. The problem is that Boost.Build engine does + not allow multiple scans of the same target. To solve that, we pass the + scanner object when calling <literal>virtual-target.actualize</literal> + and it creates different engine targets for different scanners. + </para> + + <para> + For each engine target created with a specified scanner, a + corresponding one is created without it. The updating action is + associated with the scanner-less target, and the target with the scanner + is made to depend on it. That way if sources for that action are touched, + all targets — with and without the scanner are considered outdated. + </para> + + <para> + Consider the following example: "a.cpp" prepared from "a.verbatim", + compiled by two compilers using different include paths and copied into + some install location. The dependency graph would look like: + </para> + +<programlisting> +a.o (<toolset>gcc) <--(compile)-- a.cpp (scanner1) ----+ +a.o (<toolset>msvc) <--(compile)-- a.cpp (scanner2) ----| +a.cpp (installed copy) <--(copy) ----------------------- a.cpp (no scanner) + ^ + | + a.verbose --------------------------------+ +</programlisting> + </section> + + <section id="bbv2.arch.depends.dependencies-on-generated-files"> + <title>Proper detection of dependencies on generated files.</title> + + <para> + This requirement breaks down to the following ones. + </para> + + <orderedlist> + <listitem><simpara> + If when compiling "a.cpp" there is an include of "a.h", the "dir" + directory is on the include path, and a target called "a.h" will be + generated in "dir", then Boost.Build should discover the include, and create + "a.h" before compiling "a.cpp". + </simpara></listitem> + + <listitem><simpara> + Since Boost.Build almost always generates targets under the "bin" + directory, this should be supported as well. I.e. in the scenario above, + Jamfile in "dir" might create a main target, which generates "a.h". The + file will be generated to "dir/bin" directory, but we still have to + recognize the dependency. + </simpara></listitem> + </orderedlist> + + <para> + The first requirement means that when determining what "a.h" means when + found in "a.cpp", we have to iterate over all directories in include + paths, checking for each one: + </para> + + <orderedlist> + <listitem><simpara> + If there is a file named "a.h" in that directory, or + </simpara></listitem> + + <listitem><simpara> + If there is a target called "a.h", which will be generated in that + that directory. + </simpara></listitem> + </orderedlist> + + <para> + Classic Jam has built-in facilities for point (1) above, but that is not + enough. It is hard to implement the right semantics without builtin + support. For example, we could try to check if there exists a target + called "a.h" somewhere in the dependency graph, and add a dependency to + it. The problem is that without a file search in the include path, the + semantics may be incorrect. For example, one can have an action that + generated some "dummy" header, for systems which do not have a native one. + Naturally, we do not want to depend on that generated header on platforms + where a native one is included. + </para> + + <para> + There are two design choices for builtin support. Suppose we have files + a.cpp and b.cpp, and each one includes header.h, generated by some action. + Dependency graph created by classic Jam would look like: + +<programlisting> +a.cpp -----> <scanner1>header.h [search path: d1, d2, d3] + + <d2>header.h --------> header.y + [generated in d2] + +b.cpp -----> <scanner2>header.h [search path: d1, d2, d4] +</programlisting> + </para> + + <para> + In this case, Jam thinks all header.h target are not related. The + correct dependency graph might be: + +<programlisting> +a.cpp ---- + \ + >----> <d2>header.h --------> header.y + / [generated in d2] +b.cpp ---- +</programlisting> + + or + +<programlisting> +a.cpp -----> <scanner1>header.h [search path: d1, d2, d3] + | + (includes) + V + <d2>header.h --------> header.y + [generated in d2] + ^ + (includes) + | +b.cpp -----> <scanner2>header.h [ search path: d1, d2, d4] +</programlisting> + </para> + + <para> + The first alternative was used for some time. The problem however is: + what include paths should be used when scanning header.h? The second + alternative was suggested by Matt Armstrong. It has a similar effect: Any + target depending on <scanner1>header.h will also depend on + <d2>header.h. This way though we now have two different targets with + two different scanners, so those targets can be scanned independently. The + first alternative's problem is avoided, so the second alternative is + implemented now. + </para> + + <para> + The second sub-requirements is that targets generated under the "bin" + directory are handled as well. Boost.Build implements a semi-automatic + approach. When compiling C++ files the process is: + </para> + + <orderedlist> + <listitem><simpara> + The main target to which the compiled file belongs to is found. + </simpara></listitem> + + <listitem><simpara> + All other main targets that the found one depends on are found. These + include: main targets used as sources as well as those specified as + "dependency" properties. + </simpara></listitem> + + <listitem><simpara> + All directories where files belonging to those main targets will be + generated are added to the include path. + </simpara></listitem> + </orderedlist> + + <para> + After this is done, dependencies are found by the approach explained + previously. + </para> + + <para> + Note that if a target uses generated headers from another main target, + that main target should be explicitly specified using the dependency + property. It would be better to lift this requirement, but it does not + seem to be causing any problems in practice. + </para> + + <para> + For target types other than C++, adding of include paths must be + implemented anew. + </para> + </section> + + <section id="bbv2.arch.depends.dependencies-from-generated-files"> + <title>Proper detection of dependencies from generated files</title> + + <para> + Suppose file "a.cpp" includes "a.h" and both are generated by some + action. Note that classic Jam has two stages. In the first stage the + dependency graph is built and actions to be run are determined. In the + second stage the actions are executed. Initially, neither file exists, so + the include is not found. As the result, Jam might attempt to compile + a.cpp before creating a.h, causing the compilation to fail. + </para> + + <para> + The solution in Boost.Jam is to perform additional dependency scans + after targets are updated. This breaks separation between build stages in + Jam — which some people consider a good thing — but I am not + aware of any better solution. + </para> + + <para> + In order to understand the rest of this section, you better read some + details about Jam's dependency scanning, available at <ulink url= + "http://public.perforce.com:8080/@md=d&cd=//public/jam/src/&ra=s&c=kVu@//2614?ac=10"> + this link</ulink>. + </para> + + <para> + Whenever a target is updated, Boost.Jam rescans it for includes. + Consider this graph, created before any actions are run. +<programlisting> +A -------> C ----> C.pro + / +B --/ C-includes ---> D +</programlisting> + </para> + + <para> + Both A and B have dependency on C and C-includes (the latter dependency + is not shown). Say during building we have tried to create A, then tried + to create C and successfully created C. + </para> + + <para> + In that case, the set of includes in C might well have changed. We do + not bother to detect precisely which includes were added or removed. + Instead we create another internal node C-includes-2. Then we determine + what actions should be run to update the target. In fact this means that + we perform the first stage logic when already in the execution stage. + </para> + + <para> + After actions for C-includes-2 are determined, we add C-includes-2 to + the list of A's dependents, and stage 2 proceeds as usual. Unfortunately, + we can not do the same with target B, since when it is not visited, C + target does not know B depends on it. So, we add a flag to C marking it as + rescanned. When visiting the B target, the flag is noticed and + C-includes-2 is added to the list of B's dependencies as well. + </para> + + <para> + Note also that internal nodes are sometimes updated too. Consider this + dependency graph: +<programlisting> +a.o ---> a.cpp + a.cpp-includes --> a.h (scanned) + a.h-includes ------> a.h (generated) + | + | + a.pro <-------------------------------------------+ +</programlisting> + </para> + + <para> + Here, our handling of generated headers come into play. Say that a.h + exists but is out of date with respect to "a.pro", then "a.h (generated)" + and "a.h-includes" will be marked for updating, but "a.h (scanned)" will + not. We have to rescan "a.h" after it has been created, but since "a.h + (generated)" has no associated scanner, it is only possible to rescan + "a.h" after "a.h-includes" target has been updated. + </para> + + <para> + The above consideration lead to the decision to rescan a target whenever + it is updated, no matter if it is internal or not. + </para> + + </section> + </section> + + <warning> + <para> + The remainder of this document is not intended to be read at all. This + will be rearranged in the future. + </para> + </warning> + + <section> + <title>File targets</title> + + <para> + As described above, file targets correspond to files that Boost.Build + manages. Users may be concerned about file targets in three ways: when + declaring file target types, when declaring transformations between types + and when determining where a file target is to be placed. File targets can + also be connected to actions that determine how the target is to be created. + Both file targets and actions are implemented in the + <literal>virtual-target</literal> module. + </para> + + <section> + <title>Types</title> + + <para> + A file target can be given a type, which determines what transformations + can be applied to the file. The <literal>type.register</literal> rule + declares new types. File type can also be assigned a scanner, which is + then used to find implicit dependencies. See "<link + linkend="bbv2.arch.depends">dependency scanning</link>". + </para> + </section> + + <section> + <title>Target paths</title> + + <para> + To distinguish targets build with different properties, they are put in + different directories. Rules for determining target paths are given below: + </para> + + <orderedlist> + <listitem><simpara> + All targets are placed under a directory corresponding to the project + where they are defined. + </simpara></listitem> + + <listitem><simpara> + Each non free, non incidental property causes an additional element to + be added to the target path. That element has the the form + <literal><feature-name>-<feature-value></literal> for + ordinary features and <literal><feature-value></literal> for + implicit ones. [TODO: Add note about composite features]. + </simpara></listitem> + + <listitem><simpara> + If the set of free, non incidental properties is different from the + set of free, non incidental properties for the project in which the main + target that uses the target is defined, a part of the form + <literal>main_target-<name></literal> is added to the target path. + <emphasis role="bold">Note:</emphasis>It would be nice to completely + track free features also, but this appears to be complex and not + extremely needed. + </simpara></listitem> + </orderedlist> + + <para> + For example, we might have these paths: +<programlisting> +debug/optimization-off +debug/main-target-a +</programlisting> + </para> + </section> + </section> + + </appendix> + +<!-- + Local Variables: + mode: xml + sgml-indent-data: t + sgml-parent-document: ("userman.xml" "chapter") + sgml-set-face: t + End: +--> |