1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>An efficient polymorphic data structure</title>
<link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">
<link rel="up" href="../poly_collection.html" title="Chapter 27. Boost.PolyCollection">
<link rel="prev" href="../poly_collection.html" title="Chapter 27. Boost.PolyCollection">
<link rel="next" href="tutorial.html" title="Tutorial">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td>
<td align="center"><a href="../../../index.html">Home</a></td>
<td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../poly_collection.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../poly_collection.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorial.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="poly_collection.an_efficient_polymorphic_data_st"></a><a class="link" href="an_efficient_polymorphic_data_st.html" title="An efficient polymorphic data structure">An efficient
polymorphic data structure</a>
</h2></div></div></div>
<p>
Suppose we have a <code class="computeroutput"><span class="identifier">base</span></code> abstract
class with implementations <code class="computeroutput"><span class="identifier">derived1</span></code>,
<code class="computeroutput"><span class="identifier">derived2</span></code> and <code class="computeroutput"><span class="identifier">derived3</span></code>. The memory layout of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span></code> (or similar constructs such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">unique_ptr</span><span class="special"><</span><span class="identifier">base</span><span class="special">>></span></code>
or <a href="../../../libs/ptr_container/" target="_top"><code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">ptr_vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">></span></code></a>) looks like the following:
</p>
<p>
<span class="inlinemediaobject"><img src="img/ptr_vector.png"></span>
</p>
<p>
Elements that are adjacent in the vector are not necessarily allocated contiguously,
much less so if the vector has undergone mid insertions and deletions. A typical
processing operation
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span> <span class="identifier">v</span><span class="special">;</span>
<span class="special">...</span>
<span class="keyword">for</span><span class="special">(</span><span class="identifier">base</span><span class="special">*</span> <span class="identifier">b</span><span class="special">:</span> <span class="identifier">v</span><span class="special">){</span>
<span class="special">...</span> <span class="comment">// access base's virtual interface</span>
<span class="special">}</span>
</pre>
<p>
is impacted negatively by two factors:
</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
Scattering of elements throughout memory reduces CPU caching efficiency,
which in general favor regular access loops to contiguous memory areas.
</li>
<li class="listitem">
<a href="https://en.wikipedia.org/wiki/Branch_predictor" target="_top">Branch prediction</a>
tries to minimize the effect of running conditional code (such as an <code class="computeroutput"><span class="keyword">if</span></code>-<code class="computeroutput"><span class="keyword">else</span></code>
statement or the invocation of a <code class="computeroutput"><span class="identifier">base</span></code>
virtual function) by speculatively executing a given branch based on past
history. This mechanism is rendered mostly useless when <code class="computeroutput"><span class="identifier">derived1</span></code>,
<code class="computeroutput"><span class="identifier">derived2</span></code> and <code class="computeroutput"><span class="identifier">derived3</span></code> elements are interspersed along
the sequence without a definite pattern.
</li>
</ul></div>
<p>
These limitations are imposed by the very nature of dynamic polymorphism: as
the exact types of the elements accessed through <code class="computeroutput"><span class="identifier">base</span></code>'s
interface are not known, an indirection through <code class="computeroutput"><span class="identifier">base</span><span class="special">*</span></code> (a particular form of <span class="emphasis"><em>type erasure</em></span>)
is required. There is however a critical observation: even though derived types
are not known when traversing a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span></code>, the information is typically available
<span class="emphasis"><em>at compile time</em></span> at the point of insertion in the vector:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span> <span class="identifier">v</span><span class="special">;</span>
<span class="special">...</span>
<span class="identifier">v</span><span class="special">.</span><span class="identifier">insert</span><span class="special">(</span><span class="keyword">new</span> <span class="identifier">derived2</span><span class="special">{...});</span> <span class="comment">// the type derived2 is "forgotten" by v</span>
</pre>
<p>
A suitably designed container can take advantage of this information to arrange
elements contiguously according to their exact type, which results in an internal
data structure (a map of pointers to <a href="http://en.cppreference.com/w/cpp/types/type_info" target="_top"><code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">type_info</span></code></a>
objects, actually) pointing to as many vectors or <span class="emphasis"><em>segments</em></span>
as there are derived classes:
</p>
<p>
<span class="inlinemediaobject"><img src="img/segment_map.png"></span>
</p>
<p>
Traversing such a structure reduces to looping over all the segments one after
another: this is extremely efficient both in terms of caching and branch prediction.
In the process we have however lost the free-order capability of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">base</span><span class="special">*></span></code> (free order can only be retained at the
segment level), but if this is not relevant to the user application the potential
performance gains of switching to this structure are large.
</p>
<p>
The discussion has focused on base/derived programming, also known as OOP,
but it also applies to other forms of dynamic polymorphism:
</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
<a href="http://en.cppreference.com/w/cpp/utility/functional/function" target="_top"><code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">function</span></code></a> abstracts callable entities
with the same given signature under a common interface. Internally, pointer
indirections and virtual-like function calls are used. Memory fragmentation
is expected to be lower than with OOP, though, as implementations usually
feature the so-called <a href="http://talesofcpp.fusionfenix.com/post-16/episode-nine-erasing-the-concrete#a-note-on-small-buffer-optimization" target="_top"><span class="emphasis"><em>small
buffer optimization</em></span></a> to avoid heap allocation in some
situations.
</li>
<li class="listitem">
The case of <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">function</span></code> can be seen as a particular
example of a more general form of polymorphism called <a href="https://en.wikipedia.org/wiki/Duck_typing" target="_top"><span class="emphasis"><em>duck
typing</em></span></a>, where unrelated types are treated uniformly
if they conform to the same given <span class="emphasis"><em>interface</em></span> (a specified
set of member functions and/or operations). Duck typing provides the power
of OOP while allowing for greater flexibility as polymorphic types need
not derive from a preexisting base class or in general be designed with
any particular interface in mind --in fact, the same object can be duck-typed
to different interfaces. Among other libraries, <a href="../../../libs/type_erasure" target="_top">Boost.TypeErasure</a>
provides duck typing for C++. Under the hood, duck typing requires pointer
indirection and virtual call implementation techniques analogous to those
of OOP, and so there are the same opportunities for efficient container
data structures as we have described.
</li>
</ul></div>
<p>
Boost.PolyCollection provides three different container class templates dealing
with OOP, <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">function</span></code>-like polymorphism and duck typing
as implemented by Boost.TypeErasure:
</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
<code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">base_collection</span></code>
</li>
<li class="listitem">
<code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">function_collection</span></code>
</li>
<li class="listitem">
<code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">any_collection</span></code>
</li>
</ul></div>
<p>
The interfaces of these containers are mostly the same and follow the usual
conventions of standard library containers.
</p>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright © 2016-2018 Joaquín M López
Muñoz<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../poly_collection.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../poly_collection.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tutorial.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>
|