diff options
author | jbj <devnull@localhost> | 2001-03-21 18:33:35 +0000 |
---|---|---|
committer | jbj <devnull@localhost> | 2001-03-21 18:33:35 +0000 |
commit | 731946f4b90eb1173452dd30f1296dd825155d82 (patch) | |
tree | 67535f54ecb7e5463c06e62044e4efd84ae0291d /db/docs/ref | |
parent | 7ed904da030dc4640ff9bce8458ba07cc09d830d (diff) | |
download | rpm-731946f4b90eb1173452dd30f1296dd825155d82.tar.gz rpm-731946f4b90eb1173452dd30f1296dd825155d82.tar.bz2 rpm-731946f4b90eb1173452dd30f1296dd825155d82.zip |
Initial revision
CVS patchset: 4644
CVS date: 2001/03/21 18:33:35
Diffstat (limited to 'db/docs/ref')
268 files changed, 44765 insertions, 0 deletions
diff --git a/db/docs/ref/am/close.html b/db/docs/ref/am/close.html new file mode 100644 index 000000000..04b8beacb --- /dev/null +++ b/db/docs/ref/am/close.html @@ -0,0 +1,43 @@ +<!--$Id: close.so,v 10.15 2000/12/18 21:05:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Closing a database</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/cursor.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Closing a database</h1> +<p>The <a href="../../api_c/db_close.html">DB->close</a> function is the standard interface for closing the database. +By default, <a href="../../api_c/db_close.html">DB->close</a> also flushes all modified records from the +database cache to disk. +<p>There is one flag that you can set to customize <a href="../../api_c/db_close.html">DB->close</a>: +<p><dl compact> +<p><dt><a href="../../api_c/db_close.html#DB_NOSYNC">DB_NOSYNC</a><dd>Do not flush cached information to disk. +</dl> +<b>It is important to understand that flushing cached information +to disk only minimizes the window of opportunity for corrupted data, it +does not eliminate the possibility.</b> +<p>While unlikely, it is possible for database corruption to happen if a +system or application crash occurs while writing data to the database. To +ensure that database corruption never occurs, applications must either: +<ul type=disc> +<li>Use transactions and logging with automatic recovery. +<li>Use logging and application-specific recovery. +<li>Edit a copy of the database, and, once all applications +using the database have successfully called <a href="../../api_c/db_close.html">DB->close</a>, use +system operations (e.g., the POSIX rename system call) to atomically +replace the original database with the updated copy. +</ul> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/cursor.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/count.html b/db/docs/ref/am/count.html new file mode 100644 index 000000000..92282641b --- /dev/null +++ b/db/docs/ref/am/count.html @@ -0,0 +1,28 @@ +<!--$Id: count.so,v 1.3 2000/12/18 21:05:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Data item count</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/join.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curclose.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Data item count</h1> +<p>Once a cursor has been initialized to reference a particular key in the +database, it can be used to determine the number of data items that are +stored for any particular key. The <a href="../../api_c/dbc_count.html">DBcursor->c_count</a> method returns +this number of data items. The returned value is always one, unless +the database supports duplicate data items, in which case it may be any +number of items. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/join.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curclose.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/curclose.html b/db/docs/ref/am/curclose.html new file mode 100644 index 000000000..52ccfeb8c --- /dev/null +++ b/db/docs/ref/am/curclose.html @@ -0,0 +1,28 @@ +<!--$Id: curclose.so,v 10.12 2000/12/13 16:48:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Closing a cursor</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/count.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/stability.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Closing a cursor</h1> +<p>The <a href="../../api_c/dbc_close.html">DBcursor->c_close</a> function is the standard interface for closing a cursor, +after which the cursor may no longer be used. Although cursors are +implicitly closed when the database they point to are closed, it is good +programming practice to explicitly close cursors. In addition, in +transactional systems, cursors may not exist outside of a transaction and +so must be explicitly closed. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/count.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/stability.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/curdel.html b/db/docs/ref/am/curdel.html new file mode 100644 index 000000000..b0fe8f957 --- /dev/null +++ b/db/docs/ref/am/curdel.html @@ -0,0 +1,26 @@ +<!--$Id: curdel.so,v 10.11 2000/03/18 21:43:07 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Deleting records with a cursor</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/curput.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curdup.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Deleting records with a cursor</h1> +<p>The <a href="../../api_c/dbc_del.html">DBcursor->c_del</a> function is the standard interface for deleting records from +the database using a cursor. The <a href="../../api_c/dbc_del.html">DBcursor->c_del</a> function deletes the record +currently referenced by the cursor. In all cases, the cursor position is +unchanged after a delete. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/curput.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curdup.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/curdup.html b/db/docs/ref/am/curdup.html new file mode 100644 index 000000000..6c609b2e5 --- /dev/null +++ b/db/docs/ref/am/curdup.html @@ -0,0 +1,34 @@ +<!--$Id: curdup.so,v 11.5 2000/12/19 14:45:39 sue Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Duplicating a cursor</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/curdel.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/join.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Duplicating a cursor</h1> +<p>Once a cursor has been initialized (e.g., by a call to <a href="../../api_c/dbc_get.html">DBcursor->c_get</a>), +it can be thought of as identifying a particular location in a database. +The <a href="../../api_c/dbc_dup.html">DBcursor->c_dup</a> function permits an application to create a new cursor that +has the same locking and transactional information as the cursor from +which it is copied, and which optionally refers to the same position in +the database. +<p>In order to maintain a cursor position when an application is using +locking, locks are maintained on behalf of the cursor until the cursor is +closed. In cases when an application is using locking without +transactions, cursor duplication is often required to avoid +self-deadlocks. For further details, refer to +<a href="../../ref/lock/am_conv.html">Access method locking conventions</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/curdel.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/join.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/curget.html b/db/docs/ref/am/curget.html new file mode 100644 index 000000000..129fa272b --- /dev/null +++ b/db/docs/ref/am/curget.html @@ -0,0 +1,74 @@ +<!--$Id: curget.so,v 10.14 2000/12/18 21:05:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Retrieving records with a cursor</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/cursor.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curput.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Retrieving records with a cursor</h1> +<p>The <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> function is the standard interface for retrieving records from +the database with a cursor. The <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> function takes a flag which +controls how the cursor is positioned within the database and returns the +key/data item associated with that positioning. Similar to +<a href="../../api_c/db_get.html">DB->get</a>, <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> may also take a supplied key and retrieve +the data associated with that key from the database. There are several +flags that you can set to customize retrieval. +<h3>Cursor position flags</h3> +<p><dl compact> +<p><dt><a href="../../api_c/dbc_get.html#DB_FIRST">DB_FIRST</a>, <a href="../../api_c/dbc_get.html#DB_LAST">DB_LAST</a><dd>Return the first (last) record in the database. +<p><dt><a href="../../api_c/dbc_get.html#DB_NEXT">DB_NEXT</a>, <a href="../../api_c/dbc_get.html#DB_PREV">DB_PREV</a><dd>Return the next (previous) record in the database. +<p><dt><a href="../../api_c/dbc_get.html#DB_NEXT_DUP">DB_NEXT_DUP</a><dd>Return the next record in the database, if it is a duplicate data item +for the current key. +<p><dt><a href="../../api_c/dbc_get.html#DB_NEXT_NODUP">DB_NEXT_NODUP</a>, <a href="../../api_c/dbc_get.html#DB_PREV_NODUP">DB_PREV_NODUP</a><dd>Return the next (previous) record in the database that is not a +duplicate data item for the current key. +<p><dt><a href="../../api_c/dbc_get.html#DB_CURRENT">DB_CURRENT</a><dd>Return the record from the database currently referenced by the +cursor. +</dl> +<h3>Retrieving specific key/data pairs</h3> +<p><dl compact> +<p><dt><a href="../../api_c/dbc_get.html#DB_SET">DB_SET</a><dd>Return the record from the database that matches the supplied key. In +the case of duplicates the first duplicate is returned and the cursor +is positioned at the beginning of the duplicate list. The user can then +traverse the duplicate entries for the key. +<p><dt><a href="../../api_c/dbc_get.html#DB_SET_RANGE">DB_SET_RANGE</a><dd>Return the smallest record in the database greater than or equal to the +supplied key. This functionality permits partial key matches and range +searches in the Btree access method. +<p><dt><a href="../../api_c/db_get.html#DB_GET_BOTH">DB_GET_BOTH</a><dd>Return the record from the database that matches both the supplied key +and data items. This is particularly useful when there are large +numbers of duplicate records for a key, as it allows the cursor to +easily be positioned at the correct place for traversal of some part of +a large set of duplicate records. +</dl> +<h3>Retrieving based on record numbers</h3> +<p><dl compact> +<p><dt><a href="../../api_c/db_get.html#DB_SET_RECNO">DB_SET_RECNO</a><dd>If the underlying database is a Btree, and was configured so that it is +possible to search it by logical record number, retrieve a specific +record based on a record number argument. +<p><dt><a href="../../api_c/dbc_get.html#DB_GET_RECNO">DB_GET_RECNO</a><dd>If the underlying database is a Btree, and was configured so that it is +possible to search it by logical record number, return the record number +for the record referenced by the cursor. +</dl> +<h3>Special-purpose flags</h3> +<p><dl compact> +<p><dt><a href="../../api_c/db_get.html#DB_CONSUME">DB_CONSUME</a><dd>Read-and-delete: the first record (the head) of the queue is returned and +deleted. The underlying database must be a Queue. +<p><dt><a href="../../api_c/dbc_get.html#DB_RMW">DB_RMW</a><dd>Read-modify-write: acquire write locks instead of read locks during +retrieval. This can enhance performance in threaded applications by +reducing the chance of deadlock. +</dl> +<p>In all cases, the cursor is repositioned by a <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> operation +to point to the newly-returned key/data pair in the database. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/cursor.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curput.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/curput.html b/db/docs/ref/am/curput.html new file mode 100644 index 000000000..0d5ef2725 --- /dev/null +++ b/db/docs/ref/am/curput.html @@ -0,0 +1,40 @@ +<!--$Id: curput.so,v 10.12 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Storing records with a cursor</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/curget.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curdel.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Storing records with a cursor</h1> +<p>The <a href="../../api_c/dbc_put.html">DBcursor->c_put</a> function is the standard interface for storing records into +the database with a cursor. In general, <a href="../../api_c/dbc_put.html">DBcursor->c_put</a> takes a key and +inserts the associated data into the database, at a location controlled +by a specified flag. +<p>There are several flags that you can set to customize storage: +<p><dl compact> +<p><dt><a href="../../api_c/dbc_put.html#DB_AFTER">DB_AFTER</a><dd>Create a new record, immediately after the record currently referenced by +the cursor. +<p><dt><a href="../../api_c/dbc_put.html#DB_BEFORE">DB_BEFORE</a><dd>Create a new record, immediately before the record currently referenced by +the cursor. +<p><dt><a href="../../api_c/dbc_put.html#DB_CURRENT_PUT">DB_CURRENT</a><dd>Replace the data part of the record currently referenced by the cursor. +<p><dt><a href="../../api_c/dbc_put.html#DB_KEYFIRST">DB_KEYFIRST</a><dd>Create a new record as the first of the duplicate records for the +supplied key. +<p><dt><a href="../../api_c/dbc_put.html#DB_KEYLAST">DB_KEYLAST</a><dd>Create a new record, as the last of the duplicate records for the supplied +key. +</dl> +<p>In all cases, the cursor is repositioned by a <a href="../../api_c/dbc_put.html">DBcursor->c_put</a> operation +to point to the newly inserted key/data pair in the database. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/curget.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curdel.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/cursor.html b/db/docs/ref/am/cursor.html new file mode 100644 index 000000000..529285b4a --- /dev/null +++ b/db/docs/ref/am/cursor.html @@ -0,0 +1,41 @@ +<!--$Id: cursor.so,v 10.15 2000/12/18 21:05:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Database cursors</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/close.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curget.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Database cursors</h1> +<p>A database cursor is a reference to a single key/data pair in the +database. It supports traversal of the database and is the only way to +access individual duplicate data items. Cursors are used for operating +on collections of records, for iterating over a database, and for saving +handles to individual records, so that they can be modified after they +have been read. +<p>The <a href="../../api_c/db_cursor.html">DB->cursor</a> function is the standard interface for opening a cursor +into a database. Upon return the cursor is uninitialized, positioning +occurs as part of the first cursor operation. +<p>Once a database cursor has been opened, there are a set of access method +operations that can be performed. Each of these operations is performed +using a method referenced from the returned cursor handle. +<p><dl compact> +<dt><a href="../../api_c/dbc_close.html">DBcursor->c_close</a><dd>Close the cursor +<dt><a href="../../api_c/dbc_del.html">DBcursor->c_del</a><dd>Delete a record +<dt><a href="../../api_c/dbc_dup.html">DBcursor->c_dup</a><dd>Duplicate a cursor +<dt><a href="../../api_c/dbc_get.html">DBcursor->c_get</a><dd>Retrieve a record +<dt><a href="../../api_c/dbc_put.html">DBcursor->c_put</a><dd>Store a record +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/close.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/curget.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/delete.html b/db/docs/ref/am/delete.html new file mode 100644 index 000000000..8ab612fa4 --- /dev/null +++ b/db/docs/ref/am/delete.html @@ -0,0 +1,28 @@ +<!--$Id: delete.so,v 10.14 2000/03/18 21:43:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Deleting records</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/sync.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Deleting records</h1> +<p>The <a href="../../api_c/db_del.html">DB->del</a> function is the standard interface for deleting records from +the database. In general, <a href="../../api_c/db_del.html">DB->del</a> takes a key and deletes the +data item associated with it from the database. +<p>If the database has been configured to support duplicate records, the +<a href="../../api_c/db_del.html">DB->del</a> function will remove all of the duplicate records. To remove +individual duplicate records, you must use a Berkeley DB cursor interface. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/sync.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/error.html b/db/docs/ref/am/error.html new file mode 100644 index 000000000..737e6d662 --- /dev/null +++ b/db/docs/ref/am/error.html @@ -0,0 +1,61 @@ +<!--$Id: error.so,v 10.14 2000/12/18 21:05:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Error support</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/verify.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/bigpic.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Error support</h1> +<p>Berkeley DB offers programmatic support for displaying error return values. +<p>The <a href="../../api_c/env_strerror.html">db_strerror</a> interface returns a pointer to the error +message corresponding to any Berkeley DB error return, similar to the ANSI C +strerror interface, but is able to handle both system error returns and +Berkeley DB specific return values. +<p>For example: +<p><blockquote><pre>int ret; +if ((ret = dbp->put(dbp, NULL, &key, &data, 0)) != 0) { + fprintf(stderr, "put failed: %s\n", db_strerror(ret)); + return (1); +} +</pre></blockquote> +<p>There are also two additional error interfaces, <a href="../../api_c/db_err.html">DB->err</a> and +<a href="../../api_c/db_err.html">DB->errx</a>. These interfaces work like the ANSI C X3.159-1989 (ANSI C) printf +interface, taking a printf-style format string and argument list, and +writing a message constructed from the format string and arguments. +<p>The <a href="../../api_c/db_err.html">DB->err</a> function appends the standard error string to the constructed +message, the <a href="../../api_c/db_err.html">DB->errx</a> function does not. These interfaces provide simpler +ways of displaying Berkeley DB error messages. For example, if your application +tracks session IDs in a variable called session_id, it can include that +information in its error messages: +<p>Error messages can additionally be configured to always include a prefix +(e.g., the program name) using the <a href="../../api_c/db_set_errpfx.html">DB->set_errpfx</a> interface. +<p><blockquote><pre>#define DATABASE "access.db" +int ret; +dbp->errpfx(dbp, argv0); +if ((ret = + dbp->open(dbp, DATABASE, DB_BTREE, DB_CREATE, 0664)) != 0) { + dbp->err(dbp, ret, "%s", DATABASE); + dbp->errx(dbp, + "contact your system administrator: session ID was %d", + session_id); + return (1); +} +</pre></blockquote> +<p>For example, if the program was called my_app, and the open call returned +an EACCESS system error, the error messages shown would appear as follows: +<p><blockquote><pre>my_app: access.db: Permission denied. +my_app: contact your system administrator: session ID was 14</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/verify.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/bigpic.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/get.html b/db/docs/ref/am/get.html new file mode 100644 index 000000000..fda7a8eb2 --- /dev/null +++ b/db/docs/ref/am/get.html @@ -0,0 +1,39 @@ +<!--$Id: get.so,v 10.15 2000/12/18 21:05:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Retrieving records</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/upgrade.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Retrieving records</h1> +<p>The <a href="../../api_c/db_get.html">DB->get</a> function is the standard interface for retrieving records from +the database. In general, <a href="../../api_c/db_get.html">DB->get</a> takes a key and returns the +associated data from the database. +<p>There are a few flags that you can set to customize retrieval: +<p><dl compact> +<p><dt><a href="../../api_c/db_get.html#DB_GET_BOTH">DB_GET_BOTH</a><dd>Search for a matching key and data item, i.e., only return success if both +the key and the data items match those stored in the database. +<p><dt><a href="../../api_c/dbc_get.html#DB_RMW">DB_RMW</a><dd>Read-modify-write: acquire write locks instead of read locks during +retrieval. This can enhance performance in threaded applications by +reducing the chance of deadlock. +<p><dt><a href="../../api_c/db_get.html#DB_SET_RECNO">DB_SET_RECNO</a><dd>If the underlying database is a Btree, and was configured so that it +is possible to search it by logical record number, retrieve a specific +record. +</dl> +<p>If the database has been configured to support duplicate records, +<a href="../../api_c/db_get.html">DB->get</a> will always return the first data item in the duplicate +set. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/upgrade.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/join.html b/db/docs/ref/am/join.html new file mode 100644 index 000000000..9d4dcdd09 --- /dev/null +++ b/db/docs/ref/am/join.html @@ -0,0 +1,184 @@ +<!--$Id: join.so,v 10.21 2000/12/18 21:05:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Logical join</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/curdup.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/count.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Logical join</h1> +<p>A logical join is a method of retrieving data from a primary database +using criteria stored in a set of secondary indexes. A logical join +requires that your data be organized as a primary database which +contains the primary key and primary data field, and a set of secondary +indexes. Each of the secondary indexes is indexed by a different +secondary key, and, for each key in a secondary index, there is a set +of duplicate data items that match the primary keys in the primary +database. +<p>For example, let's assume the need for an application that will return +the names of stores in which one can buy fruit of a given color. We +would first construct a primary database that lists types of fruit as +the key item, and the store where you can buy them as the data item: +<p><blockquote><pre><b>Primary key:</b> <b>Primary data:</b> +apple Convenience Store +blueberry Farmer's Market +peach Shopway +pear Farmer's Market +raspberry Shopway +strawberry Farmer's Market</pre></blockquote> +<p>We would then create a secondary index with the key <b>color</b>, and, +as the data items, the names of fruits of different colors. +<p><blockquote><pre><b>Secondary key:</b> <b>Secondary data:</b> +blue blueberry +red apple +red raspberry +red strawberry +yellow peach +yellow pear</pre></blockquote> +<p>This secondary index would allow an application to look up a color, and +then use the data items to look up the stores where the colored fruit +could be purchased. For example, by first looking up <b>blue</b>, +the data item <b>blueberry</b> could be used as the lookup key in the +primary database, returning <b>Farmer's Market</b>. +<p>Your data must be organized in the following manner in order to use the +<a href="../../api_c/db_join.html">DB->join</a> function: +<p><ol> +<p><li>The actual data should be stored in the database represented by the +DB object used to invoke this function. Generally, this +DB object is called the <i>primary</i>. +<p><li>Secondary indexes should be stored in separate databases, whose keys +are the values of the secondary indexes and whose data items are the +primary keys corresponding to the records having the designated +secondary key value. It is acceptable (and expected) that there may be +duplicate entries in the secondary indexes. +<p>These duplicate entries should be sorted for performance reasons, although +it is not required. For more information see the <a href="../../api_c/db_set_flags.html#DB_DUPSORT">DB_DUPSORT</a> flag +to the <a href="../../api_c/db_set_flags.html">DB->set_flags</a> function. +</ol> +<p>What the <a href="../../api_c/db_join.html">DB->join</a> function does is review a list of secondary keys, and, +when it finds a data item that appears as a data item for all of the +secondary keys, it uses that data items as a lookup into the primary +database, and returns the associated data item. +<p>If there were a another secondary index that had as its key the +<b>cost</b> of the fruit, a similar lookup could be done on stores +where inexpensive fruit could be purchased: +<p><blockquote><pre><b>Secondary key:</b> <b>Secondary data:</b> +expensive blueberry +expensive peach +expensive pear +expensive strawberry +inexpensive apple +inexpensive pear +inexpensive raspberry</pre></blockquote> +<p>The <a href="../../api_c/db_join.html">DB->join</a> function provides logical join functionality. While not +strictly cursor functionality, in that it is not a method off a cursor +handle, it is more closely related to the cursor operations than to the +standard DB operations. +<p>It is also possible to do lookups based on multiple criteria in a single +operation, e.g., it is possible to look up fruits that are both red and +expensive in a single operation. If the same fruit appeared as a data +item in both the color and expense indexes, then that fruit name would +be used as the key for retrieval from the primary index, and would then +return the store where expensive, red fruit could be purchased. +<h3>Example</h3> +<p>Consider the following three databases: +<p><dl compact> +<p><dt>personnel<dd><ul type=disc> +<li>key = SSN +<li>data = record containing name, address, phone number, job title +</ul> +<p><dt>lastname<dd><ul type=disc> +<li>key = lastname +<li>data = SSN +</ul> +<p><dt>jobs<dd><ul type=disc> +<li>key = job title +<li>data = SSN +</ul> +</dl> +<p>Consider the following query: +<p><blockquote><pre>Return the personnel records of all people named smith with the job +title manager.</pre></blockquote> +<p>This query finds are all the records in the primary database (personnel) +for whom the criteria <b>lastname=smith and job title=manager</b> is +true. +<p>Assume that all databases have been properly opened and have the handles: +pers_db, name_db, job_db. We also assume that we have an active +transaction referenced by the handle txn. +<p><blockquote><pre>DBC *name_curs, *job_curs, *join_curs; +DBC *carray[3]; +DBT key, data; +int ret, tret; +<p> +name_curs = NULL; +job_curs = NULL; +memset(&key, 0, sizeof(key)); +memset(&data, 0, sizeof(data)); +<p> +if ((ret = + name_db->cursor(name_db, txn, &name_curs)) != 0) + goto err; +key.data = "smith"; +key.size = sizeof("smith"); +if ((ret = + name_curs->c_get(name_curs, &key, &data, DB_SET)) != 0) + goto err; +<p> +if ((ret = job_db->cursor(job_db, txn, &job_curs)) != 0) + goto err; +key.data = "manager"; +key.size = sizeof("manager"); +if ((ret = + job_curs->c_get(job_curs, &key, &data, DB_SET)) != 0) + goto err; +<p> +carray[0] = name_curs; +carray[1] = job_curs; +carray[2] = NULL; +<p> +if ((ret = + pers_db->join(pers_db, carray, &join_curs, 0)) != 0) + goto err; +while ((ret = + join_curs->c_get(join_curs, &key, &data, 0)) == 0) { + /* Process record returned in key/data. */ +} +<p> +/* + * If we exited the loop because we ran out of records, + * then it has completed successfully. + */ +if (ret == DB_NOTFOUND) + ret = 0; +<p> +err: +if (join_curs != NULL && + (tret = join_curs->c_close(join_curs)) != 0 && ret == 0) + ret = tret; +if (name_curs != NULL && + (tret = name_curs->c_close(name_curs)) != 0 && ret == 0) + ret = tret; +if (job_curs != NULL && + (tret = job_curs->c_close(job_curs)) != 0 && ret == 0) + ret = tret; +<p> +return (ret); +</pre></blockquote> +<p>The name cursor is positioned at the beginning of the duplicate list +for <b>smith</b> and the job cursor is placed at the beginning of +the duplicate list for <b>manager</b>. The join cursor is returned +from the logical join call. This code then loops over the join cursor +getting the personnel records of each one until there are no more. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/curdup.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/count.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/open.html b/db/docs/ref/am/open.html new file mode 100644 index 000000000..01c45339e --- /dev/null +++ b/db/docs/ref/am/open.html @@ -0,0 +1,47 @@ +<!--$Id: open.so,v 10.15 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Opening a database</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/ops.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/opensub.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Opening a database</h1> +<p>The <a href="../../api_c/db_open.html">DB->open</a> function is the standard interface for opening a database, +and takes five arguments: +<p><dl compact> +<p><dt>file<dd>The name of the file to be opened. +<p><dt>database<dd>An optional database name. +<p><dt>type<dd>The type of database to open. This value will be one of the four access +methods Berkeley DB supports: DB_BTREE, DB_HASH, DB_QUEUE or DB_RECNO, or the +special value DB_UNKNOWN, which allows you to open an existing file +without knowing its type. +<p><dt>mode<dd>The permissions to give to any created file. +</dl> +<p>There are a few flags that you can set to customize open: +<p><dl compact> +<p><dt><a href="../../api_c/env_open.html#DB_CREATE">DB_CREATE</a><dd>Create the underlying database and any necessary physical files. +<p><dt><a href="../../api_c/env_open.html#DB_NOMMAP">DB_NOMMAP</a><dd>Do not map this database into process memory. +<p><dt><a href="../../api_c/db_open.html#DB_RDONLY">DB_RDONLY</a><dd>Treat the data base as readonly. +<p><dt><a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a><dd>The returned handle is free-threaded, that is, it can be used +simultaneously by multiple threads within the process. +<p><dt><a href="../../api_c/db_open.html#DB_TRUNCATE">DB_TRUNCATE</a><dd>Physically truncate the underlying database file, discarding all +databases it contained. Underlying filesystem primitives are used to +implement this flag. For this reason it is only applicable to the +physical file and cannot be used to discard individual databases from +within physical files. +<p><dt><a href="../../api_c/db_set_feedback.html#DB_UPGRADE">DB_UPGRADE</a><dd>Upgrade the database format as necessary. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/ops.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/opensub.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/opensub.html b/db/docs/ref/am/opensub.html new file mode 100644 index 000000000..066ca4b79 --- /dev/null +++ b/db/docs/ref/am/opensub.html @@ -0,0 +1,64 @@ +<!--$Id: opensub.so,v 10.6 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Opening multiple databases in a single file</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/upgrade.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Opening multiple databases in a single file</h1> +<p>Applications may create multiple databases within a single physical +file. This is useful when the databases are both numerous and +reasonably small, in order to avoid creating a large number of +underlying files, or when it is desirable to include secondary index +databases in the same file as the primary index database. Multiple +databases are an administrative convenience and using them is unlikely +to effect database performance. To open or create a file that will +include more than a single database, specify a database name when +calling the <a href="../../api_c/db_open.html">DB->open</a> method. +<p>Physical files do not need to be comprised of a single type of database, +and databases in a file may be of any type (e.g., Btree, Hash or Recno), +except for Queue databases. Queue databases must be created one per file +and cannot share a file with any other database type. There is no limit +on the number of databases that may be created in a single file other than +the standard Berkeley DB file size and disk space limitations. +<p>It is an error to attempt to open a second database in a file that was +not initially created using a database name, that is, the file must +initially be specified as capable of containing multiple databases for a +second database to be created in it. +<p>It is not an error to open a file that contains multiple databases without +specifying a database name, however the database type should be specified +as DB_UNKNOWN and the database must be opened read-only. The handle that +is returned from such a call is a handle on a database whose key values +are the names of the databases stored in the database file and whose data +values are opaque objects. No keys or data values may be modified or +stored using this database handle. +<p>Storing multiple databases in a single file is almost identical to +storing each database in its own separate file. The one crucial +difference is how locking and the underlying memory pool services must +to be configured. As an example, consider two databases instantiated +in two different physical files. If access to each separate database +is single-threaded, there is no reason to perform any locking of any +kind, and the two databases may be read and written simultaneously. +Further, there would be no requirement to create a shared database +environment in which to open the databases. Because multiple databases +in a file exist in a single physical file, opening two databases in the +same file requires that locking be enabled, unless access to the +databases is known to be single-threaded, that is, only one of the +databases is ever accessed at a time. (As the locks for the two +databases can only conflict during page allocation, this additional +locking is unlikely to effect performance.) Further, the databases must +share an underlying memory pool so that per-physical-file information +is updated correctly. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/upgrade.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/ops.html b/db/docs/ref/am/ops.html new file mode 100644 index 000000000..5daaddd74 --- /dev/null +++ b/db/docs/ref/am/ops.html @@ -0,0 +1,36 @@ +<!--$Id: ops.so,v 10.16 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Access method operations</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/renumber.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Access method operations</h1> +<p>Once a database handle has been created using <a href="../../api_c/db_create.html">db_create</a>, there +are several standard access method operations. Each of these operations +is performed using a method that is referenced from the returned handle. +The operations are as follows: +<p><dl compact> +<p><dt><a href="../../api_c/db_close.html">DB->close</a><dd>Close the database +<dt><a href="../../api_c/db_cursor.html">DB->cursor</a><dd>Open a cursor into the database +<dt><a href="../../api_c/db_del.html">DB->del</a><dd>Delete a record +<dt><a href="../../api_c/db_get.html">DB->get</a><dd>Retrieve a record +<dt><a href="../../api_c/db_open.html">DB->open</a><dd>Open a database +<dt><a href="../../api_c/db_put.html">DB->put</a><dd>Store a record +<dt><a href="../../api_c/db_stat.html">DB->stat</a><dd>Return statistics about the database +<dt><a href="../../api_c/db_sync.html">DB->sync</a><dd>Flush the underlying cache +<dt><a href="../../api_c/db_upgrade.html">DB->upgrade</a><dd>Upgrade a database +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/renumber.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/partial.html b/db/docs/ref/am/partial.html new file mode 100644 index 000000000..7f3af8f68 --- /dev/null +++ b/db/docs/ref/am/partial.html @@ -0,0 +1,134 @@ +<!--$Id: partial.so,v 10.18 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Partial record storage and retrieval</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/stability.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/verify.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Partial record storage and retrieval</h1> +<p>It is possible to both store and retrieve parts of data items in all +Berkeley DB access methods. This is done by setting the +<a href="../../api_c/dbt.html#DB_DBT_PARTIAL">DB_DBT_PARTIAL</a> flag in the <a href="../../api_c/dbt.html">DBT</a> structure passed to the +Berkeley DB interface. +<p>The <a href="../../api_c/dbt.html#DB_DBT_PARTIAL">DB_DBT_PARTIAL</a> flag is based on the values of two fields +of the <a href="../../api_c/dbt.html">DBT</a> structure, <b>dlen</b> and <b>doff</b>. The value +of <b>dlen</b> is the number of bytes of the record in which the +application is interested. The value of <b>doff</b> is the offset from +the beginning of the data item where those bytes start. +<p>For example, if the data item were <b>ABCDEFGHIJKL</b>, a <b>doff</b> +value of 3 would indicate that the bytes of interest started at +<b>D</b>, and a <b>dlen</b> value of 4 would indicate that the bytes +of interest were <b>DEFG</b>. +<p>When retrieving a data item from a database, the <b>dlen</b> bytes +starting <b>doff</b> bytes from the beginning of the record are +returned, as if they comprised the entire record. If any or all of the +specified bytes do not exist in the record, the retrieval is still +successful and any existing bytes (and nul bytes for any non-existent +bytes) are returned. +<p>When storing a data item into the database, the <b>dlen</b> bytes +starting <b>doff</b> bytes from the beginning of the specified key's +data record are replaced by the data specified by the <b>data</b> and +<b>size</b> fields. If <b>dlen</b> is smaller than <b>size</b>, the +record will grow, and if <b>dlen</b> is larger than <b>size</b>, the +record will shrink. If the specified bytes do not exist, the record will +be extended using nul bytes as necessary, and the store call will still +succeed. +<p>The following are various examples of the put case for the +<a href="../../api_c/dbt.html#DB_DBT_PARTIAL">DB_DBT_PARTIAL</a> flag. In all examples, the initial data item is 20 +bytes in length: +<p><b>ABCDEFGHIJ0123456789</b> +<p><ol> +<p><li><p><blockquote><pre>size = 20 +doff = 0 +dlen = 20 +data = abcdefghijabcdefghij +<p> +Result: The 20 bytes at offset 0 are replaced by the 20 bytes of data, +i.e., the entire record is replaced. +<p> +ABCDEFGHIJ0123456789 -> abcdefghijabcdefghij +</pre></blockquote> +<p><li><p><blockquote><pre>size = 10 +doff = 20 +dlen = 0 +data = abcdefghij +<p> +Result: The 0 bytes at offset 20 are replaced by the 10 bytes of data, +i.e., the record is extended by 10 bytes. +<p> +ABCDEFGHIJ0123456789 -> ABCDEFGHIJ0123456789abcdefghij +</pre></blockquote> +<p><li><p><blockquote><pre>size = 10 +doff = 10 +dlen = 5 +data = abcdefghij +<p> +Result: The 5 bytes at offset 10 are replaced by the 10 bytes of data. +<p> +ABCDEFGHIJ0123456789 -> ABCDEFGHIJabcdefghij56789 +</pre></blockquote> +<p><li><p><blockquote><pre>size = 10 +doff = 10 +dlen = 0 +data = abcdefghij +<p> +Result: The 0 bytes at offset 10 are replaced by the 10 bytes of data, +i.e., 10 bytes are inserted into the record. +<p> +ABCDEFGHIJ0123456789 -> ABCDEFGHIJabcdefghij0123456789 +</pre></blockquote> +<p><li><p><blockquote><pre>size = 10 +doff = 2 +dlen = 15 +data = abcdefghij +<p> +Result: The 15 bytes at offset 2 are replaced by the 10 bytes of data. +<p> +ABCDEFGHIJ0123456789 -> ABabcdefghij789 +</pre></blockquote> +<p><li><p><blockquote><pre>size = 10 +doff = 0 +dlen = 0 +data = abcdefghij +<p> +Result: The 0 bytes at offset 0 are replaced by the 10 bytes of data, +i.e., the 10 bytes are inserted at the beginning of the record. +<p> +ABCDEFGHIJ0123456789 -> abcdefghijABCDEFGHIJ0123456789 +</pre></blockquote> +<p><li><p><blockquote><pre>size = 0 +doff = 0 +dlen = 10 +data = "" +<p> +Result: The 10 bytes at offset 0 are replaced by the 0 bytes of data, +i.e., the first 10 bytes of the record are discarded. +<p> +ABCDEFGHIJ0123456789 -> 0123456789 +</pre></blockquote> +<p><li><p><blockquote><pre>size = 10 +doff = 25 +dlen = 0 +data = abcdefghij +<p> +Result: The 0 bytes at offset 25 are replaced by the 10 bytes of data, +i.e., 10 bytes are inserted into the record past the end of the current +data (\0 represents a nul byte). +<p> +ABCDEFGHIJ0123456789 -> ABCDEFGHIJ0123456789\0\0\0\0\0abcdefghij +</pre></blockquote> +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/stability.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/verify.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/put.html b/db/docs/ref/am/put.html new file mode 100644 index 000000000..993dcbeb0 --- /dev/null +++ b/db/docs/ref/am/put.html @@ -0,0 +1,36 @@ +<!--$Id: put.so,v 10.14 2000/03/18 21:43:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Storing records</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/get.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/delete.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Storing records</h1> +<p>The <a href="../../api_c/db_put.html">DB->put</a> function is the standard interface for storing records into +the database. In general, <a href="../../api_c/db_put.html">DB->put</a> takes a key and stores the +associated data into the database. +<p>There are a few flags that you can set to customize storage: +<p><dl compact> +<p><dt><a href="../../api_c/db_put.html#DB_APPEND">DB_APPEND</a><dd>Simply append the data to the end of the database, treating the database +much like a simple log. This flag is only valid for the Queue and Recno +access methods. +<p><dt><a href="../../api_c/db_put.html#DB_NOOVERWRITE">DB_NOOVERWRITE</a><dd>Only store the data item if the key does not already appear in the database. +</dl> +<p>If the database has been configured to support duplicate records, the +<a href="../../api_c/db_put.html">DB->put</a> function will add the new data value at the end of the duplicate +set. If the database supports sorted duplicates, the new data value is +inserted at the correct sorted location. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/get.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/delete.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/stability.html b/db/docs/ref/am/stability.html new file mode 100644 index 000000000..b5f6d2386 --- /dev/null +++ b/db/docs/ref/am/stability.html @@ -0,0 +1,49 @@ +<!--$Id: stability.so,v 10.20 2000/12/13 16:48:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Cursor Stability</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/am/curclose.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/partial.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Cursor Stability</h1> +<p>In the absence of locking, no guarantees are made about the stability of +cursors in different processes or threads. However, the Btree, Queue +and Recno access methods guarantee that cursor operations, interspersed +with other cursor or non-cursor operations in the same thread of control +will always return keys in order and will return each non-deleted key/data +pair exactly once. Because the Hash access method uses a dynamic hashing +algorithm, it cannot guarantee any form of stability in the presence of +inserts and deletes unless locking is performed. +<p>If locking was specified when the Berkeley DB file was opened, but transactions +are not in effect, the access methods provide repeatable reads with +respect to the cursor. That is, a <a href="../../api_c/dbc_get.html#DB_CURRENT">DB_CURRENT</a> call on the cursor +is guaranteed to return the same record as was returned on the last call +to the cursor. +<p>With the exception of the Queue access method, in the presence of +transactions, all access method calls between a call to <a href="../../api_c/txn_begin.html">txn_begin</a> +and a call to <a href="../../api_c/txn_abort.html">txn_abort</a> or <a href="../../api_c/txn_commit.html">txn_commit</a> provide degree 3 +consistency (serializable transactions). +<p>The Queue access method permits phantom records to appear between calls. +That is, deleted records are not locked, therefore another transaction may +replace a deleted record between two calls to retrieve it. The record would +not appear in the first call but would be seen by the second call. +<p>For all access methods, a cursor scan of the database performed within +the context of a transaction is guaranteed to return each key/data pair +once and only once, except in the following case. If, while performing +a cursor scan using the Hash access method, the transaction performing +the scan inserts a new pair into the database, it is possible that duplicate +key/data pairs will be returned. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/curclose.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/partial.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/stat.html b/db/docs/ref/am/stat.html new file mode 100644 index 000000000..3042ccfee --- /dev/null +++ b/db/docs/ref/am/stat.html @@ -0,0 +1,36 @@ +<!--$Id: stat.so,v 10.17 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Database statistics</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/sync.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/close.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Database statistics</h1> +<p>The <a href="../../api_c/db_stat.html">DB->stat</a> function is the standard interface for obtaining database +statistics. Generally, <a href="../../api_c/db_stat.html">DB->stat</a> returns a set of statistics +about the underlying database, e.g., the number of key/data pairs in +the database, how the database was originally configured, and so on. +<p>There are two flags that you can set to customize the returned statistics: +<p><dl compact> +<p><dt><a href="../../api_c/db_stat.html#DB_CACHED_COUNTS">DB_CACHED_COUNTS</a><dd>Request an approximate key and key/data pair count. As obtaining an +exact count can be very performance intensive for large databases, +it is possible to request a previously cached count. Obviously, the +cached count is only an approximate count, and may be out-of-date. +<p><dt><a href="../../api_c/db_stat.html#DB_RECORDCOUNT">DB_RECORDCOUNT</a><dd>If the database is a Queue or Recno database, or a Btree database that +was configured so that it is possible to search it by logical record +number, return only a count of the records in the database. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/sync.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/close.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/sync.html b/db/docs/ref/am/sync.html new file mode 100644 index 000000000..3d1d61e62 --- /dev/null +++ b/db/docs/ref/am/sync.html @@ -0,0 +1,38 @@ +<!--$Id: sync.so,v 10.15 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Flushing the database cache</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/delete.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Flushing the database cache</h1> +<p>The <a href="../../api_c/db_sync.html">DB->sync</a> function is the standard interface for flushing all modified +records from the database cache to disk. +<p><b>It is important to understand that flushing cached information +to disk only minimizes the window of opportunity for corrupted data, it +does not eliminate the possibility.</b> +<p>While unlikely, it is possible for database corruption to happen if a +system or application crash occurs while writing data to the database. To +ensure that database corruption never occurs, applications must either: +<ul type=disc> +<li>Use transactions and logging with automatic recovery. +<li>Use logging and application-specific recovery. +<li>Edit a copy of the database, and, once all applications +using the database have successfully called <a href="../../api_c/db_close.html">DB->close</a>, use +system operations (e.g., the POSIX rename system call) to atomically +replace the original database with the updated copy. +</ul> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/delete.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/upgrade.html b/db/docs/ref/am/upgrade.html new file mode 100644 index 000000000..21fa87a1e --- /dev/null +++ b/db/docs/ref/am/upgrade.html @@ -0,0 +1,50 @@ +<!--$Id: upgrade.so,v 10.14 2000/12/21 18:37:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Upgrading databases</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Method Operations</dl></h3></td> +<td width="1%"><a href="../../ref/am/opensub.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/get.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Upgrading databases</h1> +<p>When upgrading to a new release of Berkeley DB, it may be necessary to upgrade +the on-disk format of already-created database files. <b>Berkeley DB +database upgrades are done in place, and so are potentially +destructive.</b> This means that if the system crashes during the upgrade +procedure, or if the upgrade procedure runs out of disk space, the +databases may be left in an inconsistent and unrecoverable state. To +guard against failure, the procedures outlined in +<a href="../../ref/upgrade/process.html">Upgrading Berkeley DB installations</a> +should be carefully followed. If you are not performing catastrophic +archival as part of your application upgrade process, you should at +least copy your database to archival media, verify that your archival +media is error-free and readable, and that copies of your backups are +stored off-site! +<p>The actual database upgrade is done using the <a href="../../api_c/db_upgrade.html">DB->upgrade</a> +method, or by dumping the database using the old version of the Berkeley DB +software and reloading it using the current version. +<p>After an upgrade, Berkeley DB applications must be recompiled to use the new +Berkeley DB library before they can access an upgraded database. +<b>There is no guarantee that applications compiled against +previous releases of Berkeley DB will work correctly with an upgraded database +format. Nor is there any guarantee that applications compiled against +newer releases of Berkeley DB will work correctly with the previous database +format.</b> We do guarantee that any archived database may be upgraded +using a current Berkeley DB software release and the <a href="../../api_c/db_upgrade.html">DB->upgrade</a> +method, and there is no need to step-wise upgrade the database using +intermediate releases of Berkeley DB. Sites should consider archiving +appropriate copies of their application or application sources if they +may need to access archived databases without first upgrading them. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/opensub.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/get.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am/verify.html b/db/docs/ref/am/verify.html new file mode 100644 index 000000000..5c975dd8c --- /dev/null +++ b/db/docs/ref/am/verify.html @@ -0,0 +1,50 @@ +<!--$Id: verify.so,v 10.3 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Database verification and salvage</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> <a name="4"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am/partial.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/error.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Database verification and salvage</h1> +<p>The <a href="../../api_c/db_verify.html">DB->verify</a> method is the standard interface for verifying +that a file, and any databases it may contain, are uncorrupted. In +addition, the method may optionally be called with a file stream +argument to which all key/data pairs found in the database are output. +There are two modes for finding key/data pairs to be output: +<p><ol> +<p><li>If the <a href="../../api_c/db_verify.html#DB_SALVAGE">DB_SALVAGE</a> flag is specified, the key/data pairs in the +database are output. When run in this mode, the database is assumed to +be largely uncorrupted. For example, the <a href="../../api_c/db_verify.html">DB->verify</a> method will +search for pages that are no longer linked into the database, and will +output key/data pairs from such pages. However, key/data items that +have been marked as deleted in the database will not be output, as the +page structures are generally trusted in this mode. +<p><li>If both the <a href="../../api_c/db_verify.html#DB_SALVAGE">DB_SALVAGE</a> and <a href="../../api_c/db_verify.html#DB_AGGRESSIVE">DB_AGGRESSIVE</a> flags are +specified, all possible key/data pairs are output. When run in this mode, +the database is assumed to be seriously corrupted. For example, key/data +pairs that have been deleted will re-appear in the output. In addition, +because pages may have been subsequently re-used and modified during +normal database operations after the key/data pairs were deleted, it is +not uncommon for apparently corrupted key/data pairs to be output in this +mode, even when there is no corruption in the underlying database. The +output will almost always have to be edited by hand or other means before +the data is ready for re-load into another database. We recommend that +<a href="../../api_c/db_verify.html#DB_SALVAGE">DB_SALVAGE</a> be tried first, and <a href="../../api_c/db_verify.html#DB_AGGRESSIVE">DB_AGGRESSIVE</a> only tried +if the output from that first attempt is obviously missing data items or +the data is sufficiently valuable that human review of the output is +preferable to any kind of data loss. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/partial.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/error.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/bt_compare.html b/db/docs/ref/am_conf/bt_compare.html new file mode 100644 index 000000000..bf824ca35 --- /dev/null +++ b/db/docs/ref/am_conf/bt_compare.html @@ -0,0 +1,85 @@ +<!--$Id: bt_compare.so,v 10.20 2000/09/10 13:42:12 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Btree comparison</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/malloc.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_prefix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Btree comparison</h1> +<p>The Btree data structure is a sorted, balanced tree structure storing +associated key/data pairs. By default, the sort order is lexicographical, +with shorter keys collating before longer keys. The user can specify the +sort order for the Btree by using the <a href="../../api_c/db_set_bt_compare.html">DB->set_bt_compare</a> function. +<p>Sort routines are passed pointers to keys as arguments. The keys are +represented as <a href="../../api_c/dbt.html">DBT</a> structures. The routine must return an integer +less than, equal to, or greater than zero if the first argument is +considered to be respectively less than, equal to, or greater than the +second argument. The only fields that the routines may examine in the +<a href="../../api_c/dbt.html">DBT</a> structures are <b>data</b> and <b>size</b> fields. +<p>An example routine that might be used to sort integer keys in the database +is as follows: +<p><blockquote><pre>int +compare_int(dbp, a, b) + DB *dbp; + const DBT *a, *b; +{ + int ai, bi; +<p> + /* + * Returns: + * < 0 if a < b + * = 0 if a = b + * > 0 if a > b + */ + memcpy(&ai, a->data, sizeof(int)); + memcpy(&bi, b->data, sizeof(int)); + return (ai - bi); +} +</pre></blockquote> +<p>Note that the data must first be copied into memory that is appropriately +aligned, as Berkeley DB does not guarantee any kind of alignment of the +underlying data, including for comparison routines. When writing +comparison routines, remember that databases created on machines of +different architectures may have different integer byte orders, for which +your code may need to compensate. +<p>An example routine that might be used to sort keys based on the first +five bytes of the key (ignoring any subsequent bytes) is as follows: +<p><blockquote><pre>int +compare_dbt(dbp, a, b) + DB *dbp; + const DBT *a, *b; +{ + u_char *p1, *p2; +<p> + /* + * Returns: + * < 0 if a < b + * = 0 if a = b + * > 0 if a > b + */ + for (p1 = a->data, p2 = b->data, len = 5; len--; ++p1, ++p2) + if (*p1 != *p2) + return ((long)*p1 - (long)*p2); + return (0); +}</pre></blockquote> +<p>All comparison functions must cause the keys in the database to be +well-ordered. The most important implication of being well-ordered is +that the key relations must be transitive, that is, if key A is less +than key B, and key B is less than key C, then the comparison routine +must also return that key A is less than key C. In addition, comparisons +will only be able to return 0 when comparing full length keys; partial +key comparisons must always return a result less than or greater than 0. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/malloc.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_prefix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/bt_minkey.html b/db/docs/ref/am_conf/bt_minkey.html new file mode 100644 index 000000000..f80ecf1df --- /dev/null +++ b/db/docs/ref/am_conf/bt_minkey.html @@ -0,0 +1,53 @@ +<!--$Id: bt_minkey.so,v 10.14 2000/03/18 21:43:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Minimum keys per page</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/bt_prefix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_recnum.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Minimum keys per page</h1> +<p>The number of keys stored on each page affects the size of a Btree and +how it is maintained. Therefore, it also affects the retrieval and search +performance of the tree. For each Btree, Berkeley DB computes a maximum key +and data size. This size is a function of the page size and the fact that +at least two key/data pairs must fit on any Btree page. Whenever key or +data items exceed the calculated size, they are stored on overflow pages +instead of in the standard Btree leaf pages. +<p>Applications may use the <a href="../../api_c/db_set_bt_minkey.html">DB->set_bt_minkey</a> function to change the minimum +number of keys that must fit on a Btree page from two to another value. +Altering this value in turn alters the on-page maximum size, and can be +used to force key and data items which would normally be stored in the +Btree leaf pages onto overflow pages. +<p>Some data sets can benefit from this tuning. For example, consider an +application using large page sizes, with a data set almost entirely +consisting of small key and data items, but with a few large items. By +setting the minimum number of keys that must fit on a page, the +application can force the outsized items to be stored on overflow pages. +That in turn can potentially keep the tree more compact, that is, with +fewer internal levels to traverse during searches. +<p>The following calculation is similar to the one performed by the Btree +implementation. (The <b>minimum_keys</b> value is multiplied by 2 +because each key/data pair requires 2 slots on a Btree page.) +<p><blockquote><pre>maximum_size = page_size / (minimum_keys * 2)</pre></blockquote> +<p>Using this calculation, if the page size is 8KB and the default +<b>minimum_keys</b> value of 2 is used, then any key or data items +larger than 2KB will be forced to an overflow page. If an application +were to specify a <b>minimum_key</b> value of 100, then any key or data +items larger than roughly 40 bytes would be forced to overflow pages. +<p>It is important to remember that accesses to overflow pages do not perform +as well as accesses to the standard Btree leaf pages, and so setting the +value incorrectly can result in overusing overflow pages and decreasing +the application's overall performance. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/bt_prefix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_recnum.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/bt_prefix.html b/db/docs/ref/am_conf/bt_prefix.html new file mode 100644 index 000000000..621de75fa --- /dev/null +++ b/db/docs/ref/am_conf/bt_prefix.html @@ -0,0 +1,66 @@ +<!--$Id: bt_prefix.so,v 10.17 2000/07/04 18:28:27 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Btree prefix comparison</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/bt_compare.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_minkey.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Btree prefix comparison</h1> +<p>The Berkeley DB Btree implementation maximizes the number of keys that can be +stored on an internal page by storing only as many bytes of each key as +are necessary to distinguish it from adjacent keys. The prefix comparison +routine is what determines this minimum number of bytes (i.e., the length +of the unique prefix), that must be stored. A prefix comparison function +for the Btree can be specified by calling <a href="../../api_c/db_set_bt_prefix.html">DB->set_bt_prefix</a>. +<p>The prefix comparison routine must be compatible with the overall +comparison function of the Btree, since what distinguishes any two keys +depends entirely on the function used to compare them. This means that +if a prefix comparison routine is specified by the application, a +compatible overall comparison routine must also have been specified. +<p>Prefix comparison routines are passed pointers to keys as arguments. The +keys are represented as <a href="../../api_c/dbt.html">DBT</a> structures. The prefix comparison +function must return the number of bytes of the second key argument that +are necessary to determine if it is greater than the first key argument. +If the keys are equal, the length of the second key should be returned. +The only fields that the routines may examine in the <a href="../../api_c/dbt.html">DBT</a> +structures are <b>data</b> and <b>size</b> fields. +<p>An example prefix comparison routine follows: +<p><blockquote><pre>u_int32_t +compare_prefix(dbp, a, b) + DB *dbp; + const DBT *a, *b; +{ + size_t cnt, len; + u_int8_t *p1, *p2; +<p> + cnt = 1; + len = a->size > b->size ? b->size : a->size; + for (p1 = + a->data, p2 = b->data; len--; ++p1, ++p2, ++cnt) + if (*p1 != *p2) + return (cnt); + /* + * They match up to the smaller of the two sizes. + * Collate the longer after the shorter. + */ + if (a->size < b->size) + return (a->size + 1); + if (b->size < a->size) + return (b->size + 1); + return (b->size); +}</pre></blockquote> +<p>The usefulness of this functionality is data dependent, but in some data +sets can produce significantly reduced tree sizes and faster search times. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/bt_compare.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_minkey.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/bt_recnum.html b/db/docs/ref/am_conf/bt_recnum.html new file mode 100644 index 000000000..cdf8970e5 --- /dev/null +++ b/db/docs/ref/am_conf/bt_recnum.html @@ -0,0 +1,34 @@ +<!--$Id: bt_recnum.so,v 10.18 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Retrieving Btree records by number</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/bt_minkey.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/h_ffactor.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Retrieving Btree records by number</h1> +<p>The Btree access method optionally supports retrieval by logical record +numbers). To configure a Btree to support record numbers, call the +<a href="../../api_c/db_set_flags.html">DB->set_flags</a> function with the <a href="../../api_c/db_set_flags.html#DB_RECNUM">DB_RECNUM</a> flag. +<p>Configuring a Btree for record numbers should not be done lightly. +While often useful, it requires that storing items into the database +be single-threaded, which can severely impact application throughput. +Generally it should be avoided in trees with a need for high write +concurrency. +<p>To determine a key's record number, use the <a href="../../api_c/dbc_get.html#DB_GET_RECNO">DB_GET_RECNO</a> flag +to the <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> function. +<p>To retrieve by record number, use the <a href="../../api_c/db_get.html#DB_SET_RECNO">DB_SET_RECNO</a> flag to the +<a href="../../api_c/db_get.html">DB->get</a> and <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> functions. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/bt_minkey.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/h_ffactor.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/byteorder.html b/db/docs/ref/am_conf/byteorder.html new file mode 100644 index 000000000..e0eef8a45 --- /dev/null +++ b/db/docs/ref/am_conf/byteorder.html @@ -0,0 +1,38 @@ +<!--$Id: byteorder.so,v 10.16 2000/03/18 21:43:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Selecting a byte order</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/cachesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/dup.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Selecting a byte order</h1> +<p>The database files created by Berkeley DB can be created in either little- or +big-endian formats. +<p>The byte order used for the underlying database can be specified by +calling the <a href="../../api_c/db_set_lorder.html">DB->set_lorder</a> function. If no order is selected, the +native format of the machine on which the database is created will be +used. +<p>Berkeley DB databases are architecture independent, and any format database can +be used on a machine with a different native format. In this case, as +each page that is read into or written from the cache must be converted +to or from the host format, and databases with non-native formats will +incur a performance penalty for the run-time conversion. +<p><b>It is important to note that the Berkeley DB access methods do no data +conversion for application specified data. Key/data pairs written on a +little-endian format architecture will be returned to the application +exactly as they were written when retrieved on a big-endian format +architecture.</b> +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/cachesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/dup.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/cachesize.html b/db/docs/ref/am_conf/cachesize.html new file mode 100644 index 000000000..d0534767f --- /dev/null +++ b/db/docs/ref/am_conf/cachesize.html @@ -0,0 +1,86 @@ +<!--$Id: cachesize.so,v 10.18 2000/03/18 21:43:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Selecting a cache size</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/pagesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/byteorder.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Selecting a cache size</h1> +<p>The size of the cache used for the underlying database can be specified +by calling the <a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> function. +Choosing a cache size is, unfortunately, an art. Your cache must be at +least large enough for your working set plus some overlap for unexpected +situations. +<p>When using the Btree access method, you must have a cache big enough for +the minimum working set for a single access. This will include a root +page, one or more internal pages (depending on the depth of your tree), +and a leaf page. If your cache is any smaller than that, each new page +will force out the least-recently-used page, and Berkeley DB will re-read the +root page of the tree anew on each database request. +<p>If your keys are of moderate size (a few tens of bytes) and your pages +are on the order of 4K to 8K, most Btree applications +will be only three levels. For +example, using 20 byte keys with 20 bytes of data associated with each +key, a 8KB page can hold roughly 400 keys and 200 key/data pairs, so a +fully populated three-level Btree will hold 32 million key/data pairs, +and a tree with only a 50% page-fill factor will still hold 16 million +key/data pairs. We rarely expect trees to exceed five levels, although +Berkeley DB will support trees up to 255 levels. +<p>The rule-of-thumb is that cache is good, and more cache is better. +Generally, applications benefit from increasing the cache size up to a +point, at which the performance will stop improving as the cache size +increases. When this point is reached, one of two things have happened: +either the cache is large enough that the application is almost never +having to retrieve information from disk, or, your application is doing +truly random accesses, and therefore increasing size of the cache doesn't +significantly increase the odds of finding the next requested information +in the cache. The latter is fairly rare -- almost all applications show +some form of locality of reference. +<p>That said, it is important not to increase your cache size beyond the +capabilities of your system, as that will result in reduced performance. +Under many operating systems, tying down enough virtual memory will cause +your memory and potentially your program to be swapped. This is +especially likely on systems without unified OS buffer caches and virtual +memory spaces, as the buffer cache was allocated at boot time and so +cannot be adjusted based on application requests for large amounts of +virtual memory. +<p>For example, even if accesses are truly random within a Btree, your +access pattern will favor internal pages to leaf pages, so your cache +should be large enough to hold all internal pages. In the steady state, +this requires at most one I/O per operation to retrieve the appropriate +leaf page. +<p>You can use the <a href="../../utility/db_stat.html">db_stat</a> utility to monitor the effectiveness of +your cache. The following output is excerpted from the output of that +utility's <b>-m</b> option: +<p><blockquote><pre>prompt: db_stat -m +131072 Cache size (128K). +4273 Requested pages found in the cache (97%). +134 Requested pages not found in the cache. +18 Pages created in the cache. +116 Pages read into the cache. +93 Pages written from the cache to the backing file. +5 Clean pages forced from the cache. +13 Dirty pages forced from the cache. +0 Dirty buffers written by trickle-sync thread. +130 Current clean buffer count. +4 Current dirty buffer count. +</pre></blockquote> +<p>The statistics for this cache say that there have been 4,273 requests of +the cache, and only 116 of those requests required an I/O from disk. This +means that the cache is working well, yielding a 97% cache hit rate. The +<a href="../../utility/db_stat.html">db_stat</a> utility will present these statistics both for the cache +as a whole and for each file within the cache separately. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/pagesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/byteorder.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/dup.html b/db/docs/ref/am_conf/dup.html new file mode 100644 index 000000000..eec5302cb --- /dev/null +++ b/db/docs/ref/am_conf/dup.html @@ -0,0 +1,71 @@ +<!--$Id: dup.so,v 10.21 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Duplicate data items</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/byteorder.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/malloc.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Duplicate data items</h1> +<p>The Btree and Hash access methods support the creation of multiple data +items for a single key item. By default, multiple data items are not +permitted, and each database store operation will overwrite any previous +data item for that key. To configure Berkeley DB for duplicate data items, call +the <a href="../../api_c/db_set_flags.html">DB->set_flags</a> function with the <a href="../../api_c/db_set_flags.html#DB_DUP">DB_DUP</a> flag. +<p>By default, Berkeley DB stores duplicates in the order in which they were added, +that is, each new duplicate data item will be stored after any already +existing data items. This default behavior can be overridden by using +the <a href="../../api_c/dbc_put.html">DBcursor->c_put</a> function and one of the <a href="../../api_c/dbc_put.html#DB_AFTER">DB_AFTER</a>, <a href="../../api_c/dbc_put.html#DB_BEFORE">DB_BEFORE</a> +<a href="../../api_c/dbc_put.html#DB_KEYFIRST">DB_KEYFIRST</a> or <a href="../../api_c/dbc_put.html#DB_KEYLAST">DB_KEYLAST</a> flags. Alternatively, Berkeley DB +may be configured to sort duplicate data items as described below. +<p>When stepping through the database sequentially, duplicate data items will +be returned individually, as a key/data pair, where the key item only +changes after the last duplicate data item has been returned. For this +reason, duplicate data items cannot be accessed using the +<a href="../../api_c/db_get.html">DB->get</a> function, as it always returns the first of the duplicate data +items. Duplicate data items should be retrieved using the Berkeley DB cursor +interface, <a href="../../api_c/dbc_get.html">DBcursor->c_get</a>. +<p>There is an interface flag that permits applications to request the +following data item only if it <b>is</b> a duplicate data item of the +current entry, see <a href="../../api_c/dbc_get.html#DB_NEXT_DUP">DB_NEXT_DUP</a> for more information. There is an +interface flag that permits applications to request the following data +item only if it <b>is not</b> a duplicate data item of the current +entry, see <a href="../../api_c/dbc_get.html#DB_NEXT_NODUP">DB_NEXT_NODUP</a> and <a href="../../api_c/dbc_get.html#DB_PREV_NODUP">DB_PREV_NODUP</a> for more +information. +<p>It is also possible to maintain duplicate records in sorted order. Sorting +duplicates will significantly increase performance when searching them +and performing logical joins, common operations when creating secondary +indexes. To configure Berkeley DB to sort duplicate data items, the application +must call the <a href="../../api_c/db_set_flags.html">DB->set_flags</a> function with the <a href="../../api_c/db_set_flags.html#DB_DUPSORT">DB_DUPSORT</a> flag (in +addition to the <a href="../../api_c/db_set_flags.html#DB_DUP">DB_DUP</a> flag). In addition, a custom sorting +function may be specified using the <a href="../../api_c/db_set_dup_compare.html">DB->set_dup_compare</a> function. If the +<a href="../../api_c/db_set_flags.html#DB_DUPSORT">DB_DUPSORT</a> flag is given, but no comparison routine is specified, +then Berkeley DB defaults to the same lexicographical sorting used for Btree +keys, with shorter items collating before longer items. +<p>If the duplicate data items are unsorted, applications may store identical +duplicate data items, or, for those that just like the way it sounds, +<i>duplicate duplicates</i>. +<p><b>In this release it is an error to attempt to store identical +duplicate data items when duplicates are being stored in a sorted order.</b> +This restriction is expected to be lifted in a future release. There is +an interface flag that permits applications to disallow storing duplicate +data items when the database has been configured for sorted duplicates, +see <a href="../../api_c/db_put.html#DB_NODUPDATA">DB_NODUPDATA</a> for more information. Applications not wanting +to permit duplicate duplicates in databases configured for sorted +duplicates should begin using the <a href="../../api_c/db_put.html#DB_NODUPDATA">DB_NODUPDATA</a> flag immediately. +<p>For further information on how searching and insertion behaves in the +presence of duplicates (sorted or not), see the <a href="../../api_c/db_get.html">DB->get</a>, +<a href="../../api_c/db_put.html">DB->put</a>, <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> and <a href="../../api_c/dbc_put.html">DBcursor->c_put</a> documentation. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/byteorder.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/malloc.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/extentsize.html b/db/docs/ref/am_conf/extentsize.html new file mode 100644 index 000000000..15d940c15 --- /dev/null +++ b/db/docs/ref/am_conf/extentsize.html @@ -0,0 +1,38 @@ +<!--$Id: extentsize.so,v 1.2 2000/11/20 21:45:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Selecting a Queue extent size</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/recno.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/re_source.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Selecting a Queue extent size</h1> +<p>In Queue databases, records are allocated sequentially and directly +mapped to an offset within the file storage for the database. As +records are deleted from the Queue, pages will become empty and will +not be reused in normal queue operations. To facilitate the reclamation +of disk space a Queue may be partitioned into extents. Each extent is +kept in a separate physical file. Extent files are automatically +created as needed and destroyed when they are emptied of records. +<p>The extent size specifies the number of pages that make up each extent. +By default, if no extent size is specified, the Queue resides in a +single file and disk space is not reclaimed. In choosing an extent size +there is a tradeoff between the amount of disk space used and the +overhead of creating and deleting files. If the extent size is too +small, the system will pay a performance penalty, creating and deleting +files frequently. In addition, if the active part of the queue spans +many files, all those files will need to be open at the same time, +consuming system and process file resources. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/recno.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/re_source.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/h_ffactor.html b/db/docs/ref/am_conf/h_ffactor.html new file mode 100644 index 000000000..6c30f0fc3 --- /dev/null +++ b/db/docs/ref/am_conf/h_ffactor.html @@ -0,0 +1,31 @@ +<!--$Id: h_ffactor.so,v 10.11 2000/03/18 21:43:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Page fill factor</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/bt_recnum.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/h_hash.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Page fill factor</h1> +<p>The density, or page fill factor, is an approximation of the number of +keys allowed to accumulate in any one bucket, determining when the hash +table grows or shrinks. If you know the average sizes of the keys and +data in your dataset, setting the fill factor can enhance performance. +A reasonable rule to use to compute fill factor is: +<p><blockquote><pre>(pagesize - 32) / (average_key_size + average_data_size + 8)</pre></blockquote> +<p>The desired density within the hash table can be specified by calling +the <a href="../../api_c/db_set_h_ffactor.html">DB->set_h_ffactor</a> function. If no density is specified, one will +be selected dynamically as pages are filled. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/bt_recnum.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/h_hash.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/h_hash.html b/db/docs/ref/am_conf/h_hash.html new file mode 100644 index 000000000..d42edee1c --- /dev/null +++ b/db/docs/ref/am_conf/h_hash.html @@ -0,0 +1,39 @@ +<!--$Id: h_hash.so,v 10.12 2000/07/04 18:28:27 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Specifying a database hash</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/h_ffactor.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/h_nelem.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Specifying a database hash</h1> +<p>The database hash determines in which bucket a particular key will reside. +The goal of hashing keys is to distribute keys equally across the database +pages, therefore it is important that the hash function work well with +the specified keys so that the resulting bucket usage is relatively +uniform. A hash function that does not work well can effectively turn +into a sequential list. +<p>No hash performs equally well on all possible data sets. It is possible +that applications may find that the default hash function performs poorly +with a particular set of keys. The distribution resulting from the hash +function can be checked using <a href="../../utility/db_stat.html">db_stat</a> utility. By comparing the +number of hash buckets and the number of keys, one can decide if the entries +are hashing in a well-distributed manner. +<p>The hash function for the hash table can be specified by calling the +<a href="../../api_c/db_set_h_hash.html">DB->set_h_hash</a> function. If no hash function is specified, a default +function will be used. Any application-specified hash function must +take a reference to a DB object, a pointer to a byte string and +its length, as arguments and return an unsigned, 32-bit hash value. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/h_ffactor.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/h_nelem.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/h_nelem.html b/db/docs/ref/am_conf/h_nelem.html new file mode 100644 index 000000000..8c510d6db --- /dev/null +++ b/db/docs/ref/am_conf/h_nelem.html @@ -0,0 +1,32 @@ +<!--$Id: h_nelem.so,v 10.12 2000/03/18 21:43:08 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Hash table size</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/h_hash.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/recno.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Hash table size</h1> +<p>When setting up the hash database, knowing the expected number of elements +that will be stored in the hash table is useful. This value can be used +by the Hash access method implementation to more accurately construct the +necessary number of buckets that the database will eventually require. +<p>The anticipated number of elements in the hash table can be specified by +calling the <a href="../../api_c/db_set_h_nelem.html">DB->set_h_nelem</a> function. If not specified, or set too low, +hash tables will expand gracefully as keys are entered, although a slight +performance degradation may be noticed. In order for the estimated number +of elements to be a useful value to Berkeley DB, the <a href="../../api_c/db_set_h_ffactor.html">DB->set_h_ffactor</a> function +must also be called to set the page fill factor. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/h_hash.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/recno.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/intro.html b/db/docs/ref/am_conf/intro.html new file mode 100644 index 000000000..15fed60f6 --- /dev/null +++ b/db/docs/ref/am_conf/intro.html @@ -0,0 +1,45 @@ +<!--$Id: intro.so,v 10.22 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: What are the available access methods?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/close.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/select.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>What are the available access methods?</h1> +<p>Berkeley DB currently offers four access methods: Btree, Hash, Queue and Recno. +<h3>Btree</h3> +<p>The Btree access method is an implementation of a sorted, balanced tree +structure. Searches, insertions, and deletions in the tree all take O(log +base_b N) time, where base_b is the average number of keys per page, and +N is the total number of keys stored. Often, inserting ordered data into +Btree implementations results in pages that are only half-full. Berkeley DB +makes ordered (or inverse ordered) insertion the best case, resulting in +nearly full-page space utilization. +<h3>Hash</h3> +<p>The Hash access method data structure is an implementation of Extended +Linear Hashing, as described in "Linear Hashing: A New Tool for File and +Table Addressing", Witold Litwin, <i>Proceedings of the 6th +International Conference on Very Large Databases (VLDB)</i>, 1980. +<h3>Queue</h3> +<p>The Queue access method stores fixed-length records with logical record +numbers as keys. It is designed for fast inserts at the tail and has a +special cursor consume operation that deletes and returns a record from +the head of the queue. The Queue access method uses record level locking. +<h3>Recno</h3> +<p>The Recno access method stores both fixed and variable-length records with +logical record numbers as keys, optionally backed by a flat text (byte +stream) file. +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/close.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/select.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/logrec.html b/db/docs/ref/am_conf/logrec.html new file mode 100644 index 000000000..fd9fb0141 --- /dev/null +++ b/db/docs/ref/am_conf/logrec.html @@ -0,0 +1,45 @@ +<!--$Id: logrec.so,v 10.23 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Logical record numbers</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/select.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/pagesize.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Logical record numbers</h1> +<p>The Berkeley DB Btree, Queue and Recno access methods can operate on logical +record numbers. In all cases for the Queue and Recno access methods, +and in some cases with the Btree access method, a record number is +specified to reference a specific key/data pair. In the case of Btree +supporting duplicate data items, the logical record number refers to a +key and all of its data items. +<p>Record numbers are 32-bit unsigned types, which limits the number of +logical records in a database to 4,294,967,296. The first record in the +database is record number 1. +<p>Record numbers in Recno databases can be configured to run in either +mutable or fixed mode: mutable, where logical record numbers change as +records are deleted or inserted, and fixed, where record numbers never +change regardless of the database operation. Record numbers in Btree +databases are always mutable, and as records are deleted or inserted, the +logical record number for other records in the database can change. See +<a href="../../ref/am_conf/renumber.html">Logically renumbering records</a> for +more information. +<p>Record numbers in Queue databases are always fixed, and never change +regardless of the database operation. +<p>Configuring Btree databases to support record numbers can severely limit +the throughput of applications with multiple concurrent threads writing +the database, because locations used to store record counts often become +hot spots that many different threads all need to update. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/select.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/pagesize.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/malloc.html b/db/docs/ref/am_conf/malloc.html new file mode 100644 index 000000000..12e57383c --- /dev/null +++ b/db/docs/ref/am_conf/malloc.html @@ -0,0 +1,31 @@ +<!--$Id: malloc.so,v 10.19 2000/03/18 21:43:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Non-local memory allocation</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/dup.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_compare.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Non-local memory allocation</h1> +<p>Berkeley DB can allocate memory for returned key/data pairs which then becomes +the responsibility of the application. See <a href="../../api_c/dbt.html#DB_DBT_MALLOC">DB_DBT_MALLOC</a> or +<a href="../../api_c/dbt.html#DB_DBT_REALLOC">DB_DBT_REALLOC</a> for further information. +<p>On systems where there may be multiple library versions of malloc (notably +Windows NT), the Berkeley DB library could allocate memory from a different heap +than the application will use to free it. To avoid this problem, the +allocation routine to be used for allocating such key/data items can be +specified by calling the <a href="../../api_c/db_set_malloc.html">DB->set_malloc</a> or +<a href="../../api_c/db_set_realloc.html">DB->set_realloc</a> functions. If no allocation function is specified, the +underlying C library functions are used. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/dup.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/bt_compare.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/pagesize.html b/db/docs/ref/am_conf/pagesize.html new file mode 100644 index 000000000..41cab5ec4 --- /dev/null +++ b/db/docs/ref/am_conf/pagesize.html @@ -0,0 +1,66 @@ +<!--$Id: pagesize.so,v 10.20 2000/03/18 21:43:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Selecting a page size</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/logrec.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/cachesize.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Selecting a page size</h1> +<p>The size of the pages used in the underlying database can be specified by +calling the <a href="../../api_c/db_set_pagesize.html">DB->set_pagesize</a> function. The minimum page size is 512 bytes +and the maximum page size is 64K bytes, and must be a power of two. If +no page size is specified by the application, a page size is selected +based on the underlying filesystem I/O block size. (A page size selected +in this way has a lower limit of 512 bytes and an upper limit of 16K +bytes.) +<p>There are four issues to consider when selecting a pagesize: overflow +record sizes, locking, I/O efficiency, and recoverability. +<p>First, the page size implicitly sets the size of an overflow record. +Overflow records are key or data items that are too large to fit on a +normal database page because of their size, and are therefore stored in +overflow pages. Overflow pages are pages that exist outside of the normal +database structure. For this reason, there is often a significant +performance penalty associated with retrieving or modifying overflow +records. Selecting a page size that is too small, and which forces the +creation of large numbers of overflow pages, can seriously impact the +performance of an application. +<p>Second, in the Btree, Hash and Recno Access Methods, the finest-grained +lock that Berkeley DB acquires is for a page. (The Queue Access Method +generally acquires record-level locks rather than page-level locks.) +Selecting a page size that is too large, and which causes threads or +processes to wait because other threads of control are accessing or +modifying records on the same page, can impact the performance of your +application. +<p>Third, the page size specifies the granularity of I/O from the database +to the operating system. Berkeley DB will give a page-sized unit of bytes to +the operating system to be scheduled for writing to the disk. For many +operating systems, there is an internal <b>block size</b> which is used +as the granularity of I/O from the operating system to the disk. If the +page size is smaller than the block size, the operating system may be +forced to read a block from the disk, copy the page into the buffer it +read, and then write out the block to disk. Obviously, it will be much +more efficient for Berkeley DB to write filesystem-sized blocks to the operating +system and for the operating system to write those same blocks to the +disk. Selecting a page size that is too small, and which causes the +operating system to coalesce or otherwise manipulate Berkeley DB pages, can +impact the performance of your application. Alternatively, selecting a +page size that is too large may cause Berkeley DB and the operating system to +write more data than is strictly necessary. +<p>Fourth, when using the Berkeley DB Transactional Data Store product, the page size may affect the errors +from which your database can recover See +<a href="../../ref/transapp/reclimit.html">Berkeley DB Recoverability</a> for more +information. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/logrec.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/cachesize.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/re_source.html b/db/docs/ref/am_conf/re_source.html new file mode 100644 index 000000000..2095a9698 --- /dev/null +++ b/db/docs/ref/am_conf/re_source.html @@ -0,0 +1,62 @@ +<!--$Id: re_source.so,v 10.14 2000/11/20 21:45:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Flat-text backing files</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/extentsize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/renumber.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Flat-text backing files</h1> +<p>It is possible to back any Recno database (either fixed or variable +length) with a flat-text source file. This provides fast read (and +potentially write) access to databases that are normally created and +stored as flat-text files. The backing source file may be specified by +calling the <a href="../../api_c/db_set_re_source.html">DB->set_re_source</a> function. +<p>The backing source file will be read to initialize the database. In the +case of variable length records, the records are assumed to be separated +as described for the <a href="../../api_c/db_set_re_delim.html">DB->set_re_delim</a> function interface. For example, +standard UNIX byte stream files can be interpreted as a sequence of +variable length records separated by ASCII newline characters. This is +the default. +<p>When cached data would normally be written back to the underlying database +file (e.g., <a href="../../api_c/db_close.html">DB->close</a> or <a href="../../api_c/db_sync.html">DB->sync</a> functions are called), the +in-memory copy of the database will be written back to the backing source +file. +<p>The backing source file must already exist (but may be zero-length) when +<a href="../../api_c/db_open.html">DB->open</a> is called. By default, the backing source file is read +lazily, i.e., records are not read from the backing source file until they +are requested by the application. If multiple processes (not threads) are +accessing a Recno database concurrently and either inserting or deleting +records, the backing source file must be read in its entirety before more +than a single process accesses the database, and only that process should +specify the backing source file as part of the <a href="../../api_c/db_open.html">DB->open</a> call. +This can be accomplished by calling the <a href="../../api_c/db_set_flags.html">DB->set_flags</a> function with the +<a href="../../api_c/db_set_flags.html#DB_SNAPSHOT">DB_SNAPSHOT</a> flag. +<p>Reading and writing the backing source file cannot be transactionally +protected because it involves filesystem operations that are not part of +the Berkeley DB transaction methodology. For this reason, if a temporary +database is used to hold the records (a NULL was specified as the file +argument to <a href="../../api_c/db_open.html">DB->open</a>), <b>it is possible to lose the +contents of the backing source file if the system crashes at the right +instant</b>. If a permanent file is used to hold the database (a file name +was specified as the file argument to <a href="../../api_c/db_open.html">DB->open</a>), normal database +recovery on that file can be used to prevent information loss. It is +still possible that the contents of the backing source file itself will +be corrupted or lost if the system crashes. +<p>For all of the above reasons, the backing source file is generally used +to specify databases that are read-only for Berkeley DB applications, and that +are either generated on the fly by software tools, or modified using a +different mechanism such as a text editor. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/extentsize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/renumber.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/recno.html b/db/docs/ref/am_conf/recno.html new file mode 100644 index 000000000..1a7128e0e --- /dev/null +++ b/db/docs/ref/am_conf/recno.html @@ -0,0 +1,69 @@ +<!--$Id: recno.so,v 11.10 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Managing record-based databases</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/h_nelem.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/extentsize.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Managing record-based databases</h1> +<p>When using fixed- or variable-length record-based databases, particularly +with flat-text backing files, there are several items that the user can +control. The Recno access method can be used to store either variable- +or fixed-length data items. By default, the Recno access method stores +variable-length data items. The Queue access method can only store +fixed-length data items. +<h3>Record Delimiters</h3> +<p>When using the Recno access method to store variable-length records, +records read from any backing source file are separated by a specific +byte value which marks the end of one record and the beginning of the +next. This delimiting value is ignored except when reading records from +a backing source file, that is, records may be stored into the database +that include the delimiter byte. However, if such records are written +out to the backing source file and the backing source file is +subsequently read into a database, the records will be split where +delimiting bytes were found. +<p>For example, UNIX text files can usually be interpreted as a sequence of +variable-length records separated by ASCII newline characters. This byte +value (ASCII 0x0a) is the default delimiter. Applications may specify a +different delimiting byte using the <a href="../../api_c/db_set_re_delim.html">DB->set_re_delim</a> interface. +If no backing source file is being used, there is no reason to set the +delimiting byte value. +<h3>Record Length</h3> +<p>When using the Recno or Queue access methods to store fixed-length +records, the record length must be specified. Since the Queue access +method always uses fixed-length records, the user must always set the +record length prior to creating the database. Setting the record length +is what causes the Recno access method to store fixed-length, not +variable-length, records. +<p>The length of the records is specified by calling the +<a href="../../api_c/db_set_re_len.html">DB->set_re_len</a> function. The default length of the records is 0 bytes. +Any record read from a backing source file or otherwise stored in the +database that is shorter than the declared length will automatically be +padded as described for the <a href="../../api_c/db_set_re_pad.html">DB->set_re_pad</a> function. Any record stored +that is longer than the declared length results in an error. For +further information on backing source files, see +<a href="../../ref/am_conf/re_source.html">Flat-text backing files</a>. +<h3>Record Padding Byte Value</h3> +<p>When storing fixed-length records in a Queue or Recno database, a pad +character may be specified by calling the <a href="../../api_c/db_set_re_pad.html">DB->set_re_pad</a> function. Any +record read from the backing source file or otherwise stored in the +database that is shorter than the expected length will automatically be +padded with this byte value. If fixed-length records are specified but +no pad value is specified, a space character (0x20 in the ASCII +character set) will be used. For further information on backing source +files, see <a href="../../ref/am_conf/re_source.html">Flat-text backing +files</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/h_nelem.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/extentsize.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/renumber.html b/db/docs/ref/am_conf/renumber.html new file mode 100644 index 000000000..7c3594dff --- /dev/null +++ b/db/docs/ref/am_conf/renumber.html @@ -0,0 +1,80 @@ +<!--$Id: renumber.so,v 10.23 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Logically renumbering records</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/re_source.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/ops.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Logically renumbering records</h1> +<p>Records stored in the Queue and Recno access methods are accessed by +logical record number. In all cases in Btree databases, and optionally +in Recno databases (see the <a href="../../api_c/db_set_flags.html">DB->set_flags</a> function and the +<a href="../../api_c/db_set_flags.html#DB_RENUMBER">DB_RENUMBER</a> flag for more information), record numbers are +mutable. This means that the record numbers may change as records are +added to and deleted from the database. The deletion of record number +4 causes any records numbered 5 and higher to be renumbered downward by +1; the addition of a new record after record number 4 causes any +records numbered 5 and higher to be renumbered upward by 1. In all +cases in Queue databases, and by default in Recno databases, record +numbers are not mutable, and the addition or deletion of records to the +database will not cause already-existing record numbers to change. For +this reason, new records cannot be inserted between already-existing +records in databases with immutable record numbers. +<p>Cursors pointing into a Btree database or a Recno database with mutable +record numbers maintain a reference to a specific record, rather than +a record number, that is, the record they reference does not change as +other records are added or deleted. For example, if a database contains +three records with the record numbers 1, 2, and 3, and the data items +"A", "B", and "C", respectively, the deletion of record number 2 ("B") +will cause the record "C" to be renumbered downward to record number 2. +A cursor positioned at record number 3 ("C") will be adjusted and +continue to point to "C" after the deletion. Similarly, a cursor +previously referencing the now deleted record number 2 will be +positioned between the new record numbers 1 and 2, and an insertion +using that cursor will appear between those records. In this manner +records can be added and deleted to a database without disrupting the +sequential traversal of the database by a cursor. +<p>Only cursors created using a single DB handle can adjust each +other's position in this way, however. If multiple DB handles +have a renumbering Recno database open simultaneously (as when multiple +processes share a single database environment), a record referred to by +one cursor could change underfoot if a cursor created using another +DB handle inserts or deletes records into the database. For +this reason, applications using Recno databases with mutable record +numbers will usually make all accesses to the database using a single +DB handle and cursors created from that handle, or will +otherwise single-thread access to the database, e.g., by using the +Berkeley DB Concurrent Data Store product. +<p>In any Queue or Recno databases, creating new records will cause the +creation of multiple records if the record number being created is more +than one greater than the largest record currently in the database. For +example, creating record number 28, when record 25 was previously the +last record in the database, will implicitly create records 26 and 27 +as well as 28. All first, last, next and previous cursor operations +will automatically skip over these implicitly created records. So, if +record number 5 is the only record the application has created, +implicitly creating records 1 through 4, the <a href="../../api_c/dbc_get.html">DBcursor->c_get</a> interface +with the <a href="../../api_c/dbc_get.html#DB_FIRST">DB_FIRST</a> flag will return record number 5, not record +number 1. Attempts to explicitly retrieve implicitly created records +by their record number will result in a special error return, +<a href="../../ref/program/errorret.html#DB_KEYEMPTY">DB_KEYEMPTY</a>. +<p>In any Berkeley DB database, attempting to retrieve a deleted record, using +a cursor positioned on the record, results in a special error return, +<a href="../../ref/program/errorret.html#DB_KEYEMPTY">DB_KEYEMPTY</a>. In addition, when using Queue databases or Recno +databases with immutable record numbers, attempting to retrieve a deleted +record by its record number will also result in the <a href="../../ref/program/errorret.html#DB_KEYEMPTY">DB_KEYEMPTY</a> +return. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/re_source.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am/ops.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/am_conf/select.html b/db/docs/ref/am_conf/select.html new file mode 100644 index 000000000..3838b3467 --- /dev/null +++ b/db/docs/ref/am_conf/select.html @@ -0,0 +1,116 @@ +<!--$Id: select.so,v 10.23 2000/03/18 21:43:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Selecting an access method</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td> +<td width="1%"><a href="../../ref/am_conf/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/logrec.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Selecting an access method</h1> +<p>The Berkeley DB access method implementation unavoidably interacts with each +application's data set, locking requirements and data access patterns. +For this reason, one access method may result in dramatically better +performance for an application than another one. Applications whose data +could be stored in more than one access method may want to benchmark their +performance using the different candidates. +<p>One of the strengths of Berkeley DB is that it provides multiple access methods +with almost identical interfaces to the different access methods. This +means that it is simple to modify an application to use a different access +method. Applications can easily benchmark the different Berkeley DB access +methods against each other for their particular data set and access pattern. +<p>Most applications choose between using the Btree or Hash access methods +or between using the Queue and Recno access methods, because each of the +two pairs offer similar functionality. +<h3>Hash or Btree?</h3> +<p>The Hash and Btree access methods should be used when logical record +numbers are not the primary key used for data access. (If logical record +numbers are a secondary key used for data access, the Btree access method +is a possible choice, as it supports simultaneous access by a key and a +record number.) +<p>Keys in Btrees are stored in sorted order and the relationship between +them is defined by that sort order. For this reason, the Btree access +method should be used when there is any locality of reference among keys. +Locality of reference means that accessing one particular key in the +Btree implies that the application is more likely to access keys near to +the key being accessed, where "near" is defined by the sort order. For +example, if keys are timestamps, and it is likely that a request for an +8AM timestamp will be followed by a request for a 9AM timestamp, the +Btree access method is generally the right choice. Or, for example, if +the keys are names, and the application will want to review all entries +with the same last name, the Btree access method is again a good choice. +<p>There is little difference in performance between the Hash and Btree +access methods on small data sets, where all, or most of, the data set +fits into the cache. However, when a data set is large enough that +significant numbers of data pages no longer fit into the cache, then the +Btree locality of reference described above becomes important for +performance reasons. For example, there is no locality of reference for +the Hash access method, and so key "AAAAA" is as likely to be stored on +the same data page with key "ZZZZZ" as with key "AAAAB". In the Btree +access method, because items are sorted, key "AAAAA" is far more likely +to be near key "AAAAB" than key "ZZZZZ". So, if the application exhibits +locality of reference in its data requests, then the Btree page read into +the cache to satisfy a request for key "AAAAA" is much more likely to be +useful to satisfy subsequent requests from the application than the Hash +page read into the cache to satisfy the same request. This means that +for applications with locality of reference, the cache is generally much +"hotter" for the Btree access method than the Hash access method, and +the Btree access method will make many fewer I/O calls. +<p>However, when a data set becomes even larger, the Hash access method can +outperform the Btree access method. The reason for this is that Btrees +contain more metadata pages than Hash databases. The data set can grow +so large that metadata pages begin to dominate the cache for the Btree +access method. If this happens, the Btree can be forced to do an I/O +for each data request because the probability that any particular data +page is already in the cache becomes quite small. Because the Hash access +method has fewer metadata pages, its cache stays "hotter" longer in the +presence of large data sets. In addition, once the data set is so large +that both the Btree and Hash access methods are almost certainly doing +an I/O for each random data request, the fact that Hash does not have to +walk several internal pages as part of a key search becomes a performance +advantage for the Hash access method as well. +<p>Application data access patterns strongly affect all of these behaviors, +for example, accessing the data by walking a cursor through the database +will greatly mitigate the large data set behavior describe above because +each I/O into the cache will satisfy a fairly large number of subsequent +data requests. +<p>In the absence of information on application data and data access +patterns, for small data sets either the Btree or Hash access methods +will suffice. For data sets larger than the cache, we normally recommend +using the Btree access method. If you have truly large data, then the +Hash access method may be a better choice. The <a href="../../utility/db_stat.html">db_stat</a> utility +is a useful tool for monitoring how well your cache is performing. +<h3>Queue or Recno?</h3> +<p>The Queue or Recno access methods should be used when logical record +numbers are the primary key used for data access. The advantage of the +Queue access method is that it performs record level locking and for this +reason supports significantly higher levels of concurrency than the Recno +access method. The advantage of the Recno access method is that it +supports a number of additional features beyond those supported by the +Queue access method, such as variable-length records and support for +backing flat-text files. +<p>Logical record numbers can be mutable or fixed: mutable, where logical +record numbers can change as records are deleted or inserted, and fixed, +where record numbers never change regardless of the database operation. +It is possible to store and retrieve records based on logical record +numbers in the Btree access method. However, those record numbers are +always mutable, and as records are deleted or inserted, the logical record +number for other records in the database will change. The Queue access +method always runs in fixed mode, and logical record numbers never change +regardless of the database operation. The Recno access method can be +configured to run in either mutable or fixed mode. +<p>In addition, the Recno access method provides support for databases whose +permanent storage is a flat text file and the database is used as a fast, +temporary storage area while the data is being read or modified. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am_conf/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/logrec.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/arch/apis.html b/db/docs/ref/arch/apis.html new file mode 100644 index 000000000..d1ae91b5a --- /dev/null +++ b/db/docs/ref/arch/apis.html @@ -0,0 +1,74 @@ +<!--$Id: apis.so,v 10.26 2000/03/18 21:43:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Programmatic APIs</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Architecture</dl></h3></td> +<td width="1%"><a href="../../ref/arch/progmodel.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/script.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Programmatic APIs</h1> +<p>The Berkeley DB subsystems can be accessed through interfaces from multiple +languages. The standard library interface is ANSI C. Applications can +also use Berkeley DB via C++ or Java, as well as from scripting languages. +Environments can be shared among applications written using any of theses +APIs. For example, you might have a local server written in C or C++, a +script for an administrator written in Perl or Tcl, and a web based user +interface written in Java, all sharing a single database environment. +<h3>C</h3> +<p>The Berkeley DB library is written entirely in ANSI C. C applications use a +single include file: +<p><blockquote><pre>#include <db.h></pre></blockquote> +<h3>C++</h3> +<p>The C++ classes provide a thin wrapper around the C API, with the major +advantages being improved encapsulation and an optional exception +mechanism for errors. C++ applications use a single include file: +<p><blockquote><pre>#include <db_cxx.h></pre></blockquote> +<p>The classes and methods are named in a fashion that directly corresponds +to structures and functions in the C interface. Likewise, arguments to +methods appear in the same order as the C interface, except to remove the +explicit <b>this</b> pointer. The #defines used for flags are identical +between the C and C++ interfaces. +<p>As a rule, each C++ object has exactly one structure from the underlying +C API associated with it. The C structure is allocated with each +constructor call and deallocated with each destructor call. Thus, the +rules the user needs to follow in allocating and deallocating structures +are the same between the C and C++ interfaces. +<p>To ensure portability to many platforms, both new and old, Berkeley DB makes as +few assumptions as possible about the C++ compiler and library. For +example, it does not expect STL, templates or namespaces to be available. +The newest C++ feature used is exceptions, which are used liberally to +transmit error information. Even the use of exceptions can be disabled +at runtime. +<h3>JAVA</h3> +<p>The Java classes provide a layer around the C API that is almost identical +to the C++ layer. The classes and methods are, for the most part +identical to the C++ layer. Db constants and #defines are represented as +"static final int" values. Error conditions are communicated as Java +exceptions. +<p>As in C++, each Java object has exactly one structure from the underlying +C API associated with it. The Java structure is allocated with each +constructor or open call, but is deallocated only by the Java garbage +collector. Because the timing of garbage collection is not predictable, +applications should take care to do a close when finished with any object +that has a close method. +<h3>Dbm/Ndbm, Hsearch</h3> +<p>Berkeley DB supports the standard UNIX interfaces <a href="../../api_c/dbm.html">dbm</a> (or its +<a href="../../api_c/dbm.html">ndbm</a> variant) and <a href="../../api_c/hsearch.html">hsearch</a>. After including a new header +file and recompiling, <a href="../../api_c/dbm.html">dbm</a> programs will run orders of magnitude +faster and their underlying databases can grow as large as necessary. +Historic <a href="../../api_c/dbm.html">dbm</a> applications fail when some number of entries were +inserted into the database, where the number depends on the effectiveness +of the hashing function on the particular data set. +<table><tr><td><br></td><td width="1%"><a href="../../ref/arch/progmodel.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/script.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/arch/bigpic.gif b/db/docs/ref/arch/bigpic.gif Binary files differnew file mode 100644 index 000000000..48c52aed5 --- /dev/null +++ b/db/docs/ref/arch/bigpic.gif diff --git a/db/docs/ref/arch/bigpic.html b/db/docs/ref/arch/bigpic.html new file mode 100644 index 000000000..6c945744e --- /dev/null +++ b/db/docs/ref/arch/bigpic.html @@ -0,0 +1,114 @@ +<!--$Id: bigpic.so,v 8.21 2000/12/18 21:05:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: The big picture</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Architecture</dl></h3></td> +<td width="1%"><a href="../../ref/am/error.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/progmodel.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>The big picture</h1> +<p>The previous chapters in this Reference Guide have described applications +that use the Berkeley DB Access Methods for fast data storage and retrieval. +The applications we describe here and in subsequent chapters are similar +in nature to the Access Method applications, but they are also fully +recoverable in the face of application or system failure. +<p>Application code that only uses the Berkeley DB Access Methods might appear as +follows: +<p><blockquote><pre>switch (ret = dbp->put(dbp, NULL, &key, &data, 0)) { +case 0: + printf("db: %s: key stored.\n", (char *)key.data); + break; +default: + dbp->err(dbp, ret, "dbp->put"); + exit (1); +}</pre></blockquote> +<p>The underlying Berkeley DB architecture that supports this is: +<p align=center><img src="smallpic.gif" alt="small"> +<p>As you can see from this diagram, the application makes calls into the +Access Methods, and the Access Methods use the underlying shared memory +buffer cache to hold recently used file pages in main memory. +<p>When applications require recoverability, then their calls to the Access +Methods must be wrapped in calls to the transaction subsystem. The +application must inform Berkeley DB where to begin and end transactions, and +must be prepared for the possibility that an operation may fail at any +particular time, causing the transaction to abort. +<p>An example of transaction protected code might appear as follows: +<p><blockquote><pre>retry: if ((ret = txn_begin(dbenv, NULL, &tid)) != 0) { + dbenv->err(dbenv, ret, "txn_begin"); + exit (1); + } +<p> + switch (ret = dbp->put(dbp, tid, &key, &data, 0)) { + case DB_LOCK_DEADLOCK: + (void)txn_abort(tid); + goto retry; + case 0: + printf("db: %s: key stored.\n", (char *)key.data); + break; + default: + dbenv->err(dbenv, ret, "dbp->put"); + exit (1); + } +<p> + if ((ret = txn_commit(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_commit"); + exit (1); + }</pre></blockquote> +<p>In this example, the same operation is being done as before, however, it +is wrapped in transaction calls. The transaction is started with +<a href="../../api_c/txn_begin.html">txn_begin</a>, and finished with <a href="../../api_c/txn_commit.html">txn_commit</a>. If the operation +fails due to a deadlock, then the transaction is aborted using +<a href="../../api_c/txn_abort.html">txn_abort</a>, after which the operation may be retried. +<p>There are actually five major subsystems in Berkeley DB, as follows: +<p><dl compact> +<p><dt>The Access Methods<dd>The Access Method subsystem provides general-purpose support for creating +and accessing database files formatted as Btrees, Hashed files, and +Fixed- and Variable-length records. These modules are useful in the +absence of transactions for applications that need fast, formatted file +support. See <a href="../../api_c/db_open.html">DB->open</a> and <a href="../../api_c/db_cursor.html">DB->cursor</a> for more +information. These functions were already discussed in detail in the +previous chapters. +<p><dt>The Memory Pool<dd>The memory pool subsystem is the general-purpose shared memory buffer pool +used by Berkeley DB. This is the shared memory cache that allows multiple +processes and threads within processes to share access to databases. This +module is useful outside of the Berkeley DB package for processes that require +portable, page-oriented, cached, shared file access. +<p><dt>Transactions<dd>The transaction subsystem allows a group of database changes to be +treated as an atomic unit so that either all of the changes are done, or +none of the changes are done. The transaction subsystem implements the +Berkeley DB transaction model. This module is useful outside of the Berkeley DB +package for processes that want to transaction protect their own data +modifications. +<p><dt>Locking<dd>The locking subsystem is the general-purpose lock manager used by Berkeley DB. +This module is useful outside of the Berkeley DB package for processes that +require a portable, fast, configurable lock manager. +<p><dt>Logging<dd>The logging subsystem is the write-ahead logging used to support the Berkeley DB +transaction model. It is largely specific to the Berkeley DB package, and +unlikely to be useful elsewhere except as a supporting module for the +Berkeley DB transaction subsystem. +</dl> +<p>Here is a more complete picture of the Berkeley DB library: +<p align=center><img src="bigpic.gif" alt="large"> +<p>In this example, the application makes calls to the Access Methods and to +the transaction subsystem. The Access Methods and transaction subsystem +in turn make calls into the Buffer Pool, Locking and Logging subsystems +on behalf of the application. +<p>While the underlying subsystems can each be called independently. For +example, the Buffer Pool subsystem can be used apart from the rest of +Berkeley DB by applications simply wanting a shared memory buffer pool, or +the Locking subsystem may be called directly by applications that are +doing their own locking outside of Berkeley DB. However, this usage is fairly +rare, and most applications will either use only the Access Methods, or +the Access Methods wrapped in calls to the transaction interfaces. +<table><tr><td><br></td><td width="1%"><a href="../../ref/am/error.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/progmodel.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/arch/progmodel.html b/db/docs/ref/arch/progmodel.html new file mode 100644 index 000000000..04284f4f3 --- /dev/null +++ b/db/docs/ref/arch/progmodel.html @@ -0,0 +1,41 @@ +<!--$Id: progmodel.so,v 10.25 2000/03/18 21:43:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Programming model</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Architecture</dl></h3></td> +<td width="1%"><a href="../../ref/arch/bigpic.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/apis.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Programming model</h1> +<p>The Berkeley DB distribution is a database library, where the library is linked +into the address space of the code which uses it. The code using Berkeley DB +may be an application or it may be a server providing functionality to a +number of clients via some form of inter-process or remote-process +communication (IPC/RPC). +<p>In the application model, one or more applications link the Berkeley DB library +directly into their address spaces. There may be many threads of control +in this model, as Berkeley DB supports locking for both multiple processes and +for multiple threads within a process. This model provides significantly +faster access to the database functionality, but implies trust among all +threads of control sharing the database environment as they will have the +ability to read, write and potentially corrupt each other's data. +<p>In the client-server model, developers write a database server application +that accepts requests via some form of IPC and issues calls to the Berkeley DB +interfaces based on those requests. In this model, the database server +is the only application linking the Berkeley DB library into its address space. +The client-server model trades performance for protection, as it does not +require that the applications share a protection domain with the server, +but IPC/RPC is slower than a function call. Of course, in addition, this +model greatly simplifies the creation of network client-server applications. +<table><tr><td><br></td><td width="1%"><a href="../../ref/arch/bigpic.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/apis.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/arch/script.html b/db/docs/ref/arch/script.html new file mode 100644 index 000000000..411cff460 --- /dev/null +++ b/db/docs/ref/arch/script.html @@ -0,0 +1,29 @@ +<!--$Id: script.so,v 10.12 2000/03/18 21:43:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Scripting languages</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Architecture</dl></h3></td> +<td width="1%"><a href="../../ref/arch/apis.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/utilities.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Scripting languages</h1> +<h3>Perl</h3> +<p>Two Perl APIs are distributed with the Berkeley DB release. The Perl interface +to Berkeley DB version 1.85 is called DB_File. The Perl interface to Berkeley DB +version 2 is called BerkeleyDB. See <a href="../../ref/perl/intro.html">Using Berkeley DB with Perl</a> for more information. +<h3>Tcl</h3> +<p>A Tcl API is distributed with the Berkeley DB release. See +<a href="../../ref/tcl/intro.html">Using Berkeley DB with Tcl</a> for more +information. +<table><tr><td><br></td><td width="1%"><a href="../../ref/arch/apis.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/arch/utilities.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/arch/smallpic.gif b/db/docs/ref/arch/smallpic.gif Binary files differnew file mode 100644 index 000000000..5eb7ae8da --- /dev/null +++ b/db/docs/ref/arch/smallpic.gif diff --git a/db/docs/ref/arch/utilities.html b/db/docs/ref/arch/utilities.html new file mode 100644 index 000000000..72bfe52b2 --- /dev/null +++ b/db/docs/ref/arch/utilities.html @@ -0,0 +1,62 @@ +<!--$Id: utilities.so,v 10.23 2000/05/23 20:57:50 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Supporting utilities</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Architecture</dl></h3></td> +<td width="1%"><a href="../../ref/arch/script.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Supporting utilities</h1> +<p>There are several stand-alone utilities that provide supporting +functionality for the Berkeley DB environment: +<p><dl compact> +<p><dt><a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a><dd>The <a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a> utility is the Berkeley DB RPC server, providing +standard server functionality for client. applications. +<p><dt><a href="../../utility/db_archive.html">db_archive</a><dd>The <a href="../../utility/db_archive.html">db_archive</a> utility supports database backup, archival and log +file administration. It facilitates log reclamation and the creation of +database snapshots. Generally, some form of log archival must be done if +a database environment has been configured for logging or transactions. +<p><dt><a href="../../utility/db_checkpoint.html">db_checkpoint</a><dd>The <a href="../../utility/db_checkpoint.html">db_checkpoint</a> utility runs as a daemon process, monitoring +the database log and periodically issuing checkpoints. It facilitates +log reclamation and the creation of database snapshots. Generally, some +form of database checkpointing must be done if a database environment has +been configured for transactions. +<p><dt><a href="../../utility/db_deadlock.html">db_deadlock</a><dd>The <a href="../../utility/db_deadlock.html">db_deadlock</a> utility runs as a daemon process, periodically +traversing the database lock structures and aborting transactions when it +detects a deadlock. Generally, some form of deadlock detection must be +done if a database environment has been configured for locking. +<p><dt><a href="../../utility/db_dump.html">db_dump</a><dd>The <a href="../../utility/db_dump.html">db_dump</a> utility writes a copy of the database to a flat-text +file in a portable format. +<p><dt><a href="../../utility/db_load.html">db_load</a><dd>The <a href="../../utility/db_load.html">db_load</a> utility reads the flat-text file produced by +<a href="../../utility/db_dump.html">db_dump</a> and loads it into a database file. +<p><dt><a href="../../utility/db_printlog.html">db_printlog</a><dd>The <a href="../../utility/db_printlog.html">db_printlog</a> utility displays the contents of Berkeley DB log files +in a human-readable and parseable format. +<p><dt><a href="../../utility/db_recover.html">db_recover</a><dd>The <a href="../../utility/db_recover.html">db_recover</a> utility runs after an unexpected Berkeley DB or system +failure to restore the database to a consistent state. Generally, some +form of database recovery must be done if databases are being modified. +<p><dt><a href="../../utility/db_stat.html">db_stat</a> <dd>The <a href="../../utility/db_stat.html">db_stat</a> utility displays statistics for databases and database +environments. +<p><dt><a href="../../utility/db_upgrade.html">db_upgrade</a><dd>The <a href="../../utility/db_upgrade.html">db_upgrade</a> utility provides a command-line interface for +upgrading underlying database formats. +<p><dt><a href="../../utility/db_verify.html">db_verify</a><dd>The <a href="../../utility/db_verify.html">db_verify</a> utility provides a command-line interface for +verifying the database format. +</dl> +<p>All of the functionality implemented for these utilities is also available +as part of the standard Berkeley DB API. This means that threaded applications +can easily create a thread that calls the same Berkeley DB functions as do the +utilities. This often simplifies an application environment by removing +the necessity for multiple processes to negotiate database and database +environment creation and shutdown. +<table><tr><td><br></td><td width="1%"><a href="../../ref/arch/script.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/aix.html b/db/docs/ref/build_unix/aix.html new file mode 100644 index 000000000..102e1a01f --- /dev/null +++ b/db/docs/ref/build_unix/aix.html @@ -0,0 +1,60 @@ +<!--$Id: aix.so,v 11.11 2000/05/04 17:11:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: AIX</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/notes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/freebsd.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>AIX</h1> +<p><ol> +<p><li><b>I can't compile and run multi-threaded applications.</b> +<p>Special compile-time flags are required when compiling threaded +applications on AIX. If you are compiling a threaded application, +you must compile with the _THREAD_SAFE flag and load with specific +libraries, e.g., "-lc_r". Specifying the compiler name with a +trailing "_r" usually performs the right actions for the system. +<p><blockquote><pre>xlc_r ... +cc -D_THREAD_SAFE -lc_r ...</pre></blockquote> +<p>The Berkeley DB library will automatically build with the correct options. +<hr size=1 noshade> +<p><li><b>I can't run using the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> option to +<a href="../../api_c/env_open.html">DBENV->open</a>.</b> +<p>AIX 4.1 only allows applications to map 10 system shared memory segments. +In AIX 4.3 this has been raised to 256K segments, but only if you set the +environment variable "export EXTSHM=ON". +<hr size=1 noshade> +<p><li><b>I can't create database files larger than 1GB on AIX.</b> +<p>Berkeley DB does not include large-file support for AIX systems by default. +Sleepycat Software has been told that the following changes will add +large-file support on the AIX 4.2 and later releases, but we have not +tested them ourselves. +<p>Add the following lines to the <b>db_config.h</b> file in your build +directory: +<p><blockquote><pre>#ifdef HAVE_FILE_OFFSET_BITS +#define _LARGE_FILES /* AIX specific. */ +#endif</pre></blockquote> +<p>Change the source code for <b>os/os_open.c</b> to always specify the +<b>O_LARGEFILE</b> flag to the <b>open</b>(2) system call. +<p>Recompile Berkeley DB from scratch. +<p>Note that the documentation for the IBM Visual Age compiler states that +it does not not support the 64-bit filesystem APIs necessary for creating +large files, and that the ibmcxx product must be used instead. We have +not heard if the GNU gcc compiler supports the 64-bit APIs or not. +<p>Finally, to create large files under AIX, the filesystem has to be +configured to support large files and the system wide user hard-limit for +file sizes has to be greater than 1GB. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/notes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/freebsd.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/conf.html b/db/docs/ref/build_unix/conf.html new file mode 100644 index 000000000..289e9559e --- /dev/null +++ b/db/docs/ref/build_unix/conf.html @@ -0,0 +1,143 @@ +<!--$Id: conf.so,v 10.33 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuring Berkeley DB</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/flags.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuring Berkeley DB</h1> +<p>There are several options that you can specify when configuring Berkeley DB. +While only the Berkeley DB specific ones are described here, most of the +standard GNU autoconf options are available and supported. To see a +complete list of the options, specify the --help flag to the configure +program. +<p>The Berkeley DB specific options are as follows: +<p><dl compact> + <a name="4"><!--meow--></a> +<p><dt><a name="--disable-bigfile">--disable-bigfile</a><dd>Some systems, notably versions of HP/UX and Solaris, require special +compile-time options in order to create files larger than 2^32 bytes. +These options are automatically enabled when Berkeley DB is compiled. For this +reason, binaries built on current versions of these systems may not run +on earlier versions of the system, as the library and system calls +necessary for large files are not available. To disable building with +these compile-time options, enter --disable-bigfile as an argument to +configure. + <a name="5"><!--meow--></a> +<p><dt><a name="--enable-compat185">--enable-compat185</a><dd>To compile or load Berkeley DB 1.85 applications against this release of the +Berkeley DB library, enter --enable-compat185 as an argument to configure. +This will include Berkeley DB 1.85 API compatibility code in the library. + <a name="6"><!--meow--></a> +<p><dt><a name="--enable-cxx">--enable-cxx</a><dd>To build the Berkeley DB C++ API, enter --enable-cxx as an argument to +configure. +<p><dt><a name="--enable-debug">--enable-debug</a><dd>To build Berkeley DB with <b>-g</b> as a compiler flag and with <b>DEBUG</b> +#defined during compilation, enter --enable-debug as an argument to +configure. This will create a Berkeley DB library with debugging symbols, as +well as load various routines that can be called from a debugger to +display pages, cursor queues and so forth. This option should not be +specified when configuring to build production binaries, although there +shouldn't be any significant performance degradation. +<p><dt><a name="--enable-debug_rop">--enable-debug_rop</a><dd>To build Berkeley DB to output log records for read operations, enter +--enable-debug_rop as an argument to configure. This option should not +be specified when configuring to build production binaries, as you will +lose a significant amount of performance. +<p><dt><a name="--enable-debug_wop">--enable-debug_wop</a><dd>To build Berkeley DB to output log records for write operations, enter +--enable-debug_wop as an argument to configure. This option should not +be specified when configuring to build production binaries, as you will +lose a significant amount of performance. +<p><dt><a name="--enable-diagnostic">--enable-diagnostic</a><dd>To build Berkeley DB with debugging run-time sanity checks, enter +--enable-diagnostic as an argument to configure. This will cause a +number of special checks to be performed when Berkeley DB is running. This +option should not be specified when configuring to build production +binaries, as you will lose a significant amount of performance. + <a name="7"><!--meow--></a> +<p><dt><a name="--enable-dump185">--enable-dump185</a><dd>To convert Berkeley DB 1.85 (or earlier) databases to this release of Berkeley DB, +enter --enable-dump185 as an argument to configure. This will build the +<a href="../../utility/db_dump.html">db_dump185</a> utility which can dump Berkeley DB 1.85 and 1.86 databases +in a format readable by the Berkeley DB <a href="../../utility/db_load.html">db_load</a> utility. +<p>The system libraries with which you are loading the <a href="../../utility/db_dump.html">db_dump185</a> +utility must already contain the Berkeley DB 1.85 library routines for this to +work, as the Berkeley DB distribution does not include them. If you are using +a non-standard library for the Berkeley DB 1.85 library routines, you will have +to change the Makefile that the configuration step creates to load the +<a href="../../utility/db_dump.html">db_dump185</a> utility with that library. + <a name="8"><!--meow--></a> + <a name="9"><!--meow--></a> +<p><dt><a name="--enable-dynamic">--enable-dynamic</a><dd>To build a dynamic shared library version of Berkeley DB, instead of the default +static library, specify --enable-dynamic. Dynamic libraries are built +using <a href="http://www.gnu.org/software/libtool/libtool.html">the +GNU Project's Libtool</a> distribution, which supports shared library builds +on many, although not all, systems. +<p>Berkeley DB can be configured to build either a static or a dynamic library, +but not both at once. You should not attempt to build both library +types in the same directory, as they have incompatible object file +formats. To build both static and dynamic libraries, create two +separate build directories, and configure and build them separately. + <a name="10"><!--meow--></a> +<p><dt><a name="--enable-java">--enable-java</a><dd>To build the Berkeley DB Java API, enter --enable-java as an argument to +configure. To build Java, you must also configure the option +--enable-dynamic. Before configuring, you must set your PATH environment +variable to include javac. Note, it is not sufficient to include a +symbolic link to javac in your PATH, because the configuration process +uses the location of javac to determine the location of the Java include +files (e.g., jni.h). On some systems additional include directories may +be needed to process jni.h, see <a href="flags.html">Changing compile or load +options</a> for more information. +<p><dt><a name="--enable-posixmutexes">--enable-posixmutexes</a><dd>To force Berkeley DB to use the POSIX pthread mutex interfaces for underlying +mutex support, enter --enable-posixmutexes as an argument to configure. +The Berkeley DB library requires that the POSIX pthread implementation support +mutexes shared between multiple processes, as described for the +pthread_condattr_setpshared and pthread_mutexattr_setpshared interfaces. +In addition, this configuration option requires that Berkeley DB be linked with +the -lpthread library. On systems where POSIX mutexes are the preferred +mutex support (e.g., HP-UX), they will be selected automatically. + <a name="11"><!--meow--></a> +<p><dt><a name="--enable-rpc">--enable-rpc</a><dd>To build the Berkeley DB RPC client code and server utility, enter --enable-rpc +as an argument to configure. The --enable-rpc option requires that RPC +libraries already be installed on your system. +<p><dt><a name="--enable-shared">--enable-shared</a><dd>The --enable-shared configure argument is an alias for --enable-dynamic. + <a name="12"><!--meow--></a> +<p><dt><a name="--enable-tcl">--enable-tcl</a><dd>To build the Berkeley DB Tcl API, enter --enable-tcl as an argument to +configure. This configuration option expects to find Tcl's tclConfig.sh +file in the <b>/usr/local/lib</b> directory. See the --with-tcl +option for instructions on specifying a non-standard location for the +Tcl installation. See <a href="../../ref/tcl/intro.html">Loading Berkeley DB +with Tcl</a> for information on sites from which you can download Tcl and +which Tcl versions are compatible with Berkeley DB. To configure the Berkeley DB +Tcl API, you must also specify the --enable-dynamic option. + <a name="13"><!--meow--></a> +<p><dt><a name="--enable-test">--enable-test</a><dd>To build the Berkeley DB test suite, enter --enable-test as an argument to +configure. To run the Berkeley DB test suite, you must also specify the +--enable-dynamic and --enable-tcl options. +<p><dt><a name="--enable-uimutexes">--enable-uimutexes</a><dd>To force Berkeley DB to use the UNIX International (UI) mutex interfaces for +underlying mutex support, enter --enable-uimutexes as an argument to +configure. This configuration option requires that Berkeley DB be linked with +the -lthread library. On systems where UI mutexes are the preferred mutex +support, (e.g., SCO's UnixWare 2), they will be selected automatically. +<p><dt><a name="--enable-umrw">--enable-umrw</a><dd>Rational Software's Purify product and other run-time tools complain +about uninitialized reads/writes of structure fields whose only purpose +is padding, as well as when heap memory that was never initialized is +written to disk. Specify the --enable-umrw option during configuration +to mask these errors. This option should not be specified when +configuring to build production binaries, as you will lose a significant +amount of performance. +<p><dt><a name="--with-tcl=DIR">--with-tcl=DIR</a><dd>To build the Berkeley DB Tcl API, enter --with-tcl=DIR, replacing DIR with +the directory in which the Tcl tclConfig.sh file may be found. See +<a href="../../ref/tcl/intro.html">Loading Berkeley DB with Tcl</a> for information +on sites from which you can download Tcl and which Tcl versions are +compatible with Berkeley DB. To configure the Berkeley DB Tcl API, you must also +specify the --enable-dynamic option. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/flags.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/flags.html b/db/docs/ref/build_unix/flags.html new file mode 100644 index 000000000..5b70b3d8d --- /dev/null +++ b/db/docs/ref/build_unix/flags.html @@ -0,0 +1,60 @@ +<!--$Id: flags.so,v 10.6 2000/12/01 00:19:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Changing compile or load options</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/conf.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/install.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Changing compile or load options</h1> +<p>You can specify compiler and/or compile and load time flags by using +environment variables during Berkeley DB configuration. For example, if you +want to use a specific compiler, specify the CC environment variable +before running configure: +<p><blockquote><pre>prompt: env CC=gcc ../dist/configure</pre></blockquote> +<p>Using anything other than the native compiler will almost certainly mean +that you'll want to check the flags specified to the compiler and +loader, too. +<p>To specify debugging and optimization options for the C compiler, +use the CFLAGS environment variable: +<p><blockquote><pre>prompt: env CFLAGS=-O2 ../dist/configure</pre></blockquote> +<p>To specify header file search directories and other miscellaneous options +for the C preprocessor and compiler, use the CPPFLAGS environment variable: +<p><blockquote><pre>prompt: env CPPFLAGS=-I/usr/contrib/include ../dist/configure</pre></blockquote> +<p>To specify debugging and optimization options for the C++ compiler, +use the CXXFLAGS environment variable: +<p><blockquote><pre>prompt: env CXXFLAGS=-Woverloaded-virtual ../dist/configure</pre></blockquote> +<p>To specify miscellaneous options or additional library directories for +the linker, use the LDFLAGS environment variable: +<p><blockquote><pre>prompt: env LDFLAGS="-N32 -L/usr/local/lib" ../dist/configure</pre></blockquote> +<p>If you want to specify additional libraries, set the LIBS environment +variable before running configure. For example: +<p><blockquote><pre>prompt: env LIBS="-lposix -lsocket" ../dist/configure</pre></blockquote> +<p>would specify two additional libraries to load, "posix" and "socket". +<p>Make sure that you prepend -L to any library directory names and that you +prepend -I to any include file directory names! Also, if the arguments +you specify contain blank or tab characters, be sure to quote them as +shown above, i.e. with single or double quotes around the values you're +specifying for LIBS. +<p>The env command is available on most systems, and simply sets one or more +environment variables before running a command. If the env command is +not available to you, you can set the environment variables in your shell +before running configure. For example, in sh or ksh, you could do: +<p><blockquote><pre>prompt: LIBS="-lposix -lsocket" ../dist/configure</pre></blockquote> +<p>and in csh or tcsh, you could do: +<p><blockquote><pre>prompt: setenv LIBS "-lposix -lsocket" +prompt: ../dist/configure</pre></blockquote> +<p>See your command shell's manual page for further information. +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/conf.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/install.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/freebsd.html b/db/docs/ref/build_unix/freebsd.html new file mode 100644 index 000000000..3d3ff8116 --- /dev/null +++ b/db/docs/ref/build_unix/freebsd.html @@ -0,0 +1,57 @@ +<!--$Id: freebsd.so,v 11.12 2000/03/18 21:43:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: FreeBSD</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/aix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/hpux.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>FreeBSD</h1> +<p><ol> +<p><li><b>I can't compile and run multi-threaded applications.</b> +<p>Special compile-time flags are required when compiling threaded +applications on FreeBSD. If you are compiling a threaded application, +you must compile with the _THREAD_SAFE and -pthread flags: +<p><blockquote><pre>cc -D_THREAD_SAFE -pthread ...</pre></blockquote> +<p>The Berkeley DB library will automatically build with the correct options. +<hr size=1 noshade> +<p><li><b>I get occasional failures when running RPC-based programs under FreeBSD clients.</b> +<p>There is a known bug in the XDR implementation in the FreeBSD C library, +from Version 2.2 up to version 4.0-RELEASE, that causes certain sized +messages to fail and return a zero-filled reply to the client. A bug +report (#16028) has been filed with FreeBSD. The following patch is the +FreeBSD fix: +<p><blockquote><pre>*** /usr/src/lib/libc/xdr/xdr_rec.c.orig Mon Jan 10 10:20:42 2000 +--- /usr/src/lib/libc/xdr/xdr_rec.c Wed Jan 19 10:53:45 2000 +*************** +*** 558,564 **** + * but we don't have any way to be certain that they aren't + * what the client actually intended to send us. + */ +! if ((header & (~LAST_FRAG)) == 0) + return(FALSE); + rstrm->fbtbc = header & (~LAST_FRAG); + return (TRUE); +--- 558,564 ---- + * but we don't have any way to be certain that they aren't + * what the client actually intended to send us. + */ +! if (header == 0) + return(FALSE); + rstrm->fbtbc = header & (~LAST_FRAG); + return (TRUE); +</pre></blockquote> +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/aix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/hpux.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/hpux.html b/db/docs/ref/build_unix/hpux.html new file mode 100644 index 000000000..3fc50d73c --- /dev/null +++ b/db/docs/ref/build_unix/hpux.html @@ -0,0 +1,89 @@ +<!--$Id: hpux.so,v 11.11 2000/12/14 17:04:02 krinsky Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: HP-UX</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/freebsd.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/irix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>HP-UX</h1> +<p><ol> +<p><li><b>I can't specify the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag to <a href="../../api_c/env_open.html">DBENV->open</a>.</b> +<p>The <b>shmget</b>(2) interfaces are not always used on HP-UX, even +though they exist, as anonymous memory allocated using <b>shmget</b>(2) +cannot be used to store the standard HP-UX msemaphore semaphores. For +this reason, it may not be possible to specify the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> +flag on some versions of HP-UX. (We have only seen this problem on HP-UX +10.XX, so the simplest workaround may be to upgrade your HP-UX release.) +<hr size=1 noshade> +<p><li><b>I can't specify both the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> and <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> +flags to <a href="../../api_c/env_open.html">DBENV->open</a>.</b> +<p>It is not possible to store the standard HP-UX msemaphore semaphores in +memory returned by <b>malloc</b>(3) in some versions of HP-UX. For +this reason, it may not be possible to specify both the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> +and <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flags on some versions of HP-UX. (We have only seen +this problem on HP-UX 10.XX, so the simplest workaround may be to upgrade +your HP-UX release.) +<hr size=1 noshade> +<p><li><b>During configuration I see a message that large file support has +been turned off.</b> +<p>Some HP-UX system include files redefine "open" when big-file support (the +HAVE_FILE_OFFSET_BITS and _FILE_OFFSET_BITS #defines) is enabled. This +causes problems when compiling for C++, where "open" is a legal +identifier, used in the Berkeley DB C++ API. For this reason, we automatically +turn off big-file support when Berkeley DB is configured with a C++ API. This +should not be a problem for applications unless there is a need to create +databases larger than 2GB. +<hr size=1 noshade> +<p><li><b>I can't compile and run multi-threaded applications.</b> +<p>Special compile-time flags are required when compiling threaded +applications on HP-UX. If you are compiling a threaded application, you +must compile with the _REENTRANT flag: +<p><blockquote><pre>cc -D_REENTRANT ...</pre></blockquote> +<p>The Berkeley DB library will automatically build with the correct options. +<hr size=1 noshade> +<p><li><b>An ENOMEM error is returned from <a href="../../api_c/env_open.html">DBENV->open</a> or +<a href="../../api_c/env_remove.html">DBENV->remove</a>.</b> +<p>Due to the constraints of the PA-RISC memory architecture, HP-UX does not +allow a process to map a file into its address space multiple times. +For this reason, each Berkeley DB environment may be opened only once by a +process on HP-UX, i.e., calls to <a href="../../api_c/env_open.html">DBENV->open</a> will fail if the +specified Berkeley DB environment has been opened and not subsequently closed. +<hr size=1 noshade> +<p><li><b>When compiling with gcc, I see the following error: +<p><blockquote><pre>#error "Large Files (ILP32) not supported in strict ANSI mode."</pre></blockquote></b> +<p>We believe this is an error in the HP-UX include files, but we don't +really understand it. The only workaround we have found is to add +-D__STDC_EXT__ to the C preprocessor defines as part of compilation. +<hr size=1 noshade> +<p><li><b>When using the Tcl or Perl APIs (including running the test suite) I +see the error "Can't shl_load() a library containing Thread Local Storage".</b> +<p>This problem happens when HP-UX has been configured to use pthread mutex +locking and an attempt is made to call Berkeley DB using the Tcl or Perl APIs. We +have never found any way to fix this problem as part of the Berkeley DB build +process. To work around the problem, rebuild tclsh or perl and modify its build +process to explicitly link it against the HP-UX pthread library (currently +/usr/lib/libpthread.a). +<hr size=1 noshade> +<p><li><b>When running an executable that has been dynamically linked +against the Berkeley DB library, I see the error "Can't find path for shared library" +even though I've correctly set the SHLIB_PATH environment variable.</b> +<p>By default, some versions of HP-UX ignore the dynamic library search path +specified by the SHLIB_PATH environment variable. To work around this, specify +the "+s" flag to ld when linking, or run +<p><blockquote><pre>chatr +s enable -l /full/path/to/libdb-3.2.sl ...</pre></blockquote> +<p>on the executable that is not working. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/freebsd.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/irix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/install.html b/db/docs/ref/build_unix/install.html new file mode 100644 index 000000000..7beb6f705 --- /dev/null +++ b/db/docs/ref/build_unix/install.html @@ -0,0 +1,60 @@ +<!--$Id: install.so,v 10.12 2000/12/01 00:19:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Installing Berkeley DB</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/flags.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/shlib.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Installing Berkeley DB</h1> +<p>Berkeley DB installs the following files into the following locations, with the +following default values: +<p><table border=1 align=center> +<tr><th>Configuration Variables</th><th>Default value</th></tr> +<tr><td>--prefix</td><td>/usr/local/BerkeleyDB.<b>Major</b>.<b>Minor</b></td></tr> +<tr><td>--exec_prefix</td><td>$(prefix)</td></tr> +<tr><td>--bindir</td><td>$(exec_prefix)/bin</td></tr> +<tr><td>--includedir</td><td>$(prefix)/include</td></tr> +<tr><td>--libdir</td><td>$(exec_prefix)/lib</td></tr> +<tr><td>docdir</td><td>$(prefix)/docs</td></tr> +<tr><th>Files</th><th>Default location</th></tr> +<tr><td>include files</td><td>$(includedir)</td></tr> +<tr><td>libraries</td><td>$(libdir)</td></tr> +<tr><td>utilities</td><td>$(bindir)</td></tr> +<tr><td>documentation</td><td>$(docdir)</td></tr> +</table> +<p>With one exception, this follows the GNU Autoconf and GNU Coding +Standards installation guidelines, please see that documentation for +more information and rationale. +<p>The single exception is the Berkeley DB documentation. The Berkeley DB +documentation is provided in HTML format, not in UNIX-style man or GNU +info format. For this reason, Berkeley DB configuration does not support +<b>--infodir</b> or <b>--mandir</b>. To change the default +installation location for the Berkeley DB documentation, modify the Makefile +variable, <b>docdir</b>. +<p>To move the entire installation tree to somewhere besides +<b>/usr/local</b>, change the value of <b>prefix</b>. +<p>To move the binaries and libraries to a different location, change the +value of <b>exec_prefix</b>. The values of <b>includedir</b> and +<b>libdir</b> may be similarly changed. +<p>Any of these values except for <b>docdir</b> may be set as part +of configuration: +<p><blockquote><pre>prompt: ../dist/configure --bindir=/usr/local/bin</pre></blockquote> +<p>Any of these values, including <b>docdir</b>, may be changed when doing +the install itself: +<p><blockquote><pre>prompt: make prefix=/usr/contrib/bdb install</pre></blockquote> +<p>The Berkeley DB installation process will attempt to create any directories that +do not already exist on the system. +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/flags.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/shlib.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/intro.html b/db/docs/ref/build_unix/intro.html new file mode 100644 index 000000000..b2c0d613b --- /dev/null +++ b/db/docs/ref/build_unix/intro.html @@ -0,0 +1,60 @@ +<!--$Id: intro.so,v 10.18 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Building for UNIX</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/debug/common.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/conf.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Building for UNIX</h1> +<p>The Berkeley DB distribution builds up to four separate libraries: the base C +API Berkeley DB library and the optional C++, Java and Tcl API libraries. For +portability reasons each library is standalone and contains the full Berkeley DB +support necessary to build applications, that is, the C++ API Berkeley DB +library does not require any other Berkeley DB libraries to build and run C++ +applications. +<p>The Berkeley DB distribution uses the Free Software Foundation's +<a href="http://sourceware.cygnus.com/autoconf">autoconf</a> and +<a href="http://www.gnu.org/software/libtool/libtool.html">libtool</a> +tools to build on UNIX platforms. In general, the standard configuration +and installation options for these tools apply to the Berkeley DB distribution. +<p>To perform the default UNIX build of Berkeley DB, first change to the +<b>build_unix</b> directory, and then enter the following two commands: +<p><blockquote><pre>../dist/configure +make</pre></blockquote> +<p>This will build the Berkeley DB library. +<p>To install the Berkeley DB library, enter: +<p><blockquote><pre>make install</pre></blockquote> +<p>To rebuild Berkeley DB, enter: +<p><blockquote><pre>make clean +make</pre></blockquote> +<p>If you change your mind about how Berkeley DB is to be configured, you must start +from scratch by entering: +<p><blockquote><pre>make realclean +../dist/configure +make</pre></blockquote> +<p>To build multiple UNIX versions of Berkeley DB in the same source tree, create a +new directory at the same level as the build_unix directory, and then +configure and build in that directory: +<p><blockquote><pre>mkdir build_bsdos3.0 +cd build_bsdos3.0 +../dist/configure +make</pre></blockquote> +<p>If you have trouble with any of these commands, please send email to the +addresses found in the Sleepycat Software contact information. In that +email, please provide a complete copy of the commands that you entered +and any output, along with a copy of any <b>config.log</b> or +<b>config.cache</b> files created during configuration. +<table><tr><td><br></td><td width="1%"><a href="../../ref/debug/common.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/conf.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/irix.html b/db/docs/ref/build_unix/irix.html new file mode 100644 index 000000000..af31b6e68 --- /dev/null +++ b/db/docs/ref/build_unix/irix.html @@ -0,0 +1,30 @@ +<!--$Id: irix.so,v 11.4 2000/03/18 21:43:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: IRIX</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/hpux.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/linux.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>IRIX</h1> +<p><ol> +<p><li><b>I can't compile and run multi-threaded applications.</b> +<p>Special compile-time flags are required when compiling threaded +applications on IRIX. If you are compiling a threaded application, you +must compile with the _SGI_MP_SOURCE flag: +<p><blockquote><pre>cc -D_SGI_MP_SOURCE ...</pre></blockquote> +<p>The Berkeley DB library will automatically build with the correct options. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/hpux.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/linux.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/linux.html b/db/docs/ref/build_unix/linux.html new file mode 100644 index 000000000..b6e2b93fb --- /dev/null +++ b/db/docs/ref/build_unix/linux.html @@ -0,0 +1,30 @@ +<!--$Id: linux.so,v 11.4 2000/03/18 21:43:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Linux</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/irix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/osf1.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Linux</h1> +<p><ol> +<p><li><b>I can't compile and run multi-threaded applications.</b> +<p>Special compile-time flags are required when compiling threaded +applications on Linux. If you are compiling a threaded application, you +must compile with the _REENTRANT flag: +<p><blockquote><pre>cc -D_REENTRANT ...</pre></blockquote> +<p>The Berkeley DB library will automatically build with the correct options. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/irix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/osf1.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/notes.html b/db/docs/ref/build_unix/notes.html new file mode 100644 index 000000000..dcb975e3c --- /dev/null +++ b/db/docs/ref/build_unix/notes.html @@ -0,0 +1,138 @@ +<!--$Id: notes.so,v 10.42 2001/01/09 18:49:53 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Architecture independent FAQs</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/test.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/aix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Architecture independent FAQs</h1> +<p><ol> +<p><li><b>When compiling with gcc, I get unreferenced symbols, e.g.,: +<p><blockquote><pre>symbol __muldi3: referenced symbol not found +symbol __cmpdi2: referenced symbol not found</pre></blockquote></b> +<p>On systems where they're available (e.g., HP-UX, Solaris), Berkeley DB uses +64-bit integral types. As far as we can tell, some versions of gcc +don't support these types. The simplest workaround is to reconfigure +Berkeley DB using the --disable-bigfile configuration option, and then rebuild. +<hr size=1 noshade> +<p><li><b>My C++ program traps during a failure in a DB call on my +gcc-based system.</b> +<p>We believe there are some severe bugs in the implementation of exceptions +for some gcc compilers. Exceptions require some interaction between +compiler, assembler, runtime libraries, and we're not sure exactly what +is at fault, but one failing combination is gcc 2.7.2.3 running on SuSE +Linux 6.0. The problem on this system can be seen with a rather simple +test case of an exception thrown from a shared library and caught in the +main program. +<p>A variation of this problem seems to occur on AIX, although we believe it +does not necessarily involve shared libraries on that platform. +<p>If you see a trap that occurs when an exception might be thrown by the DB +runtime, we suggest that you use static libraries instead of dynamic +(shared) libraries. See the documentation for configuration. If this +doesn't work, and you have a choice of compilers, try using a more recent +gcc or a non-gcc based compiler to build Berkeley DB. +<p>Finally, you can disable the use of exceptions in the C++ runtime for +Berkeley DB by using the <a href="../../api_c/db_create.html#DB_CXX_NO_EXCEPTIONS">DB_CXX_NO_EXCEPTIONS</a> flag with +<a href="../../api_c/env_create.html">db_env_create</a> or <a href="../../api_c/db_create.html">db_create</a>. When this flag is on, all +C++ methods fail by returning an error code rather than throwing an +exception. +<hr size=1 noshade> +<p><li><b>I get unexpected results and database corruption when running +threaded programs.</b> +<p><b>I get error messages that mutex (e.g., pthread_mutex_XXX or +mutex_XXX) functions are undefined when linking applications with Berkeley DB.</b> +<p>On some architectures, the Berkeley DB library uses the ISO POSIX standard +pthreads and UNIX International (UI) threads interfaces for underlying +mutex support, e.g., Solaris and HP-UX. You can specify compilers, +compiler flags or link with the appropriate thread library when loading +your application, to resolve the undefined references: +<p><blockquote><pre>cc ... -lpthread ... +cc ... -lthread ... +xlc_r ... +cc ... -mt ...</pre></blockquote> +<p>See the appropriate architecture-specific Reference Guide pages for more +information. +<p>On systems where more than one type of mutex is available, it may be +necessary for applications to use the same threads package from which +Berkeley DB draws its mutexes, e.g., if Berkeley DB was built to use the POSIX +pthreads mutex calls for mutex support, the application may need to be +written to use the POSIX pthreads interfaces for its threading model. +While this is only conjecture at this time and we know of no systems that +actually have this requirement, it's not unlikely that some exist. +<p>In a few cases, Berkeley DB can be configured to use specific underlying mutex +interfaces. You can use the <a href="../../ref/build_unix/conf.html#--enable-posixmutexes">--enable-posixmutexes</a> and +<a href="../../ref/build_unix/conf.html#--enable-uimutexes">--enable-uimutexes</a> configuration options to specify the POSIX and Unix +International (UI) threads packages. This should not, however, be +necessary in most cases. +<p>In some cases, it is vitally important to make sure that you load +the correct library. For example, on Solaris systems, there are POSIX +pthread interfaces in the C library, and so applications can link Berkeley DB +using only C library and not see any undefined symbols. However, the C +library POSIX pthread mutex support is insufficient for Berkeley DB and Berkeley DB +cannot detect that fact. Similar errors can arise when applications +(e.g., tclsh) use dlopen to dynamically load Berkeley DB as a library. +<p>If you are seeing problems in this area after you've confirmed that you're +linking with the correct libraries, there are two other things you can +try. First, if your platform supports inter-library dependencies, we +recommend that you change the Berkeley DB Makefile to specify the appropriate +threads library when creating the Berkeley DB dynamic library, as an +inter-library dependency. Second, if your application is using dlopen to +dynamically load Berkeley DB, specify the appropriate thread library on the link +line when you load the application itself. +<hr size=1 noshade> +<p><li><b>I get core dumps when running programs that fork children.</b> +<p>Berkeley DB handles should not be shared across process forks, each forked +child should acquire its own Berkeley DB handles. +<hr size=1 noshade> +<p><li><b>I get reports of uninitialized memory reads and writes when +running software analysis tools (e.g., Rational Software Corp.'s Purify +tool).</b> +<p>For performance reasons, Berkeley DB does not write the unused portions of +database pages or fill in unused structure fields. To turn off these +errors when running software analysis tools, build with the +--enable-umrw configuration option. +<hr size=1 noshade> +<p><li><b>Berkeley DB programs or the test suite fail unexpectedly.</b> +<p>The Berkeley DB architecture does not support placing the shared memory regions +on remote filesystems, e.g., the Network File System (NFS) or the Andrew +File System (AFS). For this reason, the shared memory regions (normally +located in the database home directory) must reside on a local filesystem. +See <a href="../../ref/env/region.html">Shared Memory Regions</a> for more +information. +<p>With respect to running the test suite, always check to make sure that +TESTDIR is not on a remote mounted filesystem. +<hr size=1 noshade> +<p><li><b>The <a href="../../utility/db_dump.html">db_dump185</a> utility fails to build.</b> +<p>The <a href="../../utility/db_dump.html">db_dump185</a> utility is the utility that supports conversion +of Berkeley DB 1.85 and earlier databases to current database formats. If +the errors look something like: +<p><blockquote><pre>cc -o db_dump185 db_dump185.o +ld: +Unresolved: +dbopen</pre></blockquote> +<p>it means that the Berkeley DB 1.85 code was not found in the standard +libraries. To build <a href="../../utility/db_dump.html">db_dump185</a>, the Berkeley DB version 1.85 code +must have already been built and installed on the system. If the Berkeley DB +1.85 header file is not found in a standard place, or the library is +not part of the standard libraries used for loading, you will need to +edit your Makefile, and change the lines: +<p><blockquote><pre>DB185INC= +DB185LIB=</pre></blockquote> +<p>So that the system Berkeley DB 1.85 header file and library are found, e.g., +<p><blockquote><pre>DB185INC=/usr/local/include +DB185LIB=-ldb185</pre></blockquote> +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/test.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/aix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/osf1.html b/db/docs/ref/build_unix/osf1.html new file mode 100644 index 000000000..42ac8e767 --- /dev/null +++ b/db/docs/ref/build_unix/osf1.html @@ -0,0 +1,30 @@ +<!--$Id: osf1.so,v 11.6 2000/10/30 20:46:06 sue Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: OSF/1</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/linux.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/qnx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>OSF/1</h1> +<p><ol> +<p><li><b>I can't compile and run multi-threaded applications.</b> +<p>Special compile-time flags are required when compiling threaded +applications on OSF/1. If you are compiling a threaded application, you +must compile with the _REENTRANT flag: +<p><blockquote><pre>cc -D_REENTRANT ...</pre></blockquote> +<p>The Berkeley DB library will automatically build with the correct options. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/linux.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/qnx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/qnx.html b/db/docs/ref/build_unix/qnx.html new file mode 100644 index 000000000..29c90dc98 --- /dev/null +++ b/db/docs/ref/build_unix/qnx.html @@ -0,0 +1,58 @@ +<!--$Id: qnx.so,v 11.5 2000/11/29 15:03:24 sue Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: QNX</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/osf1.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/sco.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>QNX</h1> +<p><ol> +<p><li><b>To what versions of QNX has DB been ported?</b> +<p>Berkeley DB has been ported to the QNX Neutrino technology which is commonly +referred to as QNX RTP (Real-Time Platform). Berkeley DB has not been +ported to earlier versions of QNX, such as QNX 4.25. +<p><li><b>What is the impact of QNX's use of <b>shm_open</b>(2) for +shared memory regions?</b> +<p>QNX requires the use of the POSIX <b>shm_open</b>(2) and +<b>shm_unlink</b>(2) calls for shared memory regions that will later +be mapped into memory using <b>mmap</b>(2). QNX's implementation +of the shared memory functions requires that the name given must begin +with a slash, and that no other slash may appear in the name. +<p>In order to comply with those requirements and allow relative pathnames +to find the same environment, Berkeley DB uses only the last component of the +home directory path and the name of the shared memory file, separated +by a colon, as the name specified to the shared memory functions. For +example, if an application specifies a home directory of +<b>/home/db/DB_DIR</b>, Berkeley DB will use <b>/DB_DIR:__db.001</b> as +the name for the shared memory area argument to <b>shm_open</b>(2). +<p>The impact of this decision is that the last component of all +environment home directory pathnames on QNX must be unique with respect +to each other. Additionally, Berkeley DB requires that environments use home +directories for QNX in order to generate a reasonable entry in the +shared memory area. +<p><li><b>What are the implications of QNX's requirement to use +<b>shm_open</b>(2) in order to use <b>mmap</b>(2)?</b> +<p>QNX requires that files mapped with <b>mmap</b>(2) be opened using +<b>shm_open</b>(2). There are other places in addition to the +environment shared memory regions, where Berkeley DB tries to memory map files +if it can. +<p>The memory pool subsystem normally attempts to use <b>mmap</b>(2) +even when using private memory, as indicated by the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> +flag to <a href="../../api_c/env_open.html">DBENV->open</a>. In the case of QNX, if an application is +using private memory, Berkeley DB will not attempt to map the memory and will +instead use the local cache. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/osf1.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/sco.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/sco.html b/db/docs/ref/build_unix/sco.html new file mode 100644 index 000000000..dda8e6d1d --- /dev/null +++ b/db/docs/ref/build_unix/sco.html @@ -0,0 +1,29 @@ +<!--$Id: sco.so,v 11.7 2000/10/30 20:46:06 sue Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: SCO</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/qnx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/solaris.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>SCO</h1> +<p><ol> +<p><li><b>If I build with gcc, programs like db_dump, db_stat core dump immediately +when invoked.</b> +<p>We suspect gcc or the runtime loader may have a bug, but we haven't +tracked it down. If you want to use gcc, we suggest building static +libraries. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/qnx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/solaris.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/shlib.html b/db/docs/ref/build_unix/shlib.html new file mode 100644 index 000000000..2819651cd --- /dev/null +++ b/db/docs/ref/build_unix/shlib.html @@ -0,0 +1,94 @@ +<!--$Id: shlib.so,v 10.9 2000/03/18 21:43:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Dynamic shared libraries</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/install.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/test.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Dynamic shared libraries</h1> +<p><b>Warning</b>: the following information is intended to be generic and +is likely to be correct for most UNIX systems. Unfortunately, dynamic +shared libraries are not standard between UNIX systems, so there may be +information here that is not correct for your system. If you have +problems, consult your compiler and linker manual pages or your system +administrator. +<p>The Berkeley DB dynamic shared libraries are created with the name +libdb-<b>major</b>.<b>minor</b>.so, where <b>major</b> is the major +version number and <b>minor</b> is the minor version number. Other +shared libraries are created if Java and Tcl support are enabled, +specifically libdb_java-<b>major</b>.<b>minor</b>.so and +libdb_tcl-<b>major</b>.<b>minor</b>.so. +<p>On most UNIX systems, when any shared library is created, the linker +stamps it with a "SONAME". In the case of Berkeley DB, the SONAME is +libdb-<b>major</b>.<b>minor</b>.so. It is important to realize that +applications linked against a shared library remember the SONAMEs of the +libraries they use and not the underlying names in the filesystem. +<p>When the Berkeley DB shared library is installed, links are created in the +install lib directory so that libdb-<b>major</b>.<b>minor</b>.so, +libdb-<b>major</b>.so and libdb.so all reference the same library. This +library will have an SONAME of libdb-<b>major</b>.<b>minor</b>.so. +<p>Any previous versions of the Berkeley DB libraries that are present in the +install directory (such as libdb-2.7.so or libdb-2.so) are left unchanged. +(Removing or moving old shared libraries is one drastic way to identify +applications that have been linked against those vintage releases.) +<p>Once you have installed the Berkeley DB libraries, unless they are installed in +a directory where the linker normally looks for shared libraries, you will +need to specify the installation directory as part of compiling and +linking against Berkeley DB. Consult your system manuals or system +administrator for ways to specify a shared library directory when +compiling and linking applications with the Berkeley DB libraries. Many systems +support environment variables (e.g., LD_LIBRARY_PATH, LD_RUN_PATH) ), or +system configuration files (e.g., /etc/ld.so.conf) for this purpose. +<p><b>Warning</b>: some UNIX installations may have an already existing +<b>/usr/lib/libdb.so</b>, and this library may be an incompatible +version of Berkeley DB. +<p>We recommend that applications link against libdb.so (e.g., using -ldb). +Even though the linker uses the file named libdb.so, the executable file +for the application remembers the library's SONAME +(libdb-<b>major</b>.<b>minor</b>.so). This has the effect of marking +the applications with the versions they need at link time. Because +applications locate their needed SONAMEs when they are executed, all +previously linked applications will continue to run using the library they +were linked with, even when a new version of Berkeley DB is installed and the +file <b>libdb.so</b> is replaced with a new version. +<p>Applications that know they are using features specific to a particular +Berkeley DB release can be linked to that release. For example, an application +wanting to link to Berkeley DB major release "3" can link using -ldb-3, and +applications that know about a particular minor release number can specify +both major and minor release numbers, for example, -ldb-3.5. +<p>If you want to link with Berkeley DB before performing library installation, +the "make" command will have created a shared library object in the +<b>.libs</b> subdirectory of the build directory, such as +<b>build_unix/.libs/libdb-major.minor.so</b>. If you want to link a +file against this library, with, for example, a major number of "3" and +a minor number of "5", you should be able to do something like: +<p><blockquote><pre>cc -L BUILD_DIRECTORY/.libs -o testprog testprog.o -ldb-3.5 +env LD_LIBRARY_PATH="BUILD_DIRECTORY/.libs:$LD_LIBRARY_PATH" ./testprog</pre></blockquote> +<p>where <b>BUILD_DIRECTORY</b> is the full directory path to the directory +where you built Berkeley DB. +<p>The libtool program (which is configured in the build_unix directory) can +be used to set the shared library path and run a program. For example, +<p><blockquote><pre>libtool gdb db_dump</pre></blockquote> +<p>runs the gdb debugger on the db_dump utility after setting the appropriate +paths. Libtool may not know what to do with arbitrary commands (it is +hardwired to recognize "gdb" and some other commands). If it complains +the mode argument will usually resolve the problem: +<p><blockquote><pre>libtool --mode=execute my_debugger db_dump</pre></blockquote> +<p>On most systems, using libtool in this way is exactly equivalent to +setting the LD_LIBRARY_PATH environment variable and then executing the +program. On other systems, using libtool has the virtue of knowing about +any other details on systems that don't behave in this typical way. +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/install.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/test.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/solaris.html b/db/docs/ref/build_unix/solaris.html new file mode 100644 index 000000000..8239537a8 --- /dev/null +++ b/db/docs/ref/build_unix/solaris.html @@ -0,0 +1,90 @@ +<!--$Id: solaris.so,v 11.14 2000/09/13 17:22:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Solaris</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/sco.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/sunos.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Solaris</h1> +<p><ol> +<p><li><b>I can't compile and run multi-threaded applications.</b> +<p>Special compile-time flags and additional libraries are required when +compiling threaded applications on Solaris. If you are compiling a +threaded application, you must compile with the D_REENTRANT flag and link +with the libpthread.a or libthread.a libraries: +<p><blockquote><pre>cc -mt ... +cc -D_REENTRANT ... -lthread +cc -D_REENTRANT ... -lpthread</pre></blockquote> +<p>The Berkeley DB library will automatically build with the correct options. +<hr size=1 noshade> +<p><li><b>I've installed gcc on my Solaris system, but configuration +fails because the compiler doesn't work.</b> +<p>On some versions of Solaris, there is a cc executable in the user's path, +but all it does is display an error message and fail: +<p><blockquote><pre>% which cc +/usr/ucb/cc +% cc +/usr/ucb/cc: language optional software package not installed</pre></blockquote> +<p>As Berkeley DB always uses the native compiler in preference to gcc, this is a +fatal error. If the error message you're seeing is: +<p><blockquote><pre>checking whether the C compiler (cc -O ) works... no +configure: error: installation or configuration problem: C compiler cannot create executables.</pre></blockquote> +<p>then this may be the problem you're seeing. The simplest workaround is +to set your CC environment variable to the system compiler, e.g.: +<p><blockquote><pre>env CC=gcc ../dist/configure</pre></blockquote> +<p>and reconfigure. +<p>If you are using the --configure-cxx option, you may also want to specify +a C++ compiler, e.g.: +<p><blockquote><pre>env CC=gcc CCC=g++ ../dist/configure</pre></blockquote> +<hr size=1 noshade> +<p><li><b>I get the error +"libc internal error: _rmutex_unlock: rmutex not held", followed by a core +dump, when running threaded or JAVA programs.</b> +<p>This is a known bug in Solaris 2.5 and it is fixed by Sun patch 103187-25. +<hr size=1 noshade> +<p><li><b>I get error reports of non-existent files, corrupted metadata +pages and core dumps.</b> +<p>Solaris 7 contains a bug in the threading libraries (-lpthread, -lthread) +which causes the wrong version of the pwrite routine to be linked into +the application if the thread library is linked in after the the C +library. The result will be that the pwrite function is called rather +than the pwrite64. To work around the problem, use an explicit link order +when creating your application. +<p>Sun Microsystems is tracking this problem with Bug Id's 4291109 and 4267207, +and patch 106980-09 to Solaris 7 fixes the problem. +<p><blockquote><pre>Bug Id: 4291109 +Duplicate of: 4267207 +Category: library +Subcategory: libthread +State: closed +Synopsis: pwrite64 mapped to pwrite +Description: +When libthread is linked after libc, there is a table of functions in +libthread that gets "wired into" libc via _libc_threads_interface(). +The table in libthread is wrong in both Solaris 7 and on28_35 for the +TI_PWRITE64 row (see near the end).</pre></blockquote> +<hr size=1 noshade> +<p><li><b>During configuration I see a message that large file support has +been turned off.</b> +<p>The Solaris 8 system include files redefine "open" when big-file support (the +HAVE_FILE_OFFSET_BITS and _FILE_OFFSET_BITS #defines) is enabled. This +causes problems when compiling for C++, where "open" is a legal +identifier, used in the Berkeley DB C++ API. For this reason, we automatically +turn off big-file support when Berkeley DB is configured with a C++ API. This +should not be a problem for applications unless there is a need to create +databases larger than 2GB. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/sco.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/sunos.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/sunos.html b/db/docs/ref/build_unix/sunos.html new file mode 100644 index 000000000..cecccaefb --- /dev/null +++ b/db/docs/ref/build_unix/sunos.html @@ -0,0 +1,30 @@ +<!--$Id: sunos.so,v 11.4 2000/03/18 21:43:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: SunOS</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/solaris.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/ultrix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>SunOS</h1> +<p><ol> +<p><li><b>I can't specify the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag to <a href="../../api_c/env_open.html">DBENV->open</a>.</b> +<p>The <b>shmget</b>(2) interfaces are not used on SunOS releases prior +to 5.0, even though they apparently exist, as the distributed include +files did not allow them to be compiled. For this reason, it will not be +possible to specify the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag those versions of +SunOS. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/solaris.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/ultrix.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/test.html b/db/docs/ref/build_unix/test.html new file mode 100644 index 000000000..9ae398980 --- /dev/null +++ b/db/docs/ref/build_unix/test.html @@ -0,0 +1,49 @@ +<!--$Id: test.so,v 10.19 2000/06/28 14:33:57 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Running the test suite under UNIX</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/shlib.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/notes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Running the test suite under UNIX</h1> +<p>The Berkeley DB test suite is built if you specify --enable-test as an +argument when configuring Berkeley DB. +<p>Before running the tests for the first time, you may need to edit the +<b>include.tcl</b> file in your build directory. The Berkeley DB +configuration assumes you intend to use the version of the tclsh utility +included in the Tcl installation with which Berkeley DB was configured to run +the test suite, and further assumes that the test suite will be run with +the libraries pre-built in the Berkeley DB build directory. If either of these +assumptions are incorrect, you will need to edit the <b>include.tcl</b> +file and change the line that reads: +<p><blockquote><pre>set tclsh_path ...</pre></blockquote> +<p>to correctly specify the full path to the version of tclsh with which you +are going to run the test suite. You may also need to change the line +that reads: +<p><blockquote><pre>set test_path ...</pre></blockquote> +<p>to correctly specify the path from the directory where you are running +the test suite to the location of the Berkeley DB Tcl API library you built. +It may not be necessary that this be a full path if you have configured +your system's dynamic shared library mechanisms to search the directory +where you built or installed the Tcl library. +<p>All Berkeley DB tests are run from within <b>tclsh</b>. After starting tclsh, +you must source the file <b>test.tcl</b> in the test directory. For +example, if you built in the <b>build_unix</b> directory of the +distribution, this would be done using the command: +<p><blockquote><pre>% source ../test/test.tcl</pre></blockquote> +<p>Once you have executed that command and the "%" prompt has returned +without errors, you are ready to run tests in the test suite. +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/shlib.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/notes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_unix/ultrix.html b/db/docs/ref/build_unix/ultrix.html new file mode 100644 index 000000000..e71946c88 --- /dev/null +++ b/db/docs/ref/build_unix/ultrix.html @@ -0,0 +1,27 @@ +<!--$Id: ultrix.so,v 11.4 2000/03/18 21:43:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Ultrix</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for UNIX systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/sunos.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Ultrix</h1> +<p><ol> +<p><li><b>Configuration complains that mmap(2) interfaces aren't being used.</b> +<p>The <b>mmap</b>(2) interfaces are not used on Ultrix, even though +they exist, as they are known to not work correctly. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/sunos.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_vxworks/faq.html b/db/docs/ref/build_vxworks/faq.html new file mode 100644 index 000000000..cea733d7f --- /dev/null +++ b/db/docs/ref/build_vxworks/faq.html @@ -0,0 +1,85 @@ +<!--$Id: faq.so,v 1.12 2000/12/21 18:33:43 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: VxWorks FAQ</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for VxWorks systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_vxworks/notes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade/process.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>VxWorks FAQ</h1> +<p><ol> +<p><li><b>Can I run the test suite under VxWorks?</b> +<p>The test suite requires the Berkeley DB Tcl library. In turn, this library +requires Tcl 8.1 or greater. In order to run the test suite, you would +need to port Tcl 8.1 or greater to VxWorks. The Tcl shell included in +<i>windsh</i> is not adequate for two reasons. First, it is based on +Tcl 8.0. Second, it does not include the necessary Tcl components for +adding a Tcl extension. +<p><li><b>Are all Berkeley DB features available for VxWorks?</b> +<p>All Berkeley DB features are available for VxWorks with the exception of the +<a href="../../api_c/db_open.html#DB_TRUNCATE">DB_TRUNCATE</a> flag for <a href="../../api_c/db_open.html">DB->open</a>. The underlying mechanism +needed for that flag is not available consistently across different file +systems for VxWorks. +<p><li><b>Are there any constraints using particular file system drivers?</b> +<p>There are constraints using the dosFs file systems with Berkeley DB. Namely, +you must configure your dosFs file system to support long file names if +you are using Berkeley DB logging in your application. The VxWorks' dosFs +1.0 file system, by default, uses the old MS-DOS 8.3 file naming +constraints, restricting to 8 character file names with a 3 character +extension. If you have configured with VxWorks' dosFs 2.0 you should +be compatible with Windows FAT32 filesystems which supports long +filenames. +<p><li><b>Are there any dependencies on particular file system drivers?</b> +<p>There is one dependency on specifics of file system drivers in the port +of Berkeley DB to VxWorks. Berkeley DB synchronizes data using the FIOSYNC function +to ioctl() (another option would have been to use the FIOFLUSH function +instead). The FIOSYNC function was chosen because the NFS client driver, +nfsDrv, only supports it and doesn't support FIOFLUSH. All local file +systems, as of VxWorks 5.4, support FIOSYNC with the exception of +rt11fsLib, which only supports FIOFLUSH. To use rt11fsLib, you will need +to modify the os/os_fsync.c file to use the FIOFLUSH function; note that +rt11fsLib cannot work with NFS clients. +<p><li><b>Are there any known file system problems?</b> +<p>During the course of our internal testing we came across two problems +with the dosFs 2.0 file system that warranted patches from Wind River Systems. +You should ask Wind River Systems for the patches to these +problems if you encounter them. +<p>The first problem is that files will seem to disappear. You should +look at <b>SPR 31480</b> in the Wind River Systems' Support pages for +a more detailed description of this problem. +<p>The second problem is a semaphore deadlock within the dosFs file system +code. Looking at a stack trace via CrossWind, you will see two or more of +your application's tasks waiting in semaphore code within dosFs. The patch +for this problem is under <b>SPR 33221</b> at Wind River Systems. +<p><li><b>Are there any file systems I cannot use?</b> +<p>The Target Server File System (TSFS) uses the netDrv driver. This driver +does not support any ioctl that allows flushing to the disk and therefore +cannot be used with Berkeley DB. +<p><li><b>Why aren't the utility programs part of the project?</b> +<p>The utility programs, in their Unix-style form, are not ported to VxWorks. +The reasoning is the utility programs are essentially wrappers for the +specific Berkeley DB interface they call. Their interface and generic model +are not the appropriate paradigm for VxWorks. It is most likely that +specific applications will want to spawn tasks that call the appropriate +Berkeley DB function to perform the actions of some utility programs, using +VxWorks native functions. For example, an application that spawns several +tasks that all may operate on the same database would also want to spawn +a task that calls <a href="../../api_c/lock_detect.html">lock_detect</a> for deadlock detection, but specific +to the environment used for that application. +<p><li><b>What VxWorks primitives are used for mutual exclusion in Berkeley DB?</b> +<p>Mutexes inside of Berkeley DB use the basic binary semaphores in VxWorks. The +mutexes are created using the FIFO queue type. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_vxworks/notes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade/process.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_vxworks/intro.html b/db/docs/ref/build_vxworks/intro.html new file mode 100644 index 000000000..593b8a1e6 --- /dev/null +++ b/db/docs/ref/build_vxworks/intro.html @@ -0,0 +1,86 @@ +<!--$Id: intro.so,v 1.7 2000/08/10 17:54:49 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Building for VxWorks</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for VxWorks systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_win/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_vxworks/notes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Building for VxWorks</h1> +<p>The build_vxworks directory in the Berkeley DB distribution contains a workspace +and project files for Tornado 2.0. +<p><table border=1 align=center> +<tr><th>File</th><th>Description</th></tr> +<tr> <td align=center>Berkeley DB.wsp</td> <td align=center>Berkeley DB Workspace file</td> </tr> +<tr> <td align=center>Berkeley DB.wpj</td> <td align=center>Berkeley DB Project file</td> </tr> +<tr> <td align=center>ex_*/*.wpj</td> <td align=center>Example programs project files</td> </tr> +</table> +<h3>Building With Tornado 2.0</h3> +<p>Open the workspace <b>Berkeley DB.wsp</b>. The list of projects +in this workspace will be shown. These projects were created for +the x86 BSP for VxWorks. +<p>The remainder of this document assumes you already have a +VxWorks target and a target server, both up and running. +<p>First, you'll need to set the include directories. +To do this, go to the <i>Builds</i> tab for the workspace. +Open up <i>Berkeley DB Builds</i>. You will see several different +builds, containing different configurations. All of the projects +in the Berkeley DB workspace are created to be downloadable applications. +<p><table border=1 align=center> +<tr><th>Build</th><th>Description</th></tr> +<tr> <td align=left>PENTIUM_RPCdebug</td> <td align=left>x86 BSP with RPC and debugging</td> </tr> +<tr> <td align=left>PENTIUM_RPCnodebug</td> <td align=left>x86 BSP with RPC no debugging</td> </tr> +<tr> <td align=left>PENTIUM_debug</td> <td align=left>x86 BSP no RPC with debugging</td> </tr> +<tr> <td align=left>PENTIUM_nodebug</td> <td align=left>x86 BSP no RPC no debugging</td> </tr> +<tr> <td align=left>SIMSPARCSOLARISgnu</td> <td align=left>VxSim BSP no RPC with debugging</td> </tr> +</table> +<p>You will have to add a new build specification if you are using a +different BSP or wish to customize further. For instance, if you have +the Power PC (PPC) BSP, you will need to add a new build for the PPC tool +chain. To do so, select the "Builds" tab and then select the Berkeley DB +project name and right click. Choose the <i>New Build...</i> +selection and create the new build target. For your new build target, +you will need to decide if you want it configured to support RPC and +whether it should be built for debugging. See the properties of the +Pentium builds for how to configure for each case. After you have added +this build you still need to correctly configure the include directories +as described below. +<p>Select the build you are interested in and right click. Choose the +<i>Properties...</i> selection. At this point, a tabbed dialogue +should appear. In this new window, choose the <i>C/C++ compiler</i> +tab. In the edit box, you need to modify the full pathname of the +<i>build_vxworks</i> subdirectory of Berkeley DB, followed by the full +pathname of the <i>include</i> subdirectory of Berkeley DB. Then click +OK. +<p>If the architecture for this new build has the most significant byte +first, you will also need to edit the <i>db_config.h</i> file in +the build directory and define <b>WORDS_BIGENDIAN</b>. +<p>To build and download the Berkeley DB downloadable application for the first time +requires several steps: +<p><ol> +<p><li>Select the build you are interested in and right click. +Choose the <i>Set ... as Active Build</i> selection. +<p><li>Select the build you are interested in and right click. +Choose the <i>Dependencies ...</i> selection. +Run dependencies over all files in the Berkeley DB project. +<p><li>Select the build you are interested in and right click. +Choose the <i>Rebuild All (Berkeley DB.out)</i> selection. +<p><li>Select the Berkeley DB project name and right click. +Choose the <i>Download 'Berkeley DB.out'</i> selection. +</ol> +<p>You will need to repeat this procedure for +all builds you are interested in building, as well as for +all of the example project builds you wish to run. +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_win/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_vxworks/notes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_vxworks/notes.html b/db/docs/ref/build_vxworks/notes.html new file mode 100644 index 000000000..83de25511 --- /dev/null +++ b/db/docs/ref/build_vxworks/notes.html @@ -0,0 +1,56 @@ +<!--$Id: notes.so,v 1.6 2000/08/09 15:45:52 sue Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: VxWorks notes</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for VxWorks systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_vxworks/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_vxworks/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>VxWorks notes</h1> +<p>Berkeley DB currently disallows the DB_TRUNC flag to <a href="../../api_c/db_open.html">DB->open</a>. +The operations this flag represent are not fully supported under +VxWorks 5.4. +<p>The memory on VxWorks is always resident and fully shared among all tasks +running on the target. For this reason, the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag +is implied for any application that does not specify the +<a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag. Additionally, applications must use a +segment ID to ensure different applications do not overwrite each other's +database environments. +See the <a href="../../api_c/env_set_shm_key.html">DBENV->set_shm_key</a> function for more information. +Also, the <a href="../../api_c/env_open.html#DB_LOCKDOWN">DB_LOCKDOWN</a> flag has no effect. +<p>The <a href="../../api_c/db_sync.html">DB->sync</a> function is implemented using an ioctl call into the +file system driver with the FIOSYNC command. Most, but not all, file +system drivers support this call. Berkeley DB requires the use of a file system +supporting FIOSYNC. +<h3>Building and Running the Example Programs</h3> +<p>Each example program can be downloaded and run by calling the function +equivalent to the example's name. You may have to edit the pathname to +the environments and database names in the examples' sources. The +examples included are: +<p><table border=1 align=center> +<tr><th>Name</th><th>Description</th></tr> +<tr> <td align=left>ex_access</td> <td align=left>Simple access method example.</td> </tr> +<tr> <td align=left>ex_btrec</td> <td align=left>Example using Btree and record numbers.</td> </tr> +<tr> <td align=left>ex_dbclient</td> <td align=left>Example running an RPC client. Takes a hostname as an argument, e.g., +<i>ex_dbclient "myhost"</i>.</td> </tr> +<tr> <td align=left>ex_env</td> <td align=left>Example using an environment.</td> </tr> +<tr> <td align=left>ex_mpool</td> <td align=left>Example using mpools.</td> </tr> +<tr> <td align=left>ex_tpcb</td> <td align=left>Example using transactions. This example requires two invocations both +taking an integer identifier as an argument. This identifier allows for +multiple sets of databases to be used within the same environment. The +first is to initialize the databases, e.g., <i>ex_tpcb_init 1</i>. The +second is to run the program on those databases, e.g., <i>ex_tpcb 1</i>.</td> </tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_vxworks/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_vxworks/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_win/faq.html b/db/docs/ref/build_win/faq.html new file mode 100644 index 000000000..2c185b6da --- /dev/null +++ b/db/docs/ref/build_win/faq.html @@ -0,0 +1,49 @@ +<!--$Id: faq.so,v 10.20 2000/06/28 15:43:27 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Windows FAQ</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for Windows systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_win/notes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_vxworks/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Windows FAQ</h1> +<p><ol> +<p><li><b>My Win* C/C++ application crashes in the Berkeley DB library when Berkeley DB calls +fprintf (or some other standard C library function).</b> +<p>You should be using the "Debug Multithreaded DLL" compiler option in +your application when you link with the +build_win32/Debug/libdb32d.lib library (this .lib file +is actually a stub for libdb32d.DLL). To check this +setting in Visual C++, choose the "Project/Settings" menu item, and +under the tab marked "C/C++", select "Code Generation" and see the box +marked "Use runtime library". This should be set to "Debug +Multithreaded DLL". If your application is linked against the static +library, build_win32/Debug/libdb32sd.lib, then you +will want to set "Use runtime library" to "Debug Multithreaded". +<p>Setting this option incorrectly can cause multiple versions of the +standard libraries to be linked into your application (one on behalf +of your application, and one on behalf of the Berkeley DB library). That +violates assumptions made by these libraries, and traps can result. +<p><li><b>Why are the build options for DB_DLL marked as "Use MFC in a Shared DLL"? +Does Berkeley DB use MFC?</b> +<p>Berkeley DB does not use MFC at all. It does however, call malloc and free and +other facilities provided by the Microsoft C runtime library. We've found +in our work that many applications and libraries are built assuming MFC, +and specifying this for Berkeley DB solves various interoperation issues, and +guarantees that the right runtime libraries are selected. Note that since +we do not use MFC facilities, the MFC library DLL is not marked as a +dependency for libdb.dll, but the appropriate Microsoft C runtime is. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_win/notes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_vxworks/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_win/intro.html b/db/docs/ref/build_win/intro.html new file mode 100644 index 000000000..6f5e0d4bb --- /dev/null +++ b/db/docs/ref/build_win/intro.html @@ -0,0 +1,143 @@ +<!--"@(#)intro.so 10.26 (Sleepycat) 11/18/99"--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Building for Win32</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for Win32 platforms</dl></h3></td> +<td width="1%"><a href="../../ref/build_unix/ultrix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/test.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Building for Win32</h1> +<p>The build_win32 directory in the Berkeley DB distribution contains project files +for both MSVC 5.0 and 6.0: +<p><table border=1 align=center> +<tr><th>Project File</th><th>Description</th></tr> +<tr> <td align=center>Berkeley_DB.dsw</td> <td align=center>Visual C++ 5.0 project (compatible with 6.0)</td> </tr> +<tr> <td align=center>*.dsp</td> <td align=center>Visual C++ 5.0 subprojects (compatible with 6.0 +)</td> </tr> +</table> +<p>These project files can be used to build Berkeley DB for any Win32 platform: +Windows 2000, Windows NT, Windows 98 and Windows 95. +<h3>Building With Visual C++ 6.0</h3> +<p>Open the file <b>Berkeley_DB.dsw</b>. You will be told that the project +was generated by a previous version of Developer Studio, and asked if you +want to convert the project. Select Yes, and all projects will be +converted. Then continue on with the instructions for building with +Visual C++ 5.0. +<p>Note that when you build a release version, you may receive a warning +about an unknown compiler option <i>/Ob2</i>. This is apparently a +flaw in the project conversion for Visual C++ and can be ignored. +<p>Each release of Berkeley DB is built and tested with this procedure using +Microsoft Visual C++ 6.0, Standard Edition. +<h3>Building With Visual C++ 5.0</h3> +<p>Open the file <b>Berkeley_DB.dsw</b>. This workspace includes a number +of subprojects needed to build Berkeley DB. +<p>First, you'll need to set the include directories. To do this, select +<i>Options...</i> from the <i>Tools</i> pull-down menu. At this +point, a tabbed dialogue should appear. In this new window, choose the +<i>Directories</i> tab. For the <i>Platform</i>, select +<i>Win32</i> and for <i>Show directories for</i> select +<i>Include files</i>. Below these options in the list of directories, +you should add two directories: the full pathname of the +<i>build_win32</i> subdirectory of Berkeley DB, followed by the full +pathname of the <i>include</i> subdirectory of Berkeley DB. Then click OK. +<p>Then, select <i>Active Project Configuration</i> under the +<i>Build</i> pull-down menu. For a debug version of the libraries, +tools and examples, select <i>db_buildall - Win32 Debug</i>. +Results from this build are put into <b>build_win32/Debug</b>. +For a release version, select <i>db_buildall - Win32 Release</i>; +results are put into <b>build_win32/Release</b>. +For a debug version that has all tools and examples built with +static libraries, select <i>db_buildall - Win32 Debug Static</i>; +results are put into <b>build_win32/Debug_static</b>. +For a release version of the same, +select <i>db_buildall - Win32 Release Static</i>; +results are put into <b>build_win32/Release_static</b>. +Finally, to build, select <i>Build db_buildall.exe</i> under the +<i>Build</i> pull-down menu. +<p>When building your application, you should normally use compile options +"debug multithreaded dll" and link against +<b>build_win32/Debug/libdb32d.lib</b>. If you want +to link against a static (non-DLL) version of the library, use the +"debug multithreaded" compile options and link against +<b>build_win32/Debug_static/libdb32sd.lib</b>. You can +also build using a release version of the libraries and tools, which will be +placed in <b>build_win32/Release/libdb32.lib</b>. +The static version will be in +<b>build_win32/Release_static/libdb32s.lib</b>. +<p>Each release of Berkeley DB is maintained, built and tested using Microsoft +Visual C++ 5.0 and 6.0. +<h3>Including the C++ API</h3> +<p>C++ support is built automatically on Win32. +<h3>Including the Java API</h3> +<p>Java support is not built automatically. The following instructions +assume you have installed the Sun Java Development Kit in +<b>d:/java</b>. Of course, if you've installed elsewhere, or have +different Java software, you will need to adjust the pathnames +accordingly. First, use the instructions above for Visual C++ 5.0 or 6.0 +to open the Tools/Options tabbed dialog for adding include directories. +In addition to the directories specified above, add +<b>d:/java/include</b> and <b>d:/java/include/win32</b>. These are +the directories needed when including <b>jni.h</b>. Now, before +clicking OK, under <i>Show directories for</i>, choose +<i>Executable files</i>. Add <b>d:/java/bin</b>. That directory +is needed to find javac. Now select OK. +<p>Select <i>Active Project Configuration</i> under the +<i>Build</i> pull-down menu. Choose <i>db_java - Win32 +Release</i>. To build, select <i>Build +libdb_java32.dll</i> under the <i>Build</i> pull-down +menu. This builds the Java support library for Berkeley DB and compiles all +the java files, placing the class files in the <b>java/classes</b> +subdirectory of Berkeley DB. Set your environment variable CLASSPATH to +include this directory, your environment variable PATH to include the +<b>build_win32/Release</b> subdirectory, and as a test, try running +the command: +<p><blockquote><pre>java com.sleepycat.examples.AccessExample</pre></blockquote> +<h3>Including the Tcl API</h3> +<p>Tcl support is not built automatically. See +<a href="../../ref/tcl/intro.html">Loading Berkeley DB with Tcl</a> for information +on sites from which you can download Tcl and which Tcl versions are +compatible with Berkeley DB. +<p>The Tcl library must be built as the same build type as the Berkeley DB +library (both Release or both Debug). We have found that the binary +release of Tcl can be used with the Release configuration of Berkeley DB, but +for the Debug configuration, you will need to need to build Tcl from +sources. Before building Tcl, you will need to modify its makefile to +make sure you are building a debug version, including thread support. +This is because the set of DLLs linked into the Tcl executable must +match the corresponding set of DLLs used by Berkeley DB. +<p>These notes assume Tcl is installed as <b>d:/tcl</b>, but you can +change that if you wish. If you run using a different version of Tcl +than the one currently being used by Sleepycat Software, you will need +to change the name of the Tcl library used in the build (e.g., +tcl83d.lib) to the appropriate name. See +Projects->Settings->Link in the db_tcl subproject. +<p>Use the instructions above for +Visual C++ 5.0 or 6.0 to open the <i>Tools/Options</i> tabbed dialog +for adding include directories. In addition to the directories specified +above, add <b>d:/tcl/include</b>. This is the directory that contains +<b>tcl.h</b>. +Then, in that same dialog, show directories for "Library Files". +Add <b>d:/tcl/lib</b> (or whatever directory contains +<b>tcl83d.lib</b> in your distribution) to the list. Now select OK. +<p>Select <i>Active Project Configuration</i> under the +<i>Build</i> pull-down menu. Choose <i>db_tcl - Win32 +Release</i>. To build, select <i>Build +libdb_tcl32.dll</i> under the <i>Build</i> pull-down +menu. This builds the Tcl support library for Berkeley DB, placing the result +into <b>build_win32/Release/libdb_tcl32.dll</b>. +Selecting an Active Configuration of <i>db_tcl - Win32 Debug</i> +will build a debug version, placing the result into +<b>build_win32/Debug/libdb_tcl32d.dll</b>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_unix/ultrix.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/test.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_win/notes.html b/db/docs/ref/build_win/notes.html new file mode 100644 index 000000000..483b101ec --- /dev/null +++ b/db/docs/ref/build_win/notes.html @@ -0,0 +1,56 @@ +<!--$Id: notes.so,v 10.17 2000/11/02 16:46:11 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Windows notes</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for Windows systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_win/test.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Windows notes</h1> +<ul type=disc> +<li>Various Berkeley DB interfaces take a <b>mode</b> argument, intended to specify +the underlying file permissions for created files. Berkeley DB currently ignores +this argument on Windows systems. +<p>It would be possible to construct a set of security attributes to pass to +<b>CreateFile</b> that accurately represents the mode. In the worst +case, this would involve looking up user and all group names and creating +an entry for each. Alternatively, we could call the <b>_chmod</b> +(partial emulation) function after file creation, although this leaves us +with an obvious race. +<p>Practically speaking, however, these efforts would be largely meaningless +on FAT, the most common file system, which only has a "readable" and +"writeable" flag, applying to all users. +<li>When using the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag, Berkeley DB shared regions are +created without ACLs, which means that the regions are only accessible +to a single user. If wider sharing is appropriate (e.g., both user +applications and Windows/NT service applications need to access the +Berkeley DB regions), the Berkeley DB code will need to be modified to create the +shared regions with the correct ACLs. Alternatively, by not specifying +the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag, file-system backed regions will be +created instead, and the permissions on those files may be directly +specified through the <a href="../../api_c/env_open.html">DBENV->open</a> interface. +<li>On Windows/9X, files opened by multiple processes do not share data +correctly. For this reason, the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag is implied +for any application that does not specify the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag, +causing the system paging file to be used for sharing data. However, +paging file memory is freed on last close, implying that multiple +processes sharing an environment must arrange for at least one process +to always have the environment open, or, alternatively, that any process +joining the environment be prepared to re-create it. If a shared +environment is closed by all processes, a subsequent open without +specifying the <a href="../../api_c/env_open.html#DB_CREATE">DB_CREATE</a> flag will result in the return of a +system EAGAIN error code. +</ul> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_win/test.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/build_win/test.html b/db/docs/ref/build_win/test.html new file mode 100644 index 000000000..e3230ca84 --- /dev/null +++ b/db/docs/ref/build_win/test.html @@ -0,0 +1,77 @@ +<!--$Id: test.so,v 10.29 2001/01/17 14:42:57 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Running the test suite under Windows</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Building Berkeley DB for Windows systems</dl></h3></td> +<td width="1%"><a href="../../ref/build_win/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/notes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Running the test suite under Windows</h1> +<p>To build the test suite on Win32 platforms you will need to configure +Tcl support. You will also need sufficient main memory and disk. +Something around 100MB of disk will be sufficient. For memory, 32MB is +too small, we recommend at least 64MB. +<h3>Building the software needed by the tests</h3> +<p>There exist bugs in some versions of Tcl that may cause the test suite +to hang on Windows/NT 4.0. Tcl version 8.4 (currently available as an +alpha release) has fixed the problem, or there are patches available +for Tcl 8.3.2 (see bug #119188 in the Tcl SourceForge database). Note +that if you want to run the test suite against a Debug version of Berkeley DB, +you need to build a debug version of Tcl. This involves building Tcl +from its source. +<p>To build, perform the following steps. Note that steps #1, #4 and #5 +are part of the normal build process for building Berkeley DB; #2, #3 are part +of including the Tcl API. +<p><ol> +<p><li>Open the <b>build_win32/Berkeley_DB.dsw</b> workspace. +<p><li>Add the pathname for the Tcl include subdirectory to your +include path. To do this, under the "Tools" menu item, select "Options". +In the dialog, select the "Directories" tab, and choose directories +for "Include Files". Add <b>d:/tcl/include</b> (or whatever directory +contains <b>tcl.h</b> in your distribution) to the list. +<p><li>Add the pathname for the Tcl library subdirectory to your +library path. To do this, under the "Tools" menu item, select "Options". +In the dialog, select the "Directories" tab, and choose directories for +"Library Files". Add <b>d:/tcl/lib</b> (or whatever directory contains +<b>tcl83d.lib</b> in your distribution) to the list. +<p><li>Set the active configuration to db_test -- Debug. To set an +active configuration, under the "Build" menu item in the IDE, select "Set +Active Configuration". Then choose "db_test -- Debug". +<p><li>Build. The IDE menu for this is called "build dbkill.exe", +even though dbkill is just one of the things that is built. +This step builds the base Berkeley DB .dll, tcl support, +and various tools that are needed by the test suite. +</ol> +<h3>Running the test suite under Windows</h3> +<p>Before running the tests for the first time, you must edit the file +<b>include.tcl</b> in your build directory and change the line +that reads: +<p><blockquote><pre>set tclsh_path SET_YOUR_TCLSH_PATH</pre></blockquote> +<p>You will want to use the location of the <b>tclsh</b> program. For +example, if Tcl is installed as <b>d:/tcl</b>, this line should be: +<p><blockquote><pre>set tclsh_path d:/tcl/bin/tclsh83d.exe</pre></blockquote> +<p>Then, in a shell of your choice enter the following commands: +<p><ol> +<p><li>cd build_win32 +<p><li>run <b>d:/tcl/bin/tclsh83d.exe</b>, or the equivalent name of +the Tcl shell for your distribution. +<p>You should get a "%" prompt. +<p><li>% source ../test/test.tcl. +<p>You should get a "%" prompt with no errors. +</ol> +<p>You are now ready to run tests in the test suite, see +<a href="../../ref/test/run.html">Running the test suite</a> for more +information. +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_win/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_win/notes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/cam/intro.html b/db/docs/ref/cam/intro.html new file mode 100644 index 000000000..7a02ea87f --- /dev/null +++ b/db/docs/ref/cam/intro.html @@ -0,0 +1,72 @@ +<!--$Id: intro.so,v 10.21 2001/01/18 19:50:57 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Building Berkeley DB Concurrent Data Store applications</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Concurrent Data Store Applications</dl></h3></td> +<td width="1%"><a href="../../ref/env/error.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Building Berkeley DB Concurrent Data Store applications</h1> +<p>It is often desirable to have concurrent read-write access to a database +when there is no need for full recoverability or transaction semantics. +For this class of applications, Berkeley DB provides an interface supporting +deadlock free, multiple-reader/single writer access to the database. +This means that, at any instant in time, there may be either multiple +readers accessing data or a single writer modifying data. The +application is entirely unaware of which is happening, and Berkeley DB +implements the necessary locking and blocking to ensure this behavior. +<p>In order to create Berkeley DB Concurrent Data Store applications, you must first initialize an +environment by calling <a href="../../api_c/env_open.html">DBENV->open</a>. You must specify the +<a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a> and <a href="../../api_c/env_open.html#DB_INIT_MPOOL">DB_INIT_MPOOL</a> flags to that interface. +It is an error to specify any of the other <a href="../../api_c/env_open.html">DBENV->open</a> subsystem +or recovery configuration flags, e.g., <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a>, +<a href="../../api_c/env_open.html#DB_INIT_TXN">DB_INIT_TXN</a> or <a href="../../api_c/env_open.html#DB_RECOVER">DB_RECOVER</a>. +<p>All databases must, of course, be created in this environment, by using +the <a href="../../api_c/db_create.html">db_create</a> interface and specifying the correct environment +as an argument. +<p>The Berkeley DB access method calls used to support concurrent access are +unchanged from the normal access method calls, with one exception: the +<a href="../../api_c/db_cursor.html">DB->cursor</a> interface. In Berkeley DB Concurrent Data Store, each cursor must encapsulate +the idea of being used for read-only access or for read-write access. +There may only be one read-write cursor active at any one time. When your +application creates a cursor, if that cursor will ever be used for +writing, the <a href="../../api_c/db_cursor.html#DB_WRITECURSOR">DB_WRITECURSOR</a> flag must be specified when the cursor +is created. +<p>No deadlock detector needs to be run in a Berkeley DB Concurrent Data Store database environment. +<p>Only a single thread of control may write the database at a time. For +this reason care must be taken to ensure that applications do not +inadvertently block themselves causing the application to hang, unable +to proceed. Some common mistakes include: +<p><ol> +<p><li>Leaving a cursor open while issuing a <a href="../../api_c/db_put.html">DB->put</a> or <a href="../../api_c/db_del.html">DB->del</a> +access method call. +<p><li>Attempting to open a cursor for read-write access while already holding +a cursor open for read-write access. +<p><li>Not testing Berkeley DB error return codes (if any cursor operation returns an +unexpected error, that cursor should be closed). +<p><li>By default, Berkeley DB Concurrent Data Store does locking on a per-database basis. For this reason, +accessing multiple databases in different orders in different threads +or processes, or leaving cursors open on one database while accessing +another database, can cause an application to hang. If this behavior +is a requirement for the application, Berkeley DB can be configured to do +locking on an environment wide basis. See the <a href="../../api_c/env_set_flags.html#DB_CDB_ALLDB">DB_CDB_ALLDB</a> flag +of the <a href="../../api_c/env_set_flags.html">DBENV->set_flags</a> function for more information. +</ol> +<p>Note that it is correct operation for two different threads of control +(actual threads or processes) to have multiple read-write cursors open, +or for one thread to issue a <a href="../../api_c/db_put.html">DB->put</a> call while another thread +has a read-write cursor open, and it is only a problem if these things +are done within a single thread of control. +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/error.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/debug/common.html b/db/docs/ref/debug/common.html new file mode 100644 index 000000000..6374307f1 --- /dev/null +++ b/db/docs/ref/debug/common.html @@ -0,0 +1,109 @@ +<!--$Id: common.so,v 10.13 2000/12/05 18:04:26 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Common errors</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Debugging Applications</dl></h3></td> +<td width="1%"><a href="../../ref/debug/printlog.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Common errors</h1> +<p>This page outlines some of the most common problems that people encounter +and some suggested courses of action. +<p><dl compact> +<p><dt><b>Symptom:</b><dd>Core dumps or garbage returns from random Berkeley DB operations. +<p><dt>Possible Cause:<dd>Failure to zero out DBT structure before issuing request. +<p><dt>Fix:<dd>Before using a <a href="../../api_c/dbt.html">DBT</a>, you must initialize all its elements +to 0 and then set the ones you are using explicitly. +<p><dt><b>Symptom:</b><dd>Random crashes and/or database corruption. +<p><dt>Possible Cause:<dd>Running multiple threads, but did not specify <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> +to <a href="../../api_c/db_open.html">DB->open</a> or <a href="../../api_c/env_open.html">DBENV->open</a>. +<p><dt>Fix:<dd>Any time you are sharing a handle across multiple threads, you must +specify <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> when you open that handle. +<p><dt><b>Symptom:</b><dd><a href="../../api_c/env_open.html">DBENV->open</a> returns EINVAL. +<p><dt>Possible Cause:<dd>The environment home directory is a remote mounted filesystem. +<p><dt>Fix:<dd>Use a locally mounted filesystem instead. +<p><dt><b>Symptom:</b><dd><a href="../../api_c/db_get.html">DB->get</a> calls are returning EINVAL. +<p><dt>Possible Cause:<dd>The application is running with threads, but did not specify the +<a href="../../api_c/dbt.html#DB_DBT_MALLOC">DB_DBT_MALLOC</a>, <a href="../../api_c/dbt.html#DB_DBT_REALLOC">DB_DBT_REALLOC</a> or <a href="../../api_c/dbt.html#DB_DBT_USERMEM">DB_DBT_USERMEM</a> +flags in the <a href="../../api_c/dbt.html">DBT</a> structures used in the call. +<p><dt>Fix:<dd>When running with threaded handles (i.e., specifying <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> +to <a href="../../api_c/env_open.html">DBENV->open</a> or <a href="../../api_c/db_open.html">DB->open</a>), you must specify one of those +flags for all <a href="../../api_c/dbt.html">DBT</a> structures in which Berkeley DB is returning data. +<p><dt><b>Symptom:</b><dd>Running multiple threads or processes, and the database appears to be +getting corrupted. +<p><dt>Possible Cause:<dd>Locking is not enabled. +<p><dt>Fix:<dd>Make sure that you are acquiring locks in your access methods. You +must specify <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a> to your <a href="../../api_c/env_open.html">DBENV->open</a> call and then +pass that environment to <a href="../../api_c/db_open.html">DB->open</a>. +<p><dt><b>Symptom:</b><dd>Locks are accumulating or threads and/or processes are +deadlocking even though there is no concurrent access to the database. +<p><dt>Possible Cause:<dd>Failure to close a cursor. +<p><dt>Fix:<dd>Cursors retain locks between calls. Everywhere the application uses +a cursor, the cursor should be explicitly closed as soon as possible after +it is used. +<p><dt><b>Symptom:</b><dd>The system locks up. +<p><dt>Possible Cause:<dd>Application not checking for <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>. +<p><dt>Fix:<dd>Unless you are using the Concurrent Data Store product, whenever you +have multiple threads and/or processes and at least one of them is +writing, you have the potential for deadlock. As a result, you must +test for the <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> return on every Berkeley DB call. In +general, updates should take place in a transaction or you might leave +the database in an inconsistent state. Reads may take place outside +the context of a transaction under common conditions. +<p>Whenever you get a <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> return, you should: +<p><ol> +<p><li>If you are running in a transaction, abort the transaction, first closing +any cursors opened in the transaction. +<p><li>If you are not running in a transaction, simply close the cursor that got +the <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> (if it was a cursor operation) and retry. +</ol> +<p>See <a href="../../ref/transapp/put.html">Recoverability and deadlock +avoidance</a> for further information. +<p><dt><b>Symptom:</b><dd>An inordinately high number of deadlocks. +<p><dt>Possible Cause:<dd>Read-Modify-Write pattern without using the RMW flag. +<p><dt>Fix:<dd>If you frequently read a piece of data, modify it and then write +it, you may be inadvertently causing a large number of deadlocks. Try +specifying the <a href="../../api_c/dbc_get.html#DB_RMW">DB_RMW</a> flag on your get calls. +<p>Or, if the application is doing a large number of updates in a small +database, turning off Btree splits may help (see <a href="../../api_c/db_set_flags.html#DB_REVSPLITOFF">DB_REVSPLITOFF</a> +for more information.) +<p><dt><b>Symptom:</b><dd>I run recovery and it exits cleanly, but my database changes are missing. +<p><dt>Possible Cause:<dd>Failure to enable logging and transactions in the database environment, +failure to specify DB_ENV handle when creating DB handle, +transaction handle not passed to Berkeley DB interface, failure to commit +transaction. +<p><dt>Fix:<dd>Make sure that the environment and database handles are properly +created, and that the application passes the transaction handle returned +by <a href="../../api_c/txn_begin.html">txn_begin</a> to the appropriate Berkeley DB interfaces, and that each +transaction is eventually committed. +<p><dt><b>Symptom:</b><dd>Recovery fails. +<p><dt>Possible Cause:<dd>A database was updated in a transactional environment both with and +without transactional handles. +<p><dt>Fix:<dd>If any database write operation is done using a transaction handle, +every write operation must be done in the context of a transaction. +<p><dt><b>Symptom:</b><dd>A database environment locks up, sometimes gradually. +<p><dt>Possible Cause:<dd>A thread of control exited unexpectedly, holding Berkeley DB resources. +<p><dt>Fix:<dd>Whenever a thread of control exits holding Berkeley DB resources, all threads +of control must exit the database environment, and recovery must be run. +<p><dt><b>Symptom:</b><dd>A database environment locks up, sometimes gradually. +<p><dt>Possible Cause:<dd>Cursors are not being closed before transaction abort. +<p><dt>Fix:<dd>Before an application aborts a transaction, any cursors opened within +the context of that transaction must be closed. +<p><dt><b>Symptom:</b><dd>Transaction abort or recovery fail, or database corruption occurs. +<p><dt>Possible Cause:<dd>Log files were removed before it was safe. +<p><dt>Fix:<dd>Do not remove any log files from a database environment until Berkeley DB +declares it safe. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/debug/printlog.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/build_unix/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/debug/compile.html b/db/docs/ref/debug/compile.html new file mode 100644 index 000000000..504d5d3ec --- /dev/null +++ b/db/docs/ref/debug/compile.html @@ -0,0 +1,43 @@ +<!--$Id: compile.so,v 10.10 2000/12/01 20:15:25 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Compile-time configuration</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Debugging</dl></h3></td> +<td width="1%"><a href="../../ref/debug/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/runtime.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Compile-time configuration</h1> +<p>There are two compile-time configuration options that assist in debugging +Berkeley DB and Berkeley DB applications. +<p><dl compact> +<p><dt>--enable-debug<dd>If you want to build Berkeley DB with <b>-g</b> as the C and C++ compiler +flag, enter --enable-debug as an argument to configure. This will create +Berkeley DB with debugging symbols, as well as load various Berkeley DB routines +that can be called directly from a debugger to display database page +content, cursor queues and so forth. (Note that the <b>-O</b> +optimization flag will still be specified. To compile with only the +<b>-g</b>, explicitly set the <b>CFLAGS</b> environment variable +before configuring.) +<p><dt>--enable-diagnostic<dd>If you want to build Berkeley DB with debugging run-time sanity checks and with +DIAGNOSTIC #defined during compilation, enter --enable-diagnostic as an +argument to configure. This will cause a number of special checks to be +performed when Berkeley DB is running. This flag should not be defined when +configuring to build production binaries, as it degrades performance. +<p>In addition, when compiling Berkeley DB for use in run-time memory consistency +checkers, in particular, programs that look for reads and writes of +uninitialized memory, use --enable-diagnostic as an argument to configure. +This guarantees that Berkeley DB will completely initialize allocated pages +rather than only initializing the minimum necessary amount. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/debug/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/runtime.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/debug/intro.html b/db/docs/ref/debug/intro.html new file mode 100644 index 000000000..0ea0afcfb --- /dev/null +++ b/db/docs/ref/debug/intro.html @@ -0,0 +1,58 @@ +<!--$Id: intro.so,v 10.15 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Debugging Applications</dl></h3></td> +<td width="1%"><a href="../../ref/install/file.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/compile.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Introduction</h1> +<p>As Berkeley DB is an embedded library, debugging applications that use Berkeley DB +is both harder and easier than debugging a separate server. Debugging +can be harder, because, when a problem arises, it is not always readily +apparent whether the problem is in the application, in the database +library, or is a result of an unexpected interaction between the two. +Debugging can be easier, as it is easier to track down a problem when +you can review a stack trace rather than deciphering inter-process +communication messages. This chapter is intended to assist you in +debugging applications and in reporting bugs to us in a manner such that +we can provide you with the correct answer or fix as quickly as +possible. +<p>When you encounter a problem, there are a few general actions you can +take: +<p><dl compact> +<p><dt>Review the Berkeley DB error output<dd>If an error output mechanism has been configured in the Berkeley DB +environment, additional run-time error messages are made available to +the applications. If you are not using an environment, it is well worth +modifying your application to create one so that you can get more +detailed error messages. See <a href="runtime.html">Run-time error +information</a> for more information on configuring Berkeley DB to output these +error messages. +<p><dt>Review <a href="../../api_c/env_set_verbose.html">DBENV->set_verbose</a><dd>Check the list of flags for the <a href="../../api_c/env_set_verbose.html">DBENV->set_verbose</a> function, and +see if any of them will produce additional information that might help +understand the problem. +<p><dt>Add run-time diagnostics<dd>You can configure and build Berkeley DB to perform run-time diagnostics. +(These checks are not done by default as they can seriously impact +performance. See <a href="compile.html">Compile-time configuration</a> for more +information. +<p><dt>Apply all available patches<dd>Before reporting a problem to Sleepycat Software, please upgrade to the +latest Sleepycat Software release of Berkeley DB, if possible, or at least +make sure you have applied any updates available for your release from +the <a href="http://www.sleepycat.com/update/index.html">Sleepycat +Software web site</a>. +<p><dt>Run the test suite<dd>If you are seeing repeated failures, or failures of simple test cases, +run the Berkeley DB test suite to determine if the distribution of Berkeley DB you +are using was built and configured correctly. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/install/file.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/compile.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/debug/printlog.html b/db/docs/ref/debug/printlog.html new file mode 100644 index 000000000..e533a88d2 --- /dev/null +++ b/db/docs/ref/debug/printlog.html @@ -0,0 +1,160 @@ +<!--$Id: printlog.so,v 10.20 2000/12/01 20:15:25 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Reviewing Berkeley DB log files</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Debugging Applications</dl></h3></td> +<td width="1%"><a href="../../ref/debug/runtime.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/common.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Reviewing Berkeley DB log files</h1> +<p>If you are running with transactions and logging, the <a href="../../utility/db_printlog.html">db_printlog</a> +utility can be a useful debugging aid. The <a href="../../utility/db_printlog.html">db_printlog</a> utility +will display the contents of your log files in a human readable (and +machine-processable) format. +<p>The <a href="../../utility/db_printlog.html">db_printlog</a> utility will attempt to display any and all +logfiles present in a designated db_home directory. For each log record, +<a href="../../utility/db_printlog.html">db_printlog</a> will display a line of the form: +<p><blockquote><pre>[22][28]db_big: rec: 43 txnid 80000963 prevlsn [21][10483281]</pre></blockquote> +<p>The opening numbers in square brackets are the log sequence number (LSN) +of the log record being displayed. The first number indicates the log +file in which the record appears, and the second number indicates the +offset in that file of the record. +<p>The first character string identifies the particular log operation being +reported. The log records corresponding to particular operations are +described below. The rest of the line consists of name/value pairs. +<p>The rec field indicates the record type (this is used to dispatch records +in the log to appropriate recovery functions). +<p>The txnid field identifies the transaction for which this record was +written. A txnid of 0 means that the record was written outside the +context of any transaction. You will see these most frequently for +checkpoints. +<p>Finally, the prevlsn contains the LSN of the last record for this +transaction. By following prevlsn fields, you can accumulate all the +updates for a particular transaction. During normal abort processing, +this field is used to quickly access all the records for a particular +transaction. +<p>After the initial line identifying the record type, each field of the log +record is displayed, one item per line. There are several fields that +appear in many different records and a few fields that appear only in +some records. +<p>The list below presents each log record type currently produced with a brief +description of the operation they describe. +<!--START LOG RECORD TYPES--> +<p><table border=1> +<tr><th>Log Record Type</th><th>Description</th></tr> +<tr><td>bam_adj</td><td>Used when we insert/remove an index into/from the page header of a Btree page.</td></tr> +<tr><td>bam_cadjust</td><td>Keeps track of record counts in a Btree or Recno database.</td></tr> +<tr><td>bam_cdel</td><td>Used to mark a record on a page as deleted.</td></tr> +<tr><td>bam_curadj</td><td>Used to adjust a cursor location when a nearby record changes in a Btree database.</td></tr> +<tr><td>bam_pg_alloc</td><td>Indicates that we allocated a page to a Btree.</td></tr> +<tr><td>bam_pg_free</td><td>Indicates that we freed a page in the Btree (freed pages are added to a freelist and reused).</td></tr> +<tr><td>bam_rcuradj</td><td>Used to adjust a cursor location when a nearby record changes in a Recno database.</td></tr> +<tr><td>bam_repl</td><td>Describes a replace operation on a record.</td></tr> +<tr><td>bam_root</td><td>Describes an assignment of a root page.</td></tr> +<tr><td>bam_rsplit</td><td>Describes a reverse page split.</td></tr> +<tr><td>bam_split</td><td>Describes a page split.</td></tr> +<tr><td>crdel_delete</td><td>Describes the removal of a Berkeley DB file.</td></tr> +<tr><td>crdel_fileopen</td><td>Describes a Berkeley DB file create attempt.</td></tr> +<tr><td>crdel_metapage</td><td>Describes the creation of a meta-data page for a new file.</td></tr> +<tr><td>crdel_metasub</td><td>Describes the creation of a meta data page for a subdatabase.</td></tr> +<tr><td>crdel_rename</td><td>Describes a file rename operation.</td></tr> +<tr><td>db_addrem</td><td>Add or remove an item from a page of duplicates.</td></tr> +<tr><td>db_big</td><td>Add an item to an overflow page (overflow pages contain items too large to place on the main page)</td></tr> +<tr><td>db_debug</td><td>Log debugging message.</td></tr> +<tr><td>db_noop</td><td>This marks an operation that did nothing but update the LSN on a page.</td></tr> +<tr><td>db_ovref</td><td>Increment or decrement the reference count for a big item.</td></tr> +<tr><td>db_relink</td><td>Fix prev/next chains on duplicate pages because a page was added or removed.</td></tr> +<tr><td>ham_chgpg</td><td>Used to adjust a cursor location when a Hash page is removed, and its elements are moved to a different Hash page.</td></tr> +<tr><td>ham_copypage</td><td>Used when we empty a bucket page, but there are overflow pages for the bucket; one needs to be copied back into the actual bucket.</td></tr> +<tr><td>ham_curadj</td><td>Used to adjust a cursor location when a nearby record changes in a Hash database.</td></tr> +<tr><td>ham_groupalloc</td><td>Allocate some number of contiguous pages to the Hash database.</td></tr> +<tr><td>ham_insdel</td><td>Insert/Delete an item on a Hash page.</td></tr> +<tr><td>ham_metagroup</td><td>Update the metadata page to reflect the allocation of a sequence of contiguous pages.</td></tr> +<tr><td>ham_newpage</td><td>Adds or removes overflow pages from a Hash bucket.</td></tr> +<tr><td>ham_replace</td><td>Handle updates to records that are on the main page.</td></tr> +<tr><td>ham_splitdata</td><td>Record the page data for a split.</td></tr> +<tr><td>log_register</td><td>Records an open of a file (mapping the file name to a log-id that is used in subsequent log operations).</td></tr> +<tr><td>qam_add</td><td>Describes the actual addition of a new record to a Queue.</td></tr> +<tr><td>qam_del</td><td>Delete a record in a Queue.</td></tr> +<tr><td>qam_delete</td><td>Remove a Queue extent file.</td></tr> +<tr><td>qam_inc</td><td>Increments the maximum record number allocated in a Queue indicating that we've allocated another space in the file.</td></tr> +<tr><td>qam_incfirst</td><td>Increments the record number that refers to the first record in the database.</td></tr> +<tr><td>qam_mvptr</td><td>Indicates that we changed the reference to either or both of the first and current records in the file.</td></tr> +<tr><td>qam_rename</td><td>Rename a Queue extent file.</td></tr> +<tr><td>txn_child</td><td>Commit a child transaction.</td></tr> +<tr><td>txn_ckp</td><td>Transaction checkpoint.</td></tr> +<tr><td>txn_regop</td><td>Logs a regular (non-child) transaction commit.</td></tr> +<tr><td>txn_xa_regop</td><td>Logs a prepare message.</td></tr> +</table> +<!--END LOG RECORD TYPES--> +<h3>Augmenting the Log for Debugging</h3> +<p>When debugging applications, it is sometimes useful to log, not only the +actual operations that modify pages, but also the underlying Berkeley DB +functions being executed. This form of logging can add significant bulk +to your log, but can permit debugging application errors that are almost +impossible to find any other way. To turn on these log messages, specify +the --enable-debug_rop and --enable-debug_wop configuration options when +configuring Berkeley DB. See <a href="../../ref/build_unix/conf.html">Configuring +Berkeley DB</a> for more information. +<h3>Extracting Committed Transactions and Transaction Status</h3> +<p>Sometimes it is useful to use the human-readable log output to determine +which transactions committed and aborted. The awk script, commit.awk, +found in the db_printlog directory of the Berkeley DB distribution allows you +to do just that. The command: +<p><blockquote><pre>awk -f commit.awk log_output</pre></blockquote> +where log_output is the output of db_printlog will display a list of +the transaction IDs of all committed transactions found in the log. +<p>If you need a complete list of both committed and aborted transactions, +then the script status.awk will produce that. The syntax is: +<p><blockquote><pre>awk -f status.awk log_output</pre></blockquote> +<h3>Extracting Transaction Histories</h3> +<p>Another useful debugging aid is to print out the complete history of a +transaction. The awk script txn.awk, allows you to do that. The +command line: +<p><blockquote><pre>awk -f txn.awk TXN=txnlist log_output</pre></blockquote> +where log_output is the output of <a href="../../utility/db_printlog.html">db_printlog</a> and txnlist is +a comma-separated list of transaction IDs, will display all log records +associated with the designated transaction ids. +<h3>Extracting File Histories</h3> +<p>The awk script fileid.awk, allows you to extract all log records that +affect particular files. The syntax for the fileid.awk script is: +<p><blockquote><pre>awk -f fileid.awk PGNO=fids log_output</pre></blockquote> +<p>where log_output is the output of db_printlog and fids is a +comma-separated list of fileids. The script will output all log +records that reference the designated file. +<h3>Extracting Page Histories</h3> +<p>The awk script pgno.awk, allows you to extract all log records that +affect particular pages. As currently designed, however, it will +extract records of all files with the designated page number, so this +script is most useful in conjunction with the fileid script. The syntax +for the pgno.awk script is: +<p><blockquote><pre>awk -f pgno.awk PGNO=pgnolist log_output</pre></blockquote> +<p>where log_output is the output of db_printlog and pgnolist is a +comma-separated list of page numbers. The script will output all log +records that reference the designated page numbers. +<h3>Other log processing tools</h3> +<p>The awk script count.awk will print out the number of log records +encountered that belonged to some transaction (that is the number of log +records excluding those for checkpoints and non-transaction protected +operations). +<p>The script range.awk will extract a subset of a log. This is useful +when the output of <a href="../../utility/db_printlog.html">db_printlog</a> is too large to be reasonably +manipulated with an editor or other tool. +<p>The syntax for range.awk is: +<p><blockquote><pre>awk -f range.awk START_FILE=sf START_OFFSET=so END_FILE=ef END_OFFSET=eo log_output</pre></blockquote> +<p>where the <b>sf</b> and <b>so</b> represent the log sequence number +(LSN) of the beginning of the sublog you wish to extract and <b>ef</b> +and <b>eo</b> represent the LSN of the end of the sublog you wish to +extract. +<table><tr><td><br></td><td width="1%"><a href="../../ref/debug/runtime.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/common.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/debug/runtime.html b/db/docs/ref/debug/runtime.html new file mode 100644 index 000000000..40fec7e82 --- /dev/null +++ b/db/docs/ref/debug/runtime.html @@ -0,0 +1,47 @@ +<!--$Id: runtime.so,v 10.16 2000/12/01 20:15:25 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Run-time error information</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Debugging</dl></h3></td> +<td width="1%"><a href="../../ref/debug/compile.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/printlog.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Run-time error information</h1> +<p>Normally, when an error occurs in the Berkeley DB library, an integer value +(either a Berkeley DB specific value, or a system <b>errno</b> value) is +returned by the function. In some cases, however, this value may be +insufficient to completely describe the cause of the error, especially +during initial application debugging. +<p>There are four interfaces intended to provide applications with +additional run-time error information. They are +<a href="../../api_c/env_set_errcall.html">DBENV->set_errcall</a>, <a href="../../api_c/env_set_errfile.html">DBENV->set_errfile</a>, +<a href="../../api_c/env_set_errpfx.html">DBENV->set_errpfx</a> and <a href="../../api_c/env_set_verbose.html">DBENV->set_verbose</a>. +<p>If the environment is configured with these interfaces, many Berkeley DB errors +will result in additional information being written to a file or passed +as an argument to an application function. +<p>The Berkeley DB error reporting facilities do not slow performance or +significantly increase application size, and may be run during normal +operation as well as during debugging. Where possible, we recommend that +these options always be configured and the output saved in the filesystem. +We have found that that this often saves time when debugging installation +or other system integration problems. +<p>In addition, there are three routines to assist applications in +displaying their own error messages: <a href="../../api_c/env_strerror.html">db_strerror</a>, +<a href="../../api_c/db_err.html">DBENV->err</a> and <a href="../../api_c/db_err.html">DBENV->errx</a>. The first is a superset of +the ANSI C strerror interface, and returns a descriptive string for +any error return from the Berkeley DB library. The <a href="../../api_c/db_err.html">DBENV->err</a> and +<a href="../../api_c/db_err.html">DBENV->errx</a> functions use the error message configuration options +described above to format and display error messages to appropriate +output devices. +<table><tr><td><br></td><td width="1%"><a href="../../ref/debug/compile.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/printlog.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/distrib/layout.html b/db/docs/ref/distrib/layout.html new file mode 100644 index 000000000..b851f62a0 --- /dev/null +++ b/db/docs/ref/distrib/layout.html @@ -0,0 +1,74 @@ +<!--$Id: layout.so,v 10.25 2000/12/22 15:35:32 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Source code layout</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Distribution</dl></h3></td> +<td width="1%"><a href="../../ref/test/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/refs/refs.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Source code layout</h1> +<p><table border=1 align=center> +<tr><th>Directory</th><th>Description</th></tr> +<tr><td>LICENSE</td><td>Berkeley DB Copyright</td></tr> +<tr><td>btree</td><td>Btree access method source code.</td></tr> +<tr><td>build_unix</td><td>UNIX build directory.</td></tr> +<tr><td>build_vxworks</td><td>VxWorks build directory.</td></tr> +<tr><td>build_win32</td><td>Windows build directory.</td></tr> +<tr><td>clib</td><td>C library replacement functions.</td></tr> +<tr><td>common</td><td>Common Berkeley DB functions.</td></tr> +<tr><td>cxx</td><td>C++ API.</td></tr> +<tr><td>db</td><td>Berkeley DB database interfaces.</td></tr> +<tr><td>db185</td><td>Berkeley DB version 1.85 compatibility API</td></tr> +<tr><td>db_archive</td><td>The db_archive utility.</td></tr> +<tr><td>db_checkpoint</td><td>The db_checkpoint utility.</td></tr> +<tr><td>db_deadlock</td><td>The db_deadlock utility.</td></tr> +<tr><td>db_dump</td><td>The db_dump utility.</td></tr> +<tr><td>db_dump185</td><td>The db_dump185 utility.</td></tr> +<tr><td>db_load</td><td>The db_load utility.</td></tr> +<tr><td>db_printlog</td><td>The db_printlog debugging utility.</td></tr> +<tr><td>db_recover</td><td>The db_recover utility.</td></tr> +<tr><td>db_stat</td><td>The db_stat utility.</td></tr> +<tr><td>db_upgrade</td><td>The db_upgrade utility.</td></tr> +<tr><td>db_verify</td><td>The db_verify utility.</td></tr> +<tr><td>dbm</td><td>The dbm/ndbm compatibility APIs.</td></tr> +<tr><td>dist</td><td>Berkeley DB administration/distribution tools.</td></tr> +<tr><td>docs</td><td>Documentation.</td></tr> +<tr><td>env</td><td>Berkeley DB environment interfaces.</td></tr> +<tr><td>examples_c</td><td>C API example programs.</td></tr> +<tr><td>examples_cxx</td><td>C++ API example programs.</td></tr> +<tr><td>examples_java</td><td>Java API example programs.</td></tr> +<tr><td>hash</td><td>Hash access method.</td></tr> +<tr><td>hsearch</td><td>The hsearch compatibility API.</td></tr> +<tr><td>include</td><td>Include files.</td></tr> +<tr><td>java</td><td>Java API.</td></tr> +<tr><td>libdb_java</td><td>The libdb_java shared library.</td></tr> +<tr><td>lock</td><td>Lock manager.</td></tr> +<tr><td>log</td><td>Log manager.</td></tr> +<tr><td>mp</td><td>Shared memory buffer pool.</td></tr> +<tr><td>mutex</td><td>Mutexes.</td></tr> +<tr><td>os</td><td>POSIX 1003.1 operating-system specific functionality.</td></tr> +<tr><td>os_vxworks</td><td>VxWorks operating-system specific functionality.</td></tr> +<tr><td>os_win32</td><td>Windows operating-system specific functionality.</td></tr> +<tr><td>perl.BerkeleyDB</td><td>BerkeleyDB Perl module.</td></tr> +<tr><td>perl.DB_File</td><td>DB_File Perl module.</td></tr> +<tr><td>qam</td><td>Queue access method source code.</td></tr> +<tr><td>rpc_client</td><td>RPC client interface.</td></tr> +<tr><td>rpc_server</td><td>RPC server utility.</td></tr> +<tr><td>tcl</td><td>Tcl API.</td></tr> +<tr><td>test</td><td>Test suite.</td></tr> +<tr><td>txn</td><td>Transaction manager.</td></tr> +<tr><td>xa</td><td>X/Open Distributed Transaction Processing XA interface.</td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/test/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/refs/refs.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/dumpload/format.html b/db/docs/ref/dumpload/format.html new file mode 100644 index 000000000..fd52e530a --- /dev/null +++ b/db/docs/ref/dumpload/format.html @@ -0,0 +1,69 @@ +<!--$Id: format.so,v 10.14 2000/03/22 21:56:11 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Dump output formats</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Dumping and Reloading</dl></h3></td> +<td width="1%"><a href="../../ref/dumpload/utility.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/dumpload/text.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Dump output formats</h1> +<p>There are two output formats used by <a href="../../utility/db_dump.html">db_dump</a> and <a href="../../utility/db_dump.html">db_dump185</a>. +<p>In both output formats, the first few lines of the output contain header +information describing the underlying access method, filesystem page size +and other bookkeeping information. +<p>The header information starts with a single line VERSION=N, where N is +the version number of the dump output format. +<p>The header information is then output in name=value pairs, where name may +be any of the keywords listed in the <a href="../../utility/db_load.html">db_load</a> manual page, and +value will be its value. While this header information can be manually +edited before the database is reloaded, there is rarely any reason to do +so, as all of this information can also be specified or overridden by +command-line arguments to <a href="../../utility/db_load.html">db_load</a>. +<p>The header information ends with single line HEADER=END. +<p>Following the header information are the key/data pairs from the database. +If the database being dumped is of type Btree or Hash, or if the +<b>-k</b> option as been specified, the output will be paired lines of +text, where the first line of the pair is the key item, and the second +line of the pair is its corresponding data item. If the database being +dumped is of type Queue or Recno and the <b>-k</b> has not been +specified, the output will be lines of text, where each line is the next +data item for the database. Each of these lines will be preceded by a +single space. +<p>If the <b>-p</b> option to <a href="../../utility/db_dump.html">db_dump</a> or <a href="../../utility/db_dump.html">db_dump185</a> was +specified, the key/data lines will consist of single characters +representing any characters from the database that are <i>printing +characters</i> and backslash (<b>\</b>) escaped characters +for any that were not. Backslash characters appearing in the output mean +one of two things: if the backslash character precedes another backslash +character, it means that a literal backslash character occurred in the +key or data item. If the backslash character precedes any other +character, the next two characters must be interpreted as hexadecimal +specification of a single character, e.g., <b>\0a</b> is +a newline character in the ASCII character set. +<p>Although some care should be exercised, it is perfectly reasonable to use +standard text editors and tools to edit databases dumped using the +<b>-p</b> option before re-loading them using the <a href="../../utility/db_load.html">db_load</a> +utility. +<p>Note that the definition of a printing character may vary from system to +system, and so database representations created using the <b>-p</b> +option may be less portable than those created without it. +<p>If the <b>-p</b> option to <a href="../../utility/db_dump.html">db_dump</a> or <a href="../../utility/db_dump.html">db_dump185</a> is +not specified, each output line will consist of paired hexadecimal values, +e.g., the line <b>726f6f74</b> is the string <b>root</b> in the ASCII +character set. +<p>In all output formats, the key and data items are ended by a single line +DATA=END. +<p>Where multiple databases have been dumped from a file, the overall output +will repeat, i.e., a new set of headers and a new set of data items. +<table><tr><td><br></td><td width="1%"><a href="../../ref/dumpload/utility.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/dumpload/text.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/dumpload/text.html b/db/docs/ref/dumpload/text.html new file mode 100644 index 000000000..569980a19 --- /dev/null +++ b/db/docs/ref/dumpload/text.html @@ -0,0 +1,32 @@ +<!--$Id: text.so,v 10.14 2000/12/04 20:49:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Loading text into databases</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Dumping and Reloading</dl></h3></td> +<td width="1%"><a href="../../ref/dumpload/format.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/install/file.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Loading text into databases</h1> +<p>The <a href="../../utility/db_load.html">db_load</a> utility can be used to load text into databases. +The <b>-T</b> option permits non-database applications to create +flat-text files that are then loaded into databases for fast, +highly-concurrent access. For example, the following command loads the +standard UNIX <b>/etc/passwd</b> file into a database, with the login +name as the key item and the entire password entry as the data item: +<p><blockquote><pre>awk -F: '{print $1; print $0}' < /etc/passwd |\ + sed 's/\\/\\\\/g' | db_load -T -t hash passwd.db</pre></blockquote> +<p>Note that backslash characters naturally occurring in the text are escaped +to avoid interpretation as escape characters by <a href="../../utility/db_load.html">db_load</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/dumpload/format.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/install/file.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/dumpload/utility.html b/db/docs/ref/dumpload/utility.html new file mode 100644 index 000000000..f9cb51c11 --- /dev/null +++ b/db/docs/ref/dumpload/utility.html @@ -0,0 +1,45 @@ +<!--$Id: utility.so,v 10.15 2000/12/04 20:49:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: The db_dump and db_load utilities</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Dumping and Reloading</dl></h3></td> +<td width="1%"><a href="../../ref/sendmail/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/dumpload/format.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>The db_dump and db_load utilities</h1> +<p>There are three utilities used for dumping and loading Berkeley DB +databases: <a href="../../utility/db_dump.html">db_dump</a>, <a href="../../utility/db_dump.html">db_dump185</a> and <a href="../../utility/db_load.html">db_load</a>. +<p>The <a href="../../utility/db_dump.html">db_dump</a> and <a href="../../utility/db_dump.html">db_dump185</a> utilities dump Berkeley DB +databases into a flat-text representation of the data that can +be read by <a href="../../utility/db_load.html">db_load</a>. The only difference between them +is that <a href="../../utility/db_dump.html">db_dump</a> reads Berkeley DB version 2 and greater +database formats, while <a href="../../utility/db_dump.html">db_dump185</a> reads Berkeley DB version +1.85 and 1.86 database formats. +<p>The <a href="../../utility/db_load.html">db_load</a> utility reads either the output format used +by the dump utilities or, optionally, a flat-text representation +created using other tools, and stores it into a Berkeley DB database. +<p>Dumping and reloading Hash databases that use user-defined hash functions +will result in new databases that use the default hash function. While +using the default hash function may not be optimal for the new database, +it will continue to work correctly. +<p>Dumping and reloading Btree databases that use user-defined prefix or +comparison functions will result in new databases that use the default +prefix and comparison functions. In which case it is quite likely that +applications will be unable to retrieve records, and possible that the +load process itself will fail. +<p>The only available workaround for either Hash or Btree databases is to +modify the sources for the <a href="../../utility/db_load.html">db_load</a> utility to load the database +using the correct hash, prefix and comparison functions. +<table><tr><td><br></td><td width="1%"><a href="../../ref/sendmail/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/dumpload/format.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/create.html b/db/docs/ref/env/create.html new file mode 100644 index 000000000..374c7a6e0 --- /dev/null +++ b/db/docs/ref/env/create.html @@ -0,0 +1,73 @@ +<!--$Id: create.so,v 10.23 2000/12/04 18:05:41 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Creating an Environment</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/env/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/naming.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Creating an Environment</h1> +<p>The <a href="../../api_c/env_open.html">DBENV->open</a> function is the standard function for creating or +joining a database environment. Transaction-protected or multi-process +applications should call <a href="../../api_c/env_open.html">DBENV->open</a> before making any other calls +to the Berkeley DB library. Applications must obtain an environment handle from +the <a href="../../api_c/env_create.html">db_env_create</a> function before calling <a href="../../api_c/env_open.html">DBENV->open</a>. +There are a large number of options that you can set to customize +<a href="../../api_c/env_open.html">DBENV->open</a> for your environment. These options fall into four +broad categories: +<p><dl compact> +<p><dt>Subsystem Initialization:<dd>These flags indicate which Berkeley DB subsystems will be initialized for the +environment, and, what operations will happen automatically when +databases are accessed within the environment. The flags include +<a href="../../api_c/env_open.html#DB_JOINENV">DB_JOINENV</a>, <a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a>, <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a>, +<a href="../../api_c/env_open.html#DB_INIT_LOG">DB_INIT_LOG</a>, <a href="../../api_c/env_open.html#DB_INIT_MPOOL">DB_INIT_MPOOL</a> and <a href="../../api_c/env_open.html#DB_INIT_TXN">DB_INIT_TXN</a>. +The <a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a> flag does initialization for Berkeley DB Concurrent Data Store +applications, see <a href="../../ref/cam/intro.html">Building Berkeley DB Concurrent Data Store +applications</a> for more information. The rest of the flags initialize +a single subsystem, e.g., when <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a> is specified, +applications reading and writing databases opened in this environment +will be using locking to ensure that they do not overwrite each other's +changes. +<p><dt>Recovery options:<dd>These flags indicate what recovery is to be performed on the environment +before it is opened for normal use, and include <a href="../../api_c/env_open.html#DB_RECOVER">DB_RECOVER</a> and +<a href="../../api_c/env_open.html#DB_RECOVER_FATAL">DB_RECOVER_FATAL</a>. +<p><dt>Naming options:<dd>These flags modify how file naming happens in the environment, and include +<a href="../../api_c/env_open.html#DB_USE_ENVIRON">DB_USE_ENVIRON</a> and <a href="../../api_c/env_open.html#DB_USE_ENVIRON_ROOT">DB_USE_ENVIRON_ROOT</a>. +<p><dt>Miscellaneous:<dd>Finally, there are a number of miscellaneous flags such as <a href="../../api_c/env_open.html#DB_CREATE">DB_CREATE</a> +which causes underlying files to be created as necessary. See the +<a href="../../api_c/env_open.html">DBENV->open</a> manual pages for further information. +</dl> +<p>Most applications either specify only the <a href="../../api_c/env_open.html#DB_INIT_MPOOL">DB_INIT_MPOOL</a> flag or +they specify all four subsystem initialization flags +(<a href="../../api_c/env_open.html#DB_INIT_MPOOL">DB_INIT_MPOOL</a>, <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a>, <a href="../../api_c/env_open.html#DB_INIT_LOG">DB_INIT_LOG</a> and +<a href="../../api_c/env_open.html#DB_INIT_TXN">DB_INIT_TXN</a>). The former configuration is for applications that +simply want to use the basic Access Method interfaces with a shared +underlying buffer pool, but don't care about recoverability after +application or system failure. The latter is for applications that need +recoverability. There are situations where other combinations of the +initialization flags make sense, but they are rare. +<p>The <a href="../../api_c/env_open.html#DB_RECOVER">DB_RECOVER</a> flag is specified by applications that want to +perform any necessary database recovery when they start running, i.e., if +there was a system or application failure the last time they ran, they +want the databases to be made consistent before they start running again. +It is not an error to specify this flag when no recovery needs to be +done. +<p>The <a href="../../api_c/env_open.html#DB_RECOVER_FATAL">DB_RECOVER_FATAL</a> flag is more special-purpose. It performs +catastrophic database recovery, and normally requires that some initial +arrangements be made, i.e., archived log files be brought back into the +filesystem. Applications should not normally specify this flag. Instead, +under these rare conditions, the <a href="../../utility/db_recover.html">db_recover</a> utility should be +used. +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/naming.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/error.html b/db/docs/ref/env/error.html new file mode 100644 index 000000000..1a79d8fe5 --- /dev/null +++ b/db/docs/ref/env/error.html @@ -0,0 +1,57 @@ +<!--$Id: error.so,v 10.13 2001/01/11 15:23:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Error support</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/env/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/cam/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Error support</h1> +<p>Berkeley DB offers programmatic support for displaying error return values. +The <a href="../../api_c/env_strerror.html">db_strerror</a> interface returns a pointer to the error +message corresponding to any Berkeley DB error return, similar to the ANSI C +strerror interface, but able to handle both system error returns and +Berkeley DB specific return values. +<p>For example: +<p><blockquote><pre>int ret; +if ((ret = dbenv->set_cachesize(dbenv, 0, 32 * 1024)) != 0) { + fprintf(stderr, "set_cachesize failed: %s\n", db_strerror(ret)); + return (1); +}</pre></blockquote> +<p>There are also two additional error functions, <a href="../../api_c/db_err.html">DBENV->err</a> and +<a href="../../api_c/db_err.html">DBENV->errx</a>. These functions work like the ANSI C printf +interface, taking a printf-style format string and argument list, and +writing a message constructed from the format string and arguments. +<p>The <a href="../../api_c/db_err.html">DBENV->err</a> function appends the standard error string to the +constructed message and the <a href="../../api_c/db_err.html">DBENV->errx</a> function does not. +<p>Error messages can be configured always to include a prefix (e.g., the +program name) using the <a href="../../api_c/env_set_errpfx.html">DBENV->set_errpfx</a> interface. +<p>These functions provide simpler ways of displaying Berkeley DB error messages: +<p><blockquote><pre>int ret; +dbenv->set_errpfx(dbenv, argv0); +if ((ret = dbenv->open(dbenv, home, NULL, + DB_CREATE | DB_INIT_LOG | DB_INIT_TXN | DB_USE_ENVIRON)) + != 0) { + dbenv->err(dbenv, ret, "open: %s", home); + dbenv->errx(dbenv, + "contact your system administrator: session ID was %d", + session_id); + return (1); +}</pre></blockquote> +<p>For example, if the program was called "my_app", attempting to open an +environment home directory in "/tmp/home", and the open call returned a +permission error, the error messages shown would look like: +<p><blockquote><pre>my_app: open: /tmp/home: Permission denied. +my_app: contact your system administrator: session ID was 2</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/cam/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/intro.html b/db/docs/ref/env/intro.html new file mode 100644 index 000000000..a555c2f0f --- /dev/null +++ b/db/docs/ref/env/intro.html @@ -0,0 +1,56 @@ +<!--$Id: intro.so,v 10.25 2000/03/18 21:43:12 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/arch/utilities.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/create.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Introduction</h1> +<p>A Berkeley DB environment is an encapsulation of one or more databases, log +files and shared information about the database environment such as shared +memory buffer cache pages. +<p>The simplest way to administer a Berkeley DB application environment is to +create a single <b>home</b> directory that stores the files for the +applications that will share the environment. The environment home +directory must be created before any Berkeley DB applications are run. Berkeley DB +itself never creates the environment home directory. The environment can +then be identified by the name of that directory. +<p>An environment may be shared by any number of applications as well as by +any number of threads within the applications. It is possible for an +environment to include resources from other directories on the system, +and applications often choose to distribute resources to other directories +or disks for performance or other reasons. However, by default, the +databases, shared regions (the locking, logging, memory pool, and +transaction shared memory areas) and log files will be stored in a single +directory hierarchy. +<p>It is important to realize that all applications sharing a database +environment implicitly trust each other. They have access to each other's +data as it resides in the shared regions and they will share resources +such as buffer space and locks. At the same time, any applications using +the same databases <b>must</b> share an environment if consistency is +to be maintained between them. +<p>The Berkeley DB environment is created and described by the <a href="../../api_c/env_create.html">db_env_create</a> +and <a href="../../api_c/env_open.html">DBENV->open</a> interfaces. In situations where customization is +desired, such as storing log files on a separate disk drive, applications +must describe the customization by either creating an environment +configuration file in the environment home directory or by arguments +passed to the <a href="../../api_c/env_open.html">DBENV->open</a> interface. See the documentation on that +function for details on this procedure. +<p>Once an environment has been created, database files specified using +relative pathnames will be named relative to the home directory. Using +pathnames relative to the home directory allows the entire environment +to be easily moved to facilitate restoring and recovering a database in +a different directory or on a different system. +<table><tr><td><br></td><td width="1%"><a href="../../ref/arch/utilities.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/create.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/naming.html b/db/docs/ref/env/naming.html new file mode 100644 index 000000000..fd5753962 --- /dev/null +++ b/db/docs/ref/env/naming.html @@ -0,0 +1,145 @@ +<!--$Id: naming.so,v 10.36 2001/01/09 15:36:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: File naming</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/env/create.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/security.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>File naming</h1> +<p>The most important task of the environment is to structure file naming +within Berkeley DB. +<p>Each of the locking, logging, memory pool and transaction subsystems of +Berkeley DB require shared memory regions, backed by the filesystem. Further, +cooperating applications (or multiple invocations of the same application) +must agree on the location of the shared memory regions and other files +used by the Berkeley DB subsystems, the log files used by the logging subsystem, +and, of course, the data files. Although it is possible to specify full +pathnames to all Berkeley DB functions, this is cumbersome and requires +that applications be recompiled when database files are moved. +<p>Applications are normally expected to specify a single directory home for +their database. This can be done easily in the call to <a href="../../api_c/env_open.html">DBENV->open</a> +by specifying a value for the <b>db_home</b> argument. There are more +complex configurations where it may be desirable to override +<b>db_home</b> or provide supplementary path information. +<h3>Specifying file naming to Berkeley DB</h3> +<p>The following describes the possible ways in which file naming information +may be specified to the Berkeley DB library. The specific circumstances and +order in which these ways are applied are described in a subsequent +paragraph. +<p><dl compact> +<p><dt><b><a name="db_home">db_home</a></b><dd>If the <b>db_home</b> argument to <a href="../../api_c/env_open.html">DBENV->open</a> is non-NULL, its +value may be used as the database home, and files named relative to its +path. +<p><dt><a name="DB_HOME">DB_HOME</a><dd>If the DB_HOME environment variable is set when <a href="../../api_c/env_open.html">DBENV->open</a> is +called, its value may be used as the database home, and files named +relative to its path. +<p>The DB_HOME environment variable is intended to permit users and system +administrators to override application and installation defaults, e.g.: +<p><blockquote><pre>env DB_HOME=/database/my_home application</pre></blockquote> +<p>Application writers are encouraged to support the <b>-h</b> option +found in the supporting Berkeley DB utilities to let users specify a database +home. +<p><dt>DB_ENV methods<dd>There are three DB_ENV methods that affect file naming. The +<a href="../../api_c/env_set_data_dir.html">DBENV->set_data_dir</a> function specifies a directory to search for database +files. The <a href="../../api_c/env_set_lg_dir.html">DBENV->set_lg_dir</a> function specifies a directory in which to +create logging files. The <a href="../../api_c/env_set_tmp_dir.html">DBENV->set_tmp_dir</a> function specifies a +directory in which to create backing temporary files. These methods are +intended to permit applications to customize file location for a database. +For example, an application writer can place data files and log files in +different directories, or instantiate a new log directory each time the +application runs. +<p><dt><a name="DB_CONFIG">DB_CONFIG</a><dd>The same information specified to the above DB_ENV methods may also +be specified using a configuration file. If an environment home directory +has been specified (either by the application specifying a non-NULL +<b>db_home</b> argument to <a href="../../api_c/env_open.html">DBENV->open</a>, or by the application +setting the DB_USE_ENVIRON or DB_USE_ENVIRON_ROOT flags and the DB_HOME +environment variable being set), any file named <b>DB_CONFIG</b> in the +database home directory will be read for lines of the format <b>NAME +VALUE</b>. +<p>The characters delimiting the two parts of the entry may be one or more +whitespace characters, and trailing whitespace characters are discarded. +All empty lines or lines whose first character is a whitespace or hash +(<b>#</b>) character will be ignored. Each line must specify both +the NAME and the VALUE of the pair. The specific NAME VALUE pairs are +documented in the manual <a href="../../api_c/env_set_data_dir.html">DBENV->set_data_dir</a>, +<a href="../../api_c/env_set_lg_dir.html">DBENV->set_lg_dir</a> and <a href="../../api_c/env_set_tmp_dir.html">DBENV->set_tmp_dir</a> pages. +<p>The DB_CONFIG configuration file is intended to permit systems to +customize file location for an environment independent of applications +using that database. For example, a database administrator can move the +database log and data files to a different location without application +recompilation. +</dl> +<h3>File name resolution in Berkeley DB</h3> +<p>The following describes the specific circumstances and order in which the +different ways of specifying file naming information are applied. Berkeley DB +file name processing proceeds sequentially through the following steps: +<p><dl compact> +<p><dt>absolute pathnames<dd>If the file name specified to a Berkeley DB function is an absolute pathname, +that file name is used without modification by Berkeley DB. +<p>On UNIX systems, an absolute pathname is defined as any pathname that +begins with a leading slash (<b>/</b>). +<p>On Windows systems, an absolute pathname is any pathname that begins with +a leading slash or leading backslash (<b>\</b>), or any +pathname beginning with a single alphabetic character, a colon and a +leading slash or backslash, e.g., <b>C:/tmp</b>. +<p><dt>DB_ENV methods, DB_CONFIG<dd>If a relevant configuration string (e.g., set_data_dir), is specified +either by calling a DB_ENV method or as a line in the DB_CONFIG +configuration file, the VALUE from the <b>NAME VALUE</b> pair is +prepended to the current file name. If the resulting file name is an +absolute pathname, the file name is used without further modification by +Berkeley DB. +<p><dt><b>db_home</b><dd>If the application specified a non-NULL <b>db_home</b> argument to +<a href="../../api_c/env_open.html">DBENV->open</a> its value is prepended to the current file name. If +the resulting file name is an absolute pathname, the file name is used +without further modification by Berkeley DB. +<p><dt>DB_HOME<dd>If the <b>db_home</b> argument is null, the DB_HOME environment variable +was set and the application has set the appropriate DB_USE_ENVIRON or +DB_USE_ENVIRON_ROOT environment variable, its value is prepended to the +current file name. If the resulting file name is an absolute pathname, +the file name is used without further modification by Berkeley DB. +<p><dt>(nothing)<dd>Finally, all file names are interpreted relative to the current working +directory of the process. +</dl> +<p>The common model for a Berkeley DB environment is one where only the DB_HOME +environment variable, or the <b>db_home</b> argument, is specified. In +this case, all data file names are relative to that directory, and all +files created by the Berkeley DB subsystems will be created in that directory. +<p>The more complex model for a transaction environment might be one where +a database home is specified, using either the DB_HOME environment +variable or the <b>db_home</b> argument to <a href="../../api_c/env_open.html">DBENV->open</a>, and then +the data directory and logging directory are set to the relative path +names of directories underneath the environment home. +<h3>Examples</h3> +Store all files in the directory <b>/a/database</b>: +<p><blockquote><pre>DBENV->open(DBENV, "/a/database", ...);</pre></blockquote> +Create temporary backing files in <b>/b/temporary</b>, and all other files +in <b>/a/database</b>: +<p><blockquote><pre>DBENV->set_tmp_dir(DBENV, "/b/temporary"); +DBENV->open(DBENV, "/a/database", ...);</pre></blockquote> +Store data files in <b>/a/database/datadir</b>, log files in +<b>/a/database/logdir</b>, and all other files in the directory +<b>/a/database</b>: +<p><blockquote><pre>DBENV->set_lg_dir("logdir"); +DBENV->set_data_dir("datadir"); +DBENV->open(DBENV, "/a/database", ...);</pre></blockquote> +<p>Store data files in <b>/a/database/data1</b> and <b>/b/data2</b>, and +all other files in the directory <b>/a/database</b>. Any data files +that are created will be created in <b>/b/data2</b>, because it is the +first DB_DATA_DIR directory specified: +<p><blockquote><pre>DBENV->set_data_dir(DBENV, "/b/data2"); +DBENV->set_data_dir(DBENV, "data1"); +DBENV->open(DBENV, "/a/database", ...);</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/create.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/security.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/open.html b/db/docs/ref/env/open.html new file mode 100644 index 000000000..f13675c73 --- /dev/null +++ b/db/docs/ref/env/open.html @@ -0,0 +1,30 @@ +<!--$Id: open.so,v 10.14 2000/03/18 21:43:12 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Opening databases within the environment</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/env/remote.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/error.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Opening databases within the environment</h1> +<p>Once the environment has been created, database handles may be created +and then opened within the environment. This is done by calling the +<a href="../../api_c/db_create.html">db_create</a> interface and specifying the appropriate environment +as an argument. +<p>File naming, database operations and error handling will all be done as +specified for the environment, e.g., if the <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a> or +<a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a> flags were specified when the environment was created +or joined, database operations will automatically perform all necessary +locking operations for the application. +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/remote.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/error.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/region.html b/db/docs/ref/env/region.html new file mode 100644 index 000000000..0dfa19672 --- /dev/null +++ b/db/docs/ref/env/region.html @@ -0,0 +1,66 @@ +<!--$Id: region.so,v 10.23 2000/08/09 15:45:52 sue Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Shared Memory Regions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/env/security.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/remote.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Shared Memory Regions</h1> +<p>Each of the Berkeley DB subsystems within an environment is described by one or +more regions. The regions contain all of the per-process and per-thread +shared information, including mutexes, that comprise a Berkeley DB environment. +These regions are created in one of three areas, depending on the flags +specified to the <a href="../../api_c/env_open.html">DBENV->open</a> function: +<p><ol> +<p><li>If the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag is specified to <a href="../../api_c/env_open.html">DBENV->open</a>, regions +are created in per-process heap memory, i.e., memory returned by +<b>malloc</b>(3). In this case, the Berkeley DB environment may only be +accessed by a single process, although that process may be +multi-threaded. +<p><li>If the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> flag is specified to <a href="../../api_c/env_open.html">DBENV->open</a>, +regions are created in system memory. When regions are created in system +memory, the Berkeley DB environment may be accessed by both multiple processes +and multiple threads within processes. +<p>The system memory used by Berkeley DB is potentially useful past the lifetime +of any particular process. Therefore, additional cleanup may be necessary +after an application fails, as there may be no way for Berkeley DB to ensure +that system resources backing the shared memory regions are returned to +the system. +<p>The system memory that is used is architecture-dependent. For example, +on systems supporting X/Open-style shared memory interfaces, e.g., UNIX +systems, the <b>shmget</b>(2) and related System V IPC interfaces are +used. Additionally, VxWorks systems use system memory. +In these cases, an initial segment ID must be specified by the +application to ensure that applications do not overwrite each other's +database environments, and so that the number of segments created does +not grow without bound. See the <a href="../../api_c/env_set_shm_key.html">DBENV->set_shm_key</a> function for more +information. +<p><li>If no memory-related flags are specified to <a href="../../api_c/env_open.html">DBENV->open</a>, then +memory backed by the filesystem is used to store the regions. On UNIX +systems, the Berkeley DB library will use the POSIX mmap interface. If mmap is +not available, the UNIX shmget interfaces will be used, assuming they are +available. +</ol> +<p>Any files created in the filesystem to back the regions are created in +the environment home directory specified to the <a href="../../api_c/env_open.html">DBENV->open</a> call. +These files are named __db.###, e.g., __db.001, __db.002 and so on. +When region files are backed by the filesystem, one file per region is +created. When region files are backed by system memory, a single file +will still be created, as there must be a well-known name in the +filesystem so that multiple processes can locate the system shared memory +that is being used by the environment. +<p>Statistics about the shared memory regions in the environment can be +displayed using the <b>-e</b> option to the <a href="../../utility/db_stat.html">db_stat</a> utility. +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/security.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/remote.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/remote.html b/db/docs/ref/env/remote.html new file mode 100644 index 000000000..3cd44a539 --- /dev/null +++ b/db/docs/ref/env/remote.html @@ -0,0 +1,48 @@ +<!--$Id: remote.so,v 11.5 2000/03/18 21:43:12 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Remote filesystems</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/env/region.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Remote filesystems</h1> +<p>When regions are backed by the filesystem, it is a common error to attempt +to create Berkeley DB environments backed by remote file systems such as the +Network File System (NFS) or the Andrew File System (AFS). Remote +filesystems rarely support mapping files into process memory, and even +more rarely support correct semantics for mutexes after the attempt +succeeds. For this reason, we strongly recommend that the database +environment directory reside in a local filesystem. +<p>For remote file systems that do allow system files to be mapped into +process memory, home directories accessed via remote file systems cannot +be used simultaneously from multiple clients. None of the commercial +remote file systems available today implement coherent, distributed shared +memory for remote-mounted files. As a result, different machines will +see different versions of these shared regions and the system behavior is +undefined. +<p>Databases, log files and temporary files may be placed on remote +filesystems, <b>as long as the remote filesystem fully supports +standard POSIX filesystem semantics</b>, although the application may incur +a performance penalty for doing so. Obviously, NFS-mounted databases +cannot be accessed from more than one Berkeley DB environment (and therefore +from more than one system), at a time since no Berkeley DB database may be +accessed from more than one Berkeley DB environment at a time. +<p><dl compact> +<p><dt>Linux note:<dd>Some Linux releases are known to not support complete semantics for the +POSIX fsync call on NFS-mounted filesystems. No Berkeley DB files should be +placed on NFS-mounted filesystems on these systems. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/region.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/env/security.html b/db/docs/ref/env/security.html new file mode 100644 index 000000000..84dab59b2 --- /dev/null +++ b/db/docs/ref/env/security.html @@ -0,0 +1,54 @@ +<!--$Id: security.so,v 10.15 2000/05/23 21:12:06 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Security</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Environment</dl></h3></td> +<td width="1%"><a href="../../ref/env/naming.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/region.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Security</h1> +<p>The following are security issues that should be considered when writing +Berkeley DB applications: +<p><dl compact> +<p><dt>Database environment permissions<dd>The directory used as the Berkeley DB database environment should have its +permissions set to ensure that files in the environment are not accessible +to users without appropriate permissions. Applications which add to the +user's permissions (e.g., UNIX setuid or setgid applications), must be +carefully checked to not permit illegal use of those permissions such +as general file access in the environment directory. +<p><dt>Environment variables<dd>Setting the <a href="../../api_c/env_open.html#DB_USE_ENVIRON">DB_USE_ENVIRON</a> and <a href="../../api_c/env_open.html#DB_USE_ENVIRON_ROOT">DB_USE_ENVIRON_ROOT</a> flags +and allowing the use of environment variables during file naming can be +dangerous. Setting those flags in Berkeley DB applications with additional +permissions (e.g., UNIX setuid or setgid applications) could potentially +allow users to read and write databases to which they would not normally +have access. +<p><dt>File permissions<dd>By default, Berkeley DB always creates files readable and writeable by the owner +and the group (i.e., S_IRUSR, S_IWUSR, S_IRGRP and S_IWGRP, or octal mode +0660 on historic UNIX systems). The group ownership of created files is +based on the system and directory defaults, and is not further specified +by Berkeley DB. +<p><dt>Temporary backing files<dd>If an unnamed database is created and the cache is too small to hold the +database in memory, Berkeley DB will create a temporary physical file to enable +it to page the database to disk as needed. In this case, environment +variables such as <b>TMPDIR</b> may be used to specify the location of +that temporary file. While temporary backing files are created readable +and writeable by the owner only (i.e., S_IRUSR and S_IWUSR, or octal mode +0600 on historic UNIX systems), some filesystems may not sufficiently +protect temporary files created in random directories from improper +access. Applications storing sensitive data in unnamed databases should +use the <a href="../../api_c/env_set_tmp_dir.html">DBENV->set_tmp_dir</a> method to specify a temporary directory +with known permissions, to be absolutely safe. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/env/naming.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/env/region.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/install/file.html b/db/docs/ref/install/file.html new file mode 100644 index 000000000..2ecb240e2 --- /dev/null +++ b/db/docs/ref/install/file.html @@ -0,0 +1,37 @@ +<!--$Id: file.so,v 10.16 2000/12/04 18:05:42 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: File utility /etc/magic information</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>System Installation Notes</dl></h3></td> +<td width="1%"><a href="../../ref/dumpload/text.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>File utility /etc/magic information</h1> +<p>The <b>file</b>(1) utility is a UNIX utility that examines and +classifies files, based on information found in its database of file +types, the /etc/magic file. The following information may be added +to your system's /etc/magic file to enable <b>file</b>(1) to +correctly identify Berkeley DB database files. +<p>The <b>file</b>(1) utility <b>magic</b>(5) information for the +standard System V UNIX implementation of the <b>file</b>(1) utility +is included in the Berkeley DB distribution for both +<a href="magic.s5.be.txt">big-endian</a> (e.g., Sparc) and +<a href="magic.s5.le.txt">little-endian</a> (e.g., x86) architectures. +<p>The <b>file</b>(1) utility <b>magic</b>(5) information for +Release 3.X of Ian Darwin's implementation of the file utility (as +distributed by FreeBSD and most Linux distributions) is included in the +Berkeley DB distribution. This <a href="magic.txt">magic.txt</a> information +is correct for both big-endian and little-endian architectures. +<table><tr><td><br></td><td width="1%"><a href="../../ref/dumpload/text.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/debug/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/install/magic.s5.be.txt b/db/docs/ref/install/magic.s5.be.txt new file mode 100644 index 000000000..1b8fcc108 --- /dev/null +++ b/db/docs/ref/install/magic.s5.be.txt @@ -0,0 +1,87 @@ +# Berkeley DB +# $Id: magic.s5.be.txt,v 10.4 2000/07/07 21:02:22 krinsky Exp $ +# +# System V /etc/magic files: big-endian version. +# +# Hash 1.85/1.86 databases store metadata in network byte order. +# Btree 1.85/1.86 databases store the metadata in host byte order. +# Hash and Btree 2.X and later databases store the metadata in host byte order. + +0 long 0x00053162 Berkeley DB 1.85/1.86 (Btree, +>4 long 0x00000002 version 2, +>4 long 0x00000003 version 3, +>0 long 0x00053162 native byte-order) + +0 long 0x62310500 Berkeley DB 1.85/1.86 (Btree, +>4 long 0x02000000 version 2, +>4 long 0x03000000 version 3, +>0 long 0x62310500 little-endian) + +12 long 0x00053162 Berkeley DB (Btree, +>16 long 0x00000004 version 4, +>16 long 0x00000005 version 5, +>16 long 0x00000006 version 6, +>16 long 0x00000007 version 7, +>16 long 0x00000008 version 8, +>16 long 0x00000009 version 9, +>12 long 0x00053162 native byte-order) + +12 long 0x62310500 Berkeley DB (Btree, +>16 long 0x04000000 version 4, +>16 long 0x05000000 version 5, +>16 long 0x06000000 version 6, +>16 long 0x07000000 version 7, +>16 long 0x08000000 version 8, +>16 long 0x09000000 version 9, +>12 long 0x62310500 little-endian) + +0 long 0x00061561 Berkeley DB +>4 long >2 1.86 +>4 long <3 1.85 +>0 long 0x00061561 (Hash, +>4 long 2 version 2, +>4 long 3 version 3, +>8 long 0x000004D2 little-endian) +>8 long 0x000010E1 native byte-order) + +12 long 0x00061561 Berkeley DB (Hash, +>16 long 0x00000004 version 4, +>16 long 0x00000005 version 5, +>16 long 0x00000006 version 6, +>16 long 0x00000007 version 7, +>16 long 0x00000008 version 8, +>16 long 0x00000009 version 9, +>12 long 0x00061561 native byte-order) + +12 long 0x61150600 Berkeley DB (Hash, +>16 long 0x04000000 version 4, +>16 long 0x05000000 version 5, +>16 long 0x06000000 version 6, +>16 long 0x07000000 version 7, +>16 long 0x08000000 version 8, +>16 long 0x09000000 version 9, +>12 long 0x61150600 little-endian) + +12 long 0x00042253 Berkeley DB (Queue, +>16 long 0x00000001 version 1, +>16 long 0x00000002 version 2, +>16 long 0x00000003 version 3, +>16 long 0x00000004 version 4, +>16 long 0x00000005 version 5, +>16 long 0x00000006 version 6, +>16 long 0x00000007 version 7, +>16 long 0x00000008 version 8, +>16 long 0x00000009 version 9, +>12 long 0x00042253 native byte-order) + +12 long 0x53220400 Berkeley DB (Queue, +>16 long 0x01000000 version 1, +>16 long 0x02000000 version 2, +>16 long 0x03000000 version 3, +>16 long 0x04000000 version 4, +>16 long 0x05000000 version 5, +>16 long 0x06000000 version 6, +>16 long 0x07000000 version 7, +>16 long 0x08000000 version 8, +>16 long 0x09000000 version 9, +>12 long 0x53220400 little-endian) diff --git a/db/docs/ref/install/magic.s5.le.txt b/db/docs/ref/install/magic.s5.le.txt new file mode 100644 index 000000000..c8871fedf --- /dev/null +++ b/db/docs/ref/install/magic.s5.le.txt @@ -0,0 +1,87 @@ +# Berkeley DB +# $Id: magic.s5.le.txt,v 10.4 2000/07/07 21:02:22 krinsky Exp $ +# +# System V /etc/magic files: little-endian version. +# +# Hash 1.85/1.86 databases store metadata in network byte order. +# Btree 1.85/1.86 databases store the metadata in host byte order. +# Hash and Btree 2.X and later databases store the metadata in host byte order. + +0 long 0x00053162 Berkeley DB 1.85/1.86 (Btree, +>4 long 0x00000002 version 2, +>4 long 0x00000003 version 3, +>0 long 0x00053162 native byte-order) + +0 long 0x62310500 Berkeley DB 1.85/1.86 (Btree, +>4 long 0x02000000 version 2, +>4 long 0x03000000 version 3, +>0 long 0x62310500 big-endian) + +12 long 0x00053162 Berkeley DB (Btree, +>16 long 0x00000004 version 4, +>16 long 0x00000005 version 5, +>16 long 0x00000006 version 6, +>16 long 0x00000007 version 7, +>16 long 0x00000008 version 8, +>16 long 0x00000009 version 9, +>12 long 0x00053162 native byte-order) + +12 long 0x62310500 Berkeley DB (Btree, +>16 long 0x04000000 version 4, +>16 long 0x05000000 version 5, +>16 long 0x06000000 version 6, +>16 long 0x07000000 version 7, +>16 long 0x08000000 version 8, +>16 long 0x09000000 version 9, +>12 long 0x62310500 big-endian) + +0 long 0x61150600 Berkeley DB +>4 long >0x02000000 1.86 +>4 long <0x03000000 1.85 +>0 long 0x00061561 (Hash, +>4 long 0x02000000 version 2, +>4 long 0x03000000 version 3, +>8 long 0xD2040000 native byte-order) +>8 long 0xE1100000 big-endian) + +12 long 0x00061561 Berkeley DB (Hash, +>16 long 0x00000004 version 4, +>16 long 0x00000005 version 5, +>16 long 0x00000006 version 6, +>16 long 0x00000007 version 7, +>16 long 0x00000008 version 8, +>16 long 0x00000009 version 9, +>12 long 0x00061561 native byte-order) + +12 long 0x61150600 Berkeley DB (Hash, +>16 long 0x04000000 version 4, +>16 long 0x05000000 version 5, +>16 long 0x06000000 version 6, +>16 long 0x07000000 version 7, +>16 long 0x08000000 version 8, +>16 long 0x09000000 version 9, +>12 long 0x61150600 big-endian) + +12 long 0x00042253 Berkeley DB (Queue, +>16 long 0x00000001 version 1, +>16 long 0x00000002 version 2, +>16 long 0x00000003 version 3, +>16 long 0x00000004 version 4, +>16 long 0x00000005 version 5, +>16 long 0x00000006 version 6, +>16 long 0x00000007 version 7, +>16 long 0x00000008 version 8, +>16 long 0x00000009 version 9, +>12 long 0x00042253 native byte-order) + +12 long 0x53220400 Berkeley DB (Queue, +>16 long 0x01000000 version 1, +>16 long 0x02000000 version 2, +>16 long 0x03000000 version 3, +>16 long 0x04000000 version 4, +>16 long 0x05000000 version 5, +>16 long 0x06000000 version 6, +>16 long 0x07000000 version 7, +>16 long 0x08000000 version 8, +>16 long 0x09000000 version 9, +>12 long 0x53220400 big-endian) diff --git a/db/docs/ref/install/magic.txt b/db/docs/ref/install/magic.txt new file mode 100644 index 000000000..c28329f40 --- /dev/null +++ b/db/docs/ref/install/magic.txt @@ -0,0 +1,56 @@ +# Berkeley DB +# $Id: magic.txt,v 10.10 2000/07/07 21:02:22 krinsky Exp $ +# +# Ian Darwin's file /etc/magic files: big/little-endian version. +# +# Hash 1.85/1.86 databases store metadata in network byte order. +# Btree 1.85/1.86 databases store the metadata in host byte order. +# Hash and Btree 2.X and later databases store the metadata in host byte order. + +0 long 0x00061561 Berkeley DB +>8 belong 4321 +>>4 belong >2 1.86 +>>4 belong <3 1.85 +>>4 belong >0 (Hash, version %d, native byte-order) +>8 belong 1234 +>>4 belong >2 1.86 +>>4 belong <3 1.85 +>>4 belong >0 (Hash, version %d, little-endian) + +0 belong 0x00061561 Berkeley DB +>8 belong 4321 +>>4 belong >2 1.86 +>>4 belong <3 1.85 +>>4 belong >0 (Hash, version %d, big-endian) +>8 belong 1234 +>>4 belong >2 1.86 +>>4 belong <3 1.85 +>>4 belong >0 (Hash, version %d, native byte-order) + +0 long 0x00053162 Berkeley DB 1.85/1.86 +>4 long >0 (Btree, version %d, native byte-order) +0 belong 0x00053162 Berkeley DB 1.85/1.86 +>4 belong >0 (Btree, version %d, big-endian) +0 lelong 0x00053162 Berkeley DB 1.85/1.86 +>4 lelong >0 (Btree, version %d, little-endian) + +12 long 0x00061561 Berkeley DB +>16 long >0 (Hash, version %d, native byte-order) +12 belong 0x00061561 Berkeley DB +>16 belong >0 (Hash, version %d, big-endian) +12 lelong 0x00061561 Berkeley DB +>16 lelong >0 (Hash, version %d, little-endian) + +12 long 0x00053162 Berkeley DB +>16 long >0 (Btree, version %d, native byte-order) +12 belong 0x00053162 Berkeley DB +>16 belong >0 (Btree, version %d, big-endian) +12 lelong 0x00053162 Berkeley DB +>16 lelong >0 (Btree, version %d, little-endian) + +12 long 0x00042253 Berkeley DB +>16 long >0 (Queue, version %d, native byte-order) +12 belong 0x00042253 Berkeley DB +>16 belong >0 (Queue, version %d, big-endian) +12 lelong 0x00042253 Berkeley DB +>16 lelong >0 (Queue, version %d, little-endian) diff --git a/db/docs/ref/intro/data.html b/db/docs/ref/intro/data.html new file mode 100644 index 000000000..e9d6ead06 --- /dev/null +++ b/db/docs/ref/intro/data.html @@ -0,0 +1,54 @@ +<!--$Id: data.so,v 10.1 2000/09/22 18:23:58 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: An introduction to data management</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/terrain.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>An introduction to data management</h1> +<p>Cheap, powerful computing and networking have created countless new +applications that could not have existed a decade ago. The advent of +the World-Wide Web, and its influence in driving the Internet into homes +and businesses, is one obvious example. Equally important, though, is +the from large, general-purpose desktop and server computers +toward smaller, special-purpose devices with built-in processing and +communications services. +<p>As computer hardware has spread into virtually every corner of our +lives, of course, software has followed. Software developers today are +building applications not just for conventional desktop and server +environments, but also for handheld computers, home appliances, +networking hardware, cars and trucks, factory floor automation systems, +and more. +<p>While these operating environments are diverse, the problems that +software engineers must solve in them are often strikingly similar. Most +systems must deal with the outside world, whether that means +communicating with users or controlling machinery. As a result, most +need some sort of I/O system. Even a simple, single-function system +generally needs to handle multiple tasks, and so needs some kind of +operating system to schedule and manage control threads. Also, many +computer systems must store and retrieve data to track history, record +configuration settings, or manage access. +<p>Data management can be very simple. In some cases, just recording +configuration in a flat text file is enough. More often, though, +programs need to store and search a large amount of data, or +structurally complex data. Database management systems are tools that +programmers can use to do this work quickly and efficiently using +off-the-shelf software. +<p>Of course, database management systems have been around for a long time. +Data storage is a problem dating back to the earliest days of computing. +Software developers can choose from hundreds of good, +commercially-available database systems. The problem is selecting the +one that best solves the problems that their applications face. +<table><tr><td><br></td><td width="1%"><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/terrain.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/dbis.html b/db/docs/ref/intro/dbis.html new file mode 100644 index 000000000..10c4abd95 --- /dev/null +++ b/db/docs/ref/intro/dbis.html @@ -0,0 +1,159 @@ +<!--$Id: dbis.so,v 10.5 2001/01/19 17:30:29 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: What is Berkeley DB?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/terrain.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/dbisnot.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>What is Berkeley DB?</h1> +<p>So far, we've discussed database systems in general terms. It's time +now to consider Berkeley DB in particular and see how it fits into the +framework we have introduced. The key question is, what kinds of +applications should use Berkeley DB? +<p>Berkeley DB is an open source embedded database library that provides +scalable, high-performance, transaction-protected data management +services to applications. Berkeley DB provides a simple function-call API +for data access and management. +<p>By "open source," we mean that Berkeley DB is distributed under a license that +conforms to the <a href="http://www.opensource.org/osd.html">Open +Source Definition</a>. This license guarantees that Berkeley DB is freely +available for use and redistribution in other open source products. +<a href="http://www.sleepycat.com">Sleepycat Software</a> sells +commercial licenses for redistribution in proprietary applications, but +in all cases the complete source code for Berkeley DB is freely available for +download and use. +<p>Berkeley DB is embedded because it links directly into the application. It +runs in the same address space as the application. As a result, no +inter-process communication, either over the network or between +processes on the same machine, is required for database operations. +Berkeley DB provides a simple function-call API for a number of programming +languages, including C, C++, Java, Perl, Tcl, Python, and PHP. All +database operations happen inside the library. Multiple processes, or +multiple threads in a single process, can all use the database at the +same time as each uses the Berkeley DB library. Low-level services like +locking, transaction logging, shared buffer management, memory +management, and so on are all handled transparently by the library. +<p>The library is extremely portable. It runs under almost all UNIX and +Linux variants, Windows, and a number of embedded real-time operating +systems. It runs on both 32-bit and 64-bit systems. +It has been deployed on high-end +Internet servers, desktop machines, and on palmtop computers, set-top +boxes, in network switches, and elsewhere. Once Berkeley DB is linked into +the application, the end user generally does not know that there's a +database present at all. +<p>Berkeley DB is scalable in a number of respects. The database library itself +is quite compact (under 300 kilobytes of text space on common +architectures), but it can manage databases up to 256 terabytes in size. +It also supports high concurrency, with thousands of users operating on +the same database at the same time. Berkeley DB is small enough to run in +tightly constrained embedded systems, but can take advantage of +gigabytes of memory and terabytes of disk on high-end server machines. +<p>Berkeley DB generally outperforms relational and object-oriented database +systems in embedded applications for a couple of reasons. First, because +the library runs in the same address space, no inter-process +communication is required for database operations. The cost of +communicating between processes on a single machine, or among machines +on a network, is much higher than the cost of making a function call. +Second, because Berkeley DB uses a simple function-call interface for all +operations, there is no query language to parse, and no execution plan +to produce. +<h3>Data Access Services</h3> +<p>Berkeley DB applications can choose the storage structure that best suits the +application. Berkeley DB supports hash tables, B-trees, simple +record-number-based storage, and persistent queues. Programmers can +create tables using any of these storage structures, and can mix +operations on different kinds of tables in a single application. +<p>Hash tables are generally good for very large databases that need +predictable search and update times for random-access records. Hash +tables allow users to ask, "Does this key exist?" or to fetch a record +with a known key. Hash tables do not allow users to ask for records +with keys that are close to a known key. +<p>B-trees are better for range-based searches, as when the application +needs to find all records with keys between some starting and ending +value. B-trees also do a better job of exploiting <i>locality +of reference</i>. If the application is likely to touch keys near each +other at the same time, the B-trees work well. The tree structure keeps +keys that are close together near one another in storage, so fetching +nearby values usually doesn't require a disk access. +<p>Record-number-based storage is natural for applications that need +to store and fetch records, but that do not have a simple way to +generate keys of their own. In a record number table, the record +number is the key for the record. Berkeley DB will can generate these +record numbers automatically. +<p>Queues are well-suited for applications that create records, and then +must deal with those records in creation order. A good example is +on-line purchasing systems. Orders can enter the system at any time, +but should generally be filled in the order in which they were placed. +<h3>Data management services</h3> +<p>Berkeley DB offers important data management services, including concurrency, +transactions, and recovery. All of these services work on all of the +storage structures. +<p>Many users can work on the same database concurrently. Berkeley DB handles +locking transparently, ensuring that two users working on the same +record do not interfere with one another. +<p>The library provides strict ACID transaction semantics. Some systems +allow the user to relax, for example, the isolation guarantees that the +database system makes. Berkeley DB ensures that all applications can see only +committed updates. +<p>Multiple operations can be grouped into a single transaction, and can +be committed or rolled back atomically. Berkeley DB uses a technique called +<i>two-phase locking</i> to be sure that concurrent transactions +are isolated from one another, and a technique called +<i>write-ahead logging</i> to guarantee that committed changes +survive application, system, or hardware failures. +<p>When an application starts up, it can ask Berkeley DB to run recovery. +Recovery restores the database to a clean state, with all committed +changes present, even after a crash. The database is guaranteed to be +consistent and all committed changes are guaranteed to be present when +recovery completes. +<p>An application can specify, when it starts up, which data management +services it will use. Some applications need fast, +single-user, non-transactional B-tree data storage. In that case, the +application can disable the locking and transaction systems, and will +not incur the overhead of locking or logging. If an application needs +to support multiple concurrent users, but doesn't need transactions, it +can turn on locking without transactions. Applications that need +concurrent, transaction-protected database access can enable all of the +subsystems. +<p>In all these cases, the application uses the same function-call API to +fetch and update records. +<h3>Design</h3> +<p>Berkeley DB was designed to provide industrial-strength database services to +application developers, without requiring them to become database +experts. It is a classic C-library style <i>toolkit</i>, providing +a broad base of functionality to application writers. Berkeley DB was designed +by programmers, for programmers: its modular design surfaces simple, +orthogonal interfaces to core services, and it provides mechanism (for +example, good thread support) without imposing policy (for example, the +use of threads is not required). Just as importantly, Berkeley DB allows +developers to balance performance against the need for crash recovery +and concurrent use. An application can use the storage structure that +provides the fastest access to its data and can request only the degree +of logging and locking that it needs. +<p>Because of the tool-based approach and separate interfaces for each +Berkeley DB subsystem, you can support a complete transaction environment for +other system operations. Berkeley DB even allows you to wrap transactions +around the standard UNIX file read and write operations! Further, Berkeley DB +was designed to interact correctly with the native system's toolset, a +feature no other database package offers. For example, Berkeley DB supports +hot backups (database backups while the database is in use), using +standard UNIX system utilities, e.g., dump, tar, cpio, pax or even cp. +<p>Finally, because scripting language interfaces are available for Berkeley DB +(notably Tcl and Perl), application writers can build incredibly powerful +database engines with little effort. You can build transaction-protected +database applications using your favorite scripting languages, an +increasingly important feature in a world using CGI scripts to deliver +HTML. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/terrain.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/dbisnot.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/dbisnot.html b/db/docs/ref/intro/dbisnot.html new file mode 100644 index 000000000..a55fa7176 --- /dev/null +++ b/db/docs/ref/intro/dbisnot.html @@ -0,0 +1,146 @@ +<!--$Id: dbisnot.so,v 10.3 2000/12/14 20:52:03 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: What is Berkeley DB not?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/dbis.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/need.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>What is Berkeley DB not?</h1> +<p>In contrast to most other database systems, Berkeley DB provides relatively +simple data access services. +<p>Records in Berkeley DB are (<i>key</i>, <i>value</i>) pairs. Berkeley DB +supports only a few logical operations on records. They are: +<ul type=disc> +<li>Insert a record in a table. +<li>Delete a record from a table. +<li>Find a record in a table by looking up its key. +<li>Update a record that has already been found. +</ul> +<p>Notice that Berkeley DB never operates on the value part of a record. +Values are simply payload, to be +stored with keys and reliably delivered back to the application on +demand. +<p>Both keys and values can be arbitrary bit strings, either fixed-length +or variable-length. As a result, programmers can put native programming +language data structures into the database without converting them to +a foreign record format first. Storage and retrieval are very simple, +but the application needs to know what the structure of a key and a +value is in advance. It cannot ask Berkeley DB, because Berkeley DB doesn't know. +<p>This is an important feature of Berkeley DB, and one worth considering more +carefully. On the one hand, Berkeley DB cannot provide the programmer with +any information on the contents or structure of the values that it +stores. The application must understand the keys and values that it +uses. On the other hand, there is literally no limit to the data types +that can be store in a Berkeley DB database. The application never needs to +convert its own program data into the data types that Berkeley DB supports. +Berkeley DB is able to operate on any data type the application uses, no +matter how complex. +<p>Because both keys and values can be up to four gigabytes in length, a +single record can store images, audio streams, or other large data +values. Large values are not treated specially in Berkeley DB. They are +simply broken into page-sized chunks, and reassembled on demand when +the application needs them. Unlike some other database systems, Berkeley DB +offers no special support for binary large objects (BLOBs). +<h3>Not a relational database</h3> +<p>Berkeley DB is not a relational database. +<p>First, Berkeley DB does not support SQL queries. All access to data is through +the Berkeley DB API. Developers must learn a new set of interfaces in order +to work with Berkeley DB. Although the interfaces are fairly simple, they are +non-standard. +<p>SQL support is a double-edged sword. One big advantage of relational +databases is that they allow users to write simple declarative queries +in a high-level language. The database system knows everything about +the data and can carry out the command. This means that it's simple to +search for data in new ways, and to ask new questions of the database. +No programming is required. +<p>On the other hand, if a programmer can predict in advance how an +application will access data, then writing a low-level program to get +and store records can be faster. It eliminates the overhead of query +parsing, optimization, and execution. The programmer must understand +the data representation, and must write the code to do the work, but +once that's done, the application can be very fast. +<p>Second, Berkeley DB has no notion of <i>schema</i> in the way that +relational systems do. Schema is the structure of records in tables, +and the relationships among the tables in the database. For example, in +a relational system the programmer can create a record from a fixed menu +of data types. Because the record types are declared to the system, the +relational engine can reach inside records and examine individual values +in them. In addition, programmers can use SQL to declare relationships +among tables, and to create indexes on tables. Relational engines +usually maintain these relationships and indexes automatically. +<p>In Berkeley DB, the key and value in a record are opaque +to Berkeley DB. They may have a rich +internal structure, but the library is unaware of it. As a result, Berkeley DB +cannot decompose the value part of a record into its constituent parts, +and cannot use those parts to find values of interest. Only the +application, which knows the data structure, can do that. +<p>Berkeley DB does allow programmers to create indexes on tables, and to use +those indexes to speed up searches. However, the programmer has no way +to tell the library how different tables and indexes are related. The +application needs to make sure that they all stay consistent. In the +case of indexes in particular, if the application puts a new record into +a table, it must also put a new record in the index for it. It's +generally simple to write a single function to make the required +updates, but it is work that relational systems do automatically. +<p>Berkeley DB is not a relational system. Relational database systems are +semantically rich and offer high-level database access. Compared to such +systems, Berkeley DB is a high-performance, transactional library for record +storage. It's possible to build a relational system on top of Berkeley DB. In +fact, the popular MySQL relational system uses Berkeley DB for +transaction-protected table management, and takes care of all the SQL +parsing and execution. It uses Berkeley DB for the storage level, and provides +the semantics and access tools. +<h3>Not an object-oriented database</h3> +<p>Object-oriented databases are designed for very tight integration with +object-oriented programming languages. Berkeley DB is written entirely in the +C programming language. It includes language bindings for C++, Java, +and other languages, but the library has no information about the +objects created in any object-oriented application. Berkeley DB never makes +method calls on any application object. It has no idea what methods are +defined on user objects, and cannot see the public or private members +of any instance. The key and value part of all records are opaque to +Berkeley DB. +<p>Berkeley DB cannot automatically page in referenced objects, as some +object-oriented databases do. The object-oriented application programmer +must decide what records are required, and must fetch them by making +method calls on Berkeley DB objects. +<h3>Not a network database</h3> +<p>Berkeley DB does not support network-style navigation among records, as +network databases do. Records in a Berkeley DB table may move around over +time, as new records are added to the table and old ones are deleted. +Berkeley DB is able to do fast searches for records based on keys, but there +is no way to create a persistent physical pointer to a record. +Applications can only refer to records by key, not by address. +<h3>Not a database server</h3> +<p>Berkeley DB is not a standalone database server. It is a library, and runs in +the address space of the application that uses it. If more than one +application links in Berkeley DB, then all can use the same database at the +same time; the library handles coordination among the applications, and +guarantees that they do not interfere with one another. +<p>Recent releases of Berkeley DB allow programmers to compile the library as a +standalone process, and to use RPC stubs to connect to it and to carry +out operations. However, there are some important limitations to this +feature. The RPC stubs provide exactly the same API that the library +itself does. There is no higher-level access provided by the standalone +process. Tuning the standalone process is difficult, since Berkeley DB does +no threading in the library (applications can be threaded, but the +library never creates a thread on its own). +<p>It is possible to build a server application that uses Berkeley DB for data +management. For example, many commercial and open source Lightweight +Directory Access Protocol (LDAP) servers use Berkeley DB for record storage. +LDAP clients connect to these servers over the network. Individual +servers make calls through the Berkeley DB API to find records and return them +to clients. On its own, however, Berkeley DB is not a server. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/dbis.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/need.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/distrib.html b/db/docs/ref/intro/distrib.html new file mode 100644 index 000000000..a5ff52263 --- /dev/null +++ b/db/docs/ref/intro/distrib.html @@ -0,0 +1,28 @@ +<!--$Id: distrib.so,v 10.16 2000/09/22 18:23:58 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: What does the Berkeley DB distribution include?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/what.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/where.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>What does the Berkeley DB distribution include?</h1> +<p>The Berkeley DB distribution includes complete source code for the Berkeley DB +library, including all three Berkeley DB products and their supporting +utilities, as well as complete documentation in HTML format. +<p>The distribution does not include pre-built binaries or libraries, or +hard-copy documentation. Pre-built libraries and binaries for some +architecture/compiler combinations are available as part of Sleepycat +Software's Berkeley DB support services. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/what.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/where.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/need.html b/db/docs/ref/intro/need.html new file mode 100644 index 000000000..771dd9890 --- /dev/null +++ b/db/docs/ref/intro/need.html @@ -0,0 +1,60 @@ +<!--$Id: need.so,v 10.2 2000/12/08 23:59:06 mao Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Do you need Berkeley DB?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/dbisnot.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/what.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Do you need Berkeley DB?</h1> +<p>Berkeley DB is an ideal database system for applications that need fast, +scalable, and reliable embedded database management. For applications +that need different services, however, it can be a poor choice. +<p>First, do you need the ability to access your data in ways you cannot +predict in advance? If your users want to be able to enter SQL +queries to perform +complicated searches that you cannot program into your application to +begin with, then you should consider a relational engine instead. Berkeley DB +requires a programmer to write code in order to run a new kind of query. +<p>On the other hand, if you can predict your data access patterns up front +-- and in particular if you need fairly simple key/value lookups -- then +Berkeley DB is a good choice. The queries can be coded up once, and will then +run very quickly because there is no SQL to parse and execute. +<p>Second, are there political arguments for or against a standalone +relational server? If you're building an application for your own use +and have a relational system installed with administrative support +already, it may be simpler to use that than to build and learn Berkeley DB. +On the other hand, if you'll be shipping many copies of your application +to customers, and don't want your customers to have to buy, install, +and manage a separate database system, then Berkeley DB may be a better +choice. +<p>Third, are there any technical advantages to an embedded database? If +you're building an application that will run unattended for long periods +of time, or for end users who are not sophisticated administrators, then +a separate server process may be too big a burden. It will require +separate installation and management, and if it creates new ways for +the application to fail, or new complexities to master in the field, +then Berkeley DB may be a better choice. +<p>The fundamental question is, how closely do your requirements match the +Berkeley DB design? Berkeley DB was conceived and built to provide fast, reliable, +transaction-protected record storage. The library itself was never +intended to provide interactive query support, graphical reporting +tools, or similar services that some other database systems provide. We +have tried always to err on the side of minimalism and simplicity. By +keeping the library small and simple, we create fewer opportunities for +bugs to creep in, and we guarantee that the database system stays fast, +because there is very little code to execute. If your application needs +that set of features, then Berkeley DB is almost certainly the best choice +for you. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/dbisnot.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/what.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/products.html b/db/docs/ref/intro/products.html new file mode 100644 index 000000000..ce04135f0 --- /dev/null +++ b/db/docs/ref/intro/products.html @@ -0,0 +1,69 @@ +<!--$Id: products.so,v 10.13 2000/12/04 18:05:42 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Sleepycat Software's Berkeley DB products</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/where.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Sleepycat Software's Berkeley DB products</h1> +<p>Sleepycat Software licenses three different products that use the Berkeley DB +technology. Each product offers a distinct level of database support. +It is not possible to mix-and-match products, that is, each application +or group of applications must use the same Berkeley DB product. +<p>All three products are included in the single Open Source distribution of +Berkeley DB from Sleepycat Software, and building that distribution +automatically builds all three products. Each product adds services, and +new interfaces, to the product that precedes it in the list. As a result, +developers can download Berkeley DB and build an application that does only +single-user, read-only database access, and later add support for more +users and more complex database access patterns. +<p>Users who distribute Berkeley DB must ensure that they are licensed for the +Berkeley DB interfaces they use. Information on licensing is available directly +from Sleepycat Software. +<h3>Berkeley DB Data Store</h3> +<p>The Berkeley DB Data Store product is an embeddable, high-performance data store. It +supports multiple concurrent threads of control to read information +managed by Berkeley DB. When updates are required, only a single process may +be using the database. That process may be multi-threaded, but only one +thread of control should be allowed to update the database at any time. +The Berkeley DB Data Store does no locking, and so provides no guarantees of correct +behavior if more than one thread of control is updating the database at +a time. +<p>The Berkeley DB Data Store product includes the <a href="../../api_c/db_create.html">db_create</a> interface, the +DB handle methods, and the methods returned by <a href="../../api_c/db_cursor.html">DB->cursor</a>. +<p>The Berkeley DB Data Store is intended for use in single-user or read-only applications +that can guarantee that no more than one thread of control will ever +update the database at any time. +<h3>Berkeley DB Concurrent Data Store</h3> +<p>The Berkeley DB Concurrent Data Store product adds multiple-reader, single writer capabilities to +the Berkeley DB Data Store product, supporting applications that need concurrent updates +and do not want to implement their own locking protocols. The additional +interfaces included with the Berkeley DB Concurrent Data Store product are <a href="../../api_c/env_create.html">db_env_create</a>, the +<a href="../../api_c/env_open.html">DBENV->open</a> method (using the <a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a> flag), and the +<a href="../../api_c/env_close.html">DBENV->close</a> method. +<p>Berkeley DB Concurrent Data Store is intended for applications that require occasional write access +to a database that is largely used for reading. +<h3>Berkeley DB Transactional Data Store</h3> +<p>The Berkeley DB Transactional Data Store product adds full transactional support and recoverability +to the Berkeley DB Data Store product. This product includes all of the interfaces +in the Berkeley DB library. +<p>Berkeley DB Transactional Data Store is intended for applications that require industrial-strength +database services, including good performance under high-concurrency +workloads with a mixture of readers and writers, the ability to commit +or roll back multiple changes to the database at a single instant, and +the guarantee that even in the event of a catastrophic system or hardware +failure, any committed database changes will be preserved. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/where.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/terrain.html b/db/docs/ref/intro/terrain.html new file mode 100644 index 000000000..f2a708913 --- /dev/null +++ b/db/docs/ref/intro/terrain.html @@ -0,0 +1,248 @@ +<!--$Id: terrain.so,v 10.3 2000/12/14 20:52:03 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Mapping the terrain: theory and practice</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/data.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/dbis.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Mapping the terrain: theory and practice</h1> +<p>The first step in selecting a database system is figuring out what the +choices are. Decades of research and real-world deployment have produced +countless systems. We need to organize them somehow to reduce the number +of options. +<p>One obvious way to group systems is to use the common labels that +vendors apply to them. The buzzwords here include "network," +"relational," "object-oriented," and "embedded," with some +cross-fertilization like "object-relational" and "embedded network". +Understanding the buzzwords is important. Each has some grounding in +theory, but has also evolved into a practical label for categorizing +systems that work in a certain way. +<p>All database systems, regardless of the buzzwords that apply to them, +provide a few common services. All of them store data, for example. +We'll begin by exploring the common services that all systems provide, +and then examine the differences among the different kinds of systems. +<h3>Data access and data management</h3> +<p>Fundamentally, database systems provide two services. +<p>The first service is <i>data access</i>. Data access means adding +new data to the database (inserting), finding data of interest +(searching), changing data already stored (updating), and removing data +from the database (deleting). All databases provide these services. How +they work varies from category to category, and depends on the record +structure that the database supports. +<p>Each record in a database is a collection of values. For example, the +record for a Web site customer might include a name, email address, +shipping address, and payment information. Records are usually stored +in tables. Each table holds records of the same kind. For example, the +<b>customer</b> table at an e-commerce Web site might store the +customer records for every person who shopped at the site. Often, +database records have a different structure from the structures or +instances supported by the programming language in which an application +is written. As a result, working with records can mean: +<ul type=disc> +<li>using database operations like searches and updates on records; and +<li>converting between programming language structures and database record +types in the application. +</ul> +<p>The second service is <i>data management</i>. Data management is +more complicated than data access. Providing good data management +services is the hard part of building a database system. When you +choose a database system to use in an application you build, making sure +it supports the data management services you need is critical. +<p>Data management services include allowing multiple users to work on the +database simultaneously (concurrency), allowing multiple records to be +changed instantaneously (transactions), and surviving application and +system crashes (recovery). Different database systems offer different +data management services. Data management services are entirely +independent of the data access services listed above. For example, +nothing about relational database theory requires that the system +support transactions, but most commercial relational systems do. +<p>Concurrency means that multiple users can operate on the database at +the same time. Support for concurrency ranges from none (single-user +access only) to complete (many readers and writers working +simultaneously). +<p>Transactions permit users to make multiple changes appear at once. For +example, a transfer of funds between bank accounts needs to be a +transaction because the balance in one account is reduced and the +balance in the other increases. If the reduction happened before the +increase, than a poorly-timed system crash could leave the customer +poorer; if the bank used the opposite order, then the same system crash +could make the customer richer. Obviously, both the customer and the +bank are best served if both operations happen at the same instant. +<p>Transactions have well-defined properties in database systems. They are +<i>atomic</i>, so that the changes happen all at once or not at all. +They are <i>consistent</i>, so that the database is in a legal state +when the transaction begins and when it ends. They are typically +<i>isolated</i>, which means that any other users in the database +cannot interfere with them while they are in progress. And they are +<i>durable</i>, so that if the system or application crashes after +a transaction finishes, the changes are not lost. Together, the +properties of <i>atomicity</i>, <i>consistency</i>, +<i>isolation</i>, and <i>durability</i> are known as the ACID +properties. +<p>As is the case for concurrency, support for transactions varies among +databases. Some offer atomicity without making guarantees about +durability. Some ignore isolatability, especially in single-user +systems; there's no need to isolate other users from the effects of +changes when there are no other users. +<p>Another important data management service is recovery. Strictly +speaking, recovery is a procedure that the system carries out when it +starts up. The purpose of recovery is to guarantee that the database is +complete and usable. This is most important after a system or +application crash, when the database may have been damaged. The recovery +process guarantees that the internal structure of the database is good. +Recovery usually means that any completed transactions are checked, and +any lost changes are reapplied to the database. At the end of the +recovery process, applications can use the database as if there had been +no interruption in service. +<p>Finally, there are a number of data management services that permit +copying of data. For example, most database systems are able to import +data from other sources, and to export it for use elsewhere. Also, most +systems provide some way to back up databases and to restore in the +event of a system failure that damages the database. Many commercial +systems allow <i>hot backups</i>, so that users can back up +databases while they are in use. Many applications must run without +interruption, and cannot be shut down for backups. +<p>A particular database system may provide other data management services. +Some provide browsers that show database structure and contents. Some +include tools that enforce data integrity rules, such as the rule that +no employee can have a negative salary. These data management services +are not common to all systems, however. Concurrency, recovery, and +transactions are the data management services that most database vendors +support. +<p>Deciding what kind of database to use means understanding the data +access and data management services that your application needs. Berkeley DB +is an embedded database that supports fairly simple data access with a +rich set of data management services. To highlight its strengths and +weaknesses, we can compare it to other database system categories. +<h3>Relational databases</h3> +<p>Relational databases are probably the best-known database variant, +because of the success of companies like Oracle. Relational databases +are based on the mathematical field of set theory. The term "relation" +is really just a synonym for "set" -- a relation is just a set of +records or, in our terminology, a table. One of the main innovations in +early relational systems was to insulate the programmer from the +physical organization of the database. Rather than walking through +arrays of records or traversing pointers, programmers make statements +about tables in a high-level language, and the system executes those +statements. +<p>Relational databases operate on <i>tuples</i>, or records, composed +of values of several different data types, including integers, character +strings, and others. Operations include searching for records whose +values satisfy some criteria, updating records, and so on. +<p>Virtually all relational databases use the Structured Query Language, +or SQL. This language permits people and computer programs to work with +the database by writing simple statements. The database engine reads +those statements and determines how to satisfy them on the tables in +the database. +<p>SQL is the main practical advantage of relational database systems. +Rather than writing a computer program to find records of interest, the +relational system user can just type a query in a simple syntax, and +let the engine do the work. This gives users enormous flexibility; they +do not need to decide in advance what kind of searches they want to do, +and they do not need expensive programmers to find the data they need. +Learning SQL requires some effort, but it's much simpler than a +full-blown high-level programming language for most purposes. And there +are a lot of programmers who have already learned SQL. +<h3>Object-oriented databases</h3> +<p>Object-oriented databases are less common than relational systems, but +are still fairly widespread. Most object-oriented databases were +originally conceived as persistent storage systems closely wedded to +particular high-level programming languages like C++. With the spread +of Java, most now support more than one programming language, but +object-oriented database systems fundamentally provide the same class +and method abstractions as do object-oriented programming languages. +<p>Many object-oriented systems allow applications to operate on objects +uniformly, whether they are in memory or on disk. These systems create +the illusion that all objects are in memory all the time. The advantage +to object-oriented programmers who simply want object storage and +retrieval is clear. They need never be aware of whether an object is in +memory or not. The application simply uses objects, and the database +system moves them between disk and memory transparently. All of the +operations on an object, and all its behavior, are determined by the +programming language. +<p>Object-oriented databases aren't nearly as widely deployed as relational +systems. In order to attract developers who understand relational +systems, many of the object-oriented systems have added support for +query languages very much like SQL. In practice, though, object-oriented +databases are mostly used for persistent storage of objects in C++ and +Java programs. +<h3>Network databases</h3> +<p>The "network model" is a fairly old technique for managing and +navigating application data. Network databases are designed to make +pointer traversal very fast. Every record stored in a network database +is allowed to contain pointers to other records. These pointers are +generally physical addresses, so fetching the referenced record just +means reading it from disk by its disk address. +<p>Network database systems generally permit records to contain integers, +floating point numbers, and character strings, as well as references to +other records. An application can search for records of interest. After +retrieving a record, the application can fetch any referenced record +quickly. +<p>Pointer traversal is fast because most network systems use physical disk +addresses as pointers. When the application wants to fetch a record, +the database system uses the address to fetch exactly the right string +of bytes from the disk. This requires only a single disk access in all +cases. Other systems, by contrast, often must do more than one disk read +to find a particular record. +<p>The key advantage of the network model is also its main drawback. The +fact that pointer traversal is so fast means that applications that do +it will run well. On the other hand, storing pointers all over the +database makes it very hard to reorganize the database. In effect, once +you store a pointer to a record, it is difficult to move that record +elsewhere. Some network databases handle this by leaving forwarding +pointers behind, but this defeats the speed advantage of doing a single +disk access in the first place. Other network databases find, and fix, +all the pointers to a record when it moves, but this makes +reorganization very expensive. Reorganization is often necessary in +databases, since adding and deleting records over time will consume +space that cannot be reclaimed without reorganizing. Without periodic +reorganization to compact network databases, they can end up with a +considerable amount of wasted space. +<h3>Clients and servers</h3> +<p>Database vendors have two choices for system architecture. They can +build a server to which remote clients connect, and do all the database +management inside the server. Alternatively, they can provide a module +that links directly into the application, and does all database +management locally. In either case, the application developer needs +some way of communicating with the database (generally, an Application +Programming Interface (API) that does work in the process or that +communicates with a server to get work done). +<p>Almost all commercial database products are implemented as servers, and +applications connect to them as clients. Servers have several features +that make them attractive. +<p>First, because all of the data is managed by a separate process, and +possibly on a separate machine, it's easy to isolate the database server +from bugs and crashes in the application. +<p>Second, because some database products (particularly relational engines) +are quite large, splitting them off as separate server processes keeps +applications small, which uses less disk space and memory. Relational +engines include code to parse SQL statements, to analyze them and +produce plans for execution, to optimize the plans, and to execute +them. +<p>Finally, by storing all the data in one place and managing it with a +single server, it's easier for organizations to back up, protect, and +set policies on their databases. The enterprise databases for large +companies often have several full-time administrators caring for them, +making certain that applications run quickly, granting and denying +access to users, and making backups. +<p>However, centralized administration can be a disadvantage in some cases. +In particular, if a programmer wants to build an application that uses +a database for storage of important information, then shipping and +supporting the application is much harder. The end user needs to install +and administer a separate database server, and the programmer must +support not just one product, but two. Adding a server process to the +application creates new opportunity for installation mistakes and +run-time problems. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/data.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/dbis.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/what.html b/db/docs/ref/intro/what.html new file mode 100644 index 000000000..c8d12069a --- /dev/null +++ b/db/docs/ref/intro/what.html @@ -0,0 +1,53 @@ +<!--$Id: what.so,v 10.22 2000/09/22 18:23:59 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: What other services does Berkeley DB provide?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/need.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/distrib.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>What other services does Berkeley DB provide?</h1> +<p>Berkeley DB also provides core database services to developers. These +services include: +<p><dl compact> +<p><dt>Page cache management:<dd>The page cache provides fast access to a cache of database pages, +handling the I/O associated with the cache to ensure that dirty pages +are written back to the file system and that new pages are allocated on +demand. Applications may use the Berkeley DB shared memory buffer manager to +serve their own files and pages. +<p><dt>Transactions and logging:<dd>The transaction and logging systems provide recoverability and atomicity +for multiple database operations. The transaction system uses two-phase +locking and write-ahead logging protocols to ensure that database +operations may be undone or redone in the case of application or system +failure. Applications may use Berkeley DB transaction and logging subsystems +to protect their own data structures and operations from application or +system failure. +<p><dt>Locking:<dd>The locking system provides multiple reader or single writer access to +objects. The Berkeley DB access methods use the locking system to acquire +the right to read or write database pages. Applications may use the +Berkeley DB locking subsystem to support their own locking needs. +</dl> +<p>By combining the page cache, transaction, locking, and logging systems, +Berkeley DB provides the same services found in much larger, more complex and +more expensive database systems. Berkeley DB supports multiple simultaneous +readers and writers and guarantees that all changes are recoverable, even +in the case of a catastrophic hardware failure during a database update. +<p>Developers may select some or all of the core database services for any +access method or database. Therefore, it is possible to choose the +appropriate storage structure and the right degrees of concurrency and +recoverability for any application. In addition, some of the systems +(e.g., the locking subsystem) can be called separately from the Berkeley DB +access method. As a result, developers can integrate non-database +objects into their transactional applications using Berkeley DB. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/need.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/distrib.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/intro/where.html b/db/docs/ref/intro/where.html new file mode 100644 index 000000000..45d0dc3ae --- /dev/null +++ b/db/docs/ref/intro/where.html @@ -0,0 +1,39 @@ +<!--$Id: where.so,v 10.27 2000/12/04 18:05:42 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Where does Berkeley DB run?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Introduction</dl></h3></td> +<td width="1%"><a href="../../ref/intro/distrib.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/products.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Where does Berkeley DB run?</h1> +<p>Berkeley DB requires only underlying IEEE/ANSI Std 1003.1 (POSIX) system calls and can be +ported easily to new architectures by adding stub routines to connect +the native system interfaces to the Berkeley DB POSIX-style system calls. +<p>Berkeley DB will autoconfigure and run on almost any modern UNIX system, and +even on most historical UNIX platforms. See +<a href="../../ref/build_unix/intro.html">Building for UNIX systems</a> for +more information. +<p>The Berkeley DB distribution includes support for QNX Neutrino. See +<a href="../../ref/build_unix/intro.html">Building for UNIX systems</a> for +more information. +<p>The Berkeley DB distribution includes support for VxWorks, via a workspace +and project files for Tornado 2.0. See +<a href="../../ref/build_vxworks/intro.html">Building for VxWorks</a> for more +information. +<p>The Berkeley DB distribution includes support for Windows/95, Windows/98, +Windows/NT and Windows/2000, via the MSVC 5 and 6 development +environments. See <a href="../../ref/build_win/intro.html">Building for +Windows systems</a> for more information. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/distrib.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/intro/products.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/java/compat.html b/db/docs/ref/java/compat.html new file mode 100644 index 000000000..4619ec557 --- /dev/null +++ b/db/docs/ref/java/compat.html @@ -0,0 +1,34 @@ +<!--$Id: compat.so,v 10.11 2000/12/04 18:05:42 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Compatibility</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Java API</dl></h3></td> +<td width="1%"><a href="../../ref/java/conf.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/program.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Compatibility</h1> +<p>The Berkeley DB Java API has been tested with the +<a href="http://www.javasoft.com">Sun Microsystems JDK 1.1.3</a> on SunOS +5.5, and Sun's JDK 1.1.7, JDK 1.2.2 and JDK 1.3.0 on Linux and +Windows/NT. It should work with any JDK 1.1, 1.2 or 1.3 (the latter +two are known as Java 2) compatible environment. IBM's VM 1.3.0 has +also been tested on Linux. +<p>The primary requirement of the Berkeley DB Java API is that the target Java +environment supports JNI (Java Native Interface), rather than another +method for allowing native C/C++ code to interface to Java. The JNI was +new in JDK 1.1, but is the most likely interface to be implemented across +multiple platforms. However, using the JNI means that Berkeley DB will not be +compatible with Microsoft Visual J++. +<table><tr><td><br></td><td width="1%"><a href="../../ref/java/conf.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/program.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/java/conf.html b/db/docs/ref/java/conf.html new file mode 100644 index 000000000..b7eedcaed --- /dev/null +++ b/db/docs/ref/java/conf.html @@ -0,0 +1,82 @@ +<!--$Id: conf.so,v 10.16 2000/12/04 21:21:51 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuration</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Java API</dl></h3></td> +<td width="1%"><a href="../../ref/rpc/server.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/compat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuration</h1> +<p>Building the Berkeley DB java classes, the examples and the native support +library is integrated into the normal build process. See +<a href="../../ref/build_unix/conf.html#--enable-java">Configuring +Berkeley DB</a> and <a href="../../ref/build_win/intro.html">Building for Windows</a> +for more information. +<p>We expect that you've already installed the Java JDK or equivalent on +your system. For the sake of discussion, we'll assume it is in a +directory called db-VERSION, e.g., you extracted Berkeley DB version 2.3.12 +and you did not change the top-level directory name. The files related +to Java are in two subdirectories of db-VERSION: java, the java source +files, and libdb_java, the C++ files that provide the "glue" between +java and Berkeley DB. The directory tree looks like this: +<p><blockquote><pre> db-VERSION + / \ + java libdb_java + | | + src ... + | + com + | + sleepycat + / \ + db examples + | | + ... ... +</pre></blockquote> +<p>This naming conforms to the emerging standard for naming java packages. +When the java code is built, it is placed into a <b>classes</b> +subdirectory that is parallel to the <b>src</b> subdirectory. +<p>For your application to use Berkeley DB successfully, you must set your +CLASSPATH environment variable to include db-VERSION/java/classes as +well as the classes in your java distribution. On UNIX, CLASSPATH is +a colon separated list of directories; on Windows it is separated by +semicolons. Alternatively, you can set your CLASSPATH to include +db-VERSION/java/classes/db.jar which is created as a result of the +build. The db.jar file contains the classes in com.sleepycat.db, it +does not contain any classes in com.sleepycat.examples. +<p>On Windows, you will want to set your PATH variable to include: +<p><blockquote><pre>db-VERSION\build_win32\Release</pre></blockquote> +<p>On UNIX, you will want to set the LD_LIBRARY_PATH environment variable +to include the Berkeley DB library installation directory. Of course, the +standard install directory may have been changed for your site, see your +system administrator for details. Regardless, if you get a: +<p><blockquote><pre>java.lang.UnsatisfiedLinkError</pre></blockquote> +<p>exception when you run, chances are you do not have the library search +path configured correctly. Different Java interpreters provide +different error messages if the CLASSPATH value is incorrect, a typical +error is: +<p><blockquote><pre>java.lang.NoClassDefFoundError</pre></blockquote> +<p>To ensure that everything is running correctly, you may want to try a +simple test from the example programs in: +<p><blockquote><pre>db-VERSION/java/src/com/sleepycat/examples</pre></blockquote> +<p>For example, the sample program: +<p><blockquote><pre>% java com.sleepycat.examples.AccessExample</pre></blockquote> +<p>will prompt for text input lines which are then stored in a Btree +database named "access.db" in your current directory. Try giving it a +few lines of input text and then end-of-file. Before it exits, you +should see a list of the lines you entered display with data items. +This is a simple check to make sure the fundamental configuration is +working correctly. +<table><tr><td><br></td><td width="1%"><a href="../../ref/rpc/server.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/compat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/java/faq.html b/db/docs/ref/java/faq.html new file mode 100644 index 000000000..75b9e9f3b --- /dev/null +++ b/db/docs/ref/java/faq.html @@ -0,0 +1,31 @@ +<!--$Id: faq.so,v 1.2 2001/01/09 20:55:54 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Frequently Asked Questions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Java API</dl></h3></td> +<td width="1%"><a href="../../ref/java/program.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/perl/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Frequently Asked Questions</h1> +<p><ol> +<p><li><b>During one of the first calls to the Berkeley DB Java API, a +DbException is thrown with a "Bad file number" or "Bad file descriptor" +message.</b> +<p>There are known large-file support bugs under JNI in various releases +of the JDK. Please upgrade to the latest release of the JDK, and, if +that does not help, disable big file support using the --disable-bigfile +configuration option. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/java/program.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/perl/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/java/program.html b/db/docs/ref/java/program.html new file mode 100644 index 000000000..c454a0910 --- /dev/null +++ b/db/docs/ref/java/program.html @@ -0,0 +1,72 @@ +<!--$Id: program.so,v 10.21 2001/01/09 18:57:28 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Java Programming Notes</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/java/compat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Java Programming Notes</h1> +<p>The Java API closely parallels the Berkeley DB C++ and C interfaces. If you +are currently using either of those APIs, there will be very little to +surprise you in the Java API. We have even taken care to make the names +of classes, constants, methods and arguments identical, where possible, +across all three APIs. +<p><ol> +<p><li>The Java runtime does not automatically close Berkeley DB objects on +finalization. There are a couple reasons for this. One is that +finalization is generally run only when garbage collection occurs and +there is no guarantee that this occurs at all, even on exit. Allowing +specific Berkeley DB actions to occur in ways that cannot be replicated seems +wrong. Secondly, finalization of objects may happen in an arbitrary +order, so we would have to do extra bookkeeping to make sure everything +was closed in the proper order. The best word of advice is to always +do a close() for any matching open() call. Specifically, the Berkeley DB +package requires that you explicitly call close on each individual +<a href="../../api_java/db_class.html">Db</a> and <a href="../../api_java/dbc_class.html">Dbc</a> object that you opened. Your database +activity may not be synchronized to disk unless you do so. +<p><li>Some methods in the Java API have no return type, and throw a +<a href="../../api_java/except_class.html">DbException</a> when an severe error arises. There are some notable +methods that do have a return value, and can also throw an exception. +<a href="../../api_java/db_get.html">Db.get</a> and <a href="../../api_java/dbc_get.html">Dbc.get</a> both return 0 when a get succeeds, +<a href="../../ref/program/errorret.html#DB_NOTFOUND">Db.DB_NOTFOUND</a> when the key is not found, and throw an error when +there is a severe error. This approach allows the programmer to check +for typical data driven errors by watching return values without special +casing exceptions. +<p>An object of type <a href="../../api_java/deadlock_class.html">DbDeadlockException</a> is thrown when a deadlock +would occur. +<p>An object of type <a href="../../api_java/mem_class.html">DbMemoryException</a> is thrown when the system +cannot provide enough memory to complete the operation (the ENOMEM +system error on UNIX). +<p>An object of type <a href="../../api_java/runrec_class.html">DbRunRecoveryException</a>, a subclass of +<a href="../../api_java/except_class.html">DbException</a>, is thrown when there is an error that requires a +recovery of the database, using <a href="../../utility/db_recover.html">db_recover</a>. +<p><li>There is no class corresponding to the C++ DbMpoolFile class in the Berkeley DB +Java API. There is a subset of the memp_XXX methods in the <a href="../../api_java/dbenv_class.html">DbEnv</a> +class. This has been provided to allow you to perform certain +administrative actions on underlying memory pools opened as a consequence +of <a href="../../api_java/env_open.html">DbEnv.open</a>. Direct access to other memory pool functionality +is not appropriate for the Java environment. +<p><li>Berkeley DB always turns on the <a href="../../api_java/env_open.html#DB_THREAD">Db.DB_THREAD</a> flag since threads +are expected in Java. +<p><li>If there are embedded null strings in the <b>curslist</b> argument for +<a href="../../api_java/db_join.html">Db.join</a>, they will be treated as the end of the list of +cursors, even though you may have allocated a longer array. Fill in +all the strings in your array unless you intend to cut it short. +<p><li>The callback installed for <a href="../../api_java/env_set_errcall.html">DbEnv.set_errcall</a> will run in the same +thread as the caller to <a href="../../api_java/env_set_errcall.html">DbEnv.set_errcall</a>. Make sure that thread +remains running until your application exits or <a href="../../api_java/env_close.html">DbEnv.close</a> is +called. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/java/compat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/am_conv.html b/db/docs/ref/lock/am_conv.html new file mode 100644 index 000000000..7dbe3e73d --- /dev/null +++ b/db/docs/ref/lock/am_conv.html @@ -0,0 +1,129 @@ +<!--$Id: am_conv.so,v 10.16 2000/03/18 21:43:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Access method locking conventions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/twopl.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/cam_conv.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Access method locking conventions</h1> +<p>All the Berkeley DB access methods follow the same conventions for locking +database objects. Applications that do their own locking and also do +locking via the access methods must be careful to adhere to these +conventions. +<p>Whenever a Berkeley DB database is opened, the DB handle is +assigned a unique locker ID. Unless transactions are specified, +that ID is used as the locker for all calls that the Berkeley DB methods +make to the lock subsystem. In order to lock a file, pages in +the file, or records in the file, we must create a unique ID that +can be used as the object to be locked in calls to the lock manager. +Under normal operation, that object is a 28-byte value, created by +the concatenation of a unique file identifier, a page or record number, +and an object type (page or record). +<p>In a transaction-protected environment, database create and delete +operations are recoverable and single-threaded. This single-threading is +achieved using a single lock for the entire environment that must be +acquired before beginning a create or delete operation. In this case, +the object on which Berkeley DB will lock is a 32-bit unsigned integer with a +value of 0. +<p>If applications are using the lock subsystem directly while they are also +using locking via the access methods, they must take care not to +inadvertently lock objects that happen to be equal to the unique file IDs +used to lock files. This is most easily accomplished by using a locker +ID of a different length than the values used by Berkeley DB. +<p>All of the access methods other than Queue use a simple +multiple-reader/single writer page locking scheme. The standard +read/write locks (<b>DB_LOCK_READ</b> and <b>DB_LOCK_WRITE</b>) and +conflict matrix, as described in <a href="../../ref/lock/stdmode.html">Standard lock modes</a> are used. An operation that returns data (e.g., +<a href="../../api_c/db_get.html">DB->get</a>, <a href="../../api_c/dbc_get.html">DBcursor->c_get</a>) obtains a read lock on all the pages +accessed while locating the requested record. When an update operation +is requested (e.g., <a href="../../api_c/db_put.html">DB->put</a>, <a href="../../api_c/dbc_del.html">DBcursor->c_del</a>), the page containing +the updated (or new) data is write locked. As read-modify-write cycles +are quite common and are deadlock prone under normal circumstances, the +Berkeley DB interfaces allow the application to specify the <a href="../../api_c/dbc_get.html#DB_RMW">DB_RMW</a> flag, +which causes operations to immediately obtain a writelock, even though +they are only reading the data. While this may reduce concurrency +somewhat, it reduces the probability of deadlock. +<p>The Queue access method does not hold long term page locks. +Instead, page locks are held only long enough to locate records or to change +metadata on a page, and record locks are held for the appropriate duration. +In the presence of transactions, record locks are held until transaction +commit. +For Berkeley DB operations, record locks are held until operation +completion and for DBC operations, record locks are held until +subsequent records are returned or the cursor is closed. +<p>Under non-transaction operation, the access methods do not normally hold +locks across calls to the Berkeley DB interfaces. The one exception to this +rule is when cursors are used. As cursors maintain a position in a file, +they must hold locks across calls and will, in fact, hold locks until the +cursor is closed. Furthermore, each cursor is assigned its own unique +locker ID when it is created, so cursor operations can conflict with one +another. (Each cursor is assigned its own locker ID because Berkeley DB handles +may be shared by multiple threads of control. The Berkeley DB library cannot +identify which operations are performed by which threads of control, and +it must ensure that two different threads of control are not +simultaneously modifying the same data structure. By assigning each +cursor its own locker, two threads of control sharing a handle cannot +inadvertently interfere with each other. +<p>This has important implications. If a single thread of control opens two +cursors or uses a combination of cursor and non-cursor operations, these +operations are performed on behalf of different lockers. Conflicts that +arise between these different lockers may not cause actual deadlocks, but +can, in fact, permanently block the thread of control. For example, +assume that an application creates a cursor and uses it to read record A. +Now assume a second cursor is opened and the application attempts to write +record A using the second cursor. Unfortunately, the first cursor has a +read lock so the second cursor cannot obtain its write lock. However, +that read lock is held by the same thread of control, so if we block +waiting for the write lock, the read lock can never be released. This +might appear to be a deadlock from the application's perspective, but +Berkeley DB cannot identify it as such because it has no knowledge of which +lockers belong to which threads of control. For this reason, application +designers are encouraged to close cursors as soon as they are done with +them. +<p>Complicated operations that require multiple cursors (or combinations of +cursor and non-cursor operations) can be performed in two ways. First, +they may be performed within a transaction, in which case all operations +lock on behalf of the designated locker ID. Alternatively, the +<a href="../../api_c/dbc_dup.html">DBcursor->c_dup</a> function duplicates a cursor, using the same locker ID as +the originating cursor. There is no way to achieve this duplication +functionality through the DB handle calls, but any DB call can be +implemented by one or more calls through a cursor. +<p>When the access methods use transactions, many of these problems disappear. +The transaction ID is used as the locker ID for all operations performed +on behalf of the transaction. This means that the application may open +multiple cursors on behalf of the same transaction and these cursors will +all share a common locker ID. This is safe because transactions cannot +span threads of control, so the library knows that two cursors in the same +transaction cannot modify the database concurrently. +<p>As mentioned earlier, most of the Berkeley DB access methods use page level +locking. During Btree traversal, lock-coupling is used to traverse the +tree. Note that the tree traversal that occurs during an update operation +can also use lock-coupling; it is not necessary to retain locks on +internal Btree pages even if the item finally referenced will be updated. +Even in the presence of transactions, locks obtained on internal pages of +the Btree may be safely released as the traversal proceeds. This greatly +improves concurrency. The only time internal locks become crucial is when +internal pages are split or merged. When traversing duplicate data items +for a key, the lock on the key value also acts as a lock on all duplicates +of that key. Therefore, two conflicting threads of control cannot access +the same duplicate set simultaneously. +<p>The Recno access method uses a Btree as its underlying data +representation and follows similar locking conventions. However, as the +Recno access method must keep track of the number of children for all +internal pages, it must obtain write locks on all internal pages during +read and write operations. In the presence of transactions, these locks +are not released until transaction commit. +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/twopl.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/cam_conv.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/cam_conv.html b/db/docs/ref/lock/cam_conv.html new file mode 100644 index 000000000..b37914890 --- /dev/null +++ b/db/docs/ref/lock/cam_conv.html @@ -0,0 +1,53 @@ +<!--$Id: cam_conv.so,v 10.10 2000/03/18 21:43:13 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Berkeley DB Concurrent Data Store locking conventions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/am_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/dead.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Berkeley DB Concurrent Data Store locking conventions</h1> +<p>The Berkeley DB Concurrent Data Store product has a different set of conventions for locking. It +provides multiple reader/single writer semantics, but not per-page locking +or transaction recoverability. As such, it does its locking entirely at +the interface to the access methods. +<p>The object it locks is the file, identified by its unique file number. +The locking matrix is not one of the two standard lock modes, instead, +we use a four-lock set, consisting of: +<p><dl compact> +<p><dt>DB_LOCK_NG<dd>not granted (always 0) +<dt>DB_LOCK_READ<dd>read (shared) +<dt>DB_LOCK_WRITE<dd>write (exclusive) +<dt>DB_LOCK_IWRITE<dd>intention-to-write (shared with NG and READ, but conflicts with WRITE and IWRITE) +</dl> +<p>The IWRITE lock is used for cursors that will be used for updating (IWRITE +locks are implicitly obtained for write operations through the Berkeley DB +handles, e.g., <a href="../../api_c/db_put.html">DB->put</a>, <a href="../../api_c/db_del.html">DB->del</a>). While the cursor is +reading, the IWRITE lock is held, but as soon as the cursor is about to +modify the database, the IWRITE is upgraded to a WRITE lock. This upgrade +blocks until all readers have exited the database. Because only one +IWRITE lock is allowed at any one time, no two cursors can ever try to +upgrade to a WRITE lock at the same time, and therefore deadlocks are +prevented, which is essential as Berkeley DB Concurrent Data Store does not include deadlock +detection and recovery. +<p>Applications that need to lock compatibly with Berkeley DB Concurrent Data Store must obey the +following rules: +<p><ol> +<p><li>Use only lock modes DB_LOCK_NG, DB_LOCK_READ, DB_LOCK_WRITE, +DB_LOCK_IWRITE. +<p><li>Never attempt to acquire a WRITE lock on an object that is +already locked with a READ lock. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/am_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/dead.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/config.html b/db/docs/ref/lock/config.html new file mode 100644 index 000000000..cc0b52481 --- /dev/null +++ b/db/docs/ref/lock/config.html @@ -0,0 +1,46 @@ +<!--$Id: config.so,v 10.15 2000/12/08 20:43:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuring locking</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/dead.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/max.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuring locking</h1> +<p>The <a href="../../api_c/env_set_lk_detect.html">DBENV->set_lk_detect</a> function specifies that the deadlock detector +should be run whenever a lock blocks. This option provides for rapid +detection of deadlocks at the expense of potentially frequent +invocations of the deadlock detector. On a fast processor with a highly +contentious application, where response time is critical, this is a good +choice. An argument to the <a href="../../api_c/env_set_lk_detect.html">DBENV->set_lk_detect</a> function indicates which +transaction to abort when a deadlock is detected. It can take on any +one of the following values: +<p><dl compact> +<p><dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_YOUNGEST">DB_LOCK_YOUNGEST</a><dd>Abort the most recently started transaction. +<dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_OLDEST">DB_LOCK_OLDEST</a><dd>Abort the longest lived transaction. +<dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_RANDOM">DB_LOCK_RANDOM</a><dd>Abort whatever transaction the deadlock detector happens to find first. +<dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_DEFAULT">DB_LOCK_DEFAULT</a><dd>Use the default policy (currently DB_RANDOM). +</dl> +<p>In general, <a href="../../api_c/env_set_lk_detect.html#DB_LOCK_DEFAULT">DB_LOCK_DEFAULT</a> is probably the correct choice. If +an application has long-running transactions, then +<a href="../../api_c/env_set_lk_detect.html#DB_LOCK_YOUNGEST">DB_LOCK_YOUNGEST</a> will guarantee that transactions eventually +complete, but it may do so at the expense of a large number of aborts. +<p>The alternative to using the <a href="../../api_c/env_set_lk_detect.html">DBENV->set_lk_detect</a> interface is +to run the deadlock detector manually, using the Berkeley DB +<a href="../../api_c/lock_detect.html">lock_detect</a> interface. +<p>The <a href="../../api_c/env_set_lk_conflicts.html">DBENV->set_lk_conflicts</a> function allows you to specify your own locking +conflicts matrix. This is an advanced configuration option, and rarely +necessary. +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/dead.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/max.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/dead.html b/db/docs/ref/lock/dead.html new file mode 100644 index 000000000..bb77e9822 --- /dev/null +++ b/db/docs/ref/lock/dead.html @@ -0,0 +1,93 @@ +<!--$Id: dead.so,v 10.13 2000/03/18 21:43:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Deadlocks and deadlock avoidance</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/cam_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Deadlocks and deadlock avoidance</h1> +<p>Practically any application that uses locking may deadlock. +In nearly all cases, in order to recover from a deadlock, transactions +must be used, so that an operation that deadlocks mid-way through can +be undone, leaving the database in a consistent state. +As the access methods may perform updates on multiple pages during a +single API call, transactions are necessary even when the application +makes only single update calls into the database. +The only exception to this rule is when all the threads accessing +the database are doing so read-only or when the Concurrent Data Store +product is used; this product guarantees deadlock-free operation at the +expense of reduced concurrency. +Since deadlocks cannot be prevented, Berkeley DB provides the ability to detect +deadlocks and recover from them gracefully. +<p>Deadlocks occur when two or more threads of control are blocked waiting +on each other's forward progress. Consider two transactions, each of +which wants to modify items A and B. Assume that transaction 1 modifies +first A and then B, but transaction 2 modifies B then A. Now, assume +that transaction 1 obtains its writelock on A, but before it obtains its +writelock on B, it is descheduled and transaction 2 runs. Transaction 2 +successfully acquires its writelock on B, but then blocks when it tries +to obtain its writelock on A, because transaction 1 already holds a +writelock on it. This is a deadlock. Transaction 1 cannot make forward +progress until Transaction 2 releases its lock on B, but Transaction 2 +cannot make forward progress until Transaction 1 releases its lock on A. +<p>The <a href="../../api_c/lock_detect.html">lock_detect</a> function runs an instance of the Berkeley DB deadlock +detector. The <a href="../../utility/db_deadlock.html">db_deadlock</a> utility performs deadlock detection by +calling <a href="../../api_c/lock_detect.html">lock_detect</a> at regular intervals. When a deadlock exists +in the system, all of the threads of control involved in the deadlock are, +by definition, waiting on a lock. The deadlock detector examines the +state of the lock manager and identifies a deadlock, and selects one of +the participants to abort. (See <a href="../../ref/lock/config.html">Configuring locking</a> for a discussion of how a participant is selected). +The lock on which the selected participant is waiting is identified such +that the <a href="../../api_c/lock_get.html">lock_get</a> (or <a href="../../api_c/lock_vec.html">lock_vec</a>) call in which that lock +was requested will receive an error return of <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>. +In the access methods, this error return is propagated back through the +Berkeley DB interface as DB_LOCK_DEADLOCK. +<p>When an application receives an DB_LOCK_DEADLOCK, the correct action is +to abort the current transaction, and optionally retry it. Transaction +support is necessary for recovery from deadlocks. When a deadlock occurs, +the database may be left in an inconsistent or corrupted state, and any +database changes already accomplished must be undone before the +application can proceed further. +<p>The deadlock detector identifies deadlocks by looking for a cycle in what +is commonly referred to as its "waits-for" graph. More precisely, the +deadlock detector reads through the lock table, and finds each object +currently locked. Each object has a list of transactions or operations +(hereafter called lockers) that currently hold locks on the object and +possibly a list of waiting lockers, waiting on the lockers holding it. +Each object creates one or more partial orderings of lockers. That is, +for a particular object, every waiting locker comes after every holding +locker, because that holding locker must release its lock before the +waiting locker can make forward progress. Conceptually, after each object +has been examined, the partial orderings are topologically sorted (see +tsort). If this topological sort reveals any cycles, then the lockers +forming the cycle are involved in a deadlock. One of the lockers is +selected for abortion. +<p>It is possible that aborting a single transaction involved in a deadlock +is not enough to allow other transactions to make forward progress. +In this case, the deadlock detector will be called repeatedly. +Unfortunately, at the time a transaction is selected for abortion, +there is not enough information available to determine if aborting +that single transaction will allow forward progress or not. Since +most applications have few deadlocks, Berkeley DB takes the conservative +approach, aborting as few transactions as may be necessary to resolve +the existing deadlocks. In particular, for each unique cycle found +in the waits-for graph described in the previous paragraph, only one +transaction is selected for abortion. However, if there are multiple +cycles, then one transaction from each cycle is selected for abortion. +Only after the aborting transactions have received the deadlock return +and aborted their transactions, can it be determined if it is necessary +to abort other transactions in order to allow forward progress. +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/cam_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/intro.html b/db/docs/ref/lock/intro.html new file mode 100644 index 000000000..b5c85af05 --- /dev/null +++ b/db/docs/ref/lock/intro.html @@ -0,0 +1,89 @@ +<!--$Id: intro.so,v 10.16 2000/03/18 21:43:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Berkeley DB and locking</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/program/runtime.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/page.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Berkeley DB and locking</h1> +<p>The lock subsystem provides interprocess and intraprocess concurrency +control mechanisms. While the locking system is used extensively by the +Berkeley DB access methods and transaction system, it may also be used as a +stand-alone subsystem to provide concurrency control to any set of +designated resources. +<p>The lock subsystem is created, initialized, and opened by calls to +<a href="../../api_c/env_open.html">DBENV->open</a> with the <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a> or <a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a> +flags specified. +<p>The <a href="../../api_c/lock_detect.html">lock_detect</a> function provides the programmatic interface to +the Berkeley DB deadlock detector. Whenever two threads of control issue lock +requests that are not carefully ordered or that require upgrading locks +(obtaining write locks on objects that are already read-locked), the +possibility for deadlock arises. A deadlock occurs when two or more +threads of control are blocked, waiting for actions that another one of +these blocked threads must take. For example, assume that threads one +and two have each obtained read locks on object A. Now suppose that both +threads wish to obtain write locks on object A. Neither thread can be +granted its writelock (because of the other thread's readlock). Both +threads block and will never unblock because the event for which they are +waiting can never happen. +<p>The deadlock detector examines all the locks held in the environment and +identifies situations where no thread can make forward progress. It then +selects one of the participants in the deadlock (according to the argument +that was specified to <a href="../../api_c/env_set_lk_detect.html">DBENV->set_lk_detect</a>) and forces it to return +the value DB_LOCK_DEADLOCK, which indicates that a deadlock occurred. +The thread receiving such an error should abort its current transaction, +or simply release all its locks if it is not running in a transaction, +and retry the operation. +<p>The <a href="../../api_c/lock_vec.html">lock_vec</a> interface is used to acquire and release locks. +<p>Two additional interfaces, <a href="../../api_c/lock_get.html">lock_get</a> and <a href="../../api_c/lock_put.html">lock_put</a>, are +provided. These interfaces are simpler front-ends to the <a href="../../api_c/lock_vec.html">lock_vec</a> +functionality, where <a href="../../api_c/lock_get.html">lock_get</a> acquires a lock, and +<a href="../../api_c/lock_put.html">lock_put</a> releases a lock that was acquired using <a href="../../api_c/lock_get.html">lock_get</a> +or <a href="../../api_c/lock_vec.html">lock_vec</a>. +<p>It is up to the application to specify lockers and objects appropriately. +When used with the Berkeley DB access methods, these lockers and objects are +handled completely internally, but an application using the lock manager +directly must either use the same conventions as the access methods or +define its own convention to which it adheres. If the application is +using the access methods with locking at the same time that it is calling +the lock manager directly, the application must follow a convention that +is compatible with the access methods' use of the locking subsystem. See +<a href="../../ref/lock/am_conv.html">Access method locking conventions</a> +for more information. +<p>The <a href="../../api_c/lock_id.html">lock_id</a> function returns a unique ID which may safely be used +as the locker parameter to the <a href="../../api_c/lock_vec.html">lock_vec</a> interface. The access +methods use <a href="../../api_c/lock_id.html">lock_id</a> to generate unique lockers for the cursors +associated with a database. +<p>The <a href="../../api_c/lock_vec.html">lock_vec</a> function performs any number of lock operations +atomically. It also provides the ability to release all locks held by a +particular locker and release all the locks on a particular object. +Performing multiple lock operations atomically is useful in performing +Btree traversals where you want to acquire a lock on a child page and once +acquired, immediately release the lock on its parent (this is +traditionally referred to as "lock-coupling"). Using <a href="../../api_c/lock_vec.html">lock_vec</a> +instead of separate calls to <a href="../../api_c/lock_put.html">lock_put</a> and <a href="../../api_c/lock_get.html">lock_get</a> reduces +the synchronization overhead between multiple threads or processes. +<p>The three interfaces, <a href="../../api_c/lock_get.html">lock_get</a>, <a href="../../api_c/lock_put.html">lock_put</a> and <a href="../../api_c/lock_vec.html">lock_vec</a>, +are fully compatible, and may be used interchangeably. +<p>All locks explicitly requested by an application should be released via +calls to <a href="../../api_c/lock_put.html">lock_put</a> or <a href="../../api_c/lock_vec.html">lock_vec</a>. +<p>The <a href="../../api_c/lock_stat.html">lock_stat</a> function returns information about the status of +the lock subsystem. It is the programmatic interface used by the +<a href="../../utility/db_stat.html">db_stat</a> utility. +<p>The locking subsystem is closed by the call to <a href="../../api_c/env_close.html">DBENV->close</a>. +<p>Finally, the entire locking subsystem may be discarded using the +<a href="../../api_c/env_remove.html">DBENV->remove</a> interface. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/runtime.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/page.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/max.html b/db/docs/ref/lock/max.html new file mode 100644 index 000000000..236229090 --- /dev/null +++ b/db/docs/ref/lock/max.html @@ -0,0 +1,88 @@ +<!--$Id: max.so,v 10.2 2000/12/21 19:11:28 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuring locking: sizing the system</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/nondb.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuring locking: sizing the system</h1> +<p>The lock system is sized using the following three functions: +<p><blockquote><pre><a href="../../api_c/env_set_lk_max_locks.html">DBENV->set_lk_max_locks</a> +<a href="../../api_c/env_set_lk_max_lockers.html">DBENV->set_lk_max_lockers</a> +<a href="../../api_c/env_set_lk_max_objects.html">DBENV->set_lk_max_objects</a></pre></blockquote> +<p>The <a href="../../api_c/env_set_lk_max_locks.html">DBENV->set_lk_max_locks</a>, <a href="../../api_c/env_set_lk_max_lockers.html">DBENV->set_lk_max_lockers</a> +and <a href="../../api_c/env_set_lk_max_objects.html">DBENV->set_lk_max_objects</a> functions specify, respectively, the +maximum number of locks, lockers and locked objects supported by the +lock subsystem. The maximum number of locks is the number of locks that +can be simultaneously requested in the system. The maximum number of +lockers is the number of lockers that can simultaneously request locks +in the system. The maximum number of lock objects is the number of +objects that can simultaneously be locked in the system. Selecting +appropriate values requires an understanding of your application and +its databases. If the values are too small, then requests for locks in +an application will fail. If the values are too large, then the locking +subsystem will consume more resources than is necessary. It is better +to err in the direction of allocating too many locks, lockers and +objects as increasing the number of locks does not require large amounts +of additional resources. +<p>The recommended algorithm for selecting the maximum number of locks, +lockers and lock objects, is to run the application under stressful +conditions and then review the lock system's statistics to determine +the maximum number of locks, lockers and lock objects that were used. +Then, double these values for safety. However, in some large +applications, finer granularity of control is necessary in order to +minimize the size of the lock subsystem. +<p>The maximum number of lockers can be estimated as follows: +<ul type=disc> +<li>If the +database environment is configured to use transactions, then the maximum +number of lockers needed is the number of simultaneously active +transactions and child transactions (where a child transaction is active +until its parent commits or aborts, not until it commits or aborts). +<li>If the database environment is not configured to use transactions, then +the maximum number of lockers needed is the number of simultaneous +non-cursor operations plus an additional locker for every simultaneously +open cursor. +</ul> +<p>The maximum number of lock objects needed can be estimated as follows: +<ul type=disc> +<li>For Btree and Recno access methods, you will need, at a minimum, one +lock object per level of the database tree. (Unless keys are quite +large with respect to the page size, neither Recno nor Btree database +trees should ever be deeper than five levels.) Then, you will need one +lock object for each leaf page of the database tree that will be +simultaneously accessed. +<li>For the Queue access method you will need one lock object per record +that is simultaneously accessed. To this, add one lock object per page +that will be simultaneously accessed. (Since the Queue access method +uses fixed-length records, and the database page size is known, it is +possible to calculate the number of pages and therefore, lock objects, +required.) Deleted records skipped by a <a href="../../api_c/dbc_get.html#DB_NEXT">DB_NEXT</a> or +<a href="../../api_c/dbc_get.html#DB_PREV">DB_PREV</a> operation do not require a separate lock object. +Further, if your application is using transactions, then no database +operation will ever use more than three lock objects at any time. +<li>For the Hash access method you only need a single lock object. +</ul> +<p>For all access methods, you should then add an additional lock object +per database, for the database's metadata page. +<p>The maximum number of locks required by an application cannot be easily +estimated. It is possible to calculate a maximum number of locks by +multiplying the maximum number of lockers, times the maximum number of +lock objects, times two (two for the two possible lock modes for each +object, read and write). However, this is a pessimal value, and real +applications are unlikely to actually need that many locks. Review of +the lock subsystem statistics is the best way to determine this value. +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/nondb.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/nondb.html b/db/docs/ref/lock/nondb.html new file mode 100644 index 000000000..4fb37d6d7 --- /dev/null +++ b/db/docs/ref/lock/nondb.html @@ -0,0 +1,50 @@ +<!--$Id: nondb.so,v 10.10 2000/12/08 20:43:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Locking and non-Berkeley DB applications</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/max.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Locking and non-Berkeley DB applications</h1> +<p>The locking subsystem is useful outside the context of Berkeley DB. It can be +used to manage concurrent access to any collection of either ephemeral or +persistent objects. That is, the lock region can persist across +invocations of an application, so it can be used to provide long-term +locking (e.g., conference room scheduling). +<p>In order to use the locking subsystem in such a general way, the +applications must adhere to a convention for naming objects and lockers. +Consider the conference room scheduling problem described above. Assume +there are three conference rooms and that we wish to schedule them in +half-hour intervals. +<p>The scheduling application must then select a way to identify each +conference room/time slot combination. In this case, we could describe +the objects being locker as bytestrings consisting of the conference room +name, the date on which it is needed, and the beginning of the appropriate +half-hour slot. +<p>Lockers are 32-bit numbers, so we might choose to use the User ID of the +individual running the scheduling program. To schedule half-hour slots, +all the application need do is issue a <a href="../../api_c/lock_get.html">lock_get</a> call for the +appropriate locker/object pair. To schedule a longer slot, the +application would issue a <a href="../../api_c/lock_vec.html">lock_vec</a> call with one <a href="../../api_c/lock_get.html">lock_get</a> +operation per half-hour up to the total length. If the <a href="../../api_c/lock_vec.html">lock_vec</a> +call fails, the application would have to release the parts of the time +slot that were obtained. +<p>To cancel a reservation, the application would make the appropriate +<a href="../../api_c/lock_put.html">lock_put</a> calls. To reschedule a reservation, the <a href="../../api_c/lock_get.html">lock_get</a> +and <a href="../../api_c/lock_put.html">lock_put</a> calls could all be made inside of a single +<a href="../../api_c/lock_vec.html">lock_vec</a> call. The output of <a href="../../api_c/lock_stat.html">lock_stat</a> could be +post-processed into a human-readable schedule of conference room use. +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/max.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/notxn.html b/db/docs/ref/lock/notxn.html new file mode 100644 index 000000000..16b00cf66 --- /dev/null +++ b/db/docs/ref/lock/notxn.html @@ -0,0 +1,46 @@ +<!--$Id: notxn.so,v 10.10 2000/03/18 21:43:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Locking without transactions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/stdmode.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/twopl.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Locking without transactions</h1> +<p>If an application runs with locking specified, but not transactions (e.g., +<a href="../../api_c/env_open.html">DBENV->open</a> is called with <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a> or +<a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a> specified, but not <a href="../../api_c/env_open.html#DB_INIT_TXN">DB_INIT_TXN</a>), locks are +normally acquired during each Berkeley DB operation and released before the +operation returns to the caller. The only exception is in the case of +cursor operations. As cursors identify a particular position in a file, +a cursor must retain a read-lock across cursor calls to make sure that +that position is uniquely identifiable during the next cursor call, +because an operation using <a href="../../api_c/dbc_get.html#DB_CURRENT">DB_CURRENT</a> must reference the same +record as the previous cursor call. Such cursor locks cannot be released +until either the cursor is reset using the <a href="../../api_c/db_get.html#DB_GET_BOTH">DB_GET_BOTH</a>, +<a href="../../api_c/dbc_get.html#DB_SET">DB_SET</a>, <a href="../../api_c/dbc_get.html#DB_SET_RANGE">DB_SET_RANGE</a>, <a href="../../api_c/dbc_put.html#DB_KEYFIRST">DB_KEYFIRST</a>, or +<a href="../../api_c/dbc_put.html#DB_KEYLAST">DB_KEYLAST</a> functionality, in which case a new cursor lock is +established, or the cursor is closed. As a result, application designers +are encouraged to close cursors as soon as possible. +<p>It is important to realize that concurrent applications that use locking +must ensure that two concurrent threads do not interfere with each other. +However, as Btree and Hash access method page splits can occur at any +time, there is virtually no way to guarantee that an application which +writes the database cannot deadlock. Applications running without the +protection of transactions may deadlock, and when they do so, can leave +the database in an inconsistent state. Applications that need concurrent +access, but not transactions, are more safely implemented using the Berkeley DB Concurrent Data Store +Product. +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/stdmode.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/twopl.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/page.html b/db/docs/ref/lock/page.html new file mode 100644 index 000000000..a7e43b3af --- /dev/null +++ b/db/docs/ref/lock/page.html @@ -0,0 +1,62 @@ +<!--$Id: page.so,v 10.12 2000/03/18 21:43:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Page locks</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/stdmode.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Page locks</h1> +<p>Under normal operation, the access methods use page locking. The pagesize +of a database is set when the database is created and may be specified by +calling the <a href="../../api_c/db_set_pagesize.html">DB->set_pagesize</a> function. If not specified, the Berkeley DB +package tries to select a pagesize that will provide the best I/O +performance by setting the page size equal to the block size of the +underlying file system. +<p>In the Btree access method, Berkeley DB uses a technique called lock coupling +to improve concurrency. The traversal of a Btree requires reading a page, +searching that page to determine which page to search next and then +repeating this process on the next page. Once a page has been searched, +it will never be accessed again for this operation, unless a page split +is required. To improve concurrency in the tree, once the next page to +read/search has been determined, that page is locked, and then atomically +(i.e., without relinquishing control of the lock manager) the original +page lock is released. +<p>As the Recno access method is built upon Btree, it too uses lock coupling +for read operations. However, as the Recno access method must maintain +a count of records on its internal pages, it cannot lock couple during +write operations. Instead, it retains write locks on all internal pages +during every update operation. For this reason, it is not possible to +have high concurrency in the Recno access method in the presence of write +operations. +<p>The Queue access method only uses short term page locks. That is, a page +lock is released prior to requesting another page lock. Record locks are +used for transaction isolation. The provides a high degree of concurrency +for write operations. A metadata page is used to keep track of the head +and tail of the queue. This page is never locked during other locking or +I/O operations. +<p>The Hash access method does not have such traversal issues, but because +it implements dynamic hashing, it must always refer to its metadata while +computing a hash function. This metadata is stored on a special page in +the hash database. This page must therefore be read locked on every +operation. Fortunately, it need only be write locked when new pages are +allocated to the file, which happens in three cases: 1) a hash bucket +becomes full and needs to split, 2) a key or data item is too large to +fit on a normal page, and 3) the number of duplicate items for a fixed +key becomes sufficiently large that they are moved to an auxiliary page. +In this case, the access method must obtain a write lock on the metadata +page, thus requiring that all readers be blocked from entering the tree +until the update completes. +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/stdmode.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/stdmode.html b/db/docs/ref/lock/stdmode.html new file mode 100644 index 000000000..ca1cd6b0b --- /dev/null +++ b/db/docs/ref/lock/stdmode.html @@ -0,0 +1,61 @@ +<!--$Id: stdmode.so,v 10.20 2000/03/18 21:43:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Standard lock modes</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/page.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/notxn.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Standard lock modes</h1> +<p>The Berkeley DB locking protocol is described by a conflict matrix. A conflict +matrix is an n x n array where n is the number of different lock modes +supported, and the (i, j)th entry of the array indicates whether a lock of +mode i conflicts with a lock of mode j. +<p>The Berkeley DB include files declare two commonly used conflict arrays: +<p><dl compact> +<p><dt>const u_int8_t db_rw_conflicts[ ];<dd>This is a conflict matrix for a simple scheme using shared and exclusive +lock modes. +<p><dt>const u_int8_t db_riw_conflicts[ ];<dd>This is a conflict matrix that involves various intent lock modes (e.g., +intent shared) that are used for multigranularity locking. +</dl> +<p>The number of modes associated with each matrix are DB_LOCK_RW_N and +DB_LOCK_RIW_N, respectively. +<p>In addition, the Berkeley DB include file defines the type <b>db_lockmode_t</b>, +which is the type of the lock modes used with the standard tables above: +<p><dl compact> +<p><dt>DB_LOCK_NG<dd>not granted (always 0) +<p><dt>DB_LOCK_READ<dd>read (shared) +<p><dt>DB_LOCK_WRITE<dd>write (exclusive) +</dl> +<p>As an example, consider the basic multiple-reader/single writer conflict +matrix described by <b>db_rw_conflicts</b>. In the following +example (and in the appropriate file), a 1 represents a conflict (i.e., +do not grant the lock if the indicated lock is held) and a 0 indicates +that it is OK to grant the lock. +<p>The rows indicate the lock that is held and the columns indicate the lock +that is requested. +<p><blockquote><pre> Notheld Read Write +Notheld 0 0 0 +Read* 0 0 1 +Write** 0 1 1 +</pre></blockquote> +<p><dl compact> +<p><dt>*<dd>In this case, suppose that there is a read lock held on an object. A new +request for a read lock would be granted, but a request for a write lock +would not. +<p><dt>**<dd>In this case, suppose that there is a write lock held on an object. A +new request for either a read or write lock would be denied. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/page.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/notxn.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/lock/twopl.html b/db/docs/ref/lock/twopl.html new file mode 100644 index 000000000..6cf112c09 --- /dev/null +++ b/db/docs/ref/lock/twopl.html @@ -0,0 +1,50 @@ +<!--$Id: twopl.so,v 10.7 2000/03/18 21:43:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Locking with transactions: two-phase locking</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/notxn.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/am_conv.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Locking with transactions: two-phase locking</h1> +<p>Berkeley DB uses a locking protocol called two-phase locking. This is the +traditional protocol used in conjunction with lock-based transaction +systems. +<p>In a two-phase locking (2PL) system, transactions are broken up into two +distinct phases. During the first phase, the transaction only acquires +locks. During the second phase, the transaction only releases locks. +More formally, once a transaction releases a lock, it may not acquire any +additional locks. Practically, this translates into a system where locks +are acquired as they are needed throughout a transaction and retained +until the transaction ends, either by committing or aborting. In Berkeley DB, +locks are released during <a href="../../api_c/txn_abort.html">txn_abort</a> or <a href="../../api_c/txn_commit.html">txn_commit</a>. The +only exception to this protocol occurs when we use lock-coupling to +traverse a data structure. If the locks are held only for traversal +purposes, then the locks may be released before transaction commit or +abort. +<p>For applications, the implications of 2PL are that long-running +transactions will hold locks for a long time. When designing +applications, lock contention should be considered. In order to reduce +the probability of deadlock and achieve the best level of concurrency +possible, the following guidelines are helpful. +<p><ol> +<p><li>When accessing multiple databases, design all transactions so +that they access the files in the same order. +<p><li>If possible, access your most hotly contested resources last +(so that their locks are held for the shortest time possible). +<p><li>If possible, use nested transactions to protect the parts of +your transaction most likely to deadlock. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/notxn.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/am_conv.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/log/config.html b/db/docs/ref/log/config.html new file mode 100644 index 000000000..f3c948893 --- /dev/null +++ b/db/docs/ref/log/config.html @@ -0,0 +1,40 @@ +<!--$Id: config.so,v 10.16 2001/01/18 20:31:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuring logging</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Logging Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/log/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/limits.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuring logging</h1> +<p>The two aspects of logging that may be configured are the size of log +files on disk and the size of the log buffer in memory. The +<a href="../../api_c/env_set_lg_max.html">DBENV->set_lg_max</a> interface specifies the individual log file +size for all of the applications sharing the Berkeley DB environment. Setting +the log file size is largely a matter of convenience, and a reflection +of the application's preferences in backup media and frequency. +However, setting the log file size too low can potentially cause +problems as it would be possible to run out of log sequence numbers, +which requires a full archival and application restart to reset. See +the <a href="../../ref/log/limits.html">Log file limits</a> section for more +information. +<p>The <a href="../../api_c/env_set_lg_bsize.html">DBENV->set_lg_bsize</a> interface specifies the size of the +in-memory log buffer, in bytes. Log information is stored in memory +until the buffer fills up or transaction commit forces the buffer to be +written to disk. Larger buffer sizes can significantly increase +throughput in the presence of long running transactions, highly +concurrent applications, or transactions producing large amounts of +data. By default, the buffer is 32KB. +<table><tr><td><br></td><td width="1%"><a href="../../ref/log/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/limits.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/log/intro.html b/db/docs/ref/log/intro.html new file mode 100644 index 000000000..0c41c17ef --- /dev/null +++ b/db/docs/ref/log/intro.html @@ -0,0 +1,58 @@ +<!--$Id: intro.so,v 10.16 2001/01/18 20:31:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Berkeley DB and logging</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Logging Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/lock/nondb.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Berkeley DB and logging</h1> +<p>The logging subsystem is the logging facility used by Berkeley DB. It is +largely Berkeley DB specific, although it is potentially useful outside of +the Berkeley DB package for applications wanting write-ahead logging support. +Applications wanting to use the log for purposes other than logging file +modifications based on a set of open file descriptors will almost +certainly need to make source code modifications to the Berkeley DB code +base. +<p>A log can be shared by any number of threads of control. The +<a href="../../api_c/env_open.html">DBENV->open</a> interface is used to open a log. When the log is no +longer in use, it should be closed, using the <a href="../../api_c/env_close.html">DBENV->close</a> +interface. +<p>Individual log entries are identified by log sequence numbers. Log +sequence numbers are stored in an opaque object, a <a href="../../api_c/db_lsn.html">DB_LSN</a>. +<p>The <a href="../../api_c/log_put.html">log_put</a> interface is used to append new log records to the +log. Optionally, the <a href="../../api_c/log_put.html#DB_CHECKPOINT">DB_CHECKPOINT</a> flag can be used to output +a checkpoint log record (indicating that the log is consistent to that +point and recoverable after a system or application failure), as well +as open-file information. The <a href="../../api_c/log_get.html">log_get</a> interface is used to +retrieve log records from the log. +<p>There are additional interfaces for integrating the log subsystem with a +transaction processing system: +<p><dl compact> +<p><dt><a href="../../api_c/log_register.html">log_register</a> and <a href="../../api_c/log_unregister.html">log_unregister</a><dd>These interfaces associate files with identification numbers. These +identification numbers are logged so that transactional recovery +correctly associates log records with the appropriate files. +<p><dt><a href="../../api_c/log_flush.html">log_flush</a><dd>Flushes the log up to a particular log sequence number. +<p><dt><a href="../../api_c/log_compare.html">log_compare</a><dd>Allows applications to compare any two log sequence numbers. +<p><dt><a href="../../api_c/log_file.html">log_file</a> <dd>Maps a log sequence number to the specific log file which contains it. +<p><dt><a href="../../api_c/log_archive.html">log_archive</a><dd>Returns various sets of log file names. These interfaces are used for +database administration, e.g., to determine if log files may safely be +removed from the system. +<p><dt><a href="../../api_c/log_stat.html">log_stat</a> <dd>The display <a href="../../utility/db_stat.html">db_stat</a> utility uses the <a href="../../api_c/log_stat.html">log_stat</a> interface +to display statistics about the log. +<p><dt><a href="../../api_c/env_remove.html">DBENV->remove</a><dd>The log meta-information (but not the log files themselves) may be +removed using the <a href="../../api_c/env_remove.html">DBENV->remove</a> interface. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/nondb.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/log/limits.html b/db/docs/ref/log/limits.html new file mode 100644 index 000000000..d34e5a813 --- /dev/null +++ b/db/docs/ref/log/limits.html @@ -0,0 +1,47 @@ +<!--$Id: limits.so,v 10.23 2001/01/18 20:31:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Log file limits</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Logging Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/log/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/mp/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Log file limits</h1> +<p>Log file names and sizes impose a limit on how long databases may be +used in a Berkeley DB database environment. It is quite unlikely that an +application will reach this limit, however, if the limit is reached, +the Berkeley DB environment's databases must be dumped and reloaded. +<p>The log file name consists of <b>log.</b> followed by 10 digits, with +a maximum of 2,000,000,000 log files. Consider an application performing +6000 transactions per second, for 24 hours a day, logged into 10MB log +files, where each transaction is logging approximately 500 bytes of data. +The calculation: +<p><blockquote><pre>(10 * 2^20 * 2000000000) / (6000 * 500 * 365 * 60 * 60 * 24) = ~221</pre></blockquote> +<p>indicates that the system will run out of log file names in roughly 221 +years. +<p>There is no way to reset the log file name space in Berkeley DB. If your +application is reaching the end of its log file name space, you must: +<p><ol> +<p><li>Archive your databases as if to prepare for catastrophic failure (see +<a href="../../utility/db_archive.html">db_archive</a> for more information). +<p><li>Dump and re-load all your databases (see <a href="../../utility/db_dump.html">db_dump</a> and +<a href="../../utility/db_load.html">db_load</a> for more information). +<p><li>Remove all of the log files from the database environment. Note, this +is the only situation where all of the log files are removed from an +environment, in all other cases at least a single log file is +retained. +<p><li>Restart your application. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/log/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/mp/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/mp/config.html b/db/docs/ref/mp/config.html new file mode 100644 index 000000000..cf311516d --- /dev/null +++ b/db/docs/ref/mp/config.html @@ -0,0 +1,55 @@ +<!--$Id: config.so,v 10.17 2000/10/03 17:17:35 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuring the memory pool</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Memory Pool Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/mp/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuring the memory pool</h1> +<p>There are two interfaces used for configuring the memory pool. +<p>The most important tuning parameter for almost all applications, including +Berkeley DB applications, is the size of the pool. There are two ways to +specify the pool size. First, calling the <a href="../../api_c/env_set_cachesize.html">DBENV->set_cachesize</a> function +specifies the pool size for all of the applications sharing the Berkeley DB +environment. Second, by calling the <a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> function. The +latter only specifies a pool size for the specific database. Note, it is +meaningless to call <a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> for a database opened inside +of a Berkeley DB environment, since the environment pool size will override any +pool size specified for a single database. For information on tuning the +Berkeley DB cache size, see <a href="../../ref/am_conf/cachesize.html">Selecting +a cache size</a>. +<p>The second memory pool configuration interface specifies the maximum size +of backing files to map into the process address space instead of copying +pages through the local cache. Only read-only database files can be +mapped into process memory. Because of the requirements of the Berkeley DB +transactional implementation, log records describing database changes must +be written to disk before the actual database changes. As mapping +read-write database files into process memory would permit the underlying +operating system to write modified database changes at will, it is not +supported. +<p>Mapping files into the process address space can result in +better-than-usual performance, as available virtual memory is normally +much larger than the local cache, and page faults are faster than page +copying on many systems. However, in the presence of limited virtual +memory it can cause resource starvation, and in the presence of large +databases, it can result in immense process sizes. +<p>To specify that no files are to be mapped into the process address space, +specify the <a href="../../api_c/env_open.html#DB_NOMMAP">DB_NOMMAP</a> flag to the <a href="../../api_c/env_set_flags.html">DBENV->set_flags</a> interface. +To specify that any individual file should not be mapped into the process +address space, specify the <a href="../../api_c/env_open.html#DB_NOMMAP">DB_NOMMAP</a> flag to the +<a href="../../api_c/memp_fopen.html">memp_fopen</a> interface. To limit the size of files mapped into the +process address space, use the <a href="../../api_c/env_set_mp_mmapsize.html">DBENV->set_mp_mmapsize</a> function. +<table><tr><td><br></td><td width="1%"><a href="../../ref/mp/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/mp/intro.html b/db/docs/ref/mp/intro.html new file mode 100644 index 000000000..2b52a5775 --- /dev/null +++ b/db/docs/ref/mp/intro.html @@ -0,0 +1,59 @@ +<!--$Id: intro.so,v 10.15 2001/01/18 20:31:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Berkeley DB and the memory pool</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Memory Pool Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/log/limits.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/mp/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Berkeley DB and the memory pool</h1> +<p>The memory pool subsystem is the general-purpose shared memory buffer pool +used by Berkeley DB. This module is useful outside of the Berkeley DB package for +processes that require page-oriented, cached, shared file access. +<p>A memory pool is a shared memory cache shared by any number of processes +and threads within processes. The <a href="../../api_c/env_open.html">DBENV->open</a> interface opens, and +optionally creates, a memory pool. When that pool is no longer in use, +it should be closed, using the <a href="../../api_c/env_close.html">DBENV->close</a> interface. +<p>The <a href="../../api_c/memp_fopen.html">memp_fopen</a> interface opens an underlying file within the +memory pool. When that file is no longer in use, it should be closed, +using the <a href="../../api_c/memp_fclose.html">memp_fclose</a> interface. The <a href="../../api_c/memp_fget.html">memp_fget</a> interface +is used to retrieve pages from files in the pool. All retrieved pages +must be subsequently returned using the <a href="../../api_c/memp_fput.html">memp_fput</a> interface. At +the time that pages are returned, they may be marked <b>dirty</b>, which +causes them to be written to the backing disk file before being discarded +from the pool. If there is insufficient room to bring a new page in the +pool, a page is selected to be discarded from the pool. If that page is +dirty, it is first written to the backing file. The page is selected +using a somewhat modified least-recently-used algorithm. Pages in files +may also be explicitly marked clean or dirty using the <a href="../../api_c/memp_fset.html">memp_fset</a> +interface. All dirty pages in the pool from any underlying file may also +be flushed as a group using the <a href="../../api_c/memp_fsync.html">memp_fsync</a> interface. +<p>There are additional interfaces for manipulating the entire memory pool: +<ul type=disc> +<li>It is possible to gradually flush buffers from the pool in order to +maintain a consistent percentage of clean buffers in the pool using the +<a href="../../api_c/memp_trickle.html">memp_trickle</a> interface. +<li>The <a href="../../utility/db_stat.html">db_stat</a> utility uses the <a href="../../api_c/memp_stat.html">memp_stat</a> interface to +display statistics about the efficiency of the pool. +<li>As some conversion may be necessary when pages are read or written the +<a href="../../api_c/memp_register.html">memp_register</a> function allows applications to specify automatic +input and output processing in these cases. +<li>There is one additional interface that is intended for manipulating the +memory pool, but which is specific to database systems. The +<a href="../../api_c/memp_sync.html">memp_sync</a> interface flushes dirty pages from all files held in +the pool up to a specified database log sequence number. +<li>Finally, the entire pool may be discarded using the <a href="../../api_c/env_remove.html">DBENV->remove</a> +interface. +</ul> +<table><tr><td><br></td><td width="1%"><a href="../../ref/log/limits.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/mp/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/perl/intro.html b/db/docs/ref/perl/intro.html new file mode 100644 index 000000000..da5d93a6a --- /dev/null +++ b/db/docs/ref/perl/intro.html @@ -0,0 +1,42 @@ +<!--$Id: intro.so,v 10.24 2001/01/09 18:57:28 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Using Berkeley DB with Perl</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Perl</dl></h3></td> +<td width="1%"><a href="../../ref/java/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Using Berkeley DB with Perl</h1> +<p>The original Perl module for Berkeley DB was DB_File, which was written to +interface to Berkeley DB version 1.85. The newer Perl module for Berkeley DB is +BerkeleyDB, which was written to interface to version 2.0 and subsequent +releases. Because Berkeley DB version 2.X has a compatibility API for version +1.85, you can (and should!) build DB_File using version 2.X of Berkeley DB, +although DB_File will still only support the 1.85 functionality. +<p>DB_File is distributed with the standard Perl source distribution (look +in the directory "ext/DB_File"). You can find both DB_File and BerkeleyDB +on CPAN, the Comprehensive Perl Archive Network of mirrored FTP sites. +The master CPAN site is +<a href="ftp://ftp.funet.fi/">ftp://ftp.funet.fi/</a>. +<p>Versions of both BerkeleyDB and DB_File that are known to work correctly +with each release of Berkeley DB are included in the distributed Berkeley DB source +tree, in the subdirectories <b>perl.BerkeleyDB</b> and +<b>perl.DB_File</b>. Each of those directories contains a +<b>README</b> file with instructions on installing and using those +modules. +<p>The Perl interface is not maintained by Sleepycat Software. Questions +about the DB_File and BerkeleyDB modules are best asked on the Usenet +newsgroup comp.lang.perl.modules. +<table><tr><td><br></td><td width="1%"><a href="../../ref/java/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/pindex.src b/db/docs/ref/pindex.src new file mode 100644 index 000000000..0e122ceb2 --- /dev/null +++ b/db/docs/ref/pindex.src @@ -0,0 +1,212 @@ +__APIREL__/ref/am/close.html#2 @closing a database +__APIREL__/ref/am/count.html#2 @counting data items for a key +__APIREL__/ref/am/curclose.html#2 @closing a cursor +__APIREL__/ref/am/curclose.html#3 closing a @cursor +__APIREL__/ref/am/curdel.html#2 @deleting records with a cursor +__APIREL__/ref/am/curdel.html#3 deleting records with a @cursor +__APIREL__/ref/am/curdup.html#2 @duplicating a cursor +__APIREL__/ref/am/curdup.html#3 duplicating a @cursor +__APIREL__/ref/am/curget.html#2 @retrieving records with a cursor +__APIREL__/ref/am/curget.html#3 retrieving records with a @cursor +__APIREL__/ref/am/curput.html#2 @storing records with a cursor +__APIREL__/ref/am/curput.html#3 storing records with a @cursor +__APIREL__/ref/am/cursor.html#2 database @cursors +__APIREL__/ref/am/delete.html#2 @deleting records +__APIREL__/ref/am/error.html#2 @error handling +__APIREL__/ref/am/get.html#2 @retrieving records +__APIREL__/ref/am/join.html#2 logical @join +__APIREL__/ref/am/open.html#2 @opening a database +__APIREL__/ref/am/partial.html#2 @partial record storage and retrieval +__APIREL__/ref/am/put.html#2 @storing records +__APIREL__/ref/am/stability.html#2 @cursor stability +__APIREL__/ref/am/stability.html#3 cursor @stability +__APIREL__/ref/am/stat.html#2 database @statistics +__APIREL__/ref/am/sync.html#2 flushing the database @cache +__APIREL__/ref/am/upgrade.html#2 @upgrading databases +__APIREL__/ref/am/verify.html#2 database @verification +__APIREL__/ref/am/verify.html#3 database @salvage +__APIREL__/ref/am/verify.html#4 recovering @corrupted databases +__APIREL__/ref/am_conf/bt_compare.html#2 specifying a Btree @comparison function +__APIREL__/ref/am_conf/bt_recnum.html#2 retrieving Btree records by @number +__APIREL__/ref/am_conf/byteorder.html#2 selecting a @byte order +__APIREL__/ref/am_conf/cachesize.html#2 selecting a @cache size +__APIREL__/ref/am_conf/dup.html#2 @duplicate data items +__APIREL__/ref/am_conf/extentsize.html#2 selecting a Queue @extent size +__APIREL__/ref/am_conf/h_ffactor.html#2 page @fill factor +__APIREL__/ref/am_conf/h_hash.html#2 specifying a database @hash +__APIREL__/ref/am_conf/h_nelem.html#2 @hash table size +__APIREL__/ref/am_conf/intro.html#2 @access methods +__APIREL__/ref/am_conf/logrec.html#2 logical @record numbers +__APIREL__/ref/am_conf/pagesize.html#2 selecting a @page size +__APIREL__/ref/am_conf/re_source.html#2 @text backing files +__APIREL__/ref/am_conf/recno.html#2 managing @record-based databases +__APIREL__/ref/am_conf/renumber.html#2 logically renumbering @records +__APIREL__/ref/am_conf/select.html#2 selecting an @access method +__APIREL__/ref/arch/apis.html#2 programmatic @APIs +__APIREL__/ref/arch/utilities.html#2 @utilities +__APIREL__/ref/build_unix/aix.html#2 @AIX +__APIREL__/ref/build_unix/conf.html#2 @configuring Berkeley DB for UNIX systems +__APIREL__/ref/build_unix/conf.html#3 configuring Berkeley DB for @UNIX systems +__APIREL__/ref/build_unix/conf.html#4 configuring without large @file support +__APIREL__/ref/build_unix/conf.html#--disable-bigfile Configuring Berkeley DB@--disable-bigfile +__APIREL__/ref/build_unix/conf.html#5 configuring Berkeley DB @1.85 API compatibility +__APIREL__/ref/build_unix/conf.html#--enable-compat185 Configuring Berkeley DB@--enable-compat185 +__APIREL__/ref/build_unix/conf.html#6 configuring the @C++ API +__APIREL__/ref/build_unix/conf.html#--enable-cxx Configuring Berkeley DB@--enable-cxx +__APIREL__/ref/build_unix/conf.html#--enable-debug Configuring Berkeley DB@--enable-debug +__APIREL__/ref/build_unix/conf.html#--enable-debug_rop Configuring Berkeley DB@--enable-debug_rop +__APIREL__/ref/build_unix/conf.html#--enable-debug_wop Configuring Berkeley DB@--enable-debug_wop +__APIREL__/ref/build_unix/conf.html#--enable-diagnostic Configuring Berkeley DB@--enable-diagnostic +__APIREL__/ref/build_unix/conf.html#7 building a utility to dump Berkeley DB @1.85 databases +__APIREL__/ref/build_unix/conf.html#--enable-dump185 Configuring Berkeley DB@--enable-dump185 +__APIREL__/ref/build_unix/conf.html#8 configuring @shared libraries +__APIREL__/ref/build_unix/conf.html#9 configuring @dynamic shared libraries +__APIREL__/ref/build_unix/conf.html#--enable-dynamic Configuring Berkeley DB@--enable-dynamic +__APIREL__/ref/build_unix/conf.html#10 configuring the @Java API +__APIREL__/ref/build_unix/conf.html#--enable-java Configuring Berkeley DB@--enable-java +__APIREL__/ref/build_unix/conf.html#--enable-posixmutexes Configuring Berkeley DB@--enable-posixmutexes +__APIREL__/ref/build_unix/conf.html#11 configuring a @RPC client/server +__APIREL__/ref/build_unix/conf.html#--enable-rpc Configuring Berkeley DB@--enable-rpc +__APIREL__/ref/build_unix/conf.html#--enable-shared Configuring Berkeley DB@--enable-shared +__APIREL__/ref/build_unix/conf.html#12 configuring the @Tcl API +__APIREL__/ref/build_unix/conf.html#--enable-tcl Configuring Berkeley DB@--enable-tcl +__APIREL__/ref/build_unix/conf.html#13 configuring the @test suite +__APIREL__/ref/build_unix/conf.html#--enable-test Configuring Berkeley DB@--enable-test +__APIREL__/ref/build_unix/conf.html#--enable-uimutexes Configuring Berkeley DB@--enable-uimutexes +__APIREL__/ref/build_unix/conf.html#--enable-umrw Configuring Berkeley DB@--enable-umrw +__APIREL__/ref/build_unix/conf.html#--with-tcl=DIR Configuring Berkeley DB@--with-tcl=DIR +__APIREL__/ref/build_unix/flags.html#2 changing @compile or load options +__APIREL__/ref/build_unix/flags.html#3 changing compile or @load options +__APIREL__/ref/build_unix/freebsd.html#2 @FreeBSD +__APIREL__/ref/build_unix/hpux.html#2 @HP-UX +__APIREL__/ref/build_unix/install.html#2 @installing Berkeley DB for UNIX systems +__APIREL__/ref/build_unix/intro.html#2 @building for UNIX +__APIREL__/ref/build_unix/irix.html#2 @IRIX +__APIREL__/ref/build_unix/linux.html#2 @Linux +__APIREL__/ref/build_unix/notes.html#2 @building for UNIX FAQ +__APIREL__/ref/build_unix/notes.html#3 building for @UNIX FAQ +__APIREL__/ref/build_unix/osf1.html#2 @OSF/1 +__APIREL__/ref/build_unix/qnx.html#2 @QNX +__APIREL__/ref/build_unix/sco.html#2 @SCO +__APIREL__/ref/build_unix/shlib.html#2 @shared libraries +__APIREL__/ref/build_unix/solaris.html#2 @Solaris +__APIREL__/ref/build_unix/sunos.html#2 @SunOS +__APIREL__/ref/build_unix/test.html#2 running the @test suite under UNIX +__APIREL__/ref/build_unix/ultrix.html#2 @Ultrix +__APIREL__/ref/build_vxworks/faq.html#2 @building for VxWorks FAQ +__APIREL__/ref/build_vxworks/faq.html#3 building for @VxWorks FAQ +__APIREL__/ref/build_vxworks/intro.html#2 @building for VxWorks +__APIREL__/ref/build_vxworks/notes.html#2 @VxWorks notes +__APIREL__/ref/build_win/faq.html#2 @building for Windows FAQ +__APIREL__/ref/build_win/faq.html#3 building for @Windows FAQ +__APIREL__/ref/build_win/intro.html#2 @building for Win32 +__APIREL__/ref/build_win/notes.html#2 @Windows notes +__APIREL__/ref/build_win/test.html#2 running the @test suite under Windows +__APIREL__/ref/build_win/test.html#3 running the test suite under @Windows +__APIREL__/ref/cam/intro.html#2 @Concurrent Data Store +__APIREL__/ref/debug/common.html#2 @debugging applications +__APIREL__/ref/distrib/layout.html#2 @source code layout +__APIREL__/ref/dumpload/text.html#2 loading @text into databases +__APIREL__/ref/dumpload/utility.html#2 dumping/loading @text to/from databases +__APIREL__/ref/env/create.html#2 database @environment +__APIREL__/ref/env/naming.html#2 file @naming +__APIREL__/ref/env/naming.html#db_home File naming@db_home +__APIREL__/ref/env/naming.html#DB_HOME File naming@DB_HOME +__APIREL__/ref/env/naming.html#DB_CONFIG File naming@DB_CONFIG +__APIREL__/ref/env/remote.html#2 remote @filesystems +__APIREL__/ref/env/security.html#2 @security +__APIREL__/ref/intro/products.html#2 Sleepycat Software's Berkeley DB @products +__APIREL__/ref/install/file.html#2 @/etc/magic +__APIREL__/ref/install/file.html#3 @file utility +__APIREL__/ref/java/compat.html#2 @Java compatibility +__APIREL__/ref/java/conf.html#2 @Java configuration +__APIREL__/ref/java/faq.html#2 Java @FAQ +__APIREL__/ref/java/faq.html#3 @Java FAQ +__APIREL__/ref/lock/am_conv.html#2 @locking conventions +__APIREL__/ref/lock/cam_conv.html#2 Berkeley DB Concurrent Data Store @locking conventions +__APIREL__/ref/lock/config.html#2 @locking configuration +__APIREL__/ref/lock/dead.html#2 @deadlocks +__APIREL__/ref/lock/intro.html#2 @locking introduction +__APIREL__/ref/lock/max.html#2 sizing the @locking subsystem +__APIREL__/ref/lock/nondb.html#2 @locking and non-Berkeley DB applications +__APIREL__/ref/lock/notxn.html#2 @locking without transactions +__APIREL__/ref/lock/page.html#2 page-level @locking +__APIREL__/ref/lock/stdmode.html#2 standard @lock modes +__APIREL__/ref/lock/twopl.html#2 two-phase @locking +__APIREL__/ref/log/config.html#2 @logging configuration +__APIREL__/ref/log/intro.html#2 @logging introduction +__APIREL__/ref/log/limits.html#2 @log file limits +__APIREL__/ref/mp/config.html#2 @memory pool configuration +__APIREL__/ref/perl/intro.html#2 @Perl +__APIREL__/ref/program/appsignals.html#2 application @signal handling +__APIREL__/ref/program/byteorder.html#2 @byte ordering +__APIREL__/ref/program/byteorder.html#3 byte @endian +__APIREL__/ref/program/compatible.html#2 @interface compatibility +__APIREL__/ref/program/dbsizes.html#2 database @limits +__APIREL__/ref/program/diskspace.html#2 @disk space requirements +__APIREL__/ref/program/environ.html#2 @environment variables +__APIREL__/ref/program/errorret.html#2 @error returns +__APIREL__/ref/program/errorret.html#3 @error name space +__APIREL__/ref/program/errorret.html#DB_NOTFOUND Error returns to applications@DB_NOTFOUND +__APIREL__/ref/program/errorret.html#DB_KEYEMPTY Error returns to applications@DB_KEYEMPTY +__APIREL__/ref/program/errorret.html#DB_LOCK_DEADLOCK Error returns to applications@DB_LOCK_DEADLOCK +__APIREL__/ref/program/errorret.html#DB_LOCK_NOTGRANTED Error returns to applications@DB_LOCK_NOTGRANTED +__APIREL__/ref/program/errorret.html#DB_RUNRECOVERY Error returns to applications@DB_RUNRECOVERY +__APIREL__/ref/program/mt.html#2 building @threaded applications +__APIREL__/ref/program/namespace.html#2 Berkeley DB library @name spaces +__APIREL__/ref/program/scope.html#2 Berkeley DB handle @scope +__APIREL__/ref/program/scope.html#3 Berkeley DB @free-threaded handles +__APIREL__/ref/rpc/client.html#2 @RPC client +__APIREL__/ref/rpc/server.html#2 @RPC server +__APIREL__/ref/sendmail/intro.html#2 @Sendmail +__APIREL__/ref/tcl/intro.html#2 loading Berkeley DB with @Tcl +__APIREL__/ref/tcl/faq.html#2 Tcl @FAQ +__APIREL__/ref/tcl/faq.html#3 @Tcl FAQ +__APIREL__/ref/tcl/program.html#2 @Tcl API programming notes +__APIREL__/ref/tcl/using.html#2 using Berkeley DB with @Tcl +__APIREL__/ref/test/run.html#2 running the @test suite +__APIREL__/ref/transapp/admin.html#2 administering @transaction protected applications +__APIREL__/ref/transapp/archival.html#2 archival in @transaction protected applications +__APIREL__/ref/transapp/archival.html#3 @catastrophic recovery +__APIREL__/ref/transapp/checkpoint.html#2 checkpoints in @transaction protected applications +__APIREL__/ref/transapp/deadlock.html#2 deadlock detection in @transaction protected applications +__APIREL__/ref/transapp/filesys.html#2 recovery and @filesystem operations +__APIREL__/ref/transapp/intro.html#2 @Transactional Data Store +__APIREL__/ref/transapp/logfile.html#2 @log file removal +__APIREL__/ref/transapp/reclimit.html#2 Berkeley DB @recoverability +__APIREL__/ref/transapp/recovery.html#2 recovery in @transaction protected applications +__APIREL__/ref/transapp/throughput.html#2 @transaction throughput +__APIREL__/ref/txn/config.html#2 @transaction configuration +__APIREL__/ref/txn/intro.html#2 Berkeley DB and @transactions +__APIREL__/ref/txn/limits.html#2 @transaction limits +__APIREL__/ref/txn/nested.html#2 nested @transactions +__APIREL__/ref/upgrade.2.0/intro.html#2 Upgrading to release @2.0 +__APIREL__/ref/upgrade.3.0/intro.html#2 Upgrading to release @3.0 +__APIREL__/ref/upgrade.3.1/intro.html#2 Upgrading to release @3.1 +__APIREL__/ref/upgrade.3.2/intro.html#2 Upgrading to release @3.2 +__APIREL__/ref/xa/config.html#2 configuring Berkeley DB with the @Tuxedo System +__APIREL__/ref/xa/intro.html#2 @XA Resource Manager +__APIREL__/utility/berkeley_db_svc.html#2 @berkeley_db_svc +__APIREL__/utility/berkeley_db_svc.html#3 utility to support @RPC client/server +__APIREL__/utility/db_archive.html#2 @db_archive +__APIREL__/utility/db_archive.html#3 utility to @archive log files +__APIREL__/utility/db_checkpoint.html#2 @db_checkpoint +__APIREL__/utility/db_checkpoint.html#3 utility to take @checkpoints +__APIREL__/utility/db_deadlock.html#2 @db_deadlock +__APIREL__/utility/db_deadlock.html#3 utility to detect @deadlocks +__APIREL__/utility/db_dump.html#2 @db_dump +__APIREL__/utility/db_dump.html#3 utility to @dump databases as text files +__APIREL__/utility/db_load.html#2 @db_load +__APIREL__/utility/db_load.html#3 utility to @load text files into databases +__APIREL__/utility/db_printlog.html#2 @db_printlog +__APIREL__/utility/db_printlog.html#3 utility to display @log files as text +__APIREL__/utility/db_recover.html#2 @db_recover +__APIREL__/utility/db_recover.html#3 utility to @recover database environments +__APIREL__/utility/db_stat.html#2 @db_stat +__APIREL__/utility/db_stat.html#3 utility to display database and environment @statistics +__APIREL__/utility/db_upgrade.html#2 @db_upgrade +__APIREL__/utility/db_upgrade.html#3 utility to upgrade @database files +__APIREL__/utility/db_upgrade.html#4 utility to @upgrade database files +__APIREL__/utility/db_verify.html#2 @db_verify +__APIREL__/utility/db_verify.html#3 utility to verify @database files +__APIREL__/utility/db_verify.html#4 utility to @verify database files diff --git a/db/docs/ref/program/appsignals.html b/db/docs/ref/program/appsignals.html new file mode 100644 index 000000000..2b1d99bd6 --- /dev/null +++ b/db/docs/ref/program/appsignals.html @@ -0,0 +1,35 @@ +<!--$Id: appsignals.so,v 10.25 2000/07/15 15:49:07 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Application signal handling</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/xa/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/errorret.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Application signal handling</h1> +<p>When applications using Berkeley DB receive signals, it is important that they +exit gracefully, discarding any Berkeley DB locks that they may hold. This is +normally done by setting a flag when a signal arrives, and then checking +for that flag periodically within the application. As Berkeley DB is not +reentrant, the signal handler should not attempt to release locks and/or +close the database handles itself. Reentering Berkeley DB is not guaranteed to +work correctly and the results are undefined. +<p>If an application exits holding a lock, the situation is no different +than if the application crashed, and all applications participating in +the database environment must be shutdown, and then recovery must be +performed. If this is not done, databases may be left in an +inconsistent state or locks the application held may cause unresolvable +deadlocks inside the environment, causing applications to hang. +<table><tr><td><br></td><td width="1%"><a href="../../ref/xa/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/errorret.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/byteorder.html b/db/docs/ref/program/byteorder.html new file mode 100644 index 000000000..6569ba88b --- /dev/null +++ b/db/docs/ref/program/byteorder.html @@ -0,0 +1,31 @@ +<!--$Id: byteorder.so,v 10.20 2000/03/18 21:43:15 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Byte ordering</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/dbsizes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/diskspace.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Byte ordering</h1> +<p>The database files created by Berkeley DB can be created in either little or +big-endian formats. By default, the native format of the machine on which +the database is created will be used. Any format database can be used on +a machine with a different native format, although it is possible that +the application will incur a performance penalty for the run-time +conversion. +<p>No user-specified data is converted in any way at all. Key or data items +stored on machines of one format will be returned to the application +exactly as stored on machines of another format. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/dbsizes.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/diskspace.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/compatible.html b/db/docs/ref/program/compatible.html new file mode 100644 index 000000000..72db97a5c --- /dev/null +++ b/db/docs/ref/program/compatible.html @@ -0,0 +1,32 @@ +<!--$Id: compatible.so,v 10.29 2000/07/25 16:31:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Compatibility with historic interfaces</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/diskspace.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/recimp.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Compatibility with historic interfaces</h1> +<p>The Berkeley DB version 2 library provides backward compatible interfaces for +the historic UNIX <a href="../../api_c/dbm.html">dbm</a>, <a href="../../api_c/dbm.html">ndbm</a> and <a href="../../api_c/hsearch.html">hsearch</a> +interfaces. It also provides a backward compatible interface for the +historic Berkeley DB 1.85 release. +<p>Berkeley DB version 2 does not provide database compatibility for any of the +above interfaces, and existing databases must be converted manually. To +convert existing databases from the Berkeley DB 1.85 format to the Berkeley DB version +2 format, review the <a href="../../utility/db_dump.html">db_dump185</a> and <a href="../../utility/db_load.html">db_load</a> information. +No utilities are provided to convert UNIX <a href="../../api_c/dbm.html">dbm</a>, <a href="../../api_c/dbm.html">ndbm</a> or +<a href="../../api_c/hsearch.html">hsearch</a> databases. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/diskspace.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/recimp.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/copy.html b/db/docs/ref/program/copy.html new file mode 100644 index 000000000..80b6f942a --- /dev/null +++ b/db/docs/ref/program/copy.html @@ -0,0 +1,63 @@ +<!--$Id: copy.so,v 10.4 2000/03/18 21:43:15 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Copying databases</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/namespace.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/version.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Copying databases</h1> +<p>Because file identification cookies (e.g., file names, device and inode +numbers, volume and file IDs, etc.) are not necessarily unique or +maintained across system reboots, each Berkeley DB database file contains a +20-byte file identification bytestring that is stored in the first page +of the database at a page byte offset of 36 bytes. When multiple +processes or threads open the same database file in Berkeley DB, it is this +bytestring that is used to ensure that the same underlying pages are +updated in the shared memory buffer pool no matter which Berkeley DB handle is +used for the operation. +<p>It is usually a bad idea to physically copy a database to a new name. In +the few cases where copying is the best solution for your application, +you must guarantee there are never two different databases with the same +file identification bytestring in the memory pool at the same time. +Copying databases is further complicated by the fact that the shared +memory buffer pool does not discard all cached copies of pages for a +database when the database is logically closed, that is, when +<a href="../../api_c/db_close.html">DB->close</a> is called. Nor is there a Berkeley DB interface to explicitly +discard pages from the shared memory buffer pool for any particular +database. +<p>Before copying a database, you must ensure that all modified pages have +been written from the memory pool cache to the backing database file. +This is done using the <a href="../../api_c/db_sync.html">DB->sync</a> or <a href="../../api_c/db_close.html">DB->close</a> interfaces. +<p>Before using a copy of a database from Berkeley DB, you must ensure that all +pages from any database with the same bytestring have been removed from +the memory pool cache. If the environment in which you intend to open +the copy of the database potentially has pages from files with identical +bytestrings to the copied database (which is likely to be the case), there +are a few possible solutions: +<p><ol> +<p><li>Remove the environment, either explicitly or by calling <a href="../../api_c/env_remove.html">DBENV->remove</a>. +Note, this will not allow you to access both the original and copy of the +database at the same time. +<p><li>Overwrite the bytestring in the copied database with a new bytestring. +This allows you to access both the original and copy of the database at +the same time. +<p><li>Create a new file that will have a new bytestring. The simplest +way to create a new file that will have a new bytestring is to call the +<a href="../../utility/db_dump.html">db_dump</a> utility to dump out the contents of the database, and then +use the <a href="../../utility/db_load.html">db_load</a> utility to load the dumped output into a new file +name. This allows you to access both the original and copy of the +database at the same time. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/namespace.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/version.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/dbsizes.html b/db/docs/ref/program/dbsizes.html new file mode 100644 index 000000000..69b45868d --- /dev/null +++ b/db/docs/ref/program/dbsizes.html @@ -0,0 +1,45 @@ +<!--$Id: dbsizes.so,v 10.22 2000/03/18 21:43:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Database limits</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/version.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/byteorder.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Database limits</h1> +<p>The largest database file that Berkeley DB can handle depends on the page size +selected by the application. Berkeley DB stores database file page numbers as +unsigned 32-bit numbers and database file page sizes as unsigned 16-bit +numbers. Using the maximum database page size of 65536, this results in +a maximum database file size of 2<sup>48</sup> (256 terabytes). The +minimum database page size is 512 bytes, which results in a minimum +maximum database size of 2<sup>41</sup> (2 terabytes). +<p>The largest database file Berkeley DB can support is potentially further limited +if the host system does not have filesystem support for files larger than +2<sup>32</sup>, including the ability to seek to absolute offsets within +those files. +<p>The largest key or data item that Berkeley DB can support is largely limited +by available memory. Specifically, while key and data byte strings may +be of essentially unlimited length, any one of them must fit into +available memory so that it can be returned to the application. As some +of the Berkeley DB interfaces return both key and data items to the application, +those interfaces will require that any key/data pair fit simultaneously +into memory. Further, as the access methods may need to compare key and +data items with other key and data items, it may be a requirement that +any two key or two data items fit into available memory. Finally, when +writing applications supporting transactions, it may be necessary to have +an additional copy of any data item in memory for logging purposes. +<p>The maximum Btree depth is 255. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/version.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/byteorder.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/diskspace.html b/db/docs/ref/program/diskspace.html new file mode 100644 index 000000000..fb8425d8a --- /dev/null +++ b/db/docs/ref/program/diskspace.html @@ -0,0 +1,145 @@ +<!--$Id: diskspace.so,v 10.9 2000/03/22 21:56:11 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Disk space requirements</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/byteorder.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/compatible.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Disk space requirements</h1> +<p>It is possible to estimate the total database size based on the size of +the data. Simply put, the following calculations attempt to figure out +how many bytes you will need to hold a set of data and then how many pages +it will take to actually store it on disk. +<p>Space freed by deleting key/data pairs from a Btree or Hash database is +never returned to the filesystem, although it is reused where possible. +This means that the Btree and Hash databases are grow-only. If enough +keys are deleted from a database that shrinking the underlying file is +desirable, you should create a new database and insert the records from +the old one into it. +<p>These are rough estimates at best. For example, they do not take into +account overflow records, filesystem metadata information, or real-life +situations where the sizes of key and data items are wildly variable, and +the page-fill factor changes over time. +<h3>Btree</h3> +<p>The formulas for the Btree access method are as follows: +<p><blockquote><pre>useful-bytes-per-page = (page-size - page-overhead) * page-fill-factor +<p> +bytes-of-data = n-records * + (bytes-per-entry + page-overhead-for-two-entries) +<p> +n-pages-of-data = bytes-of-data / bytes-per-page +<p> +total-pages-on-disk = n-pages-of-data * page-size +</pre></blockquote> +<p>The <b>useful-bytes-per-page</b> is a measure of the bytes on each page +that will actually hold the application data. It is computed as the total +number of bytes on the page that are available to hold application data, +corrected by the percentage of the page that is likely to contain data. +The reason for this correction is that the percentage of a page that +contains application data can vary from close to 50% after a page split, +to almost 100% if the entries in the database were inserted in sorted +order. Obviously, the <b>page-fill-factor</b> can drastically alter +the amount of disk space required to hold any particular data set. The +page-fill factor of any existing database can be displayed using the +<a href="../../utility/db_stat.html">db_stat</a> utility. +<p>As an example, using an 8K page size, with an 85% page-fill factor, there +are 6941 bytes of useful space on each page: +<p><blockquote><pre>6941 = (8192 - 26) * .85</pre></blockquote> +<p>The total <b>bytes-of-data</b> is an easy calculation: it is the number +of key/data pairs plus the overhead required to store each pair on a page. +The overhead to store a single item on a Btree page is 5 bytes. So, +assuming 60,000,000 key/data pairs, each of which is 8 bytes long, there +are 1440000000 bytes, or roughly 1.34GB, of total data: +<p><blockquote><pre>1560000000 = 60000000 * ((8 * 2) + (5 * 2))</pre></blockquote> +<p>The total pages of data, <b>n-pages-of-data</b>, is the +<b>bytes-of-data</b> divided by the <b>useful-bytes-per-page</b>. In +the example, there are 224751 pages of data. +<p><blockquote><pre>224751 = 1560000000 / 6941</pre></blockquote> +<p>The total bytes of disk space for the database is <b>n-pages-of-data</b> +multiplied by the <b>page-size</b>. In the example, the result is +1841160192 bytes, or roughly 1.71GB. +<p><blockquote><pre>1841160192 = 224751 * 8192</pre></blockquote> +<h3>Hash</h3> +<p>The formulas for the Hash access method are as follows: +<p><blockquote><pre>useful-bytes-per-page = (page-size - page-overhead) +<p> +bytes-of-data = n-records * + (bytes-per-entry + page-overhead-for-two-entries) +<p> +n-pages-of-data = bytes-of-data / bytes-per-page +<p> +total-pages-on-disk = n-pages-of-data * page-size +</pre></blockquote> +<p>The <b>useful-bytes-per-page</b> is a measure of the bytes on each page +that will actually hold the application data. It is computed as the total +number of bytes on the page that are available to hold application data. +If the application has explicitly set a page fill factor, then pages will +not necessarily be kept full. For databases with a preset fill factor, +see the calculation below. The page-overhead for Hash databases is 26 +bytes and the page-overhead-for-two-entries is 6 bytes. +<p>As an example, using an 8K page size, there are 8166 bytes of useful space +on each page: +<p><blockquote><pre>8166 = (8192 - 26)</pre></blockquote> +<p>The total <b>bytes-of-data</b> is an easy calculation: it is the number +of key/data pairs plus the overhead required to store each pair on a page. +In this case that's 6 bytes per pair. So, assuming 60,000,000 key/data +pairs, each of which is 8 bytes long, there are 1320000000 bytes, or +roughly 1.23GB, of total data: +<p><blockquote><pre>1320000000 = 60000000 * ((16 + 6))</pre></blockquote> +<p>The total pages of data, <b>n-pages-of-data</b>, is the +<b>bytes-of-data</b> divided by the <b>useful-bytes-per-page</b>. In +this example, there are 161646 pages of data. +<p><blockquote><pre>161646 = 1320000000 / 8166</pre></blockquote> +<p>The total bytes of disk space for the database is <b>n-pages-of-data</b> +multiplied by the <b>page-size</b>. In the example, the result is +1324204032 bytes, or roughly 1.23GB. +<p><blockquote><pre>1324204032 = 161646 * 8192</pre></blockquote> +<p>Now, let's assume that the application specified a fill factor explicitly. +The fill factor indicates the target number of items to place on a single +page (a fill factor might reduce the utilization of each page, but it can +be useful in avoiding splits and preventing buckets from becoming too +large. Using our estimates above, each item is 22 bytes (16 + 6) and +there are 8166 useful bytes on a page (8192 - 26). That means that, on +average, you can fit 371 pairs per page. +<p><blockquote><pre>371 = 8166 / 22</pre></blockquote> +<p>However, let's assume that the application designer knows that while most +items are 8 bytes, they can sometimes be as large as 10 and it's very +important to avoid overflowing buckets and splitting. Then, the +application might specify a fill factor of 314. +<p><blockquote><pre>314 = 8166 / 26</pre></blockquote> +<p>With a fill factor of 314, then the formula for computing database size +is: +<p><blockquote><pre>npages = npairs / pairs-per-page</pre></blockquote> +<p>or 191082. +<p><blockquote><pre>191082 = 60000000 / 314</pre></blockquote> +<p>At 191082 pages, the total database size would be 1565343744 or 1.46GB. +<p><blockquote><pre>1565343744 = 191082 * 8192 </pre></blockquote> +<p>There are a few additional caveats with respect to Hash databases. This +discussion assumes that the hash function does a good job of evenly +distributing keys among hash buckets. If the function does not do this, +you may find your table growing significantly larger than you expected. +Secondly, in order to provide support for Hash databases co-existing with +other databases in a single file, pages within a Hash database are +allocated in power-of-2 chunks. That means that a Hash database with 65 +buckets will take up as much space as a Hash database with 128 buckets; +each time the Hash database grows beyond its current power-of-two number +of buckets, it allocates space for the next power-of-two buckets. This +space may be sparsely allocated in the file system, but the files will +appear to be their full size. Finally, because of this need for +contiguous allocation, overflow pages and duplicate pages can be allocated +only at specific points in the file, and this too can lead to sparse hash +tables. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/byteorder.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/compatible.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/environ.html b/db/docs/ref/program/environ.html new file mode 100644 index 000000000..7f56109b5 --- /dev/null +++ b/db/docs/ref/program/environ.html @@ -0,0 +1,33 @@ +<!--$Id: environ.so,v 10.17 2000/03/18 21:43:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Environment variables</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/errorret.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/mt.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Environment variables</h1> +<p>The Berkeley DB library uses the following environment variables: +<p><dl compact> +<p><dt>DB_HOME<dd>If the environment variable DB_HOME is set, it is used as part of +<a href="../../ref/env/naming.html">File Naming</a>. +Note, for the DB_HOME variable to take effect, either the +<a href="../../api_c/env_open.html#DB_USE_ENVIRON">DB_USE_ENVIRON</a> or <a href="../../api_c/env_open.html#DB_USE_ENVIRON_ROOT">DB_USE_ENVIRON_ROOT</a> flags must be +specified to <a href="../../api_c/env_open.html">DBENV->open</a>. +<p><dt>TMPDIR, TEMP, TMP, TempFolder<dd>The TMPDIR, TEMP, TMP and TempFolder environment variables are all +checked as locations in which to create temporary files. See +<a href="../../api_c/env_set_tmp_dir.html">DBENV->set_tmp_dir</a> for more information. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/errorret.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/mt.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/errorret.html b/db/docs/ref/program/errorret.html new file mode 100644 index 000000000..fc6ad650d --- /dev/null +++ b/db/docs/ref/program/errorret.html @@ -0,0 +1,108 @@ +<!--$Id: errorret.so,v 10.34 2000/12/31 19:26:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Error returns to applications</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/appsignals.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/environ.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Error returns to applications</h1> +<p>Except for the historic <a href="../../api_c/dbm.html">dbm</a>, <a href="../../api_c/dbm.html">ndbm</a> and <a href="../../api_c/hsearch.html">hsearch</a> +interfaces, Berkeley DB does not use the global variable <b>errno</b> to +return error values. The return values for all Berkeley DB functions are +grouped into three categories: +<p><dl compact> +<p><dt>0<dd>A return value of 0 indicates that the operation was successful. +<p><dt>> 0<dd>A return value that is greater than 0 indicates that there was a system +error. The <b>errno</b> value returned by the system is returned by +the function, e.g., when a Berkeley DB function is unable to allocate memory, +the return value from the function will be ENOMEM. +<p><dt>< 0<dd>A return value that is less than 0 indicates a condition that was not +a system failure, but was not an unqualified success, either. For +example, a routine to retrieve a key/data pair from the database may +return DB_NOTFOUND when the key/data pair does not appear in +the database, as opposed to the value of 0, which would be returned if +the key/data pair were found in the database. +<p> <a name="3"><!--meow--></a> +All values returned by Berkeley DB functions are less than 0 in order to avoid +conflict with possible values of <b>errno</b>. Specifically, Berkeley DB +reserves all values from -30,800 to -30,999 to itself as possible error +values. There are a few Berkeley DB interfaces where it is possible for an +application function to be called by a Berkeley DB function and subsequently +fail with an application-specific return. Such failure returns will be +passed back to the function that originally called a Berkeley DB interface. +To avoid ambiguity as to the cause of the error, error values separate +from the Berkeley DB error name space should be used. +</dl> +While possible error returns are specified by each individual function's +manual page, there are a few error returns that deserve special mention: +<h3><a name="DB_NOTFOUND">DB_NOTFOUND</a> and <a name="DB_KEYEMPTY">DB_KEYEMPTY</a></h3> +<p>There are two special return values that are similar in meaning, and that +are returned in similar situations, and therefore might be confused: +DB_NOTFOUND and DB_KEYEMPTY. +<p>The DB_NOTFOUND error return indicates that the requested key/data +pair did not exist in the database or that start- or end-of-file has been +reached. +<p>The DB_KEYEMPTY error return indicates that the requested key/data +pair logically exists but was never explicitly created by the application +(the Recno and Queue access methods will automatically create key/data +pairs under some circumstances; see <a href="../../api_c/db_open.html">DB->open</a> for more +information), or that the requested key/data pair was deleted and never +re-created. In addition, the Queue access method will return +DB_KEYEMPTY for records which were created as part of a +transaction which was later aborted, and never re-created. +<h3><a name="DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a></h3> +<p>When multiple threads of control are modifying the database, there is +normally the potential for deadlock. In Berkeley DB, deadlock is signified by +an error return from the Berkeley DB function of the value +DB_LOCK_DEADLOCK. Whenever a Berkeley DB function returns +DB_LOCK_DEADLOCK, the enclosing transaction should be aborted. +<p>Any Berkeley DB function that attempts to acquire locks can potentially return +DB_LOCK_DEADLOCK. Practically speaking, the safest way to deal +with applications that can deadlock is to handle an +DB_LOCK_DEADLOCK return from any Berkeley DB access method call. +<h3><a name="DB_LOCK_NOTGRANTED">DB_LOCK_NOTGRANTED</a></h3> +<p>When multiple threads of control are modifying the database, there is +normally the potential for deadlock. In order to avoid deadlock, +applications may specify, on a per-transaction basis, that if a lock is +unavailable, the Berkeley DB operation should return immediately instead of +waiting on the lock. The error return in this case will be +DB_LOCK_NOTGRANTED. Whenever a Berkeley DB function returns +DB_LOCK_NOTGRANTED, the enclosing transaction should be aborted. +<h3><a name="DB_RUNRECOVERY">DB_RUNRECOVERY</a></h3> +<p>There exists a class of errors that Berkeley DB considers fatal to an entire +Berkeley DB environment. An example of this type of error is a corrupted +database, or a log write failure because the disk is out of free space. +The only way to recover from these failures is to have all threads of +control exit the Berkeley DB environment, run recovery of the environment, and +re-enter Berkeley DB. (It is not strictly necessary that the processes exit, +although that is the only way to recover system resources, such as file +descriptors and memory, allocated by Berkeley DB.) +<p>When this type of error is encountered, the error value +DB_RUNRECOVERY is returned. This error can be returned by any +Berkeley DB interface. Once DB_RUNRECOVERY is returned by any +interface, it will be returned from all subsequent Berkeley DB calls made by +any threads or processes participating in the environment. +<p>Optionally, applications may also specify a fatal-error callback function +using the <a href="../../api_c/env_set_paniccall.html">DBENV->set_paniccall</a> function. This callback function will be +called with two arguments: a reference to the DB_ENV structure associated +with the environment, and the <b>errno</b> value associated with the +underlying error that caused the problem. +<p>Applications can handle such fatal errors in one of two ways: by checking +for DB_RUNRECOVERY as part of their normal Berkeley DB error return +checking, similarly to DB_LOCK_DEADLOCK or any other error, or, +in applications that have no cleanup processing of their own, by simply +exiting the application when the callback function is called. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/appsignals.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/environ.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/extending.html b/db/docs/ref/program/extending.html new file mode 100644 index 000000000..6f276d8dc --- /dev/null +++ b/db/docs/ref/program/extending.html @@ -0,0 +1,242 @@ +<!--$Id: extending.so,v 10.32 2000/07/25 16:31:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Application-specific logging and recovery</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/recimp.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/runtime.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Application-specific logging and recovery</h1> +<p>Berkeley DB includes tools to assist in the development of application-specific +logging and recovery. Specifically, given a description of the +information to be logged, these tools will automatically create logging +functions (functions that take the values as parameters and construct a +single record that is written to the log), read functions (functions that +read a log record and unmarshall the values into a structure that maps +onto the values you chose to log), a print function (for debugging), +templates for the recovery functions, and automatic dispatching to your +recovery functions. +<h3>Defining Application-Specific Operations</h3> +<p>Log records are described in files named XXX.src, where "XXX" is a +unique prefix. The prefixes currently used in the Berkeley DB package are +btree, crdel, db, hash, log, qam, and txn. These files contain interface +definition language descriptions for each type of log record that +is supported. +<p>All lines beginning with a hash character in <b>.src</b> files are +treated as comments. +<p>The first non-comment line in the file should begin with the keyword +PREFIX followed by a string that will be prepended to every function. +Frequently, the PREFIX is either identical or similar to the name of the +<b>.src</b> file. +<p>The rest of the file consists of one or more log record descriptions. +Each log record description begins with the line: +<p><blockquote><pre>BEGIN RECORD_NAME RECORD_NUMBER</pre></blockquote> +<p>and ends with the line: +<p><blockquote><pre>END</pre></blockquote> +<p>The RECORD_NAME keyword should be replaced with a unique record name for +this log record. Record names must only be unique within <b>.src</b> +files. +<p>The RECORD_NUMBER keyword should be replaced with a record number. Record +numbers must be unique for an entire application, that is, both +application-specific and Berkeley DB log records must have unique values. +Further, as record numbers are stored in log files, which often must be +portable across application releases, no record number should ever be +re-used. The record number space below 10,000 is reserved for Berkeley DB +itself, applications should choose record number values equal to or +greater than 10,000. +<p>Between the BEGIN and END statements, there should be one line for each +data item that will be logged in this log record. The format of these +lines is as follows: +<p><blockquote><pre>ARG | DBT | POINTER variable_name variable_type printf_format</pre></blockquote> +<p>The keyword ARG indicates that the argument is a simple parameter of the +type specified. The keyword DBT indicates that the argument is a DBT +containing a length and pointer. The keyword PTR indicates that the +argument is a pointer to the data type specified and that the entire type +should be logged. +<p>The variable name is the field name within the structure that will be used +to reference this item. The variable type is the C type of the variable, +and the printf format should be "s", for string, "d" for signed integral +type, or "u" for unsigned integral type. +<h3>Automatically Generated Functions</h3> +<p>For each log record description found in the file, the following structure +declarations and #defines will be created in the file PREFIX_auto.h. +<p><blockquote><pre><p> +#define DB_PREFIX_RECORD_TYPE /* Integer ID number */ +<p> +typedef struct _PREFIX_RECORD_TYPE_args { + /* + * These three fields are generated for every record. + */ + u_int32_t type; /* Record type used for dispatch. */ +<p> + /* + * Transaction id that identifies the transaction on whose + * behalf the record is being logged. + */ + DB_TXN *txnid; +<p> + /* + * The LSN returned by the previous call to log for + * this transaction. + */ + DB_LSN *prev_lsn; +<p> + /* + * The rest of the structure contains one field for each of + * the entries in the record statement. + */ +};</pre></blockquote> +<p>The DB_PREFIX_RECORD_TYPE will be described in terms of a value +DB_PREFIX_BEGIN, which should be specified by the application writer in +terms of the library provided DB_user_BEGIN macro (this is the value of +the first identifier available to users outside the access method system). +<p>In addition to the PREFIX_auto.h file, a file named PREFIX_auto.c is +created, containing the following functions for each record type: +<p><dl compact> +<p><dt>The log function, with the following parameters:<dd><p><dl compact> +<p><dt>dbenv<dd>The environment handle returned by <a href="../../api_c/env_create.html">db_env_create</a>. +<p><dt>txnid<dd>The transaction identifier returned by <a href="../../api_c/txn_begin.html">txn_begin</a>. +<p><dt>lsnp<dd>A pointer to storage for an LSN into which the LSN of the new log record +will be returned. +<p><dt>syncflag<dd>A flag indicating if the record must be written synchronously. Valid +values are 0 and <a href="../../api_c/log_put.html#DB_FLUSH">DB_FLUSH</a>. +</dl> +<p>The log function marshalls the parameters into a buffer and calls +<a href="../../api_c/log_put.html">log_put</a> on that buffer returning 0 on success and 1 on failure. +<p><dt>The read function with the following parameters:<dd> +<p><dl compact> +<p><dt>recbuf<dd>A buffer. +<p><dt>argp<dd>A pointer to a structure of the appropriate type. +</dl> +<p>The read function takes a buffer and unmarshalls its contents into a +structure of the appropriate type. It returns 0 on success and non-zero +on error. After the fields of the structure have been used, the pointer +returned from the read function should be freed. +<p><dt>The recovery function with the following parameters:<dd><p><dl compact> +<p><dt>dbenv<dd>The handle returned from the <a href="../../api_c/env_create.html">db_env_create</a> call which identifies +the environment in which recovery is running. +<p><dt>rec<dd>The <b>rec</b> parameter is the record being recovered. +<p><dt>lsn<dd>The log sequence number of the record being recovered. +<p><dt>op<dd>A parameter of type db_recops which indicates what operation is being run +(DB_TXN_OPENFILES, DB_TXN_ABORT, DB_TXN_BACKWARD_ROLL, DB_TXN_FORWARD_ROLL). +<p><dt>info<dd>A structure passed by the dispatch function. It is used to contain a list +of committed transactions and information about files that may have been +deleted. +</dl> +<p>The recovery function is called on each record read from the log during +system recovery or transaction abort. +<p>The recovery function is created in the file PREFIX_rtemp.c since it +contains templates for recovery functions. The actual recovery functions +must be written manually, but the templates usually provide a good starting +point. +<p><dt>The print function:<dd>The print function takes the same parameters as the recover function so +that it is simple to dispatch both to simple print functions as well as +to the actual recovery functions. This is useful for debugging purposes +and is used by the <a href="../../utility/db_printlog.html">db_printlog</a> utility to produce a human-readable +version of the log. All parameters except the <b>rec</b> and +<b>lsnp</b> parameters are ignored. The <b>rec</b> parameter contains +the record to be printed. +</dl> +One additional function, an initialization function, +is created for each <b>.src</b> file. +<p><dl compact> +<p><dt>The initialization function has the following parameters:<dd><p><dl compact> +<p><dt>dbenv<dd>The environment handle returned by <a href="../../api_c/env_create.html">db_env_create</a>. +</dl> +<p>The recovery initialization function registers each log record type +declared with the recovery system, so that the appropriate function is +called during recovery. +</dl> +<h3>Using Automatically Generated Routines</h3> +<p>Applications use the automatically generated functions as follows: +<p><ol> +<p><li>When the application starts, +call the <a href="../../api_c/env_set_rec_init.html">DBENV->set_recovery_init</a> with your recovery +initialization function so that the initialization function is called +at the appropriate time. +<p><li>Issue a <a href="../../api_c/txn_begin.html">txn_begin</a> call before any operations you wish +to be transaction protected. +<p><li>Before accessing any data, issue the appropriate lock call to +lock the data (either for reading or writing). +<p><li>Before modifying any data that is transaction protected, issue +a call to the appropriate log function. +<p><li>Issue a <a href="../../api_c/txn_commit.html">txn_commit</a> to save all of the changes or a +<a href="../../api_c/txn_abort.html">txn_abort</a> to cancel all of the modifications. +</ol> +<p>The recovery functions (described below) can be called in two cases: +<p><ol> +<p><li>From the recovery daemon upon system failure, with op set to +DB_TXN_FORWARD_ROLL or DB_TXN_BACKWARD_ROLL. +<p><li>From <a href="../../api_c/txn_abort.html">txn_abort</a>, if it is called to abort a transaction, with +op set to DB_TXN_ABORT. +</ol> +<p>For each log record type you declare, you must write the appropriate +function to undo and redo the modifications. The shell of these functions +will be generated for you automatically, but you must fill in the details. +<p>Your code should be able to detect whether the described modifications +have been applied to the data or not. The function will be called with +the "op" parameter set to DB_TXN_ABORT when a transaction that wrote the +log record aborts and with DB_TXN_FORWARD_ROLL and DB_TXN_BACKWARD_ROLL +during recovery. The actions for DB_TXN_ABORT and DB_TXN_BACKWARD_ROLL +should generally be the same. For example, in the access methods, each +page contains the log sequence number of the most recent log record that +describes a modification to the page. When the access method changes a +page it writes a log record describing the change and including the the +LSN that was on the page before the change. This LSN is referred to as +the previous LSN. The recovery functions read the page described by a +log record and compare the log sequence number (LSN) on the page to the +LSN they were passed. If the page LSN is less than the passed LSN and +the operation is undo, no action is necessary (because the modifications +have not been written to the page). If the page LSN is the same as the +previous LSN and the operation is redo, then the actions described are +reapplied to the page. If the page LSN is equal to the passed LSN and +the operation is undo, the actions are removed from the page; if the page +LSN is greater than the passed LSN and the operation is redo, no further +action is necessary. If the action is a redo and the LSN on the page is +less than the previous LSN in the log record this is an error, since this +could only happen if some previous log record was not processed. +<p>Please refer to the internal recovery functions in the Berkeley DB library +(found in files named XXX_rec.c) for examples of how recovery functions +should work. +<h3>Non-conformant Logging</h3> +<p>If your application cannot conform to the default logging and recovery +structure, then you will have to create your own logging and recovery +functions explicitly. +<p>First, you must decide how you will dispatch your records. Encapsulate +this algorithm in a dispatch function that is passed to <a href="../../api_c/env_open.html">DBENV->open</a>. +The arguments for the dispatch function are as follows: +<p><dl compact> +<p><dt>dbenv<dd>The environment handle returned by <a href="../../api_c/env_create.html">db_env_create</a>. +<p><dt>rec<dd>The record being recovered. +<p><dt>lsn<dd>The log sequence number of the record to be recovered. +<p><dt>op<dd>Indicates what operation of recovery is needed (openfiles, abort, forward roll +or backward roll). +<p><dt>info<dd>An opaque value passed to your function during system recovery. +</dl> +<p>When you abort a transaction, <a href="../../api_c/txn_abort.html">txn_abort</a> will read the last log +record written for the aborting transaction and will then call your +dispatch function. It will continue looping, calling the dispatch +function on the record whose LSN appears in the lsn parameter of the +dispatch call (until a NULL LSN is placed in that field). The dispatch +function will be called with the op set to DB_TXN_ABORT. +<p>Your dispatch function can do any processing necessary. See the code +in db/db_dispatch.c for an example dispatch function (that is based on +the assumption that the transaction ID, previous LSN, and record type +appear in every log record written). +<p>If you do not use the default recovery system, you will need to construct +your own recovery process based on the recovery program provided in +db_recover/db_recover.c. Note that your recovery functions will need to +correctly process the log records produced by calls to <a href="../../api_c/txn_begin.html">txn_begin</a> +and <a href="../../api_c/txn_commit.html">txn_commit</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/recimp.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/runtime.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/mt.html b/db/docs/ref/program/mt.html new file mode 100644 index 000000000..31110920a --- /dev/null +++ b/db/docs/ref/program/mt.html @@ -0,0 +1,95 @@ +<!--$Id: mt.so,v 10.37 2000/12/04 18:05:42 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Building multi-threaded applications</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/environ.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/scope.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Building multi-threaded applications</h1> +<p>The Berkeley DB library is not itself multi-threaded. The library was +deliberately architected to not use threads internally because of the +portability problems that using threads within the library would +introduce. +<p>Berkeley DB supports multi-threaded applications with the caveat that it loads +and calls functions that are commonly available in C language environments. +Other than this usage, Berkeley DB has no static data and maintains no local +context between calls to Berkeley DB functions. +<p>Environment and database object handles returned from Berkeley DB library +functions are free-threaded. No other object handles returned from +the Berkeley DB library are free-threaded. +<p>The following rules should be observed when using threads to +access the Berkeley DB library: +<p><ol> +<p><li>The <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flag must be specified to the <a href="../../api_c/env_open.html">DBENV->open</a> +and <a href="../../api_c/db_open.html">DB->open</a> functions if the Berkeley DB handles returned by those interfaces +will be used in the context of more than one thread. Setting the +<a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flag inconsistently may result in database corruption. +<p>Threading is assumed in the Java API, so no special flags are required, +and Berkeley DB functions will always behave as if the <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flag +was specified. +<p>Only a single thread may call the <a href="../../api_c/env_close.html">DBENV->close</a> or <a href="../../api_c/db_close.html">DB->close</a> functions +for a returned environment or database handle. +<p>No other Berkeley DB handles are free-threaded, for example, cursors and +transactions may not span threads as their returned handles are not +free-threaded. +<p><li>When using the non-cursor Berkeley DB calls to retrieve key/data items (e.g., +<a href="../../api_c/db_get.html">DB->get</a>), the memory referenced by the pointer stored into the +Dbt is only valid until the next call to Berkeley DB using the DB handle +returned by <a href="../../api_c/db_open.html">DB->open</a>. This includes any use of the returned +DB handle, including by another thread of control within the +process. +<p>For this reason, if the <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> handle was specified to the +<a href="../../api_c/db_open.html">DB->open</a> function, either <a href="../../api_c/dbt.html#DB_DBT_MALLOC">DB_DBT_MALLOC</a>, <a href="../../api_c/dbt.html#DB_DBT_REALLOC">DB_DBT_REALLOC</a> +or <a href="../../api_c/dbt.html#DB_DBT_USERMEM">DB_DBT_USERMEM</a> must be specified in the <a href="../../api_c/dbt.html">DBT</a> when +performing any non-cursor key or data retrieval. +<p><li>The <a href="../../api_c/dbc_get.html#DB_CURRENT">DB_CURRENT</a>, <a href="../../api_c/dbc_get.html#DB_NEXT">DB_NEXT</a> and <a href="../../api_c/dbc_get.html#DB_PREV">DB_PREV</a> flags to the +<a href="../../api_c/log_get.html">log_get</a> function may not be used by a free-threaded handle. If +such calls are necessary, a thread should explicitly create a unique +environment handle by separately calling <a href="../../api_c/env_open.html">DBENV->open</a> without +specifying <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a>. +<p><li>Each database operation (i.e., any call to a function underlying the +handles returned by <a href="../../api_c/db_open.html">DB->open</a> and <a href="../../api_c/db_cursor.html">DB->cursor</a>) is normally +performed on behalf of a unique locker. If, within a single thread of +control, multiple calls on behalf of the same locker are desired, then +transactions must be used. For example, consider the case where a +cursor scan locates a record, and then based on that record, accesses +some other item in the database. If these operations are done using +the default lockers for the handle, they may conflict. If the +application wishes to guarantee that the operations do not conflict, +locks must be obtained on behalf of a transaction, instead of the +default locker ID, and a transaction must be specified to subsequent +<a href="../../api_c/db_cursor.html">DB->cursor</a> and other Berkeley DB calls. +<p><li>Transactions may not span threads. Each transaction must begin and end +in the same thread, and each transaction may only be used by a single +thread. +<p>Cursors may not span transactions or threads. Each cursor must be +allocated and de-allocated within the same transaction and within +the same thread. +<p><li>User-level synchronization mutexes must have been implemented for the +compiler/architecture combination. Attempting to specify the DB_THREAD +flag will fail if fast mutexes are not available. +<p>If blocking mutexes are available, for example POSIX pthreads, they will +be used. Otherwise, the Berkeley DB library will make a system call to pause +for some amount of time when it is necessary to wait on a lock. This may +not be optimal, especially in a thread-only environment where it will be +more efficient to explicitly yield the processor to another thread. +<p>It is possible to specify a yield function on an per-application basis. +See <a href="../../api_c/set_func_yield.html">db_env_set_func_yield</a> for more information. +<p>It is possible to specify the number of attempts that will be made to +acquire the mutex before waiting. Se <a href="../../api_c/env_set_tas_spins.html">db_env_set_tas_spins</a> for +more information. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/environ.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/scope.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/namespace.html b/db/docs/ref/program/namespace.html new file mode 100644 index 000000000..519f5f61c --- /dev/null +++ b/db/docs/ref/program/namespace.html @@ -0,0 +1,44 @@ +<!--$Id: namespace.so,v 10.14 2000/08/01 21:51:23 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Name spaces</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/scope.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/copy.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Name spaces</h1> +<p>The Berkeley DB library is careful to avoid C language programmer name spaces, +but there are a few potential areas for concern, mostly in the Berkeley DB +include file db.h. The db.h include file defines a number of types and +strings. Where possible, all of these types and strings are prefixed with +"DB_" or "db_". There are a few notable exceptions. +<p>The Berkeley DB library uses a macro named "__P" to configure for systems that +do not provide ANSI C function prototypes. This could potentially collide +with other systems using a "__P" macro for similar or different purposes. +<p>The Berkeley DB library needs information about specifically sized types for +each architecture. If they are not provided by the system, they are +typedef'd in the db.h include file. The types which may be typedef'd +by db.h include the following: u_int8_t, int16_t, u_int16_t, int32_t, +u_int32_t, u_char, u_short, u_int and u_long. +<p>The Berkeley DB library declares a number of external routines. All of these +routines are prefixed with the strings "db_", "lock_", "log_", "memp_" +or "txn_". All internal routines are prefixed with the strings "__db_", +"__lock_," "__log_", "__memp_" or "__txn_". +<p>Berkeley DB environments create or use some number of files in environment home +directories. These files are named <a href="../../ref/env/naming.html#DB_CONFIG">DB_CONFIG</a>, "log.NNNNNNNNNN" +(e.g., log.0000000003), or with the string prefix "__db" (e.g., __db.001). +Database files that match these names should not be created in the +environment directory. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/scope.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/copy.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/recimp.html b/db/docs/ref/program/recimp.html new file mode 100644 index 000000000..240eccd8b --- /dev/null +++ b/db/docs/ref/program/recimp.html @@ -0,0 +1,49 @@ +<!--$Id: recimp.so,v 11.2 2000/03/18 21:43:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Recovery implementation</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/filesys.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/reclimit.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Recovery implementation</h1> +<p>The physical recovery process works as follows: +<p>First, find the last checkpoint that completed. Since the system may +have crashed while writing a checkpoint, this implies finding the +second-to-last checkpoint in the log files. Read forward from this +checkpoint, opening any database files for which modifications are found +in the log. +<p>Then, read backward from the end of the log. For each commit record +encountered, record its transaction ID. For every other data update +record, find the transaction ID of the record. If that transaction ID +appears in the list of committed transactions, do nothing; if it does not +appear in the committed list, then call the appropriate recovery routine +to undo the operation. +<p>In the case of catastrophic recovery, this roll-backward pass continues +through all the present log files. In the case of normal recovery, this +pass continues until we find a checkpoint written before the second-to-last +checkpoint described above. +<p>When the roll-backward pass is complete, the roll-forward pass begins at +the point where the roll-backward pass ended. Each record is read and if +its transaction id is in the committed list, then the appropriate recovery +routine is called to redo the operation if necessary. +<p>In a distributed transaction environment, there may be transactions that +are prepared, but not yet committed. If these transactions are XA +transactions, then they are rolled forward to their current state, and an +active transaction corresponding to it is entered in the transaction table +so that the XA transaction manager may call either transaction abort or +commit, depending on the outcome of the overall transaction. If the +transaction is not an XA transaction, then it is aborted like any other +transactions would be. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/filesys.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/reclimit.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/runtime.html b/db/docs/ref/program/runtime.html new file mode 100644 index 000000000..a6f860bca --- /dev/null +++ b/db/docs/ref/program/runtime.html @@ -0,0 +1,57 @@ +<!--$Id: runtime.so,v 10.23 2000/12/04 18:05:42 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Run-time configuration</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/extending.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Run-time configuration</h1> +<p>There are a few interfaces that support run-time configuration of Berkeley DB. +First is a group of interfaces that allow applications to intercept +Berkeley DB requests for underlying library or system call functionality: +<p><blockquote><pre><a href="../../api_c/set_func_close.html">db_env_set_func_close</a> +<a href="../../api_c/set_func_dirfree.html">db_env_set_func_dirfree</a> +<a href="../../api_c/set_func_dirlist.html">db_env_set_func_dirlist</a> +<a href="../../api_c/set_func_exists.html">db_env_set_func_exists</a> +<a href="../../api_c/set_func_free.html">db_env_set_func_free</a> +<a href="../../api_c/set_func_fsync.html">db_env_set_func_fsync</a> +<a href="../../api_c/set_func_ioinfo.html">db_env_set_func_ioinfo</a> +<a href="../../api_c/set_func_malloc.html">db_env_set_func_malloc</a> +<a href="../../api_c/set_func_map.html">db_env_set_func_map</a> +<a href="../../api_c/set_func_open.html">db_env_set_func_open</a> +<a href="../../api_c/set_func_read.html">db_env_set_func_read</a> +<a href="../../api_c/set_func_realloc.html">db_env_set_func_realloc</a> +<a href="../../api_c/set_func_seek.html">db_env_set_func_seek</a> +<a href="../../api_c/set_func_sleep.html">db_env_set_func_sleep</a> +<a href="../../api_c/set_func_unlink.html">db_env_set_func_unlink</a> +<a href="../../api_c/set_func_unmap.html">db_env_set_func_unmap</a> +<a href="../../api_c/set_func_write.html">db_env_set_func_write</a> +<a href="../../api_c/set_func_yield.html">db_env_set_func_yield</a></pre></blockquote> +<p>These interfaces are only available from the Berkeley DB C language API. +<p>In addition, there are a few interfaces that allow applications to +re-configure, on an application-wide basis, Berkeley DB behaviors. +<p><blockquote><pre><a href="../../api_c/env_set_mutexlocks.html">DBENV->set_mutexlocks</a> +<a href="../../api_c/env_set_pageyield.html">db_env_set_pageyield</a> +<a href="../../api_c/env_set_panicstate.html">db_env_set_panicstate</a> +<a href="../../api_c/env_set_region_init.html">db_env_set_region_init</a> +<a href="../../api_c/env_set_tas_spins.html">db_env_set_tas_spins</a></pre></blockquote> +<p>These interfaces are available from all of the Berkeley DB programmatic APIs. +<p>A not-uncommon problem for applications is the new API in Solaris 2.6 +for manipulating large files. As this API was not part of Solaris 2.5, +it is difficult to create a single binary that takes advantage of the +large file functionality in Solaris 2.6 but which still runs on Solaris +2.5. <a href="solaris.txt">Example code</a> that supports this is +included in the Berkeley DB distribution. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/extending.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/scope.html b/db/docs/ref/program/scope.html new file mode 100644 index 000000000..198147932 --- /dev/null +++ b/db/docs/ref/program/scope.html @@ -0,0 +1,71 @@ +<!--$Id: scope.so,v 10.3 2000/08/10 17:54:49 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Berkeley DB handles</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/mt.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/namespace.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Berkeley DB handles</h1> + <a name="3"><!--meow--></a> +<p>The Berkeley DB library has a number of object handles. The following table +lists those handles, their scope, and if they are free-threaded, that +is, if multiple threads within a process can share them. +<p><dl compact> +<p><dt>DB_ENV<dd>The DB_ENV handle is created by the <a href="../../api_c/env_create.html">db_env_create</a> function and +references a Berkeley DB database environment, a collection of +databases and Berkeley DB subsystems. DB_ENV handles are free-threaded +if the <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flag is specified to the <a href="../../api_c/env_open.html">DBENV->open</a> function +when the environment is opened. The handle should not be closed while +any other handle remains open that is using it as a reference +(e.g., DB or DB_TXN). Once either the <a href="../../api_c/env_close.html">DBENV->close</a> or +<a href="../../api_c/env_remove.html">DBENV->remove</a> functions are called, the handle may not be accessed again, +regardless of the function's return. +<p><dt>DB_TXN<dd>The DB_TXN handle is created by the <a href="../../api_c/txn_begin.html">txn_begin</a> function and +references a single transaction. The handle is not free-threaded, and +transactions may not span threads nor may transactions be used by more +than a single thread. +Once the +<a href="../../api_c/txn_abort.html">txn_abort</a> or <a href="../../api_c/txn_commit.html">txn_commit</a> functions are called, the handle may +not be accessed again, regardless of the function's return. +In addition, parent transactions may not issue +any Berkeley DB operations, except for <a href="../../api_c/txn_begin.html">txn_begin</a>, <a href="../../api_c/txn_abort.html">txn_abort</a> +and <a href="../../api_c/txn_commit.html">txn_commit</a>, while it has active child transactions (child +transactions that have not yet been committed or aborted). +<p><dt>DB_MPOOLFILE<dd>The DB_MPOOLFILE handle references an open file in the shared +memory buffer pool of the database environment. The handle is not +free-threaded. Once the <a href="../../api_c/memp_fclose.html">memp_fclose</a> function is called, the handle may +not be accessed again, regardless of the function's return. +<p><dt>DB<dd>The DB handle is created by the <a href="../../api_c/db_create.html">db_create</a> function and +references a single Berkeley DB database, which may or may not be part of a +database environment. DB handles are free-threaded if the +<a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flag is specified to the <a href="../../api_c/db_open.html">DB->open</a> function when the +database is opened, or if the database environment in which the database +is opened is free-threaded. The handle should not be closed while any +other handle that references the database is in use, e.g., database +handles must not be closed while cursor handles into the database remain +open, or transactions which include operations on the database have not +yet been committed or aborted. Once the <a href="../../api_c/db_close.html">DB->close</a>, +<a href="../../api_c/db_remove.html">DB->remove</a> or <a href="../../api_c/db_rename.html">DB->rename</a> functions are called, the handle may +not be accessed again, regardless of the function's return. +<p><dt>DBC<dd>The DBC handle references a cursor into a Berkeley DB database. The +handle is not free-threaded and cursors may not span threads nor may +cursors be used by more than a single thread. If the cursor is to be +used to perform operations on behalf of a transaction, the cursor must +be opened and closed within the context of that single transaction. +Once <a href="../../api_c/dbc_close.html">DBcursor->c_close</a> has been called, the handle may not be accessed +again, regardless of the function's return. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/mt.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/namespace.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/program/solaris.txt b/db/docs/ref/program/solaris.txt new file mode 100644 index 000000000..d2ec31682 --- /dev/null +++ b/db/docs/ref/program/solaris.txt @@ -0,0 +1,213 @@ +#ifdef OS_solaris + * This is all for Solaris 2.6. + * + * Sun defined a new API in Solaris2.6 to be used when manipulating large + * (>2Gbyte) files. This API isn't present in 2.5.x, so we can't simply + * call it -- that would mean two binaries, one for 2.5.x and the other for + * 2.6. Not pretty. So, what we do here is determine the OS on which we're + * running at runtime, and adjust the underlying Berkeley DB calls to use + * the new API if it's there. + */ + +/* This must match the definition of stat64 in Solaris2.6 */ +struct our_stat64 { + dev_t st_dev; + long st_pad1[3]; /* reserve for dev expansion */ + u_longlong_t st_ino; + mode_t st_mode; + nlink_t st_nlink; + uid_t st_uid; + gid_t st_gid; + dev_t st_rdev; + long st_pad2[2]; + longlong_t st_size; + timestruc_t mst_atime; + timestruc_t mst_mtime; + timestruc_t mst_ctime; + long st_blksize; + longlong_t st_blocks; /* large file support */ + char st_fstype[_ST_FSTYPSZ]; + long st_pad4[8]; /* expansion area */ +}; + +#define MEGABYTE (1024 * 1024) + +typedef int (*open_fn)(const char *path, int flags, ...); +typedef longlong_t (*lseek64_fn)(int fildes, longlong_t offset, int whence); +typedef longlong_t (*fstat64_fn)(int fildes, struct our_stat64 *s); +typedef void* (*mmap64_fn)(void* addr, size_t len, int prot, int flags, +int filedes, longlong_t off); + +static fstat64_fn os_fstat64_fn = NULL; +static lseek64_fn os_lseek64_fn = NULL; +static mmap64_fn os_mmap64_fn = NULL; +static open_fn os_open64_fn = NULL; + +static int dblayer_load_largefile_fns() +{ + void *lib_handle = NULL; + void *function_found = NULL; + int ret = 0; + + lib_handle = dlopen(NULL, RTLD_NOW); + if (NULL == lib_handle) + return (-1); + + function_found = dlsym(lib_handle,"open64"); + if (NULL == function_found) + return (-1); + os_open64_fn = (open_fn)function_found; + + function_found = dlsym(lib_handle,"lseek64"); + if (NULL == function_found) + return (-1); + os_lseek64_fn = (lseek64_fn)function_found; + + function_found = dlsym(lib_handle,"fstat64"); + if (NULL == function_found) + return (-1); + os_fstat64_fn = (fstat64_fn)function_found; + + function_found = dlsym(lib_handle,"mmap64"); + if (NULL == function_found) + return (-1); + os_mmap64_fn = (mmap64_fn)function_found; + + return 0; +} + +/* Helper function for large seeks */ +static int dblayer_seek_fn_solaris(int fd, + size_t pgsize, db_pgno_t pageno, u_long relative, int whence) +{ + longlong_t offset = 0; + longlong_t ret = 0; + + if (NULL == os_lseek64_fn) { + return -1; + } + + offset = (longlong_t)pgsize * pageno + relative; + + ret = (*os_lseek64_fn)(fd,offset,whence); + + return (ret == -1) ? errno : 0; +} + +/* Helper function for large file mmap */ +static int dblayer_map_solaris(fd, len, is_private, is_rdonly, addr) + int fd, is_private, is_rdonly; + size_t len; + void **addr; +{ + void *p; + int flags, prot; + + flags = is_private ? MAP_PRIVATE : MAP_SHARED; + prot = PROT_READ | (is_rdonly ? 0 : PROT_WRITE); + + if ((p = (*os_mmap64_fn)(NULL, + len, prot, flags, fd, (longlong_t)0)) == (void *)MAP_FAILED) + return (errno); + + *addr = p; + return (0); +} + +/* Helper function for large fstat */ +static int dblayer_ioinfo_solaris(const char *path, + int fd, u_int32_t *mbytesp, u_int32_t *bytesp, u_int32_t *iosizep) +{ + struct our_stat64 sb; + + if (NULL == os_fstat64_fn) { + return -1; + } + + if ((*os_fstat64_fn)(fd, &sb) == -1) + return (errno); + + /* Return the size of the file. */ + if (mbytesp != NULL) + *mbytesp = (u_int32_t) (sb.st_size / (longlong_t)MEGABYTE); + if (bytesp != NULL) + *bytesp = (u_int32_t) (sb.st_size % (longlong_t)MEGABYTE); + + /* + * Return the underlying filesystem blocksize, if available. Default + * to 8K on the grounds that most OS's use less than 8K as their VM + * page size. + */ + if (iosizep != NULL) + *iosizep = sb.st_blksize; + return (0); +} +#endif + +#ifdef irix + * A similar mess to Solaris: a new API added in IRIX6.2 to support large + * files. We always build on 6.2 or later, so no need to do the same song + * and dance as on Solaris -- we always have the header files for the + * 64-bit API. + */ + +/* Helper function for large seeks */ +static int dblayer_seek_fn_irix(int fd, + size_t pgsize, db_pgno_t pageno, u_long relative, int whence) +{ + off64_t offset = 0; + off64_t ret = 0; + + offset = (off64_t)pgsize * pageno + relative; + + ret = lseek64(fd,offset,whence); + + return (ret == -1) ? errno : 0; +} + +/* Helper function for large fstat */ +static int dblayer_ioinfo_irix(const char *path, + int fd, u_int32_t *mbytesp, u_int32_t *bytesp, u_int32_t *iosizep) +{ + struct stat64 sb; + + if (fstat64(fd, &sb) == -1) { + return (errno); + } + + /* Return the size of the file. */ + if (mbytesp != NULL) + *mbytesp = (u_int32_t) (sb.st_size / (off64_t)MEGABYTE); + if (bytesp != NULL) + *bytesp = (u_int32_t) (sb.st_size % (off64_t)MEGABYTE); + + if (iosizep != NULL) + *iosizep = sb.st_blksize; + return (0); +} +#endif /* irix */ + +static int dblayer_override_libdb_functions(dblayer_private *priv) +{ +#if defined(OS_solaris) + int ret = 0; + + ret = dblayer_load_largefile_fns(); + if (0 != ret) { + Debug("Not Solaris2.6: no large file support enabled\n"); + } else { + /* Means we did get the XXX64 functions, so let's use them */ + db_jump_set((void*)os_open64_fn, DB_FUNC_OPEN); + db_jump_set((void*)dblayer_seek_fn_solaris, DB_FUNC_SEEK); + db_jump_set((void*)dblayer_ioinfo_solaris, DB_FUNC_IOINFO); + db_jump_set((void*)dblayer_map_solaris, DB_FUNC_MAP); + Debug("Solaris2.6: selected 64-bit file handling.\n"); + } +#else +#if defined (irix) + db_jump_set((void*)dblayer_seek_fn_irix, DB_FUNC_SEEK); + db_jump_set((void*)dblayer_ioinfo_irix, DB_FUNC_IOINFO); +#endif /* irix */ +#endif /* OS_solaris */ + return 0; +} diff --git a/db/docs/ref/program/version.html b/db/docs/ref/program/version.html new file mode 100644 index 000000000..d1b1254a1 --- /dev/null +++ b/db/docs/ref/program/version.html @@ -0,0 +1,45 @@ +<!--$Id: version.so,v 10.14 2000/03/18 21:43:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Library version information</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Programmer Notes</dl></h3></td> +<td width="1%"><a href="../../ref/program/copy.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/dbsizes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Library version information</h1> +<p>Each release of the Berkeley DB library has a major version number, a minor +version number, and a patch number. +<p>The major version number changes only when major portions of the Berkeley DB +functionality have been changed. In this case, it may be necessary to +significantly modify applications in order to upgrade them to use the new +version of the library. +<p>The minor version number changes when Berkeley DB interfaces have changed, and +the new release is not entirely backward compatible with previous releases. +To upgrade applications to the new version, they must be recompiled, and +potentially, minor modifications made, (e.g., the order of arguments to a +function might have changed). +<p>The patch number changes on each release. If only the patch number +has changed in a release, applications do not need to be recompiled, +and they can be upgraded to the new version by simply installing a +new version of the shared library. +<p>Internal Berkeley DB interfaces may change at any time and during any release, +without warning. This means that the library must be entirely recompiled +and reinstalled when upgrading to new releases of the library, as there +is no guarantee that modules from the current version of the library will +interact correctly with modules from a previous release. +<p>To retrieve the Berkeley DB version information, applications should use the +<a href="../../api_c/env_version.html">db_version</a> interface. In addition to the above information, the +<a href="../../api_c/env_version.html">db_version</a> interface returns a string encapsulating the version +information, suitable for display to a user. +<table><tr><td><br></td><td width="1%"><a href="../../ref/program/copy.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/dbsizes.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/refs/bdb_usenix.html b/db/docs/ref/refs/bdb_usenix.html new file mode 100644 index 000000000..58e82b573 --- /dev/null +++ b/db/docs/ref/refs/bdb_usenix.html @@ -0,0 +1,1120 @@ +<!--"@(#)usenix.html 1.2 4/26/99"--> +<html> +<head> +<title>Berkeley DB</title> +</head> +<body bgcolor="white"> +<center> +<h1> +Berkeley DB +</h1> +<p> +<i> +Michael A. Olson +<br> +Keith Bostic +<br> +Margo Seltzer +<br> +<br> +Sleepycat Software, Inc. +<br> +<br> +</i> +<b> +Abstract +</b> +</center> +<font size="-1"> +<blockquote> +<p> +Berkeley DB is an Open Source embedded database system with a number +of key advantages over comparable systems. It is simple to use, supports +concurrent access by multiple users, and provides industrial-strength +transaction support, including surviving system and disk crashes. This +paper describes the design and technical features of Berkeley DB, the +distribution, and its license. +</blockquote> +</font> +<h1> +Introduction +</h1> +<p> +The Berkeley Database (Berkeley DB) is an embedded database system +that can be used in applications requiring high-performance +concurrent storage and retrieval of key/value pairs. The software +is distributed as a library that can be linked directly into an +application. +It provides a variety of programmatic interfaces, +including callable APIs for C, C++, Perl, Tcl and Java. +Users may download Berkeley DB from Sleepycat Software's Web site, +at +<a href="http://www.sleepycat.com">www.sleepycat.com</a>. +<p> +Sleepycat distributes Berkeley DB as an Open Source product. The company +collects license fees for certain uses of the software and sells support +and services. +<h2> +History +</h2> +<p> +Berkeley DB began as a new implementation of a hash access method +to replace both +<tt>hsearch</tt> +and the various +<tt>dbm</tt> +implementations +(<tt>dbm</tt> from AT&T, +<tt>ndbm</tt> +from Berkeley, and +<tt>gdbm</tt> +from the GNU project). +In 1990 Seltzer and Yigit produced a package called Hash to do this +<a href="#Selt91">[Selt91]</a>. +<p> +The first general release of Berkeley DB, in 1991, +included some interface changes and a new B+tree access method. +At roughly the same time, Seltzer and Olson +developed a prototype transaction +system based on Berkeley DB, called LIBTP <a href="#Selt92">[Selt92]</a>, +but never released the code. +<p> +The 4.4BSD UNIX release included Berkeley DB 1.85 in 1992. +Seltzer and Bostic maintained the code in the early 1990s +in Berkeley and in Massachusetts. +Many users adopted the code during this period. +<p> +By mid-1996, +users wanted commercial support for the software. +In response, Bostic and Seltzer formed Sleepycat Software. +The company enhances, distributes, and +supports Berkeley DB and supporting software and documentation. +Sleepycat released version 2.1 of Berkeley DB in mid-1997 +with important new features, including +support for concurrent access to databases. +The company makes about three commercial releases a year, +and most recently shipped version 2.8. +<h2> +Overview of Berkeley DB +</h2> +<p> +The C interfaces in Berkeley DB permit +<tt>dbm</tt>-style +record management +for databases, +with significant extensions to handle duplicate data items elegantly, +to deal with concurrent access, and to provide transactional +support so that multiple changes can be simultaneously committed +(so that they are made permanent) or rolled back (so that the +database is restored to its state at the beginning of the transaction). +<p> +C++ and Java interfaces provide a small set of classes for +operating on a database. The main class in both cases is called +<tt>Db</tt>, +and provides methods that encapsulate the +<tt>dbm</tt>-style +interfaces that the C interfaces provide. +<p> +Tcl and Perl interfaces allow developers working in those languages +to use Berkeley DB in their applications. +Bindings for both languages are included in the distribution. +<p> +Developers may compile their applications and link in Berkeley DB +statically or dynamically. +<h2> +How Berkeley DB is used +</h2> +<p> +The Berkeley DB library supports concurrent access to databases. +It can be linked +into standalone applications, into a collection of cooperating applications, +or into servers that handle requests and do database operations on +behalf of clients. +<p> +Compared to using a standalone database management system, Berkeley +DB is easy to understand and simple to use. The +software stores and retrieves records, which consist of key/value pairs. +Keys are used to locate items and can be any data type or structure +supported by the programming language. +<p> +The programmer can provide the functions that Berkeley DB uses to +operate on keys. +For example, +B+trees can use a custom comparison function, +and the Hash access method can use a custom hash function. +Berkeley DB uses default functions if none are supplied. +Otherwise, Berkeley DB does not examine or interpret either keys +or values in any way. +Values may be arbitrarily long. +<p> +It is also important to understand what Berkeley DB is not. +It is not a database server that handles network requests. It is not an +SQL engine that executes queries. It is not a relational or object-oriented +database management system. +<p> +It is possible to build any of those on top of Berkeley DB, +but the package, as distributed, +is an embedded database engine. It has been designed +to be portable, small, fast, and reliable. +<h2> +Applications that use Berkeley DB +</h2> +<p> +Berkeley DB is embedded in a variety of proprietary and Open Source +software packages. +This section highlights a few of the products that use it. +<p> +Directory servers, which do data storage and retrieval using the +Local Directory Access Protocol (LDAP), provide naming and directory +lookup service on local-area networks. +This service is, +essentially, +database query and update, +but uses a simple protocol rather than SQL or ODBC. +Berkeley DB is the embedded data manager in the majority of deployed +directory servers today, +including LDAP servers from Netscape, +MessageDirect (formerly Isode), +and others. +<p> +Berkeley DB is also embedded in a large number of mail servers. +Intermail, +from Software.com, +uses Berkeley DB as a message store +and as the backing store for its directory server. +The sendmail server +(including both the commercial Sendmail Pro offering from Sendmail, +Inc. and the version distributed by sendmail.org) +uses Berkeley DB to store aliases and other information. +Similarly, +Postfix (formerly VMailer) uses Berkeley DB +to store administrative information. +<p> +In addition, +Berkeley DB is embedded in a wide variety of other software products. +Example applications include managing access control lists, +storing user keys in a public-key infrastructure, +recording machine-to-network-address mappings in address servers, +and storing configuration and device information in video +post-production software. +<p> +Finally, +Berkeley DB is a part of many other Open Source software packages +available on the Internet. +For example, +the software is embedded in the Apache Web server and the Gnome desktop. +<h1> +Access Methods +</h1> +<p> +In database terminology, an access method is the disk-based structure +used to store data and the operations available on that structure. +For example, many database systems support a B+tree access method. +B+trees allow equality-based lookups (find keys equal to some constant), +range-based lookups (find keys between two constants) and record +insertion and deletion. +<p> +Berkeley DB supports three access methods: B+tree, +Extended Linear Hashing (Hash), +and Fixed- or Variable-length Records (Recno). +All three operate on records composed of a key and a data value. +In the B+tree and Hash access methods, keys can have arbitrary structure. +In the Recno access method, each record is assigned a record number, which +serves as the key. +In all the access methods, the +value can have arbitrary structure. +The programmer can supply comparison or hashing functions for keys, +and Berkeley DB stores and retrieves values without +interpreting them. +<p> +All of the access methods use the host filesystem as a backing store. +<h2> +Hash +</h2> +<p> +Berkeley DB includes a Hash access method that implements extended +linear hashing <a href="#Litw80">[Litw80]</a>. +Extended linear hashing adjusts the hash function as the hash +table grows, attempting to keep all buckets underfull in the steady +state. +<p> +The Hash access method supports insertion and deletion of records and +lookup by exact match only. Applications may iterate over all records +stored in a table, but the order in which they are returned is undefined. +<h2> +B+tree +</h2> +<p> +Berkeley DB includes a B+tree <a href="#Come79">[Come79]</a> access method. +B+trees store records of key/value pairs in leaf pages, +and pairs of (key, child page address) at internal nodes. +Keys in the tree are stored in sorted order, +where the order is determined by the comparison function supplied when the +database was created. +Pages at the leaf level of the tree include pointers +to their neighbors to simplify traversal. B+trees support lookup by +exact match (equality) or range (greater than or equal to a key). +Like Hash tables, B+trees support record insertion, +deletion, and iteration over all records in the tree. +<p> +As records are inserted and pages in the B+tree fill up, they are split, +with about half the keys going into a new peer page at the same level in +the tree. +Most B+tree implementations leave both nodes half-full after a split. +This leads to poor performance in a common case, where the caller inserts +keys in order. +To handle this case, Berkeley DB keeps track of the insertion order, +and splits pages unevenly to keep pages fuller. +This reduces tree size, yielding better search performance and smaller +databases. +<p> +On deletion, empty pages are coalesced by reverse splits +into single pages. +The access method does no other page balancing on insertion +or deletion. +Keys are not moved among pages at every update +to keep the tree well-balanced. While this could improve search times +in some cases, the additional code complexity leads to slower updates and +is prone to deadlocks. +<p> +For simplicity, Berkeley DB B+trees do no prefix compression of keys +at internal or leaf nodes. +<h2> +Recno +</h2> +<p> +Berkeley DB includes a fixed- or variable-length record access method, +called +<i>Recno</i>. +The Recno access method assigns logical record numbers to each +record, +and can search for and update records by record number. +Recno is able, +for example, +to load a text file into a database, +treating each line as a record. +This permits fast searches by line number for applications like +text editors <a href="#Ston82">[Ston82]</a>. +<p> +Recno is actually built +on top of the B+tree access method and provides a simple interface +for storing sequentially-ordered data values. +The Recno access method generates keys internally. +The programmer's view of the values is that +they are numbered sequentially from one. +Developers can choose to have records automatically renumbered +when lower-numbered records are added or deleted. +In this case, new keys can be inserted between existing keys. +<h1> +Features +</h1> +<p> +This section describes important features of Berkeley DB. +In general, +developers can choose which features are useful to them, +and use only those that are required by their application. +<p> +For example, +when an application opens a database, it can declare the degree of +concurrency and recovery that it requires. Simple stand-alone applications, +and in particular ports of applications that used +<tt>dbm</tt> +or one of its +variants, generally do not require concurrent access or crash recovery. +Other applications, such as enterprise-class database management systems +that store sales transactions or other critical data, need full +transactional service. Single-user operation is faster than multi-user +operation, since no overhead is incurred by locking. Running with +the recovery system disabled is faster than running with it enabled, +since log records need not be written when changes are made to the +database. +<p> +In addition, some core subsystems, including the locking system and +the logging facility, +can be used outside the context of the access methods as well. +Although few users have chosen to do so, it is possible to +use only the lock manager in Berkeley DB to control concurrency +in an application, without using any of the standard database services. +Alternatively, the caller can integrate locking of non-database resources +with Berkeley DB's transactional two-phase locking system, to impose +transaction semantics on objects outside the database. +<h2> +Programmatic interfaces +</h2> +<p> +Berkeley DB defines a simple API for database management. +The package does not include industry-standard +programmatic interfaces such as Open Database Connectivity (ODBC), +Object Linking and Embedding for Databases (OleDB), or Structured +Query Language (SQL). These interfaces, while useful, were +designed to promote interoperability of database systems, and not +simplicity or performance. +<p> +In response to customer demand, +Berkeley DB 2.5 introduced support for the XA standard <a href="#Open94">[Open94]</a>. +XA permits Berkeley DB to participate in distributed transactions +under a transaction processing monitor like Tuxedo from BEA Systems. +Like XA, other standard interfaces can be built on top of the +core system. +The standards do not belong inside Berkeley DB, +since not all applications need them. +<h2> +Working with records +</h2> +<p> +A database user may need to search for particular keys in a database, +or may simply want to browse available records. +Berkeley DB supports both keyed access, +to find one or more records with a given key, +or sequential access, +to retrieve all the records in the database one at a time. +The order of the records returned during sequential scans +depends on the access method. +B+tree and Recno databases return records in sort order, +and Hash databases return them in apparently random order. +<p> +Similarly, +Berkeley DB defines simple interfaces for inserting, +updating, +and deleting records in a database. +<h2> +Long keys and values +</h2> +<p> +Berkeley DB manages keys and values as large as +2<sup>32</sup> bytes. +Since the time required to copy a record is proportional to its size, +Berkeley DB includes interfaces that operate on partial records. +If an application requires only part of a large record, +it requests partial record retrieval, +and receives just the bytes that it needs. +The smaller copy saves both time and memory. +<p> +Berkeley DB allows the programmer to define the data types of +keys and values. +Developers use any type expressible in the programming language. +<h2> +Large databases +</h2> +<p> +A single database managed by Berkeley DB can be up to 2<sup>48</sup> +bytes, +or 256 petabytes, +in size. +Berkeley DB uses the host filesystem as the backing store +for the database, +so large databases require big file support from the operating system. +Sleepycat Software has customers using Berkeley DB +to manage single databases in excess of 100 gigabytes. +<h2> +Main memory databases +</h2> +<p> +Applications that do not require persistent storage can create +databases that exist only in main memory. +These databases bypass the overhead imposed by the I/O system +altogether. +<p> +Some applications do need to use disk as a backing store, +but run on machines with very large memory. +Berkeley DB is able to manage very large shared memory regions +for cached data pages, +log records, +and lock management. +For example, +the cache region used for data pages may be gigabytes in size, +reducing the likelihood that any read operation will need to +visit the disk in the steady state. +The programmer declares the size of the cache region at +startup. +<p> +Finally, many operating systems provide memory-mapped file services +that are much faster than their general-purpose file system +interfaces. +Berkeley DB can memory-map its database files for read-only database use. +The application operates on records stored directly on the pages, +with no cache management overhead. +Because the application gets pointers directly into the +Berkeley DB pages, +writes cannot be permitted. +Otherwise, +changes could bypass the locking and logging systems, +and software errors could corrupt the database. +Read-only applications can use Berkeley DB's memory-mapped +file service to improve performance on most architectures. +<h2> +Configurable page size +</h2> +<p> +Programmers declare the size of the pages used by their access +methods when they create a database. +Although Berkeley DB provides reasonable defaults, +developers may override them to control system performance. +Small pages reduce the number of records that fit on a single page. +Fewer records on a page means that fewer records are locked when +the page is locked, +improving concurrency. +The per-page overhead is proportionally higher with smaller pages, +of course, +but developers can trade off space for time as an application requires. +<h2> +Small footprint +</h2> +<p> +Berkeley DB is a compact system. +The full package, including all access methods, recoverability, +and transaction support +is roughly 175K of text space on common architectures. +<h2> +Cursors +</h2> +<p> +In database terminology, a cursor is a pointer into an access method +that can be called iteratively to return records in sequence. Berkeley +DB includes cursor interfaces for all access methods. This permits, +for example, users to traverse a B+tree and view records in order. +Pointers to records in cursors are persistent, so that once fetched, +a record may be updated in place. Finally, cursors support access to +chains of duplicate data items in the various access methods. +<h2> +Joins +</h2> +<p> +In database terminology, +a join is an operation that spans multiple separate +tables (or in the case of Berkeley DB, multiple separate DB files). +For example, a company may store information about its customers +in one table and information about sales in another. An application +will likely want to look up sales information by customer name; this +requires matching records in the two tables that share a common +customer ID field. +This combining of records from multiple tables is called a join. +<p> +Berkeley DB includes interfaces for joining two or more tables. +<h2> +Transactions +</h2> +<p> +Transactions have four properties <a href="#Gray93">[Gray93]</a>: +<ul> +<li> +They are atomic. That is, all of the changes made in a single +transaction must be applied at the same instant or not at all. +This permits, for example, the transfer of money between two +accounts to be accomplished, by making the reduction of the +balance in one account and the increase in the other into a +single, atomic action. +</li> +<li> +They must be consistent. That is, changes to the database +by any transaction cannot leave the database in an illegal +or corrupt state. +</li> +<li> +They must be isolatable. Regardless of the number of users +working in the database at the same time, every user must have +the illusion that no other activity is going on. +</li> +<li> +They must be durable. Even if the disk that stores the database +is lost, it must be possible to recover the database to its last +transaction-consistent state. +</li> +</ul> +<p> +This combination of properties -- atomicity, consistency, isolation, and +durability -- is referred to as ACIDity in the literature. Berkeley DB, +like most database systems, provides ACIDity using a collection of core +services. +<p> +Programmers can choose to use Berkeley DB's transaction services +for applications that need them. +<h3> +Write-ahead logging +</h3> +<p> +Programmers can enable the logging system when they start up Berkeley DB. +During a transaction, +the application makes a series of changes to the database. +Each change is captured in a log entry, +which holds the state of the database record +both before and after the change. +The log record is guaranteed +to be flushed to stable storage before any of the changed data pages +are written. +This behavior -- writing the log before the data pages -- is called +<i>write-ahead logging</i>. +<p> +At any time during the transaction, +the application can +<i>commit</i>, +making the changes permanent, +or +<i>roll back</i>, +cancelling all changes and restoring the database to its +pre-transaction state. +If the application +rolls back the transaction, then the log holds the state of all +changed pages prior to the transaction, and Berkeley DB simply +restores that state. +If the application commits the transaction, +Berkeley DB writes the log records to disk. +In-memory copies of the data pages already reflect the changes, +and will be flushed as necessary during normal processing. +Since log writes are sequential, but data page +writes are random, this improves performance. +<h3> +Crashes and recovery +</h3> +<p> +Berkeley DB's write-ahead log is used by the transaction +system to commit or roll back transactions. +It also gives the recovery system the information that +it needs to protect against data loss or corruption +from crashes. +Berkeley DB is able to survive application crashes, +system crashes, +and even catastrophic failures like the loss of a hard +disk, +without losing any data. +<p> +Surviving crashes requires data stored in several different places. +During normal processing, +Berkeley DB has copies of active log records and recently-used +data pages in memory. +Log records are flushed to the log disk when transactions commit. +Data pages trickle out to the data disk as pages move through +the buffer cache. +Periodically, +the system administrator backs up the data disk, +creating a safe copy of the database at a particular instant. +When the database is backed up, +the log can be truncated. +For maximum robustness, +the log disk and data disk should be separate devices. +<p> +Different system failures can destroy memory, +the log disk, +or the data disk. +Berkeley DB is able to survive the loss of any one +of these repositories +without losing any committed transactions. +<p> +If the computer's memory is lost, +through an application or operating system crash, +then the log holds all committed transactions. +On restart, +the recovery system rolls the log forward against +the database, +reapplying any changes to on-disk pages that were in memory at the +time of the crash. +Since the log contains pre- and post-change state for +transactions, +the recovery system also uses the log to restore any pages to +their original state if they were modified by transactions +that never committed. +<p> +If the data disk is lost, +the system administrator can restore the most recent copy from backup. +The recovery system will roll the entire log forward against +the original database, +reapplying all committed changes. +When it finishes, +the database will contain every change made by every +transaction that ever committed. +<p> +If the log disk is lost, +then the recovery system can use the in-memory copies of +log entries to roll back any uncommitted transactions, +flush all in-memory database pages to the data disk, +and shut down gracefully. +At that point, +the system administrator can back up the database disk, +install a new log disk, +and restart the system. +<h3> +Checkpoints +</h3> +<p> +Berkeley DB includes a checkpointing service that interacts +with the recovery system. +During normal processing, +both the log and the database are changing continually. +At any given instant, +the on-disk versions of the two are not guaranteed to be consistent. +The log probably contains changes that are not yet in the database. +<p> +When an application makes a +<i>checkpoint</i>, +all committed changes in the log up to that point +are guaranteed to be present on the data disk, +too. +Checkpointing is moderately expensive during normal processing, +but limits the time spent recovering from crashes. +<p> +After an application or operating system crash, +the recovery system only needs to go back two checkpoints +to start rolling the log forward. +(One checkpoint is not far enough. +The recovery system cannot be sure that the most recent +checkpoint completed -- +it may have been interrupted by the crash that forced the +recovery system to run in the first place.) +Without checkpoints, +there is no way to be sure how long restarting after a crash will take. +With checkpoints, +the restart interval can be fixed by the programmer. +Recovery processing can be guaranteed to complete in a second or two. +<p> +Software crashes are much more common than disk failures. +Many developers want to guarantee that software bugs do not destroy data, +but are willing to restore from tape, +and to tolerate a day or two of lost work, +in the unlikley event of a disk crash. +With Berkeley DB, +programmers may truncate the log at checkpoints. +As long as the two most recent checkpoints are present, +the recovery system can guarantee that no committed transactions +are lost after a software crash. +In this case, +the recovery system does not require that the log and the +data be on separate devices, +although separating them can still improve performance +by spreading out writes. +<h3> +Two-phase locking +</h3> +<p> +Berkeley DB provides a service known as two-phase locking. +In order to reduce the likelihood of deadlocks and to guarantee ACID +properties, database systems manage locks in two phases. First, during +the operation of a transaction, they acquire locks, but never release +them. Second, at the end of the transaction, they release locks, but +never acquire them. In practice, most database systems, including Berkeley +DB, acquire locks on demand over the course of the transaction, then +flush the log, then release all locks. +<p> +Berkeley DB can lock entire database files, which correspond to tables, +or individual pages in them. +It does no record-level locking. +By shrinking the page size, +however, +developers can guarantee that every page holds only a small +number of records. +This reduces contention. +<p> +If locking is enabled, +then read and write operations on a database acquire two-phase locks, +which are held until the transaction completes. +Which objects are locked and the order of lock acquisition +depend on the workload for each transaction. +It is possible for two or more transactions to deadlock, +so that each is waiting for a lock that is held by another. +<p> +Berkeley DB detects deadlocks and automatically rolls back +one of the transactions. +This releases the locks that it held +and allows the other transactions to continue. +The caller is notified that its transaction did not complete, +and may restart it. +Developers can specify the deadlock detection interval +and the policy to use in choosing a transaction to roll back. +<p> +The two-phase locking interfaces are separately callable by applications +that link Berkeley DB, though few users have needed to use that facility +directly. +Using these interfaces, +Berkeley DB provides a fast, +platform-portable locking system for general-purpose use. +It also lets users include non-database objects in a database transaction, +by controlling access to them exactly as if they were inside the database. +<p> +The Berkeley DB two-phase locking facility is built on the fastest correct +locking primitives that are supported by the underlying architecture. +In the current implementation, this means that the locking system is +different on the various UNIX platforms, and is still more different +on Windows NT. In our experience, the most difficult aspect of performance +tuning is finding the fastest locking primitives that work correctly +on a particular architecture and then integrating the new +interface with the several that we already support. +<p> +The world would be a better place if the operating systems community +would uniformly implement POSIX locking primitives and would guarantee +that acquiring an uncontested lock was a fast operation. +Locks must work both among threads in a single process +and among processes. +<h2> +Concurrency +</h2> +<p> +Good performance under concurrent operation is a critical design point +for Berkeley DB. Although Berkeley DB is itself not multi-threaded, +it is thread-safe, and runs well in threaded applications. +Philosophically, +we view the use of threads and the choice of a threads package +as a policy decision, +and prefer to offer mechanism (the ability to run threaded or not), +allowing applications to choose their own policies. +<p> +The locking, logging, and buffer pool subsystems all use shared memory +or other OS-specific sharing facilities to communicate. Locks, buffer +pool fetches, and log writes behave in the same way across threads in +a single process as they do across different processes on a single +machine. +<p> +As a result, concurrent database applications may start up a new process +for every single user, may create a single server which spawns a new +thread for every client request, or may choose any policy in between. +<p> +Berkeley DB has been carefully designed to minimize contention +and maximize concurrency. +The cache manager allows all threads or processes to benefit from +I/O done by one. +Shared resources must sometimes be locked for exclusive access +by one thread of control. +We have kept critical sections small, +and are careful not to hold critical resource locks across +system calls that could deschedule the locking thread or process. +Sleepycat Software has customers with hundreds of concurrent +users working on a single database in production. +<h1> +Engineering Philosophy +</h1> +<p> +Fundamentally, Berkeley DB is a collection of access methods with +important facilities, like logging, locking, and transactional access +underlying them. In both the research and the commercial world, +the techniques for building systems like Berkeley DB have been well-known +for a long time. +<p> +The key advantage of Berkeley DB is the careful attention that has been +paid to engineering details throughout its life. We have carefully +designed the system so that the core facilities, like locking and I/O, +surface the right interfaces and are otherwise opaque to the caller. +As programmers, we understand the value of simplicity and have worked +hard to simplify the interfaces we surface to users of the +database system. +<p> +Berkeley DB avoids limits in the code. It places no practical limit +on the size of keys, values, or databases; they may grow to occupy +the available storage space. +<p> +The locking and logging subsystems have been carefully crafted to +reduce contention and improve throughput by shrinking or eliminating +critical sections, and reducing the sizes of locked regions and log +entries. +<p> +There is nothing in the design or implementation of Berkeley DB that +pushes the state of the art in database systems. Rather, we have been +very careful to get the engineering right. The result is a system that +is superior, as an embedded database system, to any other solution +available. +<p> +Most database systems trade off simplicity for correctness. Either the +system is easy to use, or it supports concurrent use and survives system +failures. Berkeley DB, because of its careful design and implementation, +offers both simplicity and correctness. +<p> +The system has a small footprint, +makes simple operations simple to carry out (inserting a new record takes +just a few lines of code), and behaves correctly in the face of heavy +concurrent use, system crashes, and even catastrophic failures like loss +of a hard disk. +<h1> +The Berkeley DB 2.x Distribution +</h1> +<p> +Berkeley DB is distributed in source code form from +<a href="http://www.sleepycat.com">www.sleepycat.com</a>. +Users are free to download and build the software, and to use it in +their applications. +<h2> +What is in the distribution +</h2> +<p> +The distribution is a compressed archive file. +It includes the source code for the Berkeley DB library, +as well as documentation, test suites, and supporting utilities. +<p> +The source code includes build support for all supported platforms. +On UNIX systems Berkeley DB uses the GNU autoconfiguration tool, +<tt>autoconf</tt>, +to identify the system and to build the library +and supporting utilities. +Berkeley DB includes specific build environments for other platforms, +such as VMS and Windows. +<h3> +Documentation +</h3> +<p> +The distributed system includes documentation in HTML format. +The documentation is in two parts: +a UNIX-style reference manual for use by programmers, +and a reference guide which is tutorial in nature. +<h3> +Test suite +</h3> +<p> +The software also includes a complete test suite, written in Tcl. +We believe that the test suite is a key advantage of Berkeley DB +over comparable systems. +<p> +First, the test suite allows users who download and build the software +to be sure that it is operating correctly. +<p> +Second, the test suite allows us, like other commercial developers +of database software, to exercise the system thoroughly at every +release. When we learn of new bugs, we add them to the test suite. +We run the test suite continually during development cycles, and +always prior to release. The result is a much more reliable system +by the time it reaches beta release. +<h2> +Binary distribution +</h2> +<p> +Sleepycat makes compiled libraries and general binary distributions available +to customers for a fee. +<h2> +Supported platforms +</h2> +<p> +Berkeley DB runs on any operating system with a +POSIX 1003.1 interface <a href="#IEEE96">[IEEE96]</a>, +which includes virtually every UNIX system. +In addition, +the software runs on VMS, +Windows/95, +Windows/98, +and Windows/NT. +Sleepycat Software no longer supports deployment on sixteen-bit +Windows systems. +<h1> +Berkeley DB 2.x Licensing +</h1> +<p> +Berkeley DB 2.x is distributed as an Open Source product. The software +is freely available from us at our Web site, and in other media. Users +are free to download the software and build applications with it. +<p> +The 1.x versions of Berkeley DB were covered by the UC Berkeley copyright +that covers software freely redistributable in source form. When +Sleepycat Software was formed, we needed to draft a license consistent +with the copyright governing the existing, older software. Because +of important differences between the UC Berkeley copyright and the GPL, +it was impossible for us to use the GPL. +A second copyright, with +terms contradictory to the first, simply would not have worked. +<p> +Sleepycat wanted to continue Open Source development of Berkeley DB +for several reasons. +We agree with Raymond <a href="#Raym98">[Raym98]</a> and others that Open +Source software is typically of higher quality than proprietary, +binary-only products. +Our customers benefit from a community of developers who +know and use Berkeley DB, +and can help with application design, +debugging, +and performance tuning. +Widespread distribution and use of the source code tends to +isolate bugs early, +and to get fixes back into the distributed system quickly. +As a result, +Berkeley DB is more reliable. +Just as importantly, +individual users are able to contribute new features +and performance enhancements, +to the benefit of everyone who uses Berkeley DB. +From a business perspective, +Open Source and free distribution of the +software creates share for us, and gives us a market into which +we can sell products and services. +Finally, making the source code +freely available reduces our support load, since customers can +find and fix bugs without recourse to us, in many cases. +<p> +To preserve the Open Source heritage of the older Berkeley DB code, +we drafted a new license governing the distribution of Berkeley DB +2.x. We adopted terms from the GPL that make it impossible to +turn our Open Source code into proprietary code owned by someone else. +<p> +Briefly, the terms governing the use and distribution of Berkeley DB +are: +<ul> +<li> +your application must be internal to your site, or +</li> +<li> +your application must be freely redistributable in source form, or +</li> +<li> +you must get a license from us. +</li> +</ul> +<p> +For customers who prefer not to distribute Open Source products, +we sell licenses to use and extend Berkeley DB at a reasonable cost. +<p> +We work hard to accommodate the needs of the Open Source community. +For example, +we have crafted special licensing arrangements with Gnome +to encourage its use and distribution of Berkeley DB. +<p> +Berkeley DB conforms to the Open Source definition <a href="#Open99">[Open99]</a>. +The license has +been carefully crafted to keep the product available as an Open Source +offering, +while providing enough of a return on our investment to fund continued +development and support of the product. The current license has +created a business capable of funding three years of development on +the software that simply would not have happened otherwise. +<h1> +Summary +</h1> +<p> +Berkeley DB offers a unique collection of features, targeted squarely +at software developers who need simple, reliable database management +services in their applications. Good design and implementation and +careful engineering throughout make the software better than many +other systems. +<p> +Berkeley DB is an Open Source product, available at +<a href="http://www.sleepycat.com">www.sleepycat.com</a>. +for download. The distributed system includes everything needed to +build and deploy the software or to port it to new systems. +<p> +Sleepycat Software distributes Berkeley DB under a license agreement +that draws on both the UC Berkeley copyright and the GPL. The license +guarantees that Berkeley DB will remain an Open Source product and +provides Sleepycat with opportunities to make money to fund continued +development on the software. +<h1> +References +</h1> +<table border=0 cellpadding=4 cellspacing=2> +<tr> +<td valign="top"><a name="Come79">[Come79]</a></td> +<td> +<p> +Comer, D., +"The Ubiquitous B-tree," +<i>ACM Computing Surveys</i> +Volume 11, number 2, +June 1979. +</td> +</tr> +<tr> +<td valign="top"> +<a name="Gray93">[Gray93]</a> +</td> +<td> +<p> +Gray, J., and Reuter, A., +<i>Transaction Processing: Concepts and Techniques</i>, +Morgan-Kaufman Publishers, +1993. +</td> +</tr> +<tr> +<td valign="top"> +<a name="IEEE96">[IEEE96]</a> +</td> +<td> +<p> +Institute for Electrical and Electronics Engineers, +<i>IEEE/ANSI Std 1003.1</i>, +1996 Edition. +</td> +</tr> +<tr> +<td valign="top"> +<a name="Litw80">[Litw80]</a> +</td> +<td> +<p> +Litwin, W., +"Linear Hashing: A New Tool for File and Table Addressing," +<i>Proceedings of the 6th International Conference on Very Large Databases (VLDB)</i>, +Montreal, Quebec, Canada, +October 1980. +</td> +</tr> +<tr> +<td valign="top"> +<a name="Open94">[Open94]</a> +</td> +<td> +<p> +The Open Group, +<i>Distributed TP: The XA+ Specification, Version 2</i>, +The Open Group, 1994. +</td> +</tr> +<tr> +<td valign="top"> +<a name="Open99">[Open99]</a> +</td> +<td> +<p> +Opensource.org, +"Open Source Definition," +<a href="http://www.opensource.org/osd.html"><i>www.opensource.org/osd.html</i></a>, +version 1.4, +1999. +</td> +</tr> +<tr> +<td valign="top"> +<a name="Raym98">[Raym98]</a> +</td> +<td> +<p> +Raymond, E.S., +"The Cathedral and the Bazaar," +<a href="http://www.tuxedo.org/~esr/writings/cathedral-bazaar/cathedral-bazaar.html"> +www.tuxedo.org/~esr/writings/cathedral-bazaar/cathedral-bazaar.html</a>, +January 1998. +</td> +</tr> +<tr> +<td valign="top"> +<a name="Selt91">[Selt91]</a> +</td> +<td> +<p> +Seltzer, M., and Yigit, O., +"A New Hashing Package for UNIX," +<i>Proceedings 1991 Winter USENIX Conference</i>, +Dallas, TX, +January 1991. +</td> +</tr> +<tr> +<td valign="top"> +<a name="Selt92">[Selt92]</a> +</td> +<td> +<p> +Seltzer, M., and Olson, M., +"LIBTP: Portable Modular Transactions for UNIX," +<i>Proceedings 1992 Winter Usenix Conference</i> +San Francisco, CA, +January 1992.] +</td> +</tr> +<tr> +<td valign="top"> +<a name="Ston82">[Ston82]</a> +</td> +<td> +<p> +Stonebraker, M., Stettner, H., Kalash, J., Guttman, A., and Lynn, N., +"Document Processing in a Relational Database System," +Memorandum No. UCB/ERL M82/32, +University of California at Berkeley, +Berkeley, CA, +May 1982. +</td> +</tr> +</table> +</body> +</html> diff --git a/db/docs/ref/refs/bdb_usenix.ps b/db/docs/ref/refs/bdb_usenix.ps new file mode 100644 index 000000000..82e678971 --- /dev/null +++ b/db/docs/ref/refs/bdb_usenix.ps @@ -0,0 +1,1441 @@ +%!PS-Adobe-3.0 +%%Creator: groff version 1.11 +%%CreationDate: Mon Apr 26 13:38:12 1999 +%%DocumentNeededResources: font Times-Bold +%%+ font Times-Roman +%%+ font Times-Italic +%%+ font Courier +%%DocumentSuppliedResources: procset grops 1.11 0 +%%Pages: 9 +%%PageOrder: Ascend +%%Orientation: Portrait +%%EndComments +%%BeginProlog +%%BeginResource: procset grops 1.11 0 +/setpacking where{ +pop +currentpacking +true setpacking +}if +/grops 120 dict dup begin +/SC 32 def +/A/show load def +/B{0 SC 3 -1 roll widthshow}bind def +/C{0 exch ashow}bind def +/D{0 exch 0 SC 5 2 roll awidthshow}bind def +/E{0 rmoveto show}bind def +/F{0 rmoveto 0 SC 3 -1 roll widthshow}bind def +/G{0 rmoveto 0 exch ashow}bind def +/H{0 rmoveto 0 exch 0 SC 5 2 roll awidthshow}bind def +/I{0 exch rmoveto show}bind def +/J{0 exch rmoveto 0 SC 3 -1 roll widthshow}bind def +/K{0 exch rmoveto 0 exch ashow}bind def +/L{0 exch rmoveto 0 exch 0 SC 5 2 roll awidthshow}bind def +/M{rmoveto show}bind def +/N{rmoveto 0 SC 3 -1 roll widthshow}bind def +/O{rmoveto 0 exch ashow}bind def +/P{rmoveto 0 exch 0 SC 5 2 roll awidthshow}bind def +/Q{moveto show}bind def +/R{moveto 0 SC 3 -1 roll widthshow}bind def +/S{moveto 0 exch ashow}bind def +/T{moveto 0 exch 0 SC 5 2 roll awidthshow}bind def +/SF{ +findfont exch +[exch dup 0 exch 0 exch neg 0 0]makefont +dup setfont +[exch/setfont cvx]cvx bind def +}bind def +/MF{ +findfont +[5 2 roll +0 3 1 roll +neg 0 0]makefont +dup setfont +[exch/setfont cvx]cvx bind def +}bind def +/level0 0 def +/RES 0 def +/PL 0 def +/LS 0 def +/MANUAL{ +statusdict begin/manualfeed true store end +}bind def +/PLG{ +gsave newpath clippath pathbbox grestore +exch pop add exch pop +}bind def +/BP{ +/level0 save def +1 setlinecap +1 setlinejoin +72 RES div dup scale +LS{ +90 rotate +}{ +0 PL translate +}ifelse +1 -1 scale +}bind def +/EP{ +level0 restore +showpage +}bind def +/DA{ +newpath arcn stroke +}bind def +/SN{ +transform +.25 sub exch .25 sub exch +round .25 add exch round .25 add exch +itransform +}bind def +/DL{ +SN +moveto +SN +lineto stroke +}bind def +/DC{ +newpath 0 360 arc closepath +}bind def +/TM matrix def +/DE{ +TM currentmatrix pop +translate scale newpath 0 0 .5 0 360 arc closepath +TM setmatrix +}bind def +/RC/rcurveto load def +/RL/rlineto load def +/ST/stroke load def +/MT/moveto load def +/CL/closepath load def +/FL{ +currentgray exch setgray fill setgray +}bind def +/BL/fill load def +/LW/setlinewidth load def +/RE{ +findfont +dup maxlength 1 index/FontName known not{1 add}if dict begin +{ +1 index/FID ne{def}{pop pop}ifelse +}forall +/Encoding exch def +dup/FontName exch def +currentdict end definefont pop +}bind def +/DEFS 0 def +/EBEGIN{ +moveto +DEFS begin +}bind def +/EEND/end load def +/CNT 0 def +/level1 0 def +/PBEGIN{ +/level1 save def +translate +div 3 1 roll div exch scale +neg exch neg exch translate +0 setgray +0 setlinecap +1 setlinewidth +0 setlinejoin +10 setmiterlimit +[]0 setdash +/setstrokeadjust where{ +pop +false setstrokeadjust +}if +/setoverprint where{ +pop +false setoverprint +}if +newpath +/CNT countdictstack def +userdict begin +/showpage{}def +}bind def +/PEND{ +clear +countdictstack CNT sub{end}repeat +level1 restore +}bind def +end def +/setpacking where{ +pop +setpacking +}if +%%EndResource +%%IncludeResource: font Times-Bold +%%IncludeResource: font Times-Roman +%%IncludeResource: font Times-Italic +%%IncludeResource: font Courier +grops begin/DEFS 1 dict def DEFS begin/u{.001 mul}bind def end/RES 72 +def/PL 792 def/LS false def/ENC0[/asciicircum/asciitilde/Scaron/Zcaron +/scaron/zcaron/Ydieresis/trademark/quotesingle/.notdef/.notdef/.notdef +/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef/.notdef +/.notdef/.notdef/space/exclam/quotedbl/numbersign/dollar/percent +/ampersand/quoteright/parenleft/parenright/asterisk/plus/comma/hyphen +/period/slash/zero/one/two/three/four/five/six/seven/eight/nine/colon +/semicolon/less/equal/greater/question/at/A/B/C/D/E/F/G/H/I/J/K/L/M/N/O +/P/Q/R/S/T/U/V/W/X/Y/Z/bracketleft/backslash/bracketright/circumflex +/underscore/quoteleft/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y +/z/braceleft/bar/braceright/tilde/.notdef/quotesinglbase/guillemotleft +/guillemotright/bullet/florin/fraction/perthousand/dagger/daggerdbl +/endash/emdash/ff/fi/fl/ffi/ffl/dotlessi/dotlessj/grave/hungarumlaut +/dotaccent/breve/caron/ring/ogonek/quotedblleft/quotedblright/oe/lslash +/quotedblbase/OE/Lslash/.notdef/exclamdown/cent/sterling/currency/yen +/brokenbar/section/dieresis/copyright/ordfeminine/guilsinglleft +/logicalnot/minus/registered/macron/degree/plusminus/twosuperior +/threesuperior/acute/mu/paragraph/periodcentered/cedilla/onesuperior +/ordmasculine/guilsinglright/onequarter/onehalf/threequarters +/questiondown/Agrave/Aacute/Acircumflex/Atilde/Adieresis/Aring/AE +/Ccedilla/Egrave/Eacute/Ecircumflex/Edieresis/Igrave/Iacute/Icircumflex +/Idieresis/Eth/Ntilde/Ograve/Oacute/Ocircumflex/Otilde/Odieresis +/multiply/Oslash/Ugrave/Uacute/Ucircumflex/Udieresis/Yacute/Thorn +/germandbls/agrave/aacute/acircumflex/atilde/adieresis/aring/ae/ccedilla +/egrave/eacute/ecircumflex/edieresis/igrave/iacute/icircumflex/idieresis +/eth/ntilde/ograve/oacute/ocircumflex/otilde/odieresis/divide/oslash +/ugrave/uacute/ucircumflex/udieresis/yacute/thorn/ydieresis]def +/Courier@0 ENC0/Courier RE/Times-Italic@0 ENC0/Times-Italic RE +/Times-Roman@0 ENC0/Times-Roman RE/Times-Bold@0 ENC0/Times-Bold RE +%%EndProlog +%%Page: 1 1 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 14/Times-Bold@0 SF(Berk)275.358 100.8 Q(eley DB)-.14 E/F1 12 +/Times-Roman@0 SF(Michael A. Olson)270.372 129.6 Q -.3(Ke)283.182 144 S +(ith Bostic).3 E(Mar)279.15 158.4 Q(go Seltzer)-.216 E/F2 12 +/Times-Italic@0 SF(Sleepycat Softwar)255.492 174.24 Q .24 -.12(e, I) +-.444 H(nc.).12 E/F3 12/Times-Bold@0 SF(Abstract)290.874 210.24 Q/F4 10 +/Times-Roman@0 SF(Berk)79.2 226.44 Q(ele)-.1 E 2.925(yD)-.15 G 2.925(Bi) +-2.925 G 2.924(sa)-2.925 G 2.924(nO)-2.924 G .424 +(pen Source embedded database system with a number of k)-2.924 F .724 +-.15(ey a)-.1 H(dv).15 E .424(antages o)-.25 F -.15(ve)-.15 G 2.924(rc) +.15 G .424(omparable sys-)-2.924 F 3.102(tems. It)79.2 238.44 R .602(is simple to use, supports concurrent access by multiple users, and pro) +3.102 F .602(vides industrial-strength transaction)-.15 F 1.555 +(support, including survi)79.2 250.44 R 1.555 +(ving system and disk crashes.)-.25 F 1.554 +(This paper describes the design and technical features of)6.555 F(Berk) +79.2 262.44 Q(ele)-.1 E 2.5(yD)-.15 G(B, the distrib)-2.5 E +(ution, and its license.)-.2 E F3 3(1. Intr)79.2 286.44 R(oduction)-.216 +E F4 .691(The Berk)79.2 302.64 R(ele)-.1 E 3.191(yD)-.15 G .691 +(atabase \(Berk)-3.191 F(ele)-.1 E 3.191(yD)-.15 G .692 +(B\) is an embedded)-3.191 F .253 +(database system that can be used in applications requir)79.2 314.64 R +(-)-.2 E 1.636(ing high-performance concurrent storage and retrie)79.2 +326.64 R -.25(va)-.25 G(l).25 E 2.619(of k)79.2 338.64 R -.15(ey)-.1 G +(/v).15 E 2.619(alue pairs.)-.25 F 2.619(The softw)7.619 F 2.619 +(are is distrib)-.1 F 2.618(uted as a)-.2 F .057 +(library that can be link)79.2 350.64 R .058 +(ed directly into an application.)-.1 F(It)5.058 E(pro)79.2 362.64 Q +1.454(vides a v)-.15 F 1.453(ariety of programmatic interf)-.25 F 1.453 +(aces, includ-)-.1 F .237 +(ing callable APIs for C, C++, Perl, Tcl and Ja)79.2 374.64 R -.25(va) +-.2 G 5.237(.U).25 G(sers)-5.237 E .327(may do)79.2 386.64 R .327 +(wnload Berk)-.25 F(ele)-.1 E 2.827(yD)-.15 G 2.827(Bf)-2.827 G .326 +(rom Sleep)-2.827 F .326(ycat Softw)-.1 F(are')-.1 E(s)-.55 E -.8(We) +79.2 398.64 S 2.5(bs).8 G(ite, at)-2.5 E/F5 10/Times-Italic@0 SF(www)2.5 +E(.sleepycat.com)-.74 E F4(.)A(Sleep)79.2 414.84 Q 1.33(ycat distrib)-.1 +F 1.33(utes Berk)-.2 F(ele)-.1 E 3.83(yD)-.15 G 3.83(Ba)-3.83 G 3.83(sa) +-3.83 G 3.83(nO)-3.83 G 1.33(pen Source)-3.83 F 3.3(product. The)79.2 +426.84 R(compan)3.3 E 3.3(yc)-.15 G .8(ollects license fees for certain) +-3.3 F(uses of the softw)79.2 438.84 Q +(are and sells support and services.)-.1 E F3 3(1.1. History)79.2 468.84 +R F4(Berk)79.2 485.04 Q(ele)-.1 E 3.057(yD)-.15 G 3.057(Bb)-3.057 G +-2.25 -.15(eg a)-3.057 H 3.058(na).15 G 3.058(san)-3.058 G 1.058 -.25 +(ew i)-3.058 H .558(mplementation of a hash).25 F .843 +(access method to replace both)79.2 497.04 R/F6 10/Courier@0 SF(hsearch) +3.342 E F4 .842(and the v)3.342 F(ari-)-.25 E(ous)79.2 509.04 Q F6(dbm) +5.466 E F4 2.967(implementations \()5.466 F F6(dbm)A F4 2.967(from A) +5.467 F(T&T)-1.11 E(,)-.74 E F6(ndbm)5.467 E F4 1.334(from Berk)79.2 +521.04 R(ele)-.1 E 2.634 -.65(y, a)-.15 H(nd).65 E F6(gdbm)3.834 E F4 +1.334(from the GNU project\).)3.834 F(In)6.333 E .367 +(1990 Seltzer and Y)79.2 533.04 R .368 +(igit produced a package called Hash)-.55 F(to do this [Selt91].)79.2 +545.04 Q 3.106(The \214rst general release of Berk)79.2 561.24 R(ele)-.1 +E 5.606(yD)-.15 G 3.106(B, in 1991,)-5.606 F 3.038(included some interf) +79.2 573.24 R 3.039(ace changes and a ne)-.1 F 5.539(wB)-.25 G(+tree) +-5.539 E .887(access method.)79.2 585.24 R .886 +(At roughly the same time, Seltzer and)5.887 F 1.201(Olson de)79.2 +597.24 R -.15(ve)-.25 G 1.202 +(loped a prototype transaction system based).15 F 3.356(on Berk)79.2 +609.24 R(ele)-.1 E 5.856(yD)-.15 G 3.356(B, called LIBTP [Selt92], b) +-5.856 F 3.355(ut ne)-.2 F -.15(ve)-.25 G(r).15 E(released the code.) +79.2 621.24 Q .653(The 4.4BSD UNIX release included Berk)79.2 637.44 R +(ele)-.1 E 3.153(yD)-.15 G 3.153(B1)-3.153 G(.85)-3.153 E .602(in 1992.) +79.2 649.44 R .601(Seltzer and Bostic maintained the code in the)5.601 F +1.545(early 1990s in Berk)79.2 661.44 R(ele)-.1 E 4.046(ya)-.15 G 1.546 +(nd in Massachusetts.)-4.046 F(Man)6.546 E(y)-.15 E +(users adopted the code during this period.)79.2 673.44 Q .432 +(By mid-1996, users w)79.2 689.64 R .431 +(anted commercial support for the)-.1 F(softw)79.2 701.64 Q 7.033 +(are. In)-.1 F 4.533(response, Bostic and Seltzer formed)7.033 F(Sleep) +79.2 713.64 Q 10.128(ycat Softw)-.1 F 12.628(are. The)-.1 F(compan) +12.627 E 15.127(ye)-.15 G(nhances,)-15.127 E(distrib)323.2 286.44 Q +1.623(utes, and supports Berk)-.2 F(ele)-.1 E 4.123(yD)-.15 G 4.124(Ba) +-4.123 G 1.624(nd supporting)-4.124 F(softw)323.2 298.44 Q 2.2 +(are and documentation.)-.1 F(Sleep)7.2 E 2.2(ycat released v)-.1 F(er) +-.15 E(-)-.2 E 1.677(sion 2.1 of Berk)323.2 310.44 R(ele)-.1 E 4.177(yD) +-.15 G 4.178(Bi)-4.177 G 4.178(nm)-4.178 G 1.678(id-1997 with important) +-4.178 F(ne)323.2 322.44 Q 2.56(wf)-.25 G .06 +(eatures, including support for concurrent access to)-2.56 F 4.176 +(databases. The)323.2 334.44 R(compan)4.176 E 4.177(ym)-.15 G(ak)-4.177 +E 1.677(es about three commer)-.1 F(-)-.2 E .958(cial releases a year) +323.2 346.44 R 3.458(,a)-.4 G .957(nd most recently shipped v)-3.458 F +(ersion)-.15 E(2.8.)323.2 358.44 Q F3 3(1.2. Ov)323.2 388.44 R(er)-.12 E +(view of Berk)-.12 E(eley DB)-.12 E F4 3.094(The C interf)323.2 404.64 R +3.094(aces in Berk)-.1 F(ele)-.1 E 5.594(yD)-.15 G 5.595(Bp)-5.594 G +(ermit)-5.595 E F6(dbm)5.595 E F4(-style)A 4.586 +(record management for databases, with signi\214cant)323.2 416.64 R -.15 +(ex)323.2 428.64 S 1.273(tensions to handle duplicate data items ele).15 +F -.05(ga)-.15 G(ntly).05 E 3.773(,t)-.65 G(o)-3.773 E 2.427 +(deal with concurrent access, and to pro)323.2 440.64 R 2.427 +(vide transac-)-.15 F .71 +(tional support so that multiple changes can be simulta-)323.2 452.64 R +1.273(neously committed \(so that the)323.2 464.64 R 3.773(ya)-.15 G +1.273(re made permanent\))-3.773 F 1.848 +(or rolled back \(so that the database is restored to its)323.2 476.64 R +(state at the be)323.2 488.64 Q(ginning of the transaction\).)-.15 E +1.034(C++ and Ja)323.2 504.84 R 1.534 -.25(va i)-.2 H(nterf).25 E 1.033 +(aces pro)-.1 F 1.033(vide a small set of classes)-.15 F 1.961 +(for operating on a database.)323.2 516.84 R 1.961 +(The main class in both)6.961 F .587(cases is called)323.2 528.84 R F6 +(Db)3.086 E F4 3.086(,a)C .586(nd pro)-3.086 F .586 +(vides methods that encapsu-)-.15 F 1.128(late the)323.2 540.84 R F6 +(dbm)3.628 E F4 1.129(-style interf)B 1.129(aces that the C interf)-.1 F +1.129(aces pro-)-.1 F(vide.)323.2 552.84 Q 2.565(Tcl and Perl interf) +323.2 569.04 R 2.564(aces allo)-.1 F 5.064(wd)-.25 G -2.15 -.25(ev e) +-5.064 H 2.564(lopers w).25 F 2.564(orking in)-.1 F 1.716 +(those languages to use Berk)323.2 581.04 R(ele)-.1 E 4.216(yD)-.15 G +4.216(Bi)-4.216 G 4.217(nt)-4.216 G 1.717(heir applica-)-4.217 F 3.419 +(tions. Bindings)323.2 593.04 R .919 +(for both languages are included in the)3.419 F(distrib)323.2 605.04 Q +(ution.)-.2 E(De)323.2 621.24 Q -.15(ve)-.25 G 1.069 +(lopers may compile their applications and link in).15 F(Berk)323.2 +633.24 Q(ele)-.1 E 2.5(yD)-.15 G 2.5(Bs)-2.5 G(tatically or dynamically) +-2.5 E(.)-.65 E F3 3(1.3. Ho)323.2 663.24 R 3(wB)-.12 G(erk)-3 E +(eley DB is used)-.12 E F4 .655(The Berk)323.2 679.44 R(ele)-.1 E 3.155 +(yD)-.15 G 3.154(Bl)-3.155 G .654(ibrary supports concurrent access to) +-3.154 F 5.115(databases. It)323.2 691.44 R 2.616(can be link)5.115 F +2.616(ed into standalone applica-)-.1 F 1.487 +(tions, into a collection of cooperating applications, or)323.2 703.44 R +4.21(into serv)323.2 715.44 R 4.21 +(ers that handle requests and do database)-.15 F EP +%%Page: 2 2 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 10/Times-Roman@0 SF(operations on behalf of clients.)79.2 84 Q .858 +(Compared to using a standalone database management)79.2 100.2 R .846 +(system, Berk)79.2 112.2 R(ele)-.1 E 3.346(yD)-.15 G 3.346(Bi)-3.346 G +3.346(se)-3.346 G .846(asy to understand and simple)-3.346 F 3.826 +(to use.)79.2 124.2 R 3.826(The softw)8.826 F 3.826 +(are stores and retrie)-.1 F -.15(ve)-.25 G 6.325(sr).15 G(ecords,) +-6.325 E 2.77(which consist of k)79.2 136.2 R -.15(ey)-.1 G(/v).15 E +2.77(alue pairs.)-.25 F -2.15 -.25(Ke y)7.77 H 5.27(sa).25 G 2.77 +(re used to)-5.27 F .698(locate items and can be an)79.2 148.2 R 3.198 +(yd)-.15 G .698(ata type or structure sup-)-3.198 F +(ported by the programming language.)79.2 160.2 Q .813 +(The programmer can pro)79.2 176.4 R .813(vide the functions that Berk) +-.15 F(e-)-.1 E(le)79.2 188.4 Q 3.264(yD)-.15 G 3.264(Bu)-3.264 G .763 +(ses to operate on k)-3.264 F -.15(ey)-.1 G 3.263(s. F).15 F .763(or e) +-.15 F .763(xample, B+trees)-.15 F 1.72 +(can use a custom comparison function, and the Hash)79.2 200.4 R .519 +(access method can use a custom hash function.)79.2 212.4 R(Berk)5.518 E +(e-)-.1 E(le)79.2 224.4 Q 5.222(yD)-.15 G 5.222(Bu)-5.222 G 2.722 +(ses def)-5.222 F 2.723(ault functions if none are supplied.)-.1 F .873 +(Otherwise, Berk)79.2 236.4 R(ele)-.1 E 3.373(yD)-.15 G 3.373(Bd)-3.373 +G .873(oes not e)-3.373 F .873(xamine or interpret)-.15 F .934(either k) +79.2 248.4 R -.15(ey)-.1 G 3.434(so).15 G 3.434(rv)-3.434 G .934 +(alues in an)-3.684 F 3.434(yw)-.15 G(ay)-3.534 E 5.934(.V)-.65 G .934 +(alues may be arbi-)-7.044 F(trarily long.)79.2 260.4 Q .69 +(It is also important to understand what Berk)79.2 276.6 R(ele)-.1 E +3.19(yD)-.15 G 3.19(Bi)-3.19 G(s)-3.19 E 4.365(not. It)79.2 288.6 R +1.865(is not a database serv)4.365 F 1.866(er that handles netw)-.15 F +(ork)-.1 E 2.797(requests. It)79.2 300.6 R .297 +(is not an SQL engine that e)2.797 F -.15(xe)-.15 G .296(cutes queries.) +.15 F 1.547(It is not a relational or object-oriented database man-)79.2 +312.6 R(agement system.)79.2 324.6 Q 1.101(It is possible to b)79.2 +340.8 R 1.101(uild an)-.2 F 3.601(yo)-.15 G 3.601(ft)-3.601 G 1.101 +(hose on top of Berk)-3.601 F(ele)-.1 E(y)-.15 E 2.116(DB, b)79.2 352.8 +R 2.116(ut the package, as distrib)-.2 F 2.117(uted, is an embedded)-.2 +F 1.444(database engine.)79.2 364.8 R 1.444 +(It has been designed to be portable,)6.444 F(small, f)79.2 376.8 Q +(ast, and reliable.)-.1 E/F1 12/Times-Bold@0 SF 3(1.4. A)79.2 406.8 R +(pplications that use Berk)-.3 E(eley DB)-.12 E F0(Berk)79.2 423 Q(ele) +-.1 E 4.248(yD)-.15 G 4.248(Bi)-4.248 G 4.249(se)-4.248 G 1.749 +(mbedded in a v)-4.249 F 1.749(ariety of proprietary)-.25 F 3.84 +(and Open Source softw)79.2 435 R 3.84(are packages.)-.1 F 3.84 +(This section)8.84 F(highlights a fe)79.2 447 Q 2.5(wo)-.25 G 2.5(ft) +-2.5 G(he products that use it.)-2.5 E 1.467(Directory serv)79.2 463.2 R +1.467(ers, which do data storage and retrie)-.15 F -.25(va)-.25 G(l).25 +E 2.823(using the Local Directory Access Protocol \(LD)79.2 475.2 R +(AP\),)-.4 E(pro)79.2 487.2 Q .956 +(vide naming and directory lookup service on local-)-.15 F 2.837 +(area netw)79.2 499.2 R 5.337(orks. This)-.1 F 2.837 +(service is, essentially)5.337 F 5.336(,d)-.65 G(atabase)-5.336 E .039 +(query and update, b)79.2 511.2 R .039 +(ut uses a simple protocol rather than)-.2 F 2.202(SQL or ODBC.)79.2 +523.2 R(Berk)7.201 E(ele)-.1 E 4.701(yD)-.15 G 4.701(Bi)-4.701 G 4.701 +(st)-4.701 G 2.201(he embedded data)-4.701 F 1.288 +(manager in the majority of deplo)79.2 535.2 R 1.289(yed directory serv) +-.1 F(ers)-.15 E(today)79.2 547.2 Q 4.855(,i)-.65 G 2.355(ncluding LD) +-4.855 F 2.355(AP serv)-.4 F 2.355(ers from Netscape, Mes-)-.15 F +(sageDirect \(formerly Isode\), and others.)79.2 559.2 Q(Berk)79.2 575.4 +Q(ele)-.1 E 4.385(yD)-.15 G 4.385(Bi)-4.385 G 4.385(sa)-4.385 G 1.886 +(lso embedded in a lar)-4.385 F 1.886(ge number of)-.18 F 5.302 +(mail serv)79.2 587.4 R 7.802(ers. Intermail,)-.15 F 5.302(from Softw) +7.802 F 5.302(are.com, uses)-.1 F(Berk)79.2 599.4 Q(ele)-.1 E 4.613(yD) +-.15 G 4.613(Ba)-4.613 G 4.613(sam)-4.613 G 2.114 +(essage store and as the backing)-4.613 F 3.597 +(store for its directory serv)79.2 611.4 R(er)-.15 E 8.597(.T)-.55 G +3.597(he sendmail serv)-8.597 F(er)-.15 E 1.175 +(\(including both the commercial Sendmail Pro of)79.2 623.4 R(fering) +-.25 E 3.283(from Sendmail, Inc. and the v)79.2 635.4 R 3.283 +(ersion distrib)-.15 F 3.282(uted by)-.2 F(sendmail.or)79.2 647.4 Q +2.304(g\) uses Berk)-.18 F(ele)-.1 E 4.804(yD)-.15 G 4.804(Bt)-4.804 G +4.804(os)-4.804 G 2.305(tore aliases and)-4.804 F 9.01 +(other information.)79.2 659.4 R(Similarly)14.01 E 11.51(,P)-.65 G 9.01 +(ost\214x \(formerly)-11.51 F 3.465(VMailer\) uses Berk)79.2 671.4 R +(ele)-.1 E 5.965(yD)-.15 G 5.965(Bt)-5.965 G 5.965(os)-5.965 G 3.465 +(tore administrati)-5.965 F -.15(ve)-.25 G(information.)79.2 683.4 Q +.134(In addition, Berk)79.2 699.6 R(ele)-.1 E 2.634(yD)-.15 G 2.633(Bi) +-2.634 G 2.633(se)-2.633 G .133(mbedded in a wide v)-2.633 F(ariety)-.25 +E 4.994(of other softw)79.2 711.6 R 4.994(are products.)-.1 F 4.994 +(Example applications)9.994 F .373 +(include managing access control lists, storing user k)323.2 84 R -.15 +(ey)-.1 G(s).15 E 2.75(in a public-k)323.2 96 R 3.05 -.15(ey i)-.1 H +2.75(nfrastructure, recording machine-to-).15 F(netw)323.2 108 Q .519 +(ork-address mappings in address serv)-.1 F .518(ers, and stor)-.15 F(-) +-.2 E .411(ing con\214guration and de)323.2 120 R .412 +(vice information in video post-)-.25 F(production softw)323.2 132 Q +(are.)-.1 E(Finally)323.2 148.2 Q 4.978(,B)-.65 G(erk)-4.978 E(ele)-.1 E +4.978(yD)-.15 G 4.978(Bi)-4.978 G 4.978(sap)-4.978 G 2.478(art of man) +-4.978 F 4.977(yo)-.15 G 2.477(ther Open)-4.977 F .005(Source softw) +323.2 160.2 R .005(are packages a)-.1 F -.25(va)-.2 G .006 +(ilable on the Internet.).25 F -.15(Fo)5.006 G(r).15 E -.15(ex)323.2 +172.2 S .604(ample, the softw).15 F .604 +(are is embedded in the Apache W)-.1 F(eb)-.8 E(serv)323.2 184.2 Q +(er and the Gnome desktop.)-.15 E F1 3(2. Access)323.2 214.2 R(Methods)3 +E F0 .828(In database terminology)323.2 230.4 R 3.329(,a)-.65 G 3.329 +(na)-3.329 G .829(ccess method is the disk-)-3.329 F 1.964 +(based structure used to store data and the operations)323.2 242.4 R -.2 +(av)323.2 254.4 S 6.053(ailable on that structure.)-.05 F -.15(Fo)11.053 +G 8.554(re).15 G 6.054(xample, man)-8.704 F(y)-.15 E 3.853 +(database systems support a B+tree access method.)323.2 266.4 R 1.203 +(B+trees allo)323.2 278.4 R 3.703(we)-.25 G 1.203 +(quality-based lookups \(\214nd k)-3.703 F -.15(ey)-.1 G 3.704(se).15 G +(qual)-3.704 E 4(to some constant\), range-based lookups \(\214nd k) +323.2 290.4 R -.15(ey)-.1 G(s).15 E 1.188(between tw)323.2 302.4 R 3.688 +(oc)-.1 G 1.189(onstants\) and record insertion and dele-)-3.688 F +(tion.)323.2 314.4 Q(Berk)323.2 330.6 Q(ele)-.1 E 4.729(yD)-.15 G 4.729 +(Bs)-4.729 G 2.228(upports three access methods: B+tree,)-4.729 F 1.553 +(Extended Linear Hashing \(Hash\), and Fix)323.2 342.6 R 1.553(ed- or V) +-.15 F(ari-)-1.11 E 3.639(able-length Records \(Recno\).)323.2 354.6 R +3.638(All three operate on)8.638 F 1.956(records composed of a k)323.2 +366.6 R 2.256 -.15(ey a)-.1 H 1.956(nd a data v).15 F 4.456(alue. In) +-.25 F(the)4.456 E 1.301(B+tree and Hash access methods, k)323.2 378.6 R +-.15(ey)-.1 G 3.801(sc).15 G 1.301(an ha)-3.801 F 1.601 -.15(ve a)-.2 H +(rbi-).15 E 3.595(trary structure.)323.2 390.6 R 3.596 +(In the Recno access method, each)8.595 F .266 +(record is assigned a record number)323.2 402.6 R 2.765(,w)-.4 G .265 +(hich serv)-2.765 F .265(es as the)-.15 F -.1(ke)323.2 414.6 S 4.106 +-.65(y. I)-.05 H 2.806(na).65 G .306(ll the access methods, the v)-2.806 +F .306(alue can ha)-.25 F .606 -.15(ve a)-.2 H(rbi-).15 E 1.417 +(trary structure.)323.2 426.6 R 1.417 +(The programmer can supply compari-)6.417 F 2.129 +(son or hashing functions for k)323.2 438.6 R -.15(ey)-.1 G 2.129 +(s, and Berk).15 F(ele)-.1 E 4.629(yD)-.15 G(B)-4.629 E +(stores and retrie)323.2 450.6 Q -.15(ve)-.25 G 2.5(sv).15 G +(alues without interpreting them.)-2.75 E 1.069 +(All of the access methods use the host \214lesystem as a)323.2 466.8 R +(backing store.)323.2 478.8 Q F1 3(2.1. Hash)323.2 508.8 R F0(Berk)323.2 +525 Q(ele)-.1 E 6.485(yD)-.15 G 6.485(Bi)-6.485 G 3.986 +(ncludes a Hash access method that)-6.485 F 9.863(implements e)323.2 537 +R 9.862(xtended linear hashing [Litw80].)-.15 F .017 +(Extended linear hashing adjusts the hash function as the)323.2 549 R +.507(hash table gro)323.2 561 R .506(ws, attempting to k)-.25 F .506 +(eep all b)-.1 F(uck)-.2 E .506(ets under)-.1 F(-)-.2 E +(full in the steady state.)323.2 573 Q 1.649 +(The Hash access method supports insertion and dele-)323.2 589.2 R .259 +(tion of records and lookup by e)323.2 601.2 R .259(xact match only)-.15 +F 5.258(.A)-.65 G(ppli-)-5.258 E .038(cations may iterate o)323.2 613.2 +R -.15(ve)-.15 G 2.538(ra).15 G .038(ll records stored in a table, b) +-2.538 F(ut)-.2 E(the order in which the)323.2 625.2 Q 2.5(ya)-.15 G +(re returned is unde\214ned.)-2.5 E F1 3(2.2. B+tr)323.2 655.2 R(ee) +-.216 E F0(Berk)323.2 671.4 Q(ele)-.1 E 7.184(yD)-.15 G 7.184(Bi)-7.184 +G 4.683(ncludes a B+tree [Come79] access)-7.184 F 2.502(method. B+trees) +323.2 683.4 R .002(store records of k)2.502 F -.15(ey)-.1 G(/v).15 E +.003(alue pairs in leaf)-.25 F .52(pages, and pairs of \(k)323.2 695.4 R +-.15(ey)-.1 G 3.02(,c)-.5 G .52(hild page address\) at internal)-3.02 F +5.384(nodes. K)323.2 707.4 R -.15(ey)-.25 G 5.384(si).15 G 5.384(nt) +-5.384 G 2.885(he tree are stored in sorted order)-5.384 F(,)-.4 E EP +%%Page: 3 3 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 10/Times-Roman@0 SF .576 +(where the order is determined by the comparison func-)79.2 84 R .815 +(tion supplied when the database w)79.2 96 R .815(as created.)-.1 F -.15 +(Pa)5.815 G .815(ges at).15 F .389(the leaf le)79.2 108 R -.15(ve)-.25 G +2.889(lo).15 G 2.889(ft)-2.889 G .389 +(he tree include pointers to their neigh-)-2.889 F 1.444 +(bors to simplify tra)79.2 120 R -.15(ve)-.2 G 3.944(rsal. B+trees).15 F +1.445(support lookup by)3.944 F -.15(ex)79.2 132 S .068 +(act match \(equality\) or range \(greater than or equal to).15 F 2.891 +(ak)79.2 144 S -.15(ey)-2.991 G 2.891(\). Lik).15 F 2.891(eH)-.1 G .391 +(ash tables, B+trees support record inser)-2.891 F(-)-.2 E +(tion, deletion, and iteration o)79.2 156 Q -.15(ve)-.15 G 2.5(ra).15 G +(ll records in the tree.)-2.5 E .646 +(As records are inserted and pages in the B+tree \214ll up,)79.2 172.2 R +(the)79.2 184.2 Q 2.722(ya)-.15 G .223(re split, with about half the k) +-2.722 F -.15(ey)-.1 G 2.723(sg).15 G .223(oing into a ne)-2.723 F(w) +-.25 E 1.603(peer page at the same le)79.2 196.2 R -.15(ve)-.25 G 4.103 +(li).15 G 4.103(nt)-4.103 G 1.603(he tree.)-4.103 F 1.603(Most B+tree) +6.603 F .387(implementations lea)79.2 208.2 R .687 -.15(ve b)-.2 H .387 +(oth nodes half-full after a split.).15 F 2.763 +(This leads to poor performance in a common case,)79.2 220.2 R 1.522 +(where the caller inserts k)79.2 232.2 R -.15(ey)-.1 G 4.022(si).15 G +4.022(no)-4.022 G(rder)-4.022 E 6.522(.T)-.55 G 4.023(oh)-7.322 G 1.523 +(andle this)-4.023 F 1.643(case, Berk)79.2 244.2 R(ele)-.1 E 4.143(yD) +-.15 G 4.143(Bk)-4.143 G 1.642(eeps track of the insertion order)-4.243 +F(,)-.4 E 2.023(and splits pages une)79.2 256.2 R -.15(ve)-.25 G 2.024 +(nly to k).15 F 2.024(eep pages fuller)-.1 F 7.024(.T)-.55 G(his)-7.024 +E 2.3(reduces tree size, yielding better search performance)79.2 268.2 R +(and smaller databases.)79.2 280.2 Q 3.177 +(On deletion, empty pages are coalesced by re)79.2 296.4 R -.15(ve)-.25 +G(rse).15 E 2.03(splits into single pages.)79.2 308.4 R 2.03 +(The access method does no)7.03 F .347 +(other page balancing on insertion or deletion.)79.2 320.4 R -2.15 -.25 +(Ke y)5.348 H 2.848(sa).25 G(re)-2.848 E 1.927(not mo)79.2 332.4 R -.15 +(ve)-.15 G 4.427(da).15 G 1.927(mong pages at e)-4.427 F -.15(ve)-.25 G +1.926(ry update to k).15 F 1.926(eep the)-.1 F 2.206 +(tree well-balanced.)79.2 344.4 R 2.207(While this could impro)7.206 F +2.507 -.15(ve s)-.15 H(earch).15 E 2.341 +(times in some cases, the additional code comple)79.2 356.4 R(xity)-.15 +E(leads to slo)79.2 368.4 Q(wer updates and is prone to deadlocks.)-.25 +E -.15(Fo)79.2 384.6 S 2.948(rs).15 G(implicity)-2.948 E 2.948(,B)-.65 G +(erk)-2.948 E(ele)-.1 E 2.949(yD)-.15 G 2.949(BB)-2.949 G .449 +(+trees do no pre\214x com-)-2.949 F(pression of k)79.2 396.6 Q -.15(ey) +-.1 G 2.5(sa).15 G 2.5(ti)-2.5 G(nternal or leaf nodes.)-2.5 E/F1 12 +/Times-Bold@0 SF 3(2.3. Recno)79.2 426.6 R F0(Berk)79.2 442.8 Q(ele)-.1 +E 2.736(yD)-.15 G 2.736(Bi)-2.736 G .236(ncludes a \214x)-2.736 F .236 +(ed- or v)-.15 F .235(ariable-length record)-.25 F 5.075 +(access method, called)79.2 454.8 R/F2 10/Times-Italic@0 SF(Recno)7.575 +E F0 10.075(.T)C 5.075(he Recno access)-10.075 F .896 +(method assigns logical record numbers to each record,)79.2 466.8 R .978 +(and can search for and update records by record num-)79.2 478.8 R(ber) +79.2 490.8 Q 5.037(.R)-.55 G .037(ecno is able, for e)-5.037 F .037 +(xample, to load a te)-.15 F .036(xt \214le into a)-.15 F 1.514 +(database, treating each line as a record.)79.2 502.8 R 1.514 +(This permits)6.514 F -.1(fa)79.2 514.8 S 1.313 +(st searches by line number for applications lik).1 F 3.812(et)-.1 G +-.15(ex)-3.812 G(t).15 E(editors [Ston82].)79.2 526.8 Q 2.59 +(Recno is actually b)79.2 543 R 2.59(uilt on top of the B+tree access) +-.2 F 3.192(method and pro)79.2 555 R 3.191(vides a simple interf)-.15 F +3.191(ace for storing)-.1 F 3.14(sequentially-ordered data v)79.2 567 R +5.64(alues. The)-.25 F 3.14(Recno access)5.64 F 2.266 +(method generates k)79.2 579 R -.15(ey)-.1 G 4.766(si).15 G(nternally) +-4.766 E 7.266(.T)-.65 G 2.266(he programmer')-7.266 F(s)-.55 E(vie)79.2 +591 Q 4.102(wo)-.25 G 4.102(ft)-4.102 G 1.602(he v)-4.102 F 1.602 +(alues is that the)-.25 F 4.102(ya)-.15 G 1.603(re numbered sequen-) +-4.102 F .254(tially from one.)79.2 603 R(De)5.254 E -.15(ve)-.25 G .254 +(lopers can choose to ha).15 F .553 -.15(ve r)-.2 H(ecords).15 E 9 +(automatically renumbered when lo)79.2 615 R(wer)-.25 E(-numbered)-.2 E +.041(records are added or deleted.)79.2 627 R .041(In this case, ne) +5.041 F 2.541(wk)-.25 G -.15(ey)-2.641 G 2.541(sc).15 G(an)-2.541 E +(be inserted between e)79.2 639 Q(xisting k)-.15 E -.15(ey)-.1 G(s.).15 +E F1 3(3. F)79.2 669 R(eatur)-.3 E(es)-.216 E F0 1.827 +(This section describes important features of Berk)79.2 685.2 R(ele)-.1 +E(y)-.15 E 3.456(DB. In)79.2 697.2 R .956(general, de)3.456 F -.15(ve) +-.25 G .956(lopers can choose which features).15 F .488 +(are useful to them, and use only those that are required)79.2 709.2 R +(by their application.)323.2 84 Q -.15(Fo)323.2 100.2 S 3.529(re).15 G +1.029(xample, when an application opens a database, it)-3.679 F .101 +(can declare the de)323.2 112.2 R .101(gree of concurrenc)-.15 F 2.601 +(ya)-.15 G .102(nd reco)-2.601 F -.15(ve)-.15 G .102(ry that).15 F .049 +(it requires.)323.2 124.2 R .048 +(Simple stand-alone applications, and in par)5.049 F(-)-.2 E .491 +(ticular ports of applications that used)323.2 136.2 R/F3 10/Courier@0 +SF(dbm)2.991 E F0 .491(or one of its)2.991 F -.25(va)323.2 148.2 S 1.093 +(riants, generally do not require concurrent access or).25 F .975 +(crash reco)323.2 160.2 R -.15(ve)-.15 G(ry).15 E 5.975(.O)-.65 G .975 +(ther applications, such as enterprise-)-5.975 F 3.08 +(class database management systems that store sales)323.2 172.2 R 2.643 +(transactions or other critical data, need full transac-)323.2 184.2 R +3.93(tional service.)323.2 196.2 R 3.93(Single-user operation is f)8.93 +F 3.93(aster than)-.1 F 1.175(multi-user operation, since no o)323.2 +208.2 R -.15(ve)-.15 G 1.176(rhead is incurred by).15 F 3.156 +(locking. Running)323.2 220.2 R .656(with the reco)3.156 F -.15(ve)-.15 +G .655(ry system disabled is).15 F -.1(fa)323.2 232.2 S 1.732 +(ster than running with it enabled, since log records).1 F 2.703 +(need not be written when changes are made to the)323.2 244.2 R +(database.)323.2 256.2 Q .851 +(In addition, some core subsystems, including the lock-)323.2 272.4 R +.345(ing system and the logging f)323.2 284.4 R(acility)-.1 E 2.844(,c) +-.65 G .344(an be used outside)-2.844 F 1.772(the conte)323.2 296.4 R +1.772(xt of the access methods as well.)-.15 F(Although)6.773 E(fe)323.2 +308.4 Q 4.284(wu)-.25 G 1.784(sers ha)-4.284 F 2.084 -.15(ve c)-.2 H +1.784(hosen to do so, it is possible to use).15 F .939 +(only the lock manager in Berk)323.2 320.4 R(ele)-.1 E 3.439(yD)-.15 G +3.439(Bt)-3.439 G 3.439(oc)-3.439 G .939(ontrol con-)-3.439 F(currenc) +323.2 332.4 Q 4.743(yi)-.15 G 4.743(na)-4.743 G 4.743(na)-4.743 G 2.242 +(pplication, without using an)-4.743 F 4.742(yo)-.15 G 4.742(ft)-4.742 G +(he)-4.742 E .158(standard database services.)323.2 344.4 R(Alternati) +5.158 E -.15(ve)-.25 G(ly).15 E 2.658(,t)-.65 G .159(he caller can) +-2.658 F(inte)323.2 356.4 Q .07 +(grate locking of non-database resources with Berk)-.15 F(e-)-.1 E(le) +323.2 368.4 Q 5.201(yD)-.15 G(B')-5.201 E 5.201(st)-.55 G 2.702 +(ransactional tw)-5.201 F 2.702(o-phase locking system, to)-.1 F 2.892 +(impose transaction semantics on objects outside the)323.2 380.4 R +(database.)323.2 392.4 Q F1 3(3.1. Pr)323.2 422.4 R +(ogrammatic interfaces)-.216 E F0(Berk)323.2 438.6 Q(ele)-.1 E 4.008(yD) +-.15 G 4.008(Bd)-4.008 G 1.509(e\214nes a simple API for database man-) +-4.008 F 3.452(agement. The)323.2 450.6 R .952 +(package does not include industry-stan-)3.452 F 1.898 +(dard programmatic interf)323.2 462.6 R 1.898 +(aces such as Open Database)-.1 F(Connecti)323.2 474.6 Q .852 +(vity \(ODBC\), Object Linking and Embedding)-.25 F .817 +(for Databases \(OleDB\), or Structured Query Language)323.2 486.6 R +4.027(\(SQL\). These)323.2 498.6 R(interf)4.027 E 1.527 +(aces, while useful, were designed)-.1 F 2.477 +(to promote interoperability of database systems, and)323.2 510.6 R +(not simplicity or performance.)323.2 522.6 Q 3.192 +(In response to customer demand, Berk)323.2 538.8 R(ele)-.1 E 5.691(yD) +-.15 G 5.691(B2)-5.691 G(.5)-5.691 E .538 +(introduced support for the XA standard [Open94].)323.2 550.8 R(XA)5.539 +E .52(permits Berk)323.2 562.8 R(ele)-.1 E 3.02(yD)-.15 G 3.02(Bt)-3.02 +G 3.02(op)-3.02 G .52(articipate in distrib)-3.02 F .52(uted trans-)-.2 +F 3.373(actions under a transaction processing monitor lik)323.2 574.8 R +(e)-.1 E -.45(Tu)323.2 586.8 S -.15(xe).45 G 1.31(do from BEA Systems.) +.15 F(Lik)6.31 E 3.81(eX)-.1 G 1.31(A, other standard)-3.81 F(interf) +323.2 598.8 Q .99(aces can be b)-.1 F .99 +(uilt on top of the core system.)-.2 F(The)5.99 E .846 +(standards do not belong inside Berk)323.2 610.8 R(ele)-.1 E 3.346(yD) +-.15 G .846(B, since not)-3.346 F(all applications need them.)323.2 +622.8 Q F1 3(3.2. W)323.2 652.8 R(orking with r)-.9 E(ecords)-.216 E F0 +3.134(Ad)323.2 669 S .634 +(atabase user may need to search for particular k)-3.134 F -.15(ey)-.1 G +(s).15 E .908(in a database, or may simply w)323.2 681 R .908 +(ant to bro)-.1 F .907(wse a)-.25 F -.25(va)-.2 G(ilable).25 E 4.101 +(records. Berk)323.2 693 R(ele)-.1 E 4.101(yD)-.15 G 4.101(Bs)-4.101 G +1.601(upports both k)-4.101 F -.15(ey)-.1 G 1.602(ed access, to).15 F +.173(\214nd one or more records with a gi)323.2 705 R -.15(ve)-.25 G +2.673(nk).15 G -.15(ey)-2.773 G 2.673(,o)-.5 G 2.673(rs)-2.673 G +(equential)-2.673 E .53(access, to retrie)323.2 717 R .83 -.15(ve a)-.25 +H .53(ll the records in the database one at).15 F EP +%%Page: 4 4 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 10/Times-Roman@0 SF 6.34(at)79.2 84 S 6.34(ime. The)-6.34 F 3.84 +(order of the records returned during)6.34 F .208 +(sequential scans depends on the access method.)79.2 96 R(B+tree)5.209 E +1.495(and Recno databases return records in sort order)79.2 108 R 3.995 +(,a)-.4 G(nd)-3.995 E .023 +(Hash databases return them in apparently random order)79.2 120 R(.)-.55 +E(Similarly)79.2 136.2 Q 4.959(,B)-.65 G(erk)-4.959 E(ele)-.1 E 4.959 +(yD)-.15 G 4.958(Bd)-4.959 G 2.458(e\214nes simple interf)-4.958 F 2.458 +(aces for)-.1 F +(inserting, updating, and deleting records in a database.)79.2 148.2 Q +/F1 12/Times-Bold@0 SF 3(3.3. Long)79.2 178.2 R -.12(ke)3 G(ys and v).12 +E(alues)-.12 E F0(Berk)79.2 194.4 Q(ele)-.1 E 3.553(yD)-.15 G 3.553(Bm) +-3.553 G 1.053(anages k)-3.553 F -.15(ey)-.1 G 3.553(sa).15 G 1.053 +(nd v)-3.553 F 1.053(alues as lar)-.25 F 1.054(ge as 2)-.18 F/F2 8 +/Times-Roman@0 SF(32)-5 I F0 3.192(bytes. Since)79.2 206.4 R .692 +(the time required to cop)3.192 F 3.192(yar)-.1 G .692(ecord is pro-) +-3.192 F 1.895(portional to its size, Berk)79.2 218.4 R(ele)-.1 E 4.396 +(yD)-.15 G 4.396(Bi)-4.396 G 1.896(ncludes interf)-4.396 F(aces)-.1 E +4.507(that operate on partial records.)79.2 230.4 R 4.507 +(If an application)9.507 F 1.273(requires only part of a lar)79.2 242.4 +R 1.274(ge record, it requests partial)-.18 F .026(record retrie)79.2 +254.4 R -.25(va)-.25 G .026(l, and recei).25 F -.15(ve)-.25 G 2.526(sj) +.15 G .025(ust the bytes that it needs.)-2.526 F(The smaller cop)79.2 +266.4 Q 2.5(ys)-.1 G -2.25 -.2(av e)-2.5 H 2.5(sb).2 G +(oth time and memory)-2.5 E(.)-.65 E(Berk)79.2 282.6 Q(ele)-.1 E 3.206 +(yD)-.15 G 3.206(Ba)-3.206 G(llo)-3.206 E .706 +(ws the programmer to de\214ne the data)-.25 F 2.72(types of k)79.2 +294.6 R -.15(ey)-.1 G 5.22(sa).15 G 2.72(nd v)-5.22 F 5.22(alues. De) +-.25 F -.15(ve)-.25 G 2.72(lopers use an).15 F 5.22(yt)-.15 G(ype)-5.22 +E -.15(ex)79.2 306.6 S(pressible in the programming language.).15 E F1 3 +(3.4. Lar)79.2 336.6 R(ge databases)-.12 E F0 3.255(As)79.2 352.8 S .755 +(ingle database managed by Berk)-3.255 F(ele)-.1 E 3.256(yD)-.15 G 3.256 +(Bc)-3.256 G .756(an be up)-3.256 F 1.716(to 2)79.2 364.8 R F2(48)-5 I +F0 1.716(bytes, or 256 petabytes, in size.)4.216 5 N(Berk)6.715 E(ele) +-.1 E 4.215(yD)-.15 G(B)-4.215 E 2.144 +(uses the host \214lesystem as the backing store for the)79.2 376.8 R +2.668(database, so lar)79.2 388.8 R 2.667 +(ge databases require big \214le support)-.18 F 3.113 +(from the operating system.)79.2 400.8 R(Sleep)8.113 E 3.114(ycat Softw) +-.1 F 3.114(are has)-.1 F 5.712(customers using Berk)79.2 412.8 R(ele) +-.1 E 8.212(yD)-.15 G 8.212(Bt)-8.212 G 8.211(om)-8.212 G 5.711 +(anage single)-8.211 F(databases in e)79.2 424.8 Q(xcess of 100 gig)-.15 +E(abytes.)-.05 E F1 3(3.5. Main)79.2 454.8 R(memory databases)3 E F0 +1.171(Applications that do not require persistent storage can)79.2 471 R +.119(create databases that e)79.2 483 R .119(xist only in main memory) +-.15 F 5.118(.T)-.65 G(hese)-5.118 E .542(databases bypass the o)79.2 +495 R -.15(ve)-.15 G .543(rhead imposed by the I/O sys-).15 F +(tem altogether)79.2 507 Q(.)-.55 E 2.144 +(Some applications do need to use disk as a backing)79.2 523.2 R 2.248 +(store, b)79.2 535.2 R 2.249(ut run on machines with v)-.2 F 2.249 +(ery lar)-.15 F 2.249(ge memory)-.18 F(.)-.65 E(Berk)79.2 547.2 Q(ele) +-.1 E 2.799(yD)-.15 G 2.799(Bi)-2.799 G 2.799(sa)-2.799 G .299 +(ble to manage v)-2.799 F .299(ery lar)-.15 F .299(ge shared mem-)-.18 F +.128(ory re)79.2 559.2 R .129 +(gions for cached data pages, log records, and lock)-.15 F 3.938 +(management. F)79.2 571.2 R 1.437(or e)-.15 F 1.437 +(xample, the cache re)-.15 F 1.437(gion used for)-.15 F .033 +(data pages may be gig)79.2 583.2 R .034 +(abytes in size, reducing the lik)-.05 F(eli-)-.1 E .639(hood that an) +79.2 595.2 R 3.139(yr)-.15 G .639 +(ead operation will need to visit the disk)-3.139 F 1.201 +(in the steady state.)79.2 607.2 R 1.201 +(The programmer declares the size)6.201 F(of the cache re)79.2 619.2 Q +(gion at startup.)-.15 E(Finally)79.2 635.4 Q 7.048(,m)-.65 G(an)-7.048 +E 7.048(yo)-.15 G 4.548(perating systems pro)-7.048 F 4.548 +(vide memory-)-.15 F 2.532(mapped \214le services that are much f)79.2 +647.4 R 2.533(aster than their)-.1 F 2.602 +(general-purpose \214le system interf)79.2 659.4 R 5.102(aces. Berk)-.1 +F(ele)-.1 E 5.102(yD)-.15 G(B)-5.102 E 5.118 +(can memory-map its database \214les for read-only)79.2 671.4 R 3.917 +(database use.)79.2 683.4 R 3.917(The application operates on records) +8.917 F 2.069(stored directly on the pages, with no cache manage-)79.2 +695.4 R 1.557(ment o)79.2 707.4 R -.15(ve)-.15 G 4.057(rhead. Because) +.15 F 1.556(the application gets pointers)4.057 F 1.265 +(directly into the Berk)323.2 84 R(ele)-.1 E 3.765(yD)-.15 G 3.765(Bp) +-3.765 G 1.265(ages, writes cannot be)-3.765 F 3.775 +(permitted. Otherwise,)323.2 96 R 1.275(changes could bypass the lock-) +3.775 F .23(ing and logging systems, and softw)323.2 108 R .23 +(are errors could cor)-.1 F(-)-.2 E 4.007(rupt the database.)323.2 120 R +4.006(Read-only applications can use)9.007 F(Berk)323.2 132 Q(ele)-.1 E +2.893(yD)-.15 G(B')-2.893 E 2.893(sm)-.55 G .393 +(emory-mapped \214le service to impro)-2.893 F -.15(ve)-.15 G +(performance on most architectures.)323.2 144 Q F1 3 +(3.6. Con\214gurable)323.2 174 R(page size)3 E F0 .111 +(Programmers declare the size of the pages used by their)323.2 190.2 R +.403(access methods when the)323.2 202.2 R 2.903(yc)-.15 G .403 +(reate a database.)-2.903 F(Although)5.403 E(Berk)323.2 214.2 Q(ele)-.1 +E 4.046(yD)-.15 G 4.046(Bp)-4.046 G(ro)-4.046 E 1.546 +(vides reasonable def)-.15 F 1.546(aults, de)-.1 F -.15(ve)-.25 G +(lopers).15 E 3.64(may o)323.2 226.2 R -.15(ve)-.15 G 3.64 +(rride them to control system performance.).15 F .793 +(Small pages reduce the number of records that \214t on a)323.2 238.2 R +.353(single page.)323.2 250.2 R(Fe)5.353 E .353 +(wer records on a page means that fe)-.25 F(wer)-.25 E .724 +(records are lock)323.2 262.2 R .724(ed when the page is lock)-.1 F .723 +(ed, impro)-.1 F(ving)-.15 E(concurrenc)323.2 274.2 Q 5.262 -.65(y. T) +-.15 H 1.462(he per).65 F 1.462(-page o)-.2 F -.15(ve)-.15 G 1.462 +(rhead is proportionally).15 F 2.29 +(higher with smaller pages, of course, b)323.2 286.2 R 2.29(ut de)-.2 F +-.15(ve)-.25 G(lopers).15 E(can trade of)323.2 298.2 Q 2.5(fs)-.25 G +(pace for time as an application requires.)-2.5 E F1 3(3.7. Small)323.2 +328.2 R -.3(fo)3 G(otprint).3 E F0(Berk)323.2 344.4 Q(ele)-.1 E 3.973 +(yD)-.15 G 3.973(Bi)-3.973 G 3.974(sac)-3.973 G 1.474(ompact system.) +-3.974 F 1.474(The full package,)6.474 F .832 +(including all access methods, reco)323.2 356.4 R -.15(ve)-.15 G +(rability).15 E 3.331(,a)-.65 G .831(nd trans-)-3.331 F 1.235 +(action support is roughly 175K of te)323.2 368.4 R 1.236 +(xt space on com-)-.15 F(mon architectures.)323.2 380.4 Q F1 3 +(3.8. Cursors)323.2 410.4 R F0 1.57(In database terminology)323.2 426.6 +R 4.07(,ac)-.65 G 1.57(ursor is a pointer into an)-4.07 F 1.806 +(access method that can be called iterati)323.2 438.6 R -.15(ve)-.25 G +1.807(ly to return).15 F 3.68(records in sequence.)323.2 450.6 R(Berk) +8.68 E(ele)-.1 E 6.18(yD)-.15 G 6.18(Bi)-6.18 G 3.68(ncludes cursor) +-6.18 F(interf)323.2 462.6 Q 2.814(aces for all access methods.)-.1 F +2.815(This permits, for)7.814 F -.15(ex)323.2 474.6 S .34 +(ample, users to tra).15 F -.15(ve)-.2 G .34(rse a B+tree and vie).15 F +2.84(wr)-.25 G .34(ecords in)-2.84 F(order)323.2 486.6 Q 6.233(.P)-.55 G +1.234(ointers to records in cursors are persistent, so)-6.233 F 1.779 +(that once fetched, a record may be updated in place.)323.2 498.6 R +(Finally)323.2 510.6 Q 4.438(,c)-.65 G 1.939 +(ursors support access to chains of duplicate)-4.438 F +(data items in the v)323.2 522.6 Q(arious access methods.)-.25 E F1 3 +(3.9. J)323.2 552.6 R(oins)-.18 E F0 2.703(In database terminology)323.2 +568.8 R 5.203(,aj)-.65 G 2.702(oin is an operation that)-5.203 F .616 +(spans multiple separate tables \(or in the case of Berk)323.2 580.8 R +(e-)-.1 E(le)323.2 592.8 Q 4.518(yD)-.15 G 2.018 +(B, multiple separate DB \214les\).)-4.518 F -.15(Fo)7.017 G 4.517(re) +.15 G 2.017(xample, a)-4.667 F(compan)323.2 604.8 Q 3.372(ym)-.15 G .873 +(ay store information about its customers in)-3.372 F 1.545 +(one table and information about sales in another)323.2 616.8 R 6.545 +(.A)-.55 G(n)-6.545 E 1.498(application will lik)323.2 628.8 R 1.499 +(ely w)-.1 F 1.499(ant to look up sales informa-)-.1 F .933 +(tion by customer name; this requires matching records)323.2 640.8 R +2.28(in the tw)323.2 652.8 R 4.78(ot)-.1 G 2.28 +(ables that share a common customer ID)-4.78 F 2.515(\214eld. This)323.2 +664.8 R .015(combining of records from multiple tables is)2.515 F +(called a join.)323.2 676.8 Q(Berk)323.2 693 Q(ele)-.1 E 5.561(yD)-.15 G +5.561(Bi)-5.561 G 3.061(ncludes interf)-5.561 F 3.062 +(aces for joining tw)-.1 F 5.562(oo)-.1 G(r)-5.562 E(more tables.)323.2 +705 Q EP +%%Page: 5 5 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 12/Times-Bold@0 SF 3(3.10. T)79.2 84 R(ransactions)-.888 E/F1 10 +/Times-Roman@0 SF -.35(Tr)79.2 100.2 S(ansactions ha).35 E .3 -.15(ve f) +-.2 H(our properties [Gray93]:).15 E/F2 8/Times-Roman@0 SF<83>84.2 116.4 +Q F1(The)17.2 E 5.489(ya)-.15 G 2.989(re atomic.)-5.489 F 2.989 +(That is, all of the changes)7.989 F 1.475 +(made in a single transaction must be applied at)104.2 128.4 R 1.31 +(the same instant or not at all.)104.2 140.4 R 1.31(This permits, for) +6.31 F -.15(ex)104.2 152.4 S 3.565(ample, the transfer of mone).15 F +6.065(yb)-.15 G 3.565(etween tw)-6.065 F(o)-.1 E 3.68 +(accounts to be accomplished, by making the)104.2 164.4 R 1.27 +(reduction of the balance in one account and the)104.2 176.4 R +(increase in the other into a single, atomic action.)104.2 188.4 Q F2 +<83>84.2 204.6 Q F1(The)17.2 E 3.125(ym)-.15 G .625(ust be consistent.) +-3.125 F .625(That is, changes to the)5.625 F 3.628(database by an)104.2 +216.6 R 6.128(yt)-.15 G 3.628(ransaction cannot lea)-6.128 F 3.929 -.15 +(ve t)-.2 H(he).15 E(database in an ille)104.2 228.6 Q -.05(ga)-.15 G +2.5(lo).05 G 2.5(rc)-2.5 G(orrupt state.)-2.5 E F2<83>84.2 244.8 Q F1 +(The)17.2 E 3.006(ym)-.15 G .506(ust be isolatable.)-3.006 F(Re)5.506 E +-.05(ga)-.15 G .505(rdless of the num-).05 F .8(ber of users w)104.2 +256.8 R .8(orking in the database at the same)-.1 F 1.88(time, e)104.2 +268.8 R -.15(ve)-.25 G 1.88(ry user must ha).15 F 2.18 -.15(ve t)-.2 H +1.88(he illusion that no).15 F(other acti)104.2 280.8 Q +(vity is going on.)-.25 E F2<83>84.2 297 Q F1(The)17.2 E 5.54(ym)-.15 G +3.04(ust be durable.)-5.54 F(Ev)8.04 E 3.04(en if the disk that)-.15 F +.877(stores the database is lost, it must be possible to)104.2 309 R +(reco)104.2 321 Q -.15(ve)-.15 G 2.668(rt).15 G .168 +(he database to its last transaction-consis-)-2.668 F(tent state.)104.2 +333 Q 2.49(This combination of properties \212 atomicity)79.2 349.2 R +4.99(,c)-.65 G(onsis-)-4.99 E(tenc)79.2 361.2 Q 4.542 -.65(y, i)-.15 H +3.243(solation, and durability \212 is referred to as).65 F -.4(AC)79.2 +373.2 S 3.459(IDity in the literature.).4 F(Berk)8.459 E(ele)-.1 E 5.958 +(yD)-.15 G 3.458(B, lik)-5.958 F 5.958(em)-.1 G(ost)-5.958 E .993 +(database systems, pro)79.2 385.2 R .993(vides A)-.15 F .994 +(CIDity using a collection)-.4 F(of core services.)79.2 397.2 Q .257 +(Programmers can choose to use Berk)79.2 413.4 R(ele)-.1 E 2.757(yD)-.15 +G(B')-2.757 E 2.757(st)-.55 G(ransac-)-2.757 E +(tion services for applications that need them.)79.2 425.4 Q F0 3 +(3.10.1. Write-ahead)79.2 455.4 R(logging)3 E F1 .479 +(Programmers can enable the logging system when the)79.2 471.6 R(y)-.15 +E .918(start up Berk)79.2 483.6 R(ele)-.1 E 3.418(yD)-.15 G 3.418 +(B. During)-3.418 F 3.417(at)3.417 G .917(ransaction, the appli-)-3.417 +F .493(cation mak)79.2 495.6 R .493 +(es a series of changes to the database.)-.1 F(Each)5.494 E .552 +(change is captured in a log entry)79.2 507.6 R 3.052(,w)-.65 G .552 +(hich holds the state)-3.052 F .207 +(of the database record both before and after the change.)79.2 519.6 R +2.208(The log record is guaranteed to be \215ushed to stable)79.2 531.6 +R .871(storage before an)79.2 543.6 R 3.371(yo)-.15 G 3.371(ft)-3.371 G +.871(he changed data pages are writ-)-3.371 F 3.989(ten. This)79.2 555.6 +R(beha)3.989 E 1.489(vior \212 writing the log before the data)-.2 F +(pages \212 is called)79.2 567.6 Q/F3 10/Times-Italic@0 SF +(write-ahead lo)2.5 E -.1(gg)-.1 G(ing).1 E F1(.)A .835(At an)79.2 583.8 +R 3.335(yt)-.15 G .835(ime during the transaction, the application can) +-3.335 F F3(commit)79.2 595.8 Q F1 4.202(,m)C 1.702 +(aking the changes permanent, or)-4.202 F F3 -.45(ro)4.201 G 1.701 +(ll bac).45 F(k)-.2 E F1(,)A .852 +(cancelling all changes and restoring the database to its)79.2 607.8 R +1.57(pre-transaction state.)79.2 619.8 R 1.57 +(If the application rolls back the)6.57 F 1.003 +(transaction, then the log holds the state of all changed)79.2 631.8 R +.5(pages prior to the transaction, and Berk)79.2 643.8 R(ele)-.1 E 3(yD) +-.15 G 3(Bs)-3 G(imply)-3 E .226(restores that state.)79.2 655.8 R .226 +(If the application commits the trans-)5.226 F .538(action, Berk)79.2 +667.8 R(ele)-.1 E 3.038(yD)-.15 G 3.038(Bw)-3.038 G .538 +(rites the log records to disk.)-3.038 F(In-)5.537 E 2.312 +(memory copies of the data pages already re\215ect the)79.2 679.8 R +1.399(changes, and will be \215ushed as necessary during nor)79.2 691.8 +R(-)-.2 E 2.35(mal processing.)79.2 703.8 R 2.35 +(Since log writes are sequential, b)7.35 F(ut)-.2 E 8.732 +(data page writes are random, this impro)79.2 715.8 R -.15(ve)-.15 G(s) +.15 E(performance.)323.2 84 Q F0 3(3.10.2. Crashes)323.2 114 R(and r)3 E +(eco)-.216 E -.12(ve)-.12 G(ry).12 E F1(Berk)323.2 130.2 Q(ele)-.1 E +3.592(yD)-.15 G(B')-3.592 E 3.592(sw)-.55 G 1.093 +(rite-ahead log is used by the transac-)-3.592 F .415 +(tion system to commit or roll back transactions.)323.2 142.2 R .414 +(It also)5.414 F(gi)323.2 154.2 Q -.15(ve)-.25 G 3.23(st).15 G .73 +(he reco)-3.23 F -.15(ve)-.15 G .73 +(ry system the information that it needs).15 F .824(to protect ag)323.2 +166.2 R .824(ainst data loss or corruption from crashes.)-.05 F(Berk) +323.2 178.2 Q(ele)-.1 E 2.703(yD)-.15 G 2.703(Bi)-2.703 G 2.704(sa) +-2.703 G .204(ble to survi)-2.704 F .504 -.15(ve a)-.25 H .204 +(pplication crashes, sys-).15 F .408(tem crashes, and e)323.2 190.2 R +-.15(ve)-.25 G 2.908(nc).15 G .407(atastrophic f)-2.908 F .407 +(ailures lik)-.1 F 2.907(et)-.1 G .407(he loss)-2.907 F +(of a hard disk, without losing an)323.2 202.2 Q 2.5(yd)-.15 G(ata.)-2.5 +E(Survi)323.2 218.4 Q .538(ving crashes requires data stored in se)-.25 +F -.15(ve)-.25 G .539(ral dif).15 F(fer)-.25 E(-)-.2 E 2.52(ent places.) +323.2 230.4 R 2.52(During normal processing, Berk)7.52 F(ele)-.1 E 5.02 +(yD)-.15 G(B)-5.02 E .766(has copies of acti)323.2 242.4 R 1.066 -.15 +(ve l)-.25 H .766(og records and recently-used data).15 F 1.539 +(pages in memory)323.2 254.4 R 6.539(.L)-.65 G 1.539 +(og records are \215ushed to the log)-6.539 F .694 +(disk when transactions commit.)323.2 266.4 R .695 +(Data pages trickle out)5.694 F .008(to the data disk as pages mo)323.2 +278.4 R .308 -.15(ve t)-.15 H .008(hrough the b).15 F(uf)-.2 E .008 +(fer cache.)-.25 F(Periodically)323.2 290.4 Q 2.691(,t)-.65 G .191 +(he system administrator backs up the data)-2.691 F .278 +(disk, creating a safe cop)323.2 302.4 R 2.778(yo)-.1 G 2.778(ft)-2.778 +G .278(he database at a particular)-2.778 F 2.609(instant. When)323.2 +314.4 R .109(the database is back)2.609 F .109(ed up, the log can be)-.1 +F 3.838(truncated. F)323.2 326.4 R 1.337(or maximum rob)-.15 F 1.337 +(ustness, the log disk and)-.2 F(data disk should be separate de)323.2 +338.4 Q(vices.)-.25 E(Dif)323.2 354.6 Q 1.29(ferent system f)-.25 F 1.29 +(ailures can destro)-.1 F 3.79(ym)-.1 G(emory)-3.79 E 3.79(,t)-.65 G +1.29(he log)-3.79 F 1.106(disk, or the data disk.)323.2 366.6 R(Berk) +6.106 E(ele)-.1 E 3.606(yD)-.15 G 3.606(Bi)-3.606 G 3.606(sa)-3.606 G +1.106(ble to survi)-3.606 F -.15(ve)-.25 G .679(the loss of an)323.2 +378.6 R 3.179(yo)-.15 G .679(ne of these repositories without losing) +-3.179 F(an)323.2 390.6 Q 2.5(yc)-.15 G(ommitted transactions.)-2.5 E +1.372(If the computer')323.2 406.8 R 3.871(sm)-.55 G 1.371 +(emory is lost, through an applica-)-3.871 F 1.619 +(tion or operating system crash, then the log holds all)323.2 418.8 R +1.789(committed transactions.)323.2 430.8 R 1.788(On restart, the reco) +6.789 F -.15(ve)-.15 G 1.788(ry sys-).15 F .49(tem rolls the log forw) +323.2 442.8 R .49(ard ag)-.1 F .49(ainst the database, reapply-)-.05 F +.682(ing an)323.2 454.8 R 3.181(yc)-.15 G .681 +(hanges to on-disk pages that were in memory)-3.181 F .14 +(at the time of the crash.)323.2 466.8 R .14 +(Since the log contains pre- and)5.14 F .957 +(post-change state for transactions, the reco)323.2 478.8 R -.15(ve)-.15 +G .956(ry system).15 F 1.14(also uses the log to restore an)323.2 490.8 +R 3.64(yp)-.15 G 1.14(ages to their original)-3.64 F 1.615(state if the) +323.2 502.8 R 4.115(yw)-.15 G 1.615 +(ere modi\214ed by transactions that ne)-4.115 F -.15(ve)-.25 G(r).15 E +(committed.)323.2 514.8 Q 2.051 +(If the data disk is lost, the system administrator can)323.2 531 R .887 +(restore the most recent cop)323.2 543 R 3.386(yf)-.1 G .886 +(rom backup.)-3.386 F .886(The reco)5.886 F(v-)-.15 E 1.298 +(ery system will roll the entire log forw)323.2 555 R 1.298(ard ag)-.1 F +1.298(ainst the)-.05 F 2.64 +(original database, reapplying all committed changes.)323.2 567 R 4.363 +(When it \214nishes, the database will contain e)323.2 579 R -.15(ve) +-.25 G(ry).15 E .535(change made by e)323.2 591 R -.15(ve)-.25 G .534 +(ry transaction that e).15 F -.15(ve)-.25 G 3.034(rc).15 G(ommitted.) +-3.034 E .494(If the log disk is lost, then the reco)323.2 607.2 R -.15 +(ve)-.15 G .495(ry system can use).15 F 1.853 +(the in-memory copies of log entries to roll back an)323.2 619.2 R(y) +-.15 E .026(uncommitted transactions, \215ush all in-memory database) +323.2 631.2 R 1.659(pages to the data disk, and shut do)323.2 643.2 R +1.659(wn gracefully)-.25 F 6.658(.A)-.65 G(t)-6.658 E 2.204 +(that point, the system administrator can back up the)323.2 655.2 R .039 +(database disk, install a ne)323.2 667.2 R 2.539(wl)-.25 G .039 +(og disk, and restart the sys-)-2.539 F(tem.)323.2 679.2 Q EP +%%Page: 6 6 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 12/Times-Bold@0 SF 3(3.10.3. Checkpoints)79.2 84 R/F1 10 +/Times-Roman@0 SF(Berk)79.2 100.2 Q(ele)-.1 E 6.085(yD)-.15 G 6.085(Bi) +-6.085 G 3.585(ncludes a checkpointing service that)-6.085 F .263 +(interacts with the reco)79.2 112.2 R -.15(ve)-.15 G .263(ry system.).15 +F .263(During normal pro-)5.263 F 2.415 +(cessing, both the log and the database are changing)79.2 124.2 R +(continually)79.2 136.2 Q 5.925(.A)-.65 G 3.425(ta)-5.925 G 1.224 -.15 +(ny g)-3.425 H -2.15 -.25(iv e).15 H 3.424(ni).25 G .924 +(nstant, the on-disk v)-3.424 F(ersions)-.15 E .414(of the tw)79.2 148.2 +R 2.914(oa)-.1 G .414(re not guaranteed to be consistent.)-2.914 F .414 +(The log)5.414 F 3.838 +(probably contains changes that are not yet in the)79.2 160.2 R +(database.)79.2 172.2 Q .085(When an application mak)79.2 188.4 R .086 +(es a)-.1 F/F2 10/Times-Italic@0 SF -.15(ch)2.586 G(ec).15 E(kpoint)-.2 +E F1 2.586(,a)C .086(ll committed)-2.586 F .443 +(changes in the log up to that point are guaranteed to be)79.2 200.4 R +.631(present on the data disk, too.)79.2 212.4 R .632 +(Checkpointing is moder)5.631 F(-)-.2 E .046(ately e)79.2 224.4 R +(xpensi)-.15 E .346 -.15(ve d)-.25 H .046(uring normal processing, b).15 +F .045(ut limits the)-.2 F(time spent reco)79.2 236.4 Q -.15(ve)-.15 G +(ring from crashes.).15 E 3.117 +(After an application or operating system crash, the)79.2 252.6 R(reco) +79.2 264.6 Q -.15(ve)-.15 G 7.419(ry system only needs to go back tw).15 +F(o)-.1 E(checkpoints)79.2 278.6 Q/F3 7/Times-Roman@0 SF(1)-4 I F1 1.376 +(to start rolling the log forw)3.876 4 N 3.875(ard. W)-.1 F(ithout)-.4 E +3.264(checkpoints, there is no w)79.2 290.6 R 3.265(ay to be sure ho)-.1 +F 5.765(wl)-.25 G(ong)-5.765 E .395(restarting after a crash will tak) +79.2 302.6 R 2.895(e. W)-.1 F .395(ith checkpoints, the)-.4 F .088 +(restart interv)79.2 314.6 R .089(al can be \214x)-.25 F .089 +(ed by the programmer)-.15 F 5.089(.R)-.55 G(eco)-5.089 E(v-)-.15 E .668 +(ery processing can be guaranteed to complete in a sec-)79.2 326.6 R +(ond or tw)79.2 338.6 Q(o.)-.1 E(Softw)79.2 354.8 Q 2.457 +(are crashes are much more common than disk)-.1 F -.1(fa)79.2 366.8 S +3.385(ilures. Man).1 F 3.385(yd)-.15 G -2.15 -.25(ev e)-3.385 H .884 +(lopers w).25 F .884(ant to guarantee that soft-)-.1 F -.1(wa)79.2 378.8 +S .158(re b).1 F .158(ugs do not destro)-.2 F 2.658(yd)-.1 G .158 +(ata, b)-2.658 F .158(ut are willing to restore)-.2 F .631 +(from tape, and to tolerate a day or tw)79.2 390.8 R 3.131(oo)-.1 G +3.131(fl)-3.131 G .63(ost w)-3.131 F .63(ork, in)-.1 F .89(the unlikle) +79.2 402.8 R 3.39(ye)-.15 G -.15(ve)-3.64 G .89(nt of a disk crash.).15 +F -.4(Wi)5.89 G .89(th Berk).4 F(ele)-.1 E 3.39(yD)-.15 G(B,)-3.39 E +1.093(programmers may truncate the log at checkpoints.)79.2 414.8 R(As) +6.092 E .09(long as the tw)79.2 426.8 R 2.59(om)-.1 G .09 +(ost recent checkpoints are present, the)-2.59 F(reco)79.2 438.8 Q -.15 +(ve)-.15 G .106(ry system can guarantee that no committed trans-).15 F +.611(actions are lost after a softw)79.2 450.8 R .611(are crash.)-.1 F +.611(In this case, the)5.611 F(reco)79.2 462.8 Q -.15(ve)-.15 G 1.439 +(ry system does not require that the log and the).15 F 1.328 +(data be on separate de)79.2 474.8 R 1.329 +(vices, although separating them)-.25 F(can still impro)79.2 486.8 Q .3 +-.15(ve p)-.15 H(erformance by spreading out writes.).15 E F0 3 +(3.10.4. T)79.2 516.8 R -.12(wo)-.888 G(-phase locking).12 E F1(Berk) +79.2 533 Q(ele)-.1 E 4.416(yD)-.15 G 4.416(Bp)-4.416 G(ro)-4.416 E 1.916 +(vides a service kno)-.15 F 1.915(wn as tw)-.25 F(o-phase)-.1 E 3.017 +(locking. In)79.2 545 R .517(order to reduce the lik)3.017 F .518 +(elihood of deadlocks)-.1 F 2.547(and to guarantee A)79.2 557 R 2.546 +(CID properties, database systems)-.4 F .063(manage locks in tw)79.2 569 +R 2.564(op)-.1 G 2.564(hases. First,)-2.564 F .064(during the operation) +2.564 F 1.574(of a transaction, the)79.2 581 R 4.074(ya)-.15 G 1.574 +(cquire locks, b)-4.074 F 1.573(ut ne)-.2 F -.15(ve)-.25 G 4.073(rr).15 +G(elease)-4.073 E 6.147(them. Second,)79.2 593 R 3.648 +(at the end of the transaction, the)6.147 F(y)-.15 E .235 +(release locks, b)79.2 605 R .235(ut ne)-.2 F -.15(ve)-.25 G 2.735(ra) +.15 G .235(cquire them.)-2.735 F .235(In practice, most)5.235 F 4.69 +(database systems, including Berk)79.2 617 R(ele)-.1 E 7.19(yD)-.15 G +4.69(B, acquire)-7.19 F 2.314(locks on demand o)79.2 629 R -.15(ve)-.15 +G 4.814(rt).15 G 2.314(he course of the transaction,)-4.814 F +(then \215ush the log, then release all locks.)79.2 641 Q .32 LW 83.2 +650.6 79.2 650.6 DL 87.2 650.6 83.2 650.6 DL 91.2 650.6 87.2 650.6 DL +95.2 650.6 91.2 650.6 DL 99.2 650.6 95.2 650.6 DL 103.2 650.6 99.2 650.6 +DL 107.2 650.6 103.2 650.6 DL 111.2 650.6 107.2 650.6 DL 115.2 650.6 +111.2 650.6 DL 119.2 650.6 115.2 650.6 DL 123.2 650.6 119.2 650.6 DL +127.2 650.6 123.2 650.6 DL 131.2 650.6 127.2 650.6 DL 135.2 650.6 131.2 +650.6 DL 139.2 650.6 135.2 650.6 DL 143.2 650.6 139.2 650.6 DL 147.2 +650.6 143.2 650.6 DL 151.2 650.6 147.2 650.6 DL 155.2 650.6 151.2 650.6 +DL 159.2 650.6 155.2 650.6 DL 163.2 650.6 159.2 650.6 DL 167.2 650.6 +163.2 650.6 DL 171.2 650.6 167.2 650.6 DL 175.2 650.6 171.2 650.6 DL +179.2 650.6 175.2 650.6 DL 183.2 650.6 179.2 650.6 DL 187.2 650.6 183.2 +650.6 DL 191.2 650.6 187.2 650.6 DL 195.2 650.6 191.2 650.6 DL 199.2 +650.6 195.2 650.6 DL 203.2 650.6 199.2 650.6 DL 207.2 650.6 203.2 650.6 +DL 211.2 650.6 207.2 650.6 DL 215.2 650.6 211.2 650.6 DL 219.2 650.6 +215.2 650.6 DL 223.2 650.6 219.2 650.6 DL/F4 5/Times-Roman@0 SF(1)100.8 +661 Q/F5 8/Times-Roman@0 SF .338(One checkpoint is not f)2.338 3.2 N +.338(ar enough.)-.08 F .338(The reco)4.338 F -.12(ve)-.12 G .338 +(ry system can-).12 F .211 +(not be sure that the most recent checkpoint completed \212 it may ha) +79.2 673.8 R -.12(ve)-.16 G .734 +(been interrupted by the crash that forced the reco)79.2 683.4 R -.12 +(ve)-.12 G .734(ry system to run).12 F(in the \214rst place.)79.2 693 Q +F1(Berk)323.2 84 Q(ele)-.1 E 3.306(yD)-.15 G 3.306(Bc)-3.306 G .806 +(an lock entire database \214les, which cor)-3.306 F(-)-.2 E .845 +(respond to tables, or indi)323.2 96 R .844(vidual pages in them.)-.25 F +.844(It does)5.844 F 2.141(no record-le)323.2 108 R -.15(ve)-.25 G 4.641 +(ll).15 G 4.641(ocking. By)-4.641 F 2.142(shrinking the page size,)4.641 +F(ho)323.2 120 Q(we)-.25 E -.15(ve)-.25 G 4.427 -.4(r, d).15 H -2.15 +-.25(ev e).4 H 3.627(lopers can guarantee that e).25 F -.15(ve)-.25 G +3.626(ry page).15 F 2.101(holds only a small number of records.)323.2 +132 R 2.102(This reduces)7.102 F(contention.)323.2 144 Q .388 +(If locking is enabled, then read and write operations on)323.2 160.2 R +5.317(ad)323.2 172.2 S 2.817(atabase acquire tw)-5.317 F 2.817 +(o-phase locks, which are held)-.1 F 3.635 +(until the transaction completes.)323.2 184.2 R 3.635(Which objects are) +8.635 F(lock)323.2 196.2 Q .738 +(ed and the order of lock acquisition depend on the)-.1 F -.1(wo)323.2 +208.2 S .503(rkload for each transaction.).1 F .502 +(It is possible for tw)5.502 F 3.002(oo)-.1 G(r)-3.002 E 1.315 +(more transactions to deadlock, so that each is w)323.2 220.2 R(aiting) +-.1 E(for a lock that is held by another)323.2 232.2 Q(.)-.55 E(Berk) +323.2 248.4 Q(ele)-.1 E 3.307(yD)-.15 G 3.307(Bd)-3.307 G .807 +(etects deadlocks and automatically rolls)-3.307 F 1.825 +(back one of the transactions.)323.2 260.4 R 1.825 +(This releases the locks)6.825 F 1.926(that it held and allo)323.2 272.4 +R 1.925(ws the other transactions to con-)-.25 F 3.346(tinue. The)323.2 +284.4 R .847(caller is noti\214ed that its transaction did not)3.346 F +1.747(complete, and may restart it.)323.2 296.4 R(De)6.747 E -.15(ve) +-.25 G 1.747(lopers can specify).15 F .646 +(the deadlock detection interv)323.2 308.4 R .647(al and the polic)-.25 +F 3.147(yt)-.15 G 3.147(ou)-3.147 G .647(se in)-3.147 F +(choosing a transaction to roll back.)323.2 320.4 Q 6.686(The tw)323.2 +336.6 R 6.686(o-phase locking interf)-.1 F 6.686(aces are separately)-.1 +F .927(callable by applications that link Berk)323.2 348.6 R(ele)-.1 E +3.427(yD)-.15 G .928(B, though)-3.427 F(fe)323.2 360.6 Q 5.64(wu)-.25 G +3.14(sers ha)-5.64 F 3.44 -.15(ve n)-.2 H 3.14(eeded to use that f).15 F +3.14(acility directly)-.1 F(.)-.65 E 2.211(Using these interf)323.2 +372.6 R 2.211(aces, Berk)-.1 F(ele)-.1 E 4.711(yD)-.15 G 4.712(Bp)-4.711 +G(ro)-4.712 E 2.212(vides a f)-.15 F(ast,)-.1 E 2.4 +(platform-portable locking system for general-purpose)323.2 384.6 R +2.917(use. It)323.2 396.6 R .418 +(also lets users include non-database objects in a)2.917 F 3.497 +(database transaction, by controlling access to them)323.2 408.6 R -.15 +(ex)323.2 420.6 S(actly as if the).15 E 2.5(yw)-.15 G +(ere inside the database.)-2.5 E .583(The Berk)323.2 436.8 R(ele)-.1 E +3.083(yD)-.15 G 3.084(Bt)-3.083 G -.1(wo)-3.084 G .584(-phase locking f) +.1 F .584(acility is b)-.1 F .584(uilt on)-.2 F .609(the f)323.2 448.8 R +.609(astest correct locking primiti)-.1 F -.15(ve)-.25 G 3.108(st).15 G +.608(hat are supported)-3.108 F 1.967(by the underlying architecture.) +323.2 460.8 R 1.967(In the current imple-)6.967 F .593 +(mentation, this means that the locking system is dif)323.2 472.8 R(fer) +-.25 E(-)-.2 E 1.709(ent on the v)323.2 484.8 R 1.709 +(arious UNIX platforms, and is still more)-.25 F(dif)323.2 496.8 Q .695 +(ferent on W)-.25 F(indo)-.4 E .695(ws NT)-.25 F 5.695(.I)-.74 G 3.195 +(no)-5.695 G .695(ur e)-3.195 F .695(xperience, the most)-.15 F(dif) +323.2 508.8 Q 2.634 +(\214cult aspect of performance tuning is \214nding the)-.25 F -.1(fa) +323.2 520.8 S .883(stest locking primiti).1 F -.15(ve)-.25 G 3.383(st) +.15 G .883(hat w)-3.383 F .882(ork correctly on a par)-.1 F(-)-.2 E 1.26 +(ticular architecture and then inte)323.2 532.8 R 1.26(grating the ne) +-.15 F 3.76(wi)-.25 G(nter)-3.76 E(-)-.2 E -.1(fa)323.2 544.8 S +(ce with the se).1 E -.15(ve)-.25 G(ral that we already support.).15 E +.536(The w)323.2 561 R .536(orld w)-.1 F .536 +(ould be a better place if the operating sys-)-.1 F 2.096 +(tems community w)323.2 573 R 2.096(ould uniformly implement POSIX)-.1 F +1.31(locking primiti)323.2 585 R -.15(ve)-.25 G 3.81(sa).15 G 1.31(nd w) +-3.81 F 1.31(ould guarantee that acquiring)-.1 F 1.085 +(an uncontested lock w)323.2 597 R 1.085(as a f)-.1 F 1.085 +(ast operation.)-.1 F 1.085(Locks must)6.085 F -.1(wo)323.2 609 S 3.641 +(rk both among threads in a single process and).1 F(among processes.) +323.2 621 Q F0 3(3.11. Concurr)323.2 651 R(ency)-.216 E F1 .383 +(Good performance under concurrent operation is a crit-)323.2 667.2 R +.766(ical design point for Berk)323.2 679.2 R(ele)-.1 E 3.266(yD)-.15 G +3.265(B. Although)-3.266 F(Berk)3.265 E(ele)-.1 E(y)-.15 E 1.961 +(DB is itself not multi-threaded, it is thread-safe, and)323.2 691.2 R +.547(runs well in threaded applications.)323.2 703.2 R(Philosophically) +5.546 E 3.046(,w)-.65 G(e)-3.046 E(vie)323.2 715.2 Q 4.764(wt)-.25 G +2.264(he use of threads and the choice of a threads)-4.764 F EP +%%Page: 7 7 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 10/Times-Roman@0 SF .066(package as a polic)79.2 84 R 2.566(yd)-.15 +G .065(ecision, and prefer to of)-2.566 F .065(fer mecha-)-.25 F .042 +(nism \(the ability to run threaded or not\), allo)79.2 96 R .043 +(wing appli-)-.25 F(cations to choose their o)79.2 108 Q(wn policies.) +-.25 E 1.947(The locking, logging, and b)79.2 124.2 R(uf)-.2 E 1.947 +(fer pool subsystems all)-.25 F .711 +(use shared memory or other OS-speci\214c sharing f)79.2 136.2 R(acili-) +-.1 E 1.713(ties to communicate.)79.2 148.2 R 1.713(Locks, b)6.713 F(uf) +-.2 E 1.713(fer pool fetches, and)-.25 F 1.061(log writes beha)79.2 +160.2 R 1.361 -.15(ve i)-.2 H 3.561(nt).15 G 1.061(he same w)-3.561 F +1.061(ay across threads in a)-.1 F .033(single process as the)79.2 172.2 +R 2.532(yd)-.15 G 2.532(oa)-2.532 G .032(cross dif)-2.532 F .032 +(ferent processes on a)-.25 F(single machine.)79.2 184.2 Q .896 +(As a result, concurrent database applications may start)79.2 200.4 R +1.651(up a ne)79.2 212.4 R 4.151(wp)-.25 G 1.651(rocess for e)-4.151 F +-.15(ve)-.25 G 1.651(ry single user).15 F 4.151(,m)-.4 G 1.651 +(ay create a)-4.151 F 2.848(single serv)79.2 224.4 R 2.848(er which spa) +-.15 F 2.849(wns a ne)-.15 F 5.349(wt)-.25 G 2.849(hread for e)-5.349 F +-.15(ve)-.25 G(ry).15 E(client request, or may choose an)79.2 236.4 Q +2.5(yp)-.15 G(olic)-2.5 E 2.5(yi)-.15 G 2.5(nb)-2.5 G(etween.)-2.5 E +(Berk)79.2 252.6 Q(ele)-.1 E 3.629(yD)-.15 G 3.629(Bh)-3.629 G 1.128 +(as been carefully designed to minimize)-3.629 F .07 +(contention and maximize concurrenc)79.2 264.6 R 3.87 -.65(y. T)-.15 H +.07(he cache man-).65 F .57(ager allo)79.2 276.6 R .57 +(ws all threads or processes to bene\214t from I/O)-.25 F 2.917 +(done by one.)79.2 288.6 R 2.917(Shared resources must sometimes be) +7.917 F(lock)79.2 300.6 Q 1.804(ed for e)-.1 F(xclusi)-.15 E 2.104 -.15 +(ve a)-.25 H 1.804(ccess by one thread of control.).15 F 1.757 -.8(We h) +79.2 312.6 T -2.25 -.2(av e).8 H -.1(ke)2.857 G .158 +(pt critical sections small, and are careful not).1 F 1.199 +(to hold critical resource locks across system calls that)79.2 324.6 R +.538(could deschedule the locking thread or process.)79.2 336.6 R +(Sleep-)5.539 E .979(ycat Softw)79.2 348.6 R .979 +(are has customers with hundreds of concur)-.1 F(-)-.2 E(rent users w) +79.2 360.6 Q(orking on a single database in production.)-.1 E/F1 12 +/Times-Bold@0 SF 3(4. Engineering)79.2 390.6 R(Philosoph)3 E(y)-.18 E F0 +(Fundamentally)79.2 406.8 Q 3.998(,B)-.65 G(erk)-3.998 E(ele)-.1 E 3.998 +(yD)-.15 G 3.998(Bi)-3.998 G 3.999(sac)-3.998 G 1.499 +(ollection of access)-3.999 F .19(methods with important f)79.2 418.8 R +.19(acilities, lik)-.1 F 2.69(el)-.1 G .19(ogging, locking,)-2.69 F +1.251(and transactional access underlying them.)79.2 430.8 R 1.252 +(In both the)6.252 F .992(research and the commercial w)79.2 442.8 R +.991(orld, the techniques for)-.1 F -.2(bu)79.2 454.8 S 2.727 +(ilding systems lik).2 F 5.227(eB)-.1 G(erk)-5.227 E(ele)-.1 E 5.227(yD) +-.15 G 5.227(Bh)-5.227 G -2.25 -.2(av e)-5.227 H 2.728(been well-)5.427 +F(kno)79.2 466.8 Q(wn for a long time.)-.25 E .443(The k)79.2 483 R .743 +-.15(ey a)-.1 H(dv).15 E .442(antage of Berk)-.25 F(ele)-.1 E 2.942(yD) +-.15 G 2.942(Bi)-2.942 G 2.942(st)-2.942 G .442(he careful atten-)-2.942 +F 1.059(tion that has been paid to engineering details through-)79.2 495 +R 1.039(out its life.)79.2 507 R 2.639 -.8(We h)6.039 H -2.25 -.2(av e) +.8 H 1.039(carefully designed the system so)3.739 F .452 +(that the core f)79.2 519 R .452(acilities, lik)-.1 F 2.952(el)-.1 G +.452(ocking and I/O, surf)-2.952 F .453(ace the)-.1 F .972(right interf) +79.2 531 R .971(aces and are otherwise opaque to the caller)-.1 F(.)-.55 +E .294(As programmers, we understand the v)79.2 543 R .295 +(alue of simplicity)-.25 F .206(and ha)79.2 555 R .506 -.15(ve w)-.2 H +(ork).05 E .206(ed hard to simplify the interf)-.1 F .205(aces we sur) +-.1 F(-)-.2 E -.1(fa)79.2 567 S(ce to users of the database system.).1 E +(Berk)79.2 583.2 Q(ele)-.1 E 4.531(yD)-.15 G 4.531(Ba)-4.531 G -.2(vo) +-4.731 G 2.031(ids limits in the code.).2 F 2.031(It places no)7.031 F +.474(practical limit on the size of k)79.2 595.2 R -.15(ey)-.1 G .473 +(s, v).15 F .473(alues, or databases;)-.25 F(the)79.2 607.2 Q 2.5(ym) +-.15 G(ay gro)-2.5 E 2.5(wt)-.25 G 2.5(oo)-2.5 G(ccup)-2.5 E 2.5(yt)-.1 +G(he a)-2.5 E -.25(va)-.2 G(ilable storage space.).25 E 1.857 +(The locking and logging subsystems ha)79.2 623.4 R 2.157 -.15(ve b)-.2 +H 1.858(een care-).15 F .184 +(fully crafted to reduce contention and impro)79.2 635.4 R .484 -.15 +(ve t)-.15 H(hrough-).15 E 2.16 +(put by shrinking or eliminating critical sections, and)79.2 647.4 R +(reducing the sizes of lock)79.2 659.4 Q(ed re)-.1 E +(gions and log entries.)-.15 E 2.238 +(There is nothing in the design or implementation of)79.2 675.6 R(Berk) +79.2 687.6 Q(ele)-.1 E 2.818(yD)-.15 G 2.818(Bt)-2.818 G .318 +(hat pushes the state of the art in database)-2.818 F 3.545 +(systems. Rather)79.2 699.6 R 3.545(,w)-.4 G 3.545(eh)-3.545 G -2.25 -.2 +(av e)-3.545 H 1.044(been v)3.745 F 1.044(ery careful to get the)-.15 F +4.321(engineering right.)79.2 711.6 R 4.321 +(The result is a system that is)9.321 F(superior)323.2 84 Q 2.867(,a)-.4 +G 2.867(sa)-2.867 G 2.866(ne)-2.867 G .366 +(mbedded database system, to an)-2.866 F 2.866(yo)-.15 G(ther)-2.866 E +(solution a)323.2 96 Q -.25(va)-.2 G(ilable.).25 E .811 +(Most database systems trade of)323.2 112.2 R 3.312(fs)-.25 G .812 +(implicity for correct-)-3.312 F 4.151(ness. Either)323.2 124.2 R 1.651 +(the system is easy to use, or it supports)4.151 F 1.17 +(concurrent use and survi)323.2 136.2 R -.15(ve)-.25 G 3.67(ss).15 G +1.17(ystem f)-3.67 F 3.67(ailures. Berk)-.1 F(ele)-.1 E(y)-.15 E 1.013 +(DB, because of its careful design and implementation,)323.2 148.2 R(of) +323.2 160.2 Q(fers both simplicity and correctness.)-.25 E .759 +(The system has a small footprint, mak)323.2 176.4 R .759 +(es simple opera-)-.1 F 1.012 +(tions simple to carry out \(inserting a ne)323.2 188.4 R 3.512(wr)-.25 +G 1.012(ecord tak)-3.512 F(es)-.1 E 1.16(just a fe)323.2 200.4 R 3.66 +(wl)-.25 G 1.16(ines of code\), and beha)-3.66 F -.15(ve)-.2 G 3.66(sc) +.15 G 1.16(orrectly in the)-3.66 F -.1(fa)323.2 212.4 S .528(ce of hea) +.1 F .527(vy concurrent use, system crashes, and e)-.2 F -.15(ve)-.25 G +(n).15 E(catastrophic f)323.2 224.4 Q(ailures lik)-.1 E 2.5(el)-.1 G +(oss of a hard disk.)-2.5 E F1 3(5. The)323.2 254.4 R(Berk)3 E +(eley DB 2.x Distrib)-.12 E(ution)-.24 E F0(Berk)323.2 270.6 Q(ele)-.1 E +4.171(yD)-.15 G 4.171(Bi)-4.171 G 4.171(sd)-4.171 G(istrib)-4.171 E +1.671(uted in source code form from)-.2 F/F2 10/Times-Italic@0 SF(www) +323.2 282.6 Q(.sleepycat.com)-.74 E F0 7.322(.U)C 2.322 +(sers are free to do)-7.322 F 2.321(wnload and)-.25 F -.2(bu)323.2 294.6 +S(ild the softw).2 E(are, and to use it in their applications.)-.1 E F1 +3(5.1. What)323.2 324.6 R(is in the distrib)3 E(ution)-.24 E F0 4.827 +(The distrib)323.2 340.8 R 4.827(ution is a compressed archi)-.2 F 5.127 +-.15(ve \214)-.25 H 7.328(le. It).15 F .057 +(includes the source code for the Berk)323.2 352.8 R(ele)-.1 E 2.556(yD) +-.15 G 2.556(Bl)-2.556 G(ibrary)-2.556 E 2.556(,a)-.65 G(s)-2.556 E .453 +(well as documentation, test suites, and supporting utili-)323.2 364.8 R +(ties.)323.2 376.8 Q 2.613(The source code includes b)323.2 393 R 2.612 +(uild support for all sup-)-.2 F .254(ported platforms.)323.2 405 R .254 +(On UNIX systems Berk)5.254 F(ele)-.1 E 2.755(yD)-.15 G 2.755(Bu)-2.755 +G(ses)-2.755 E 1.28(the GNU autocon\214guration tool,)323.2 417 R/F3 10 +/Courier@0 SF(autoconf)3.78 E F0 3.78(,t)C 3.78(oi)-3.78 G(den-)-3.78 E +.992(tify the system and to b)323.2 429 R .992 +(uild the library and supporting)-.2 F 3.589(utilities. Berk)323.2 441 R +(ele)-.1 E 3.589(yD)-.15 G 3.588(Bi)-3.589 G 1.088(ncludes speci\214c b) +-3.588 F 1.088(uild en)-.2 F(viron-)-.4 E .515 +(ments for other platforms, such as VMS and W)323.2 453 R(indo)-.4 E +(ws.)-.25 E F1 3(5.1.1. Documentation)323.2 483 R F0 5.008(The distrib) +323.2 499.2 R 5.008(uted system includes documentation in)-.2 F 1.626 +(HTML format.)323.2 511.2 R 1.626(The documentation is in tw)6.626 F +4.127(op)-.1 G 1.627(arts: a)-4.127 F .725 +(UNIX-style reference manual for use by programmers,)323.2 523.2 R +(and a reference guide which is tutorial in nature.)323.2 535.2 Q F1 3 +(5.1.2. T)323.2 565.2 R(est suite)-1.104 E F0 1.107(The softw)323.2 +581.4 R 1.108(are also includes a complete test suite, writ-)-.1 F .155 +(ten in Tcl.)323.2 593.4 R 1.754 -.8(We b)5.154 H(elie).8 E .454 -.15 +(ve t)-.25 H .154(hat the test suite is a k).15 F .454 -.15(ey a)-.1 H +(dv).15 E(an-)-.25 E(tage of Berk)323.2 605.4 Q(ele)-.1 E 2.5(yD)-.15 G +2.5(Bo)-2.5 G -.15(ve)-2.65 G 2.5(rc).15 G(omparable systems.)-2.5 E +2.612(First, the test suite allo)323.2 621.6 R 2.613(ws users who do) +-.25 F 2.613(wnload and)-.25 F -.2(bu)323.2 633.6 S 1.731(ild the softw) +.2 F 1.731(are to be sure that it is operating cor)-.1 F(-)-.2 E(rectly) +323.2 645.6 Q(.)-.65 E .893(Second, the test suite allo)323.2 661.8 R +.894(ws us, lik)-.25 F 3.394(eo)-.1 G .894(ther commercial)-3.394 F(de) +323.2 673.8 Q -.15(ve)-.25 G .536(lopers of database softw).15 F .536 +(are, to e)-.1 F -.15(xe)-.15 G .535(rcise the system).15 F 2.256 +(thoroughly at e)323.2 685.8 R -.15(ve)-.25 G 2.256(ry release.).15 F +2.256(When we learn of ne)7.256 F(w)-.25 E -.2(bu)323.2 697.8 S 1.719 +(gs, we add them to the test suite.).2 F 3.319 -.8(We r)6.719 H 1.719 +(un the test).8 F 5.692(suite continually during de)323.2 709.8 R -.15 +(ve)-.25 G 5.692(lopment c).15 F 5.692(ycles, and)-.15 F EP +%%Page: 8 8 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 10/Times-Roman@0 SF(al)79.2 84 Q -.1(wa)-.1 G .314 +(ys prior to release.).1 F .314(The result is a much more reli-)5.314 F +(able system by the time it reaches beta release.)79.2 96 Q/F1 12 +/Times-Bold@0 SF 3(5.2. Binary)79.2 126 R(distrib)3 E(ution)-.24 E F0 +(Sleep)79.2 142.2 Q .893(ycat mak)-.1 F .893 +(es compiled libraries and general binary)-.1 F(distrib)79.2 154.2 Q +(utions a)-.2 E -.25(va)-.2 G(ilable to customers for a fee.).25 E F1 3 +(5.3. Supported)79.2 184.2 R(platf)3 E(orms)-.3 E F0(Berk)79.2 200.4 Q +(ele)-.1 E 5.623(yD)-.15 G 5.623(Br)-5.623 G 3.123(uns on an)-5.623 F +5.622(yo)-.15 G 3.122(perating system with a)-5.622 F .816 +(POSIX 1003.1 interf)79.2 212.4 R .817(ace [IEEE96], which includes vir) +-.1 F(-)-.2 E 1.998(tually e)79.2 224.4 R -.15(ve)-.25 G 1.997 +(ry UNIX system.).15 F 1.997(In addition, the softw)6.997 F(are)-.1 E +2.85(runs on VMS, W)79.2 236.4 R(indo)-.4 E 2.85(ws/95, W)-.25 F(indo) +-.4 E 2.85(ws/98, and W)-.25 F(in-)-.4 E(do)79.2 248.4 Q(ws/NT)-.25 E +10.21(.S)-.74 G(leep)-10.21 E 5.21(ycat Softw)-.1 F 5.21 +(are no longer supports)-.1 F(deplo)79.2 260.4 Q(yment on sixteen-bit W) +-.1 E(indo)-.4 E(ws systems.)-.25 E F1 3(6. Berk)79.2 290.4 R +(eley DB 2.x Licensing)-.12 E F0(Berk)79.2 306.6 Q(ele)-.1 E 2.627(yD) +-.15 G 2.627(B2)-2.627 G .128(.x is distrib)-2.627 F .128 +(uted as an Open Source prod-)-.2 F 4.709(uct. The)79.2 318.6 R(softw) +4.709 E 2.209(are is freely a)-.1 F -.25(va)-.2 G 2.209 +(ilable from us at our).25 F -.8(We)79.2 330.6 S 3.372(bs).8 G .872 +(ite, and in other media.)-3.372 F .872(Users are free to do)5.872 F +(wn-)-.25 E(load the softw)79.2 342.6 Q(are and b)-.1 E +(uild applications with it.)-.2 E 1.023(The 1.x v)79.2 358.8 R 1.022 +(ersions of Berk)-.15 F(ele)-.1 E 3.522(yD)-.15 G 3.522(Bw)-3.522 G +1.022(ere co)-3.522 F -.15(ve)-.15 G 1.022(red by the).15 F 3.763 +(UC Berk)79.2 370.8 R(ele)-.1 E 6.263(yc)-.15 G(op)-6.263 E 3.763 +(yright that co)-.1 F -.15(ve)-.15 G 3.764(rs softw).15 F 3.764 +(are freely)-.1 F(redistrib)79.2 382.8 Q 1.742(utable in source form.) +-.2 F 1.741(When Sleep)6.742 F 1.741(ycat Soft-)-.1 F -.1(wa)79.2 394.8 +S .906(re w).1 F .907(as formed, we needed to draft a license consis-) +-.1 F 2.319(tent with the cop)79.2 406.8 R 2.319(yright go)-.1 F -.15 +(ve)-.15 G 2.318(rning the e).15 F 2.318(xisting, older)-.15 F(softw) +79.2 418.8 Q 5.328(are. Because)-.1 F 2.828(of important dif)5.328 F +2.828(ferences between)-.25 F .497(the UC Berk)79.2 430.8 R(ele)-.1 E +2.997(yc)-.15 G(op)-2.997 E .497(yright and the GPL, it w)-.1 F .496 +(as impos-)-.1 F .884(sible for us to use the GPL.)79.2 442.8 R 3.384 +(As)5.884 G .884(econd cop)-3.384 F .884(yright, with)-.1 F .87 +(terms contradictory to the \214rst, simply w)79.2 454.8 R .87 +(ould not ha)-.1 F -.15(ve)-.2 G -.1(wo)79.2 466.8 S(rk).1 E(ed.)-.1 E +(Sleep)79.2 483 Q 2.533(ycat w)-.1 F 2.533 +(anted to continue Open Source de)-.1 F -.15(ve)-.25 G(lop-).15 E 2.079 +(ment of Berk)79.2 495 R(ele)-.1 E 4.579(yD)-.15 G 4.579(Bf)-4.579 G +2.079(or se)-4.579 F -.15(ve)-.25 G 2.079(ral reasons.).15 F 3.678 -.8 +(We a)7.078 H(gree).8 E .853 +(with Raymond [Raym98] and others that Open Source)79.2 507 R(softw)79.2 +519 Q .763(are is typically of higher quality than proprietary)-.1 F(,) +-.65 E 2.616(binary-only products.)79.2 531 R 2.617 +(Our customers bene\214t from a)7.616 F .983(community of de)79.2 543 R +-.15(ve)-.25 G .983(lopers who kno).15 F 3.483(wa)-.25 G .983 +(nd use Berk)-3.483 F(ele)-.1 E(y)-.15 E 1.317 +(DB, and can help with application design, deb)79.2 555 R(ugging,)-.2 E +1.65(and performance tuning.)79.2 567 R -.4(Wi)6.65 G 1.65 +(despread distrib).4 F 1.65(ution and)-.2 F 1.017 +(use of the source code tends to isolate b)79.2 579 R 1.017(ugs early) +-.2 F 3.517(,a)-.65 G(nd)-3.517 E .032(to get \214x)79.2 591 R .031 +(es back into the distrib)-.15 F .031(uted system quickly)-.2 F 5.031 +(.A)-.65 G(s)-5.031 E 3.553(ar)79.2 603 S 1.053(esult, Berk)-3.553 F +(ele)-.1 E 3.553(yD)-.15 G 3.553(Bi)-3.553 G 3.553(sm)-3.553 G 1.053 +(ore reliable.)-3.553 F 1.054(Just as impor)6.054 F(-)-.2 E(tantly)79.2 +615 Q 3.695(,i)-.65 G(ndi)-3.695 E 1.195 +(vidual users are able to contrib)-.25 F 1.195(ute ne)-.2 F 3.695(wf) +-.25 G(ea-)-3.695 E 1.056 +(tures and performance enhancements, to the bene\214t of)79.2 627 R +-2.15 -.25(ev e)79.2 639 T .359(ryone who uses Berk).25 F(ele)-.1 E +2.859(yD)-.15 G 2.859(B. From)-2.859 F 2.858(ab)2.859 G .358 +(usiness per)-3.058 F(-)-.2 E(specti)79.2 651 Q -.15(ve)-.25 G 3.115(,O) +.15 G .615(pen Source and free distrib)-3.115 F .615(ution of the soft-) +-.2 F -.1(wa)79.2 663 S 1.605(re creates share for us, and gi).1 F -.15 +(ve)-.25 G 4.105(su).15 G 4.105(sam)-4.105 G(ark)-4.105 E 1.605(et into) +-.1 F .412(which we can sell products and services.)79.2 675 R(Finally) +5.413 E 2.913(,m)-.65 G(ak-)-2.913 E .148(ing the source code freely a) +79.2 687 R -.25(va)-.2 G .147(ilable reduces our support).25 F 2.436 +(load, since customers can \214nd and \214x b)79.2 699 R 2.437 +(ugs without)-.2 F(recourse to us, in man)79.2 711 Q 2.5(yc)-.15 G +(ases.)-2.5 E 4.727 -.8(To p)323.2 84 T(reserv).8 E 5.627(et)-.15 G +3.126(he Open Source heritage of the older)-5.627 F(Berk)323.2 96 Q(ele) +-.1 E 3.003(yD)-.15 G 3.003(Bc)-3.003 G .504(ode, we drafted a ne)-3.003 +F 3.004(wl)-.25 G .504(icense go)-3.004 F -.15(ve)-.15 G(rning).15 E +.417(the distrib)323.2 108 R .417(ution of Berk)-.2 F(ele)-.1 E 2.916 +(yD)-.15 G 2.916(B2)-2.916 G 2.916(.x. W)-2.916 F 2.916(ea)-.8 G .416 +(dopted terms)-2.916 F .411(from the GPL that mak)323.2 120 R 2.911(ei) +-.1 G 2.911(ti)-2.911 G .411(mpossible to turn our Open)-2.911 F 1.289 +(Source code into proprietary code o)323.2 132 R 1.288(wned by someone) +-.25 F(else.)323.2 144 Q(Brie\215y)323.2 160.2 Q 3.18(,t)-.65 G .68 +(he terms go)-3.18 F -.15(ve)-.15 G .68(rning the use and distrib).15 F +.68(ution of)-.2 F(Berk)323.2 172.2 Q(ele)-.1 E 2.5(yD)-.15 G 2.5(Ba) +-2.5 G(re:)-2.5 E/F2 8/Times-Roman@0 SF<83>328.2 188.4 Q F0 +(your application must be internal to your site, or)17.2 E F2<83>328.2 +204.6 Q F0 .612(your application must be freely redistrib)17.2 F .611 +(utable in)-.2 F(source form, or)348.2 216.6 Q F2<83>328.2 232.8 Q F0 +(you must get a license from us.)17.2 E -.15(Fo)323.2 249 S 2.631(rc).15 +G .131(ustomers who prefer not to distrib)-2.631 F .132(ute Open Source) +-.2 F 1.493(products, we sell licenses to use and e)323.2 261 R 1.492 +(xtend Berk)-.15 F(ele)-.1 E(y)-.15 E(DB at a reasonable cost.)323.2 273 +Q 2.675 -.8(We w)323.2 289.2 T 1.076 +(ork hard to accommodate the needs of the Open).7 F .606 +(Source community)323.2 301.2 R 5.606(.F)-.65 G .606(or e)-5.756 F .606 +(xample, we ha)-.15 F .905 -.15(ve c)-.2 H .605(rafted spe-).15 F 1.415 +(cial licensing arrangements with Gnome to encourage)323.2 313.2 R +(its use and distrib)323.2 325.2 Q(ution of Berk)-.2 E(ele)-.1 E 2.5(yD) +-.15 G(B.)-2.5 E(Berk)323.2 341.4 Q(ele)-.1 E 4.103(yD)-.15 G 4.103(Bc) +-4.103 G 1.603(onforms to the Open Source de\214nition)-4.103 F 4.867 +([Open99]. The)323.2 353.4 R 2.367 +(license has been carefully crafted to)4.867 F -.1(ke)323.2 365.4 S .643 +(ep the product a).1 F -.25(va)-.2 G .642(ilable as an Open Source of) +.25 F(fering,)-.25 E(while pro)323.2 377.4 Q +(viding enough of a return on our in)-.15 E -.15(ve)-.4 G(stment to).15 +E 1.546(fund continued de)323.2 389.4 R -.15(ve)-.25 G 1.546 +(lopment and support of the prod-).15 F 3.033(uct. The)323.2 401.4 R +.534(current license has created a b)3.033 F .534(usiness capable)-.2 F +.916(of funding three years of de)323.2 413.4 R -.15(ve)-.25 G .916 +(lopment on the softw).15 F(are)-.1 E(that simply w)323.2 425.4 Q +(ould not ha)-.1 E .3 -.15(ve h)-.2 H(appened otherwise.).15 E F1 3 +(7. Summary)323.2 455.4 R F0(Berk)323.2 471.6 Q(ele)-.1 E 2.991(yD)-.15 +G 2.991(Bo)-2.991 G -.25(ff)-2.991 G .491 +(ers a unique collection of features, tar).25 F(-)-.2 E .175 +(geted squarely at softw)323.2 483.6 R .174(are de)-.1 F -.15(ve)-.25 G +.174(lopers who need simple,).15 F .492 +(reliable database management services in their applica-)323.2 495.6 R +5.3(tions. Good)323.2 507.6 R 2.8(design and implementation and careful) +5.3 F 1.633(engineering throughout mak)323.2 519.6 R 4.133(et)-.1 G +1.633(he softw)-4.133 F 1.634(are better than)-.1 F(man)323.2 531.6 Q +2.5(yo)-.15 G(ther systems.)-2.5 E(Berk)323.2 547.8 Q(ele)-.1 E 4.1(yD) +-.15 G 4.1(Bi)-4.1 G 4.1(sa)-4.1 G 4.1(nO)-4.1 G 1.6 +(pen Source product, a)-4.1 F -.25(va)-.2 G 1.6(ilable at).25 F/F3 10 +/Times-Italic@0 SF(www)323.2 559.8 Q(.sleepycat.com)-.74 E F0 .654 +(for do)3.154 F 3.154(wnload. The)-.25 F(distrib)3.154 E .654(uted sys-) +-.2 F .383(tem includes e)323.2 571.8 R -.15(ve)-.25 G .383 +(rything needed to b).15 F .382(uild and deplo)-.2 F 2.882(yt)-.1 G(he) +-2.882 E(softw)323.2 583.8 Q(are or to port it to ne)-.1 E 2.5(ws)-.25 G +(ystems.)-2.5 E(Sleep)323.2 600 Q 2.633(ycat Softw)-.1 F 2.633 +(are distrib)-.1 F 2.633(utes Berk)-.2 F(ele)-.1 E 5.133(yD)-.15 G 5.134 +(Bu)-5.133 G 2.634(nder a)-5.134 F .764(license agreement that dra)323.2 +612 R .764(ws on both the UC Berk)-.15 F(ele)-.1 E(y)-.15 E(cop)323.2 +624 Q 2.377(yright and the GPL.)-.1 F 2.377(The license guarantees that) +7.377 F(Berk)323.2 636 Q(ele)-.1 E 3.384(yD)-.15 G 3.384(Bw)-3.384 G +.884(ill remain an Open Source product and)-3.384 F(pro)323.2 648 Q +1.493(vides Sleep)-.15 F 1.493(ycat with opportunities to mak)-.1 F +3.994(em)-.1 G(one)-3.994 E(y)-.15 E(to fund continued de)323.2 660 Q +-.15(ve)-.25 G(lopment on the softw).15 E(are.)-.1 E EP +%%Page: 9 9 +%%BeginPageSetup +BP +%%EndPageSetup +/F0 12/Times-Bold@0 SF 3(8. Refer)79.2 84 R(ences)-.216 E/F1 10 +/Times-Roman@0 SF([Come79])79.2 100.2 Q(Comer)104.2 112.2 Q 3.127(,D)-.4 +G .627(., \231The Ubiquitous B-tree,)-3.127 F<9a>-.7 E/F2 10 +/Times-Italic@0 SF -.3(AC)3.126 G 3.126(MC).3 G(om-)-3.126 E .404 +(puting Surve)104.2 124.2 R(ys)-.3 E F1 -1.29(Vo)2.904 G .404 +(lume 11, number 2, June 1979.)1.29 F([Gray93])79.2 140.4 Q(Gray)104.2 +152.4 Q 2.982(,J)-.65 G .482(., and Reuter)-2.982 F 2.982(,A)-.4 G(.,) +-2.982 E F2 -1.55 -.55(Tr a)2.981 H .481(nsaction Pr).55 F(ocessing:) +-.45 E 6.776(Concepts and T)104.2 164.4 R(ec)-.92 E(hniques)-.15 E F1 +9.277(,M)C(or)-9.277 E -.05(ga)-.18 G(n-Kaufman).05 E(Publishers, 1993.) +104.2 176.4 Q([IEEE96])79.2 192.6 Q .364 +(Institute for Electrical and Electronics Engineers,)104.2 204.6 R F2 +(IEEE/ANSI Std 1003.1)104.2 216.6 Q F1 2.5(,1)C(996 Edition.)-2.5 E +([Litw80])79.2 232.8 Q 2.365(Litwin, W)104.2 244.8 R 2.366 +(., \231Linear Hashing: A Ne)-.92 F 4.866(wT)-.25 G 2.366(ool for)-5.666 +F 1.784(File and T)104.2 256.8 R 1.783(able Addressing,)-.8 F<9a>-.7 E +F2(Pr)4.283 E 1.783(oceedings of the)-.45 F 4.804 +(6th International Confer)104.2 268.8 R 4.804(ence on V)-.37 F 4.804 +(ery Lar)-1.11 F -.1(ge)-.37 G 1.983(Databases \(VLDB\))104.2 280.8 R F1 +4.483(,M)C 1.982(ontreal, Quebec, Canada,)-4.483 F(October 1980.)104.2 +292.8 Q([Open94])79.2 309 Q 4.068(The Open Group,)104.2 321 R F2 +(Distrib)6.568 E 4.069(uted TP: The XA+)-.2 F .78(Speci\214cation, V) +104.2 333 R(er)-1.11 E .78(sion 2)-.1 F F1 3.28(,T)C .78 +(he Open Group, 1994.)-3.28 F([Open99])79.2 349.2 Q(Opensource.or)104.2 +361.2 Q 8.307(g, \231Open Source De\214nition,)-.18 F<9a>-.7 E F2(www) +104.2 373.2 Q(.opensour)-.74 E(ce)-.37 E(.or)-.15 E(g/osd.html)-.37 E F1 +3.13(,v)C .63(ersion 1.4, 1999.)-3.28 F([Raym98])79.2 389.4 Q .718 +(Raymond, E.S., \231The Cathedral and the Bazaar)104.2 401.4 R -.7<2c9a> +-.4 G F2(www)104.2 413.4 Q(.tuxedo.or)-.74 E(g/~esr/writings/cathedr) +-.37 E(al-)-.15 E(bazaar/cathedr)104.2 425.4 Q(al-bazaar)-.15 E(.html) +-1.11 E F1 2.5(,J)C(anuary 1998.)-2.5 E([Selt91])79.2 441.6 Q(Seltzer) +104.2 453.6 Q 2.578(,M)-.4 G .078(., and Y)-2.578 F .079(igit, O., \231) +-.55 F 2.579(AN)-.8 G .579 -.25(ew H)-2.579 H .079(ashing P).25 F(ack-) +-.15 E 6.704(age for UNIX,)104.2 465.6 R<9a>-.7 E F2(Pr)9.204 E 6.704 +(oceedings 1991 W)-.45 F(inter)-.55 E(USENIX Confer)104.2 477.6 Q(ence) +-.37 E F1 2.5(,D)C(allas, TX, January 1991.)-2.5 E([Selt92])79.2 493.8 Q +(Seltzer)104.2 505.8 Q 5.365(,M)-.4 G 2.865 +(., and Olson, M., \231LIBTP: Portable)-5.365 F 2.845(Modular T)104.2 +517.8 R 2.845(ransactions for UNIX,)-.35 F<9a>-.7 E F2(Pr)5.345 E +(oceedings)-.45 E 1.49(1992 W)104.2 529.8 R 1.49(inter Usenix Confer) +-.55 F(ence)-.37 E F1 3.99(,S)C 1.49(an Francisco,)-3.99 F +(CA, January 1992.)104.2 541.8 Q([Ston82])79.2 558 Q(Stonebrak)104.2 570 +Q(er)-.1 E 10.04(,M)-.4 G 7.54(., Stettner)-10.04 F 10.04(,H)-.4 G 7.54 +(., Kalash, J.,)-10.04 F .763(Guttman, A., and L)104.2 582 R .764 +(ynn, N., \231Document Process-)-.55 F .557 +(ing in a Relational Database System,)104.2 594 R 3.056<9a4d>-.7 G +(emoran-)-3.056 E .825(dum No. UCB/ERL M82/32, Uni)104.2 606 R -.15(ve) +-.25 G .825(rsity of Cali-).15 F(fornia at Berk)104.2 618 Q(ele)-.1 E +1.3 -.65(y, B)-.15 H(erk).65 E(ele)-.1 E 1.3 -.65(y, C)-.15 H +(A, May 1982.).65 E EP +%%Trailer +end +%%EOF diff --git a/db/docs/ref/refs/embedded.html b/db/docs/ref/refs/embedded.html new file mode 100644 index 000000000..b7641d931 --- /dev/null +++ b/db/docs/ref/refs/embedded.html @@ -0,0 +1,672 @@ +<html> +<head> +<title>Challenges in Embedded Database System Administration</title> +</head> +<body bgcolor=white> +<center> +<h1>Challenges in Embedded Database System Administration</h1> +<h3>Margo Seltzer, Harvard University</h3> +<h3>Michael Olson, Sleepycat Software, Inc.</h3> +<em>{margo,mao}@sleepycat.com</em> +</center> +<p> +Database configuration and maintenance have historically been complex tasks, +often +requiring expert knowledge of database design and application +behavior. +In an embedded environment, it is not feasible to require such +expertise and ongoing database maintenance. +This paper discusses the database administration +challenges posed by embedded systems and describes how the +Berkeley DB architecture addresses these challenges. + +<h2>1. Introduction</h2> + +Embedded systems provide a combination of opportunities and challenges +in application and system configuration and management. +As an embedded system is most often dedicated to a single application or +small set of tasks, the operating conditions of the system are +typically better understood than those of general purpose computing +environments. +Similarly, as embedded systems are dedicated to a small set of tasks, +one would expect that the software to manage them should be small +and simple. +On the other hand, once an embedded system is deployed, it must +continue to function without interruption and without administrator +intervention. +<p> +Database administration consists of two components, +initial configuration and ongoing maintenance. +Initial configuration consists of database design, manifestation, +and tuning. +The instantiation of the design includes decomposing the design +into tables, relations, or objects and designating proper indices +and their implementations (e.g., Btrees, hash tables, etc.). +Tuning a design requires selecting a location for the log and +data files, selecting appropriate database page sizes, specifying +the size of in-memory caches, and specifying the limits of +multi-threading and concurrency. +As embedded systems define a specific environment and set of tasks, +requiring expertise during the initial system +configuration process is acceptable, and we focus our efforts on +the ongoing maintenance of the system. +In this way, our emphasis differs from other projects such as +Microsoft's AutoAdmin project <a href="#Chaud982">[3]</a>, and the "no-knobs" +administration that is identified as an area of important future +research by the Asilomar authors<a href="#Bern98">[1]</a>. +<p> +In this paper, we focus on what the authors +of the Asilomar report call "gizmo" databases <a href="#Bern98"> [1]</a>, +databases +that reside in devices such as smart cards, toasters, or telephones. +The key characteristics of such databases are that their +functionality is completely transparent to users, no one ever +performs explicit database operations or +database maintenance, the database may crash at any time and +must recover instantly, the device may undergo a hard reset at +any time, requiring that the database return to its initial +state, and the semantic integrity of the database must be maintained +at all times. +In Section 2, we provide more detail on the sorts of tasks +typically performed by database administrators (DBAs) that must +be automated in an embedded system. +<p> +The rest of this paper is structured as follows. +In Section 2, we outline the requirements for embedded database support. +In Section 3, we discuss how Berkeley DB +is conducive to the hands-off management +required in embedded systems. +In Section 4, we discuss novel features that +enhance Berkeley +DB's suitability for the embedded applications. +In Section 5, we discuss issues of footprint size. +In Section 6 we discuss related work, and we conclude +in Section 7. + +<h2>2. Embedded Database Requirements</h2> +Historically, much of the commercial database industry has been driven +by the requirements of high performance online transaction +processing (OLTP), complex query processing, and the industry +standard benchmarks that have emerged (e.g., TPC-C <a href="#TPCC">[9]</a>, +TPC-D <a href="#TPCD">[10]</a>) to +allow for system comparisons. +As embedded systems typically perform fairly simple queries, +such metrics are not nearly as relevant for embedded database +systems as are ease of maintenance, robustness, and small footprint. +Of these three requirements, robustness and ease of maintenance +are the key issues. +Users must trust the data stored in their devices and must not need +to manually perform anything resembling system administration in order +to get their unit to work properly. +Fortunately, ease of use and robustness are important side +effects of simplicity and good design. +These, in turn, lead to a small size, providing the third +requirement of an embedded system. +<h3>2.1 The User Perspective</h3> +<p> +In the embedded database arena, it is the ongoing maintenance tasks +that must be automated, not necessarily the initial system configuration. +There are five tasks +that are traditionally performed by DBAs, +but must be performed automatically +in embedded database systems. +These tasks are +log archival and reclamation, +backup, +data compaction/reorganization, +automatic and rapid recovery, and +reinitialization from scratch. +<P> +Log archival and backup are tightly coupled. +Database backups are part of any +large database installation, and log archival is analogous to incremental +backup. +It is not clear what the implications of backup and archival are in +an embedded system. +Consumers do not back up their VCRs or refrigerators, yet they do +(or should) back up their personal computers or personal digital +assistants. +For the remainder of this paper, we assume that backups, in some form, +are required for gizmo databases (imagine having to reprogram, manually, +the television viewing access pattern learned by some set-top television +systems today). +Furthermore, we require that those backups are nearly instantaneous or +completely transparent, +as users should not be aware that their gizmos are being backed up +and should not have to explicitly initiate such backups. +<p> +Data compaction or reorganization has traditionally required periodic +dumping and restoration of +database tables and the recreation of indices. +In an embedded system, such reorganization must happen automatically. +<p> +Recovery issues are similar in embedded and traditional environments +with a few exceptions. +While a few seconds or even a minute recovery is acceptable +for a large server installation, no one is willing to wait +for their telephone or television to reboot. +As with archival, recovery must be nearly instantaneous in an embedded product. +Secondly, it is often the case that a system will be completely +reinitialized, rather than simply rebooted. +In this case, the embedded database must be restored to its initial +state, freeing all its resources. +This is not typically a requirement of large server systems. +<h3>2.2 The Developer Perspective</h3> +<p> +In addition to the maintenance-free operation required of the +embedded systems, there are a number of requirements that fall +out of the constrained resources typically found in the "gizmos" +using gizmo databases. These requirements are: +small footprint, +short code-path, +programmatic interface for tight application coupling and +to avoid the overhead (in both time and size) of +interfaces such as SQL and ODBC, +application configurability and flexibility, +support for complete memory-resident operation (e.g., these systems +must run on gizmos without file systems), and +support for multi-threading. +<p> +A small footprint and short code-path are self-explanatory, however +what is not as obvious is that the programmatic interface requirement +is the logical result of them. +Traditional interfaces such as ODBC and SQL add significant +size overhead and frequently add multiple context/thread switches +per operation, not to mention several IPC calls. +An embedded product is less likely to require the complex +query processing that SQL enables. +Instead, in the embedded space, the ability for an application +to configure the database for the specific tasks in question +is more important than a general query interface. +<p> +As some systems do not provide storage other than RAM and ROM, +it is essential that an embedded database work seemlessly +in memory-only environments. +Similarly, many of today's embedded operating systems provide a +single address space architecture, so a simple, multi-threaded +capability is essential for application requiring any concurrency. +<p> +In general, embedded applications run on gizmos whose native +operating system support varies tremendously. +For example, the embedded OS may or may +not support user-level processing or multi-threading. +Even if it does, a particular embedded +application may or may not need it. +Not all applications need more than one thread of control. +An embedded database must provide mechanisms to developers +without deciding policy. +For example, the threading model in an application is a matter of policy, +and depends +not on the database software, but on the hardware, operating +system, and the application's feature set. +Therefore, the data manager must provide for the use of multi-threading, +but not require it. + +<h2>3. Berkeley DB: A Database for Embedded Systems</h2> +Berkeley DB is the result of implementing database functionality +using the UNIX tool-based philosophy. +The current Berkeley DB package, as distributed by Sleepycat +Software, is a descendant of the hash and btree access methods +distributed with 4.4BSD and its descendents. +The original package (referred to as DB-1.85), +while intended as a public domain replacement for dbm and +its followers (e.g., ndbm, gdbm, etc), rapidly became widely +used as an efficient, easy-to-use data store. +It was incorporated into a number of Open Source packages including +Perl, Sendmail, Kerberos, and the GNU C-library. +<p> +Versions 2.X and higher are distributed by Sleepycat Software and +add functionality for concurrency, logging, transactions, and +recovery. +Each piece of additional functionality is implemented as an independent +module, which means that the subsystems can be used outside the +context of Berkeley DB. For example, the locking subsystem can +easily be used to implement locking for a non-DB application and +the shared memory buffer pool can be used for any application +caching data in main memory. +This subsystem design allows a designer to pick and choose +the functionality necessary for the application, minimizing +memory footprint and maximizing performance. +This addresses the small footprint and short code-path criteria +mentioned in the previous section. +<p> +As Berkeley DB grew out of a replacement for dbm, its primary +implementation language has always been C and its interface has +been programmatic. The C interface is the native interface, +unlike many database systems where the programmatic API is simply +a layer on top of an already-costly query interface (e.g. embedded +SQL). +Berkeley DB's heritage is also apparent in its data model; it has +none. +The database stores unstructured key/data pairs, specified as +variable length byte strings. +This leaves schema design and representation issues the responsibility +of the application, which is ideal for an embedded environment. +Applications retain full control over specification of their data +types, representation, index values, and index relationships. +In other words, Berkeley DB provides a robust, high-performance, +keyed storage system, not a particular database management system. +We have designed for simplicity and performance, trading off +complex, general purpose support that is better encapsulated in +applications. +<p> +Another element of Berkeley DB's programmatic interface is its +customizability; applications can specify Btree comparison and +prefix compression functions, hash functions, error routines, +and recovery models. +This means that embedded applications can tailor the underlying +database to best suit their data demands. +Similarly, the utilities traditionally bundled with a database +manager (e.g., recovery, dump/restore, archive) are implemented +as tiny wrapper programs around library routines. This means +that it is not necessary to run separate applications for the +utilities. Instead, independent threads can act as utility +daemons, or regular query threads can perform utility functions. +Many of the current products built on Berkeley DB are bundled as +a single large server with independent threads that perform functions +such as checkpoint, deadlock detection, and performance monitoring. +<p> +As mentioned earlier, living in an embedded environment requires +flexible management of storage. +Berkeley DB does not require any preallocation of disk space +for log or data files. +While many commercial database systems take complete control +of a raw device, Berkeley DB uses a normal file system, and +can therefore, safely and easily share a data space with other +programs. +All databases and log files are native files of the host environment, +so whatever utilities are provided by the environment can be used +to manage database files as well. +<p> +Berkeley DB provides three different memory models for its +management of shared information. +Applications can use the IEEE Std 1003.1b-1993 (POSIX) <tt>mmap</tt> +interface to share +data, they can use system shared memory, as frequently provided +by the shmget family of interfaces, or they can use per-process +heap memory (e.g., malloc). +Applications that require no permanent storage and do not provide +shared memory facilities can still use Berkeley DB by requesting +strictly private memory and specifying that all databases be +memory-resident. +This provides pure-memory operation. +<p> +Lastly, Berkeley DB is designed for rapid startup -- recovery can +happen automatically as part of system initialization. +This means that Berkeley DB works correctly in environments where +gizmos are suddenly shut down and restarted. + +<h2>4. Extensions for Embedded Environments </h2> +While the Berkeley DB library has been designed for use in +embedded systems, all the features described above are useful +in more conventional systems as well. +In this section, we discuss a number of features and "automatic +knobs" that are specifically geared +toward the more constrained environments found in gizmo databases. + +<h3>4.1 Automatic compression</h3> +Following the programmatic interface design philosophy, we +support application-specific (or default) compression routines. +These can be geared toward the particular data types present +in the application's dataset, thus providing better compression +than a general purpose routine. +Note that the application could instead specify an encryption +function and create encrypted databases instead of compressed ones. +Alternately, the application might specify a function that performs +both compression and encryption. +<p> +As applications are also permitted to specify comparison and hash +functions, the application can chose to organize its data based +either on uncompressed and clear-text data or compressed and encrypted +data. +If the application indicates that data should be compared in its +processed form (i.e., compressed and encrypted), then the compression +and encryption are performed on individual data items and the in-memory +representation retains these characteristics. +However, if the application indicates that data should be compared in +its original form, then entire pages are transformed upon being read +into or written out of the main memory buffer cache. +These two alternatives provide the flexibility to trade space +and security for performance. + +<h3>4.2 In-memory logging & transactions</h3> +One of the four key properties of transaction systems is durability. +This means that transaction systems are designed for permanent storage +(most commonly disk). However, as mentioned above, embedded systems +do not necessarily contain any such storage. +Nevertheless, transactions can be useful in this environment to +preserve the semantic integrity of the underlying storage. +Berkeley DB optionally provides logging functionality and +transaction support regardless of whether the database and logs +are on disk or in memory. + +<h3>4.3 Remote Logs</h3> +While we do not expect users to backup their television sets and +toasters, it is conceivable that a set-top box provided by a +cable carrier should, in fact, be backed up by that cable carrier. +The ability to store logs remotely can provide "information appliance" +functionality, and can also be used in conjunction with local logs +to enhance reliability. +Furthermore, remote logs provide for catastrophic recovery, e.g., loss +of the gizmo, destruction of the gizmo, etc. + +<h3>4.4 Application References to Database Buffers</h3> + +Typically, when data is returned to the user, it must be copied +from the data manager's buffer cache (or data page) into the +application's memory. +However, in an embedded environment, the robustness of the +total software package is of paramount importance, not the +isolation between the application and the data manager. +As a result, it is possible for the data manager to avoid +copies by giving applications direct references to data items +in a shared memory cache. +This is a significant performance optimization that can be +allowed when the application and data manager are tightly +integrated. + +<h3>4.5 Recoverable database creation/deletion</h3> + +In a conventional database management system, the creation of +database tables (relations) and indices are heavyweight operations +that are not recoverable. +This is not acceptable in a complex embedded environment where +instantaneous recovery and robust operation in the face of +all types of database operations is essential. +While Berkeley DB files can be removed using normal file system +utilities, we provide transaction protected utilities that +allow us to recover both database creation and deletion. + +<h3>4.6 Adaptive concurrency control</h3> +The Berkeley DB package uses page-level locking by default. +This trades off fine grain concurrency control for simplicity +during recovery. (Finer grain concurrency control can be +obtained by reducing the page size in the database.) +However, when multiple threads/processes perform page-locking +in the presence of writing operations, there is the +potential for deadlock. +As some environments do not need or desire the overhead of +logging and transactions, it is important to provide the +ability for concurrent access without the potential for +deadlock. +<p> +Berkeley DB provides an option to perform coarser grain, +deadlock-free locking. +Rather than locking on pages, locking is performed at the +interface to the database. +Multiple readers or a single writer are allowed to be +active in the database at any instant in time, with +conflicting requests queued automatically. +The presence of cursors, through which applications can both +read and write data, complicates this design. +If a cursor is currently being used for reading, but will later +be used to write, the system will be deadlock prone if no +special precautions are taken. +To handle this situation, we require that, when a cursor is +created, the application specify any future intention to write. +If there is an intention to write, the cursor is granted an +intention-to-write lock which does not conflict with readers, +but does conflict with other intention-to-write locks and write +locks. +The end result is that the application is limited to a single +potentially writing cursor accessing the database at any point +in time. +<p> +Under periods of low contention (but potentially high throughput), +the normal page-level locking provides the best overall throughput. +However, as contention rises, so does the potential for deadlock. +As some cross-over point, switching to the less concurrent, but +deadlock-free locking protocol will result in higher throughput +as operations must never be retried. +Given the operating conditions of an embedded database manager, +it is useful to make this change automatically as the system +itself detects high contention. + +<h3>4.7 Adaptive synchronization</h3> + +In addition to the logical locks that protect the integrity of the +database pages, Berkeley DB must synchronize access to shared memory +data structures, such as the lock table, in-memory buffer pool, and +in-memory log buffer. +Each independent module uses a single mutex to protect its shared +data structures, under the assumption that operations that require +the mutex are very short and the potential for conflict is +low. +Unfortunately, in highly concurrent environments with multiple processors +present, this assumption is not always true. +When this assumption becomes invalid (that is, we observe significant +contention for the subsystem mutexes), we can switch over to a finer-grained +concurrency model for the mutexes. +Once again, there is a performance trade-off. Fine-grain mutexes +impose a penalty of approximately 25% (due to the increased number +of mutexes required for each operation), but allow for higher throughput. +Using fine-grain mutexes under low contention would cause a decrease +in performance, so it is important to monitor the system carefully, +so that the change can be executed only when it will increase system +throughput without jeopardizing latency. + +<h2>5. Footprint of an Embedded System</h2> +While traditional systems compete on price-performance, the +embedded players will compete on price, features, and footprint. +The earlier sections have focused on features; in this section +we focus on footprint. +<p> +Oracle reports that Oracle Lite 3.0 requires 350 KB to 750 KB +of memory and approximately 2.5 MB of hard disk space <a href="#Oracle">[7]</a>. +This includes drivers for interfaces such as ODBC and JDBC. +In contrast, Berkeley DB ranges in size from 75 KB to under 200 KB, +foregoing heavyweight interfaces such as ODBC and JDBC and +providing a variety of deployed sizes that can be used depending +on application needs. At the low end, applications requiring +a simple single-user access method can choose from either extended +linear hashing, B+ trees, or record-number based retrieval and +pay only the 75 KB space requirement. +Applications requiring all three access methods will observe the +110 KB footprint. +At the high end, a fully recoverable, high-performance system +occupies less than a quarter megabyte of memory. +This is a system you can easily incorporate in your toaster oven. +Table 1 shows the per-module break down of the entire Berkeley DB +library. Note that this does not include memory used to cache database +pages. + +<table border> +<tr><th colspan=4>Object sizes in bytes</th></tr> +<tr><th align=left>Subsystem</th><th align=center>Text</th><th align=center>Data</th><th align=center>Bss</th></tr> +<tr><td>Btree-specific routines</td><td align=right>28812</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td>Recno-specific routines</td><td align=right>7211</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td>Hash-specific routines</td><td align=right>23742</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td colspan=4></td></tr> +<tr><td>Memory Pool</td><td align=right>14535</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td>Access method common code</td><td align=right>23252</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td>OS compatibility library</td><td align=right>4980</td><td align=right>52</td><td align=right>0</td></tr> +<tr><td>Support utilities</td><td align=right>6165</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td colspan=4></td></tr> +<tr><th>All modules for Btree access method only</th><td align=right>77744</td><td align=right>52</td><td align=right>0</td></tr> +<tr><th>All modules for Recno access method only</th><td align=right>84955</td><td align=right>52</td><td align=right>0</td></tr> +<tr><th>All modules for Hash access method only</th><td align=right>72674</td><td align=right>52</td><td align=right>0</td></tr> +<tr><td colspan=4></td></tr> +<tr><th align=left>All Access Methods</th><td align=right>108697</td><td align=right>52</td><td align=right>0</td></tr> +<tr><td colspan=4><br></td></tr> +<tr><td>Locking</td><td align=right>12533</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td colspan=4></td></tr> +<tr><td>Recovery</td><td align=right>26948</td><td align=right>8</td><td align=right>4</td></tr> +<tr><td>Logging</td><td align=right>37367</td><td align=right>0</td><td align=right>0</td></tr> +<tr><td colspan=4></td></tr> +<tr><th align=left>Full Package</th><td align=right>185545</td><td align=right>60</td><td align=right>4</td></tr> +<tr><br></tr> +</table> + +<h2>6. Related Work</h2> + +Every three to five years, leading researchers in the database +community convene to identify future directions in database +research. +They produce a report of this meeting, named for the year and +location of the meeting. +The most recent of these reports, the 1998 Asilomar report, +identifies the embedded database market as one of the +high growth areas in database research <a href="#Bern98">[1]</a>. +Not surprisingly, market analysts identify the embedded database +market as a high-growth area in the commercial sector as well <a href="#Host98"> +[5]</a>. +<p> +The Asilomar report identifies a new class of database applications, which they +term "gizmo" databases, small databases embedded in tiny mobile +appliances, e.g., smart-cards, telephones, personal digital assistants. +Such databases must be self-managing, secure and reliable. +Thus, the idea is that gizmo databases require plug and play data +management with no database administrator (DBA), no human settable +parameters, and the ability to adapt to changing conditions. +More specifically, the Asilomar authors claim that the goal is +self-tuning, including defining the physical DB design, the +logical DB design, and automatic reports and utilities <a href="#Bern98">[1]</a> +To date, +few researchers have accepted this challenge, and there is a dearth +of research literature on the subject. +<p> +Our approach to embedded database administration is fundamentally +different than that described by the Asilomar authors. +We adopt their terminology, but view the challenge in supporting +gizmo databases to be that of self-sustenance <em>after</em> initial +deployment. Therefore, we find it, not only acceptable, but +desirable to assume that application developers control initial +database design and configuration. To the best of our knowledge, +none of the published work in this area addresses this approach. +<p> +As the research community has not provided guidance in this +arena, most work in embedded database administration has fallen +to the commercial vendors. +These vendors fall into two camps, companies selling databases +specifically designed for embedding or programmatic access +and the major database vendors (e.g., Oracle, Informix, Sybase). +<p> +The embedded vendors all acknowledge the need for automatic +administration, but fail to identify precisely how their +products actually accomplish this. +A notable exception is Interbase whose white paper +comparison with Sybase and Microsoft's SQL servers +explicitly address features of maintenance ease. +Interbase claims that as they use no log files, there is +no need for log reclamation, checkpoint tuning, or other +tasks associated with log management. However, Interbase +uses Transaction Information Pages, and it is unclear +how these are reused or reclaimed <a href="#Interbase">[6]</a>. +Additionally, with a log-free system, they must use +a FORCE policy (write all pages to disk at commit), +as defined by Haerder and Reuter <a href="#Haerder">[4]</a>. This has +serious performance consequences for disk-based systems. +The approach described in this paper does use logs and +therefore requires log reclamation, +but provides hooks so the application may reclaim logs +safely and programmatically. +While Berkeley DB does require checkpoints, the goal of +tuning the checkpoint interval is to bound recovery time. +Since the checkpoint interval in Berkeley DB can be expressed +by the amount of log data written, it requires no tuning. +The application designer sets a target recovery time, and +selects the amount of log data that can be read in that interval +and specifies the checkpoint interval appropriately. Even as +load changes, the time to recover does not. +<p> +The backup approaches taken by Interbase and Berkeley DB +are similar in that they both allow online backup, but +rather different in their affect on transactions running +during backup. As Interbase performs backups as transactions +<a href="#Interbase">[6]</a>, concurrent queries can suffer potentially long +delays. Berkeley DB uses native operating system system utilities +and recovery for backups, so there is no interference with +concurrent activity, other than potential contention on disk +arms. +<p> +There are a number of database vendors selling in +the embedded market (e.g., Raima, +Centura, Pervasive, Faircom), but none highlight +the special requirements of embedded database +applications. +On the other end of the spectrum, the major vendors, +Oracle, Sybase, Microsoft, are all becoming convinced +of the importance of the embedded market. +As mentioned earlier, Oracle has announced its +Oracle Lite server for embedded use. +Sybase has announced its UltraLite platform for "application-optimized, +high-performance, SQL database engine for professional +application developers building solutions for mobile and embedded platforms." +<a href="#Sybase">[8]</a>. +We believe that SQL is incompatible with the +gizmo database environment or truly embedded systems for which Berkeley +DB is most suitable. +Microsoft research is taking a different approach, developing +technology to assist in automating initial database design and +index specification <a href="#Chaud98">[2]</a><a href="#Chaud982">[3]</a>. +As mentioned earlier, we believe that such configuration is, not only +acceptable in the embedded market, but desirable so that applications +can tune their database management for the target environment. +<h2>7. Conclusions</h2> +The coming wave of embedded systems poses a new set of challenges +for data management. +The traditional server-based, big footprint systems designed for +high performance on big iron are not the right approach in this +environment. +Instead, application developers need small, fast, versatile systems +that can be tailored to a specific environment. +In this paper, we have identified several of the key issues in +providing these systems and shown how Berkeley DB provides +many of the characteristics necessary for such applications. + +<h2>8. References</h2> +<p> +[1] <a name="Bern98"> Bernstein, P., Brodie, M., Ceri, S., DeWitt, D., Franklin, M., +Garcia-Molina, H., Gray, J., Held, J., Hellerstein, J., +Jagadish, H., Lesk, M., Maier, D., Naughton, J., +Pirahesh, H., Stonebraker, M., Ullman, J., +"The Asilomar Report on Database Research," +SIGMOD Record 27(4): 74-80, 1998. +</a> +<p> +[2] <a name="Chaud98"> Chaudhuri, S., Narasayya, V., +"AutoAdmin 'What-If' Index Analysis Utility," +<em>Proceedings of the ACM SIGMOD Conference</em>, Seattle, 1998. +</a> +<p> +[3] <a name="Chaud982"> Chaudhuri, S., Narasayya, V., +"An Efficient, Cost-Driver Index Selection Tool for Microsoft SQL Server," +<em>Proceedings of the 23rd VLDB Conference</em>, Athens, Greece, 1997. +</a> +<p> +[4] <a name="Harder"> Haerder, T., Reuter, A., +"Principles of Transaction-Oriented Database Recovery," +<em>Computing Surveys 15</em>,4 (1983), 237-318. +</a> +<p> +[5] <a name="Host98"> Hostetler, M., "Cover Is Off A New Type of Database," +Embedded DB News, +http://www.theadvisors.com/embeddeddbnews.htm, +5/6/98. +</a> +<p> +[6] <a name="Interbase"> Interbase, "A Comparison of Borland InterBase 4.0 +Sybase SQL Server and Microsoft SQL Server," +http://web.interbase.com/products/doc_info_f.html. +</a> +<p> +[7] <a name="Oracle"> Oracle, "Oracle Delivers New Server, Application Suite +to Power the Web for Mission-Critical Business," +http://www.oracle.com.sg/partners/news/newserver.htm, +May 1998. +</a> +<p> +[8] <a name="Sybase"> Sybase, Sybase UltraLite, http://www.sybase.com/products/ultralite/beta. +</a> +<p> +[9] <a name="TPCC"> Transaction Processing Council, "TPC-C Benchmark Specification, +Version 3.4," San Jose, CA, August 1998. +</a> +<p> +[10] <a name="TPCD"> Transaction Processing Council, "TPC-D Benchmark Specification, +Version 2.1," San Jose, CA, April 1999. +</a> +</body> +</html> + + diff --git a/db/docs/ref/refs/hash_usenix.ps b/db/docs/ref/refs/hash_usenix.ps new file mode 100644 index 000000000..c88477883 --- /dev/null +++ b/db/docs/ref/refs/hash_usenix.ps @@ -0,0 +1,12209 @@ +%!PS-Adobe-1.0 +%%Creator: utopia:margo (& Seltzer,608-13E,8072,) +%%Title: stdin (ditroff) +%%CreationDate: Tue Dec 11 15:06:45 1990 +%%EndComments +% @(#)psdit.pro 1.3 4/15/88 +% lib/psdit.pro -- prolog for psdit (ditroff) files +% Copyright (c) 1984, 1985 Adobe Systems Incorporated. All Rights Reserved. +% last edit: shore Sat Nov 23 20:28:03 1985 +% RCSID: $Header: psdit.pro,v 2.1 85/11/24 12:19:43 shore Rel $ + +% Changed by Edward Wang (edward@ucbarpa.berkeley.edu) to handle graphics, +% 17 Feb, 87. + +/$DITroff 140 dict def $DITroff begin +/fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def +/xi{0 72 11 mul translate 72 resolution div dup neg scale 0 0 moveto + /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def F + /pagesave save def}def +/PB{save /psv exch def currentpoint translate + resolution 72 div dup neg scale 0 0 moveto}def +/PE{psv restore}def +/arctoobig 90 def /arctoosmall .05 def +/m1 matrix def /m2 matrix def /m3 matrix def /oldmat matrix def +/tan{dup sin exch cos div}def +/point{resolution 72 div mul}def +/dround {transform round exch round exch itransform}def +/xT{/devname exch def}def +/xr{/mh exch def /my exch def /resolution exch def}def +/xp{}def +/xs{docsave restore end}def +/xt{}def +/xf{/fontname exch def /slotno exch def fontnames slotno get fontname eq not + {fonts slotno fontname findfont put fontnames slotno fontname put}if}def +/xH{/fontheight exch def F}def +/xS{/fontslant exch def F}def +/s{/fontsize exch def /fontheight fontsize def F}def +/f{/fontnum exch def F}def +/F{fontheight 0 le{/fontheight fontsize def}if + fonts fontnum get fontsize point 0 0 fontheight point neg 0 0 m1 astore + fontslant 0 ne{1 0 fontslant tan 1 0 0 m2 astore m3 concatmatrix}if + makefont setfont .04 fontsize point mul 0 dround pop setlinewidth}def +/X{exch currentpoint exch pop moveto show}def +/N{3 1 roll moveto show}def +/Y{exch currentpoint pop exch moveto show}def +/S{show}def +/ditpush{}def/ditpop{}def +/AX{3 -1 roll currentpoint exch pop moveto 0 exch ashow}def +/AN{4 2 roll moveto 0 exch ashow}def +/AY{3 -1 roll currentpoint pop exch moveto 0 exch ashow}def +/AS{0 exch ashow}def +/MX{currentpoint exch pop moveto}def +/MY{currentpoint pop exch moveto}def +/MXY{moveto}def +/cb{pop}def % action on unknown char -- nothing for now +/n{}def/w{}def +/p{pop showpage pagesave restore /pagesave save def}def +/Dt{/Dlinewidth exch def}def 1 Dt +/Ds{/Ddash exch def}def -1 Ds +/Di{/Dstipple exch def}def 1 Di +/Dsetlinewidth{2 Dlinewidth mul setlinewidth}def +/Dsetdash{Ddash 4 eq{[8 12]}{Ddash 16 eq{[32 36]} + {Ddash 20 eq{[32 12 8 12]}{[]}ifelse}ifelse}ifelse 0 setdash}def +/Dstroke{gsave Dsetlinewidth Dsetdash 1 setlinecap stroke grestore + currentpoint newpath moveto}def +/Dl{rlineto Dstroke}def +/arcellipse{/diamv exch def /diamh exch def oldmat currentmatrix pop + currentpoint translate 1 diamv diamh div scale /rad diamh 2 div def + currentpoint exch rad add exch rad -180 180 arc oldmat setmatrix}def +/Dc{dup arcellipse Dstroke}def +/De{arcellipse Dstroke}def +/Da{/endv exch def /endh exch def /centerv exch def /centerh exch def + /cradius centerv centerv mul centerh centerh mul add sqrt def + /eradius endv endv mul endh endh mul add sqrt def + /endang endv endh atan def + /startang centerv neg centerh neg atan def + /sweep startang endang sub dup 0 lt{360 add}if def + sweep arctoobig gt + {/midang startang sweep 2 div sub def /midrad cradius eradius add 2 div def + /midh midang cos midrad mul def /midv midang sin midrad mul def + midh neg midv neg endh endv centerh centerv midh midv Da + Da} + {sweep arctoosmall ge + {/controldelt 1 sweep 2 div cos sub 3 sweep 2 div sin mul div 4 mul def + centerv neg controldelt mul centerh controldelt mul + endv neg controldelt mul centerh add endh add + endh controldelt mul centerv add endv add + centerh endh add centerv endv add rcurveto Dstroke} + {centerh endh add centerv endv add rlineto Dstroke} + ifelse} + ifelse}def +/Dpatterns[ +[%cf[widthbits] +[8<0000000000000010>] +[8<0411040040114000>] +[8<0204081020408001>] +[8<0000103810000000>] +[8<6699996666999966>] +[8<0000800100001008>] +[8<81c36666c3810000>] +[8<0f0e0c0800000000>] +[8<0000000000000010>] +[8<0411040040114000>] +[8<0204081020408001>] +[8<0000001038100000>] +[8<6699996666999966>] +[8<0000800100001008>] +[8<81c36666c3810000>] +[8<0f0e0c0800000000>] +[8<0042660000246600>] +[8<0000990000990000>] +[8<0804020180402010>] +[8<2418814242811824>] +[8<6699996666999966>] +[8<8000000008000000>] +[8<00001c3e363e1c00>] +[8<0000000000000000>] +[32<00000040000000c00000004000000040000000e0000000000000000000000000>] +[32<00000000000060000000900000002000000040000000f0000000000000000000>] +[32<000000000000000000e0000000100000006000000010000000e0000000000000>] +[32<00000000000000002000000060000000a0000000f00000002000000000000000>] +[32<0000000e0000000000000000000000000000000f000000080000000e00000001>] +[32<0000090000000600000000000000000000000000000007000000080000000e00>] +[32<00010000000200000004000000040000000000000000000000000000000f0000>] +[32<0900000006000000090000000600000000000000000000000000000006000000>]] +[%ug +[8<0000020000000000>] +[8<0000020000002000>] +[8<0004020000002000>] +[8<0004020000402000>] +[8<0004060000402000>] +[8<0004060000406000>] +[8<0006060000406000>] +[8<0006060000606000>] +[8<00060e0000606000>] +[8<00060e000060e000>] +[8<00070e000060e000>] +[8<00070e000070e000>] +[8<00070e020070e000>] +[8<00070e020070e020>] +[8<04070e020070e020>] +[8<04070e024070e020>] +[8<04070e064070e020>] +[8<04070e064070e060>] +[8<06070e064070e060>] +[8<06070e066070e060>] +[8<06070f066070e060>] +[8<06070f066070f060>] +[8<060f0f066070f060>] +[8<060f0f0660f0f060>] +[8<060f0f0760f0f060>] +[8<060f0f0760f0f070>] +[8<0e0f0f0760f0f070>] +[8<0e0f0f07e0f0f070>] +[8<0e0f0f0fe0f0f070>] +[8<0e0f0f0fe0f0f0f0>] +[8<0f0f0f0fe0f0f0f0>] +[8<0f0f0f0ff0f0f0f0>] +[8<1f0f0f0ff0f0f0f0>] +[8<1f0f0f0ff1f0f0f0>] +[8<1f0f0f8ff1f0f0f0>] +[8<1f0f0f8ff1f0f0f8>] +[8<9f0f0f8ff1f0f0f8>] +[8<9f0f0f8ff9f0f0f8>] +[8<9f0f0f9ff9f0f0f8>] +[8<9f0f0f9ff9f0f0f9>] +[8<9f8f0f9ff9f0f0f9>] +[8<9f8f0f9ff9f8f0f9>] +[8<9f8f1f9ff9f8f0f9>] +[8<9f8f1f9ff9f8f1f9>] +[8<bf8f1f9ff9f8f1f9>] +[8<bf8f1f9ffbf8f1f9>] +[8<bf8f1fdffbf8f1f9>] +[8<bf8f1fdffbf8f1fd>] +[8<ff8f1fdffbf8f1fd>] +[8<ff8f1fdffff8f1fd>] +[8<ff8f1ffffff8f1fd>] +[8<ff8f1ffffff8f1ff>] +[8<ff9f1ffffff8f1ff>] +[8<ff9f1ffffff9f1ff>] +[8<ff9f9ffffff9f1ff>] +[8<ff9f9ffffff9f9ff>] +[8<ffbf9ffffff9f9ff>] +[8<ffbf9ffffffbf9ff>] +[8<ffbfdffffffbf9ff>] +[8<ffbfdffffffbfdff>] +[8<ffffdffffffbfdff>] +[8<ffffdffffffffdff>] +[8<fffffffffffffdff>] +[8<ffffffffffffffff>]] +[%mg +[8<8000000000000000>] +[8<0822080080228000>] +[8<0204081020408001>] +[8<40e0400000000000>] +[8<66999966>] +[8<8001000010080000>] +[8<81c36666c3810000>] +[8<f0e0c08000000000>] +[16<07c00f801f003e007c00f800f001e003c007800f001f003e007c00f801f003e0>] +[16<1f000f8007c003e001f000f8007c003e001f800fc007e003f001f8007c003e00>] +[8<c3c300000000c3c3>] +[16<0040008001000200040008001000200040008000000100020004000800100020>] +[16<0040002000100008000400020001800040002000100008000400020001000080>] +[16<1fc03fe07df0f8f8f07de03fc01f800fc01fe03ff07df8f87df03fe01fc00f80>] +[8<80>] +[8<8040201000000000>] +[8<84cc000048cc0000>] +[8<9900009900000000>] +[8<08040201804020100800020180002010>] +[8<2418814242811824>] +[8<66999966>] +[8<8000000008000000>] +[8<70f8d8f870000000>] +[8<0814224180402010>] +[8<aa00440a11a04400>] +[8<018245aa45820100>] +[8<221c224180808041>] +[8<88000000>] +[8<0855800080550800>] +[8<2844004482440044>] +[8<0810204080412214>] +[8<00>]]]def +/Dfill{ + transform /maxy exch def /maxx exch def + transform /miny exch def /minx exch def + minx maxx gt{/minx maxx /maxx minx def def}if + miny maxy gt{/miny maxy /maxy miny def def}if + Dpatterns Dstipple 1 sub get exch 1 sub get + aload pop /stip exch def /stipw exch def /stiph 128 def + /imatrix[stipw 0 0 stiph 0 0]def + /tmatrix[stipw 0 0 stiph 0 0]def + /minx minx cvi stiph idiv stiph mul def + /miny miny cvi stipw idiv stipw mul def + gsave eoclip 0 setgray + miny stiph maxy{ + tmatrix exch 5 exch put + minx stipw maxx{ + tmatrix exch 4 exch put tmatrix setmatrix + stipw stiph true imatrix {stip} imagemask + }for + }for + grestore +}def +/Dp{Dfill Dstroke}def +/DP{Dfill currentpoint newpath moveto}def +end + +/ditstart{$DITroff begin + /nfonts 60 def % NFONTS makedev/ditroff dependent! + /fonts[nfonts{0}repeat]def + /fontnames[nfonts{()}repeat]def +/docsave save def +}def + +% character outcalls +/oc{ + /pswid exch def /cc exch def /name exch def + /ditwid pswid fontsize mul resolution mul 72000 div def + /ditsiz fontsize resolution mul 72 div def + ocprocs name known{ocprocs name get exec}{name cb}ifelse +}def +/fractm [.65 0 0 .6 0 0] def +/fraction{ + /fden exch def /fnum exch def gsave /cf currentfont def + cf fractm makefont setfont 0 .3 dm 2 copy neg rmoveto + fnum show rmoveto currentfont cf setfont(\244)show setfont fden show + grestore ditwid 0 rmoveto +}def +/oce{grestore ditwid 0 rmoveto}def +/dm{ditsiz mul}def +/ocprocs 50 dict def ocprocs begin +(14){(1)(4)fraction}def +(12){(1)(2)fraction}def +(34){(3)(4)fraction}def +(13){(1)(3)fraction}def +(23){(2)(3)fraction}def +(18){(1)(8)fraction}def +(38){(3)(8)fraction}def +(58){(5)(8)fraction}def +(78){(7)(8)fraction}def +(sr){gsave 0 .06 dm rmoveto(\326)show oce}def +(is){gsave 0 .15 dm rmoveto(\362)show oce}def +(->){gsave 0 .02 dm rmoveto(\256)show oce}def +(<-){gsave 0 .02 dm rmoveto(\254)show oce}def +(==){gsave 0 .05 dm rmoveto(\272)show oce}def +(uc){gsave currentpoint 400 .009 dm mul add translate + 8 -8 scale ucseal oce}def +end + +% an attempt at a PostScript FONT to implement ditroff special chars +% this will enable us to +% cache the little buggers +% generate faster, more compact PS out of psdit +% confuse everyone (including myself)! +50 dict dup begin +/FontType 3 def +/FontName /DIThacks def +/FontMatrix [.001 0 0 .001 0 0] def +/FontBBox [-260 -260 900 900] def% a lie but ... +/Encoding 256 array def +0 1 255{Encoding exch /.notdef put}for +Encoding + dup 8#040/space put %space + dup 8#110/rc put %right ceil + dup 8#111/lt put %left top curl + dup 8#112/bv put %bold vert + dup 8#113/lk put %left mid curl + dup 8#114/lb put %left bot curl + dup 8#115/rt put %right top curl + dup 8#116/rk put %right mid curl + dup 8#117/rb put %right bot curl + dup 8#120/rf put %right floor + dup 8#121/lf put %left floor + dup 8#122/lc put %left ceil + dup 8#140/sq put %square + dup 8#141/bx put %box + dup 8#142/ci put %circle + dup 8#143/br put %box rule + dup 8#144/rn put %root extender + dup 8#145/vr put %vertical rule + dup 8#146/ob put %outline bullet + dup 8#147/bu put %bullet + dup 8#150/ru put %rule + dup 8#151/ul put %underline + pop +/DITfd 100 dict def +/BuildChar{0 begin + /cc exch def /fd exch def + /charname fd /Encoding get cc get def + /charwid fd /Metrics get charname get def + /charproc fd /CharProcs get charname get def + charwid 0 fd /FontBBox get aload pop setcachedevice + 2 setlinejoin 40 setlinewidth + newpath 0 0 moveto gsave charproc grestore + end}def +/BuildChar load 0 DITfd put +/CharProcs 50 dict def +CharProcs begin +/space{}def +/.notdef{}def +/ru{500 0 rls}def +/rn{0 840 moveto 500 0 rls}def +/vr{0 800 moveto 0 -770 rls}def +/bv{0 800 moveto 0 -1000 rls}def +/br{0 840 moveto 0 -1000 rls}def +/ul{0 -140 moveto 500 0 rls}def +/ob{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath stroke}def +/bu{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath fill}def +/sq{80 0 rmoveto currentpoint dround newpath moveto + 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath stroke}def +/bx{80 0 rmoveto currentpoint dround newpath moveto + 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath fill}def +/ci{500 360 rmoveto currentpoint newpath 333 0 360 arc + 50 setlinewidth stroke}def + +/lt{0 -200 moveto 0 550 rlineto currx 800 2cx s4 add exch s4 a4p stroke}def +/lb{0 800 moveto 0 -550 rlineto currx -200 2cx s4 add exch s4 a4p stroke}def +/rt{0 -200 moveto 0 550 rlineto currx 800 2cx s4 sub exch s4 a4p stroke}def +/rb{0 800 moveto 0 -500 rlineto currx -200 2cx s4 sub exch s4 a4p stroke}def +/lk{0 800 moveto 0 300 -300 300 s4 arcto pop pop 1000 sub + 0 300 4 2 roll s4 a4p 0 -200 lineto stroke}def +/rk{0 800 moveto 0 300 s2 300 s4 arcto pop pop 1000 sub + 0 300 4 2 roll s4 a4p 0 -200 lineto stroke}def +/lf{0 800 moveto 0 -1000 rlineto s4 0 rls}def +/rf{0 800 moveto 0 -1000 rlineto s4 neg 0 rls}def +/lc{0 -200 moveto 0 1000 rlineto s4 0 rls}def +/rc{0 -200 moveto 0 1000 rlineto s4 neg 0 rls}def +end + +/Metrics 50 dict def Metrics begin +/.notdef 0 def +/space 500 def +/ru 500 def +/br 0 def +/lt 416 def +/lb 416 def +/rt 416 def +/rb 416 def +/lk 416 def +/rk 416 def +/rc 416 def +/lc 416 def +/rf 416 def +/lf 416 def +/bv 416 def +/ob 350 def +/bu 350 def +/ci 750 def +/bx 750 def +/sq 750 def +/rn 500 def +/ul 500 def +/vr 0 def +end + +DITfd begin +/s2 500 def /s4 250 def /s3 333 def +/a4p{arcto pop pop pop pop}def +/2cx{2 copy exch}def +/rls{rlineto stroke}def +/currx{currentpoint pop}def +/dround{transform round exch round exch itransform} def +end +end +/DIThacks exch definefont pop +ditstart +(psc)xT +576 1 1 xr +1(Times-Roman)xf 1 f +2(Times-Italic)xf 2 f +3(Times-Bold)xf 3 f +4(Times-BoldItalic)xf 4 f +5(Helvetica)xf 5 f +6(Helvetica-Bold)xf 6 f +7(Courier)xf 7 f +8(Courier-Bold)xf 8 f +9(Symbol)xf 9 f +10(DIThacks)xf 10 f +10 s +1 f +xi +%%EndProlog + +%%Page: 1 1 +10 s 10 xH 0 xS 1 f +3 f +22 s +1249 626(A)N +1420(N)X +1547(ew)X +1796(H)X +1933(ashing)X +2467(P)X +2574(ackage)X +3136(for)X +3405(U)X +3532(N)X +3659(IX)X +2 f +20 s +3855 562(1)N +1 f +12 s +1607 779(Margo)N +1887(Seltzer)X +9 f +2179(-)X +1 f +2256(University)X +2686(of)X +2790(California,)X +3229(Berkeley)X +2015 875(Ozan)N +2242(Yigit)X +9 f +2464(-)X +1 f +2541(York)X +2762(University)X +3 f +2331 1086(ABSTRACT)N +1 f +10 s +1152 1222(UNIX)N +1385(support)X +1657(of)X +1756(disk)X +1921(oriented)X +2216(hashing)X +2497(was)X +2654(originally)X +2997(provided)X +3314(by)X +2 f +3426(dbm)X +1 f +3595([ATT79])X +3916(and)X +1152 1310(subsequently)N +1595(improved)X +1927(upon)X +2112(in)X +2 f +2199(ndbm)X +1 f +2402([BSD86].)X +2735(In)X +2826(AT&T)X +3068(System)X +3327(V,)X +3429(in-memory)X +3809(hashed)X +1152 1398(storage)N +1420(and)X +1572(access)X +1814(support)X +2090(was)X +2251(added)X +2479(in)X +2577(the)X +2 f +2711(hsearch)X +1 f +3000(library)X +3249(routines)X +3542([ATT85].)X +3907(The)X +1152 1486(result)N +1367(is)X +1457(a)X +1530(system)X +1789(with)X +1968(two)X +2125(incompatible)X +2580(hashing)X +2865(schemes,)X +3193(each)X +3377(with)X +3555(its)X +3666(own)X +3840(set)X +3965(of)X +1152 1574(shortcomings.)N +1152 1688(This)N +1316(paper)X +1517(presents)X +1802(the)X +1922(design)X +2152(and)X +2289(performance)X +2717(characteristics)X +3198(of)X +3286(a)X +3343(new)X +3498(hashing)X +3768(package)X +1152 1776(providing)N +1483(a)X +1539(superset)X +1822(of)X +1909(the)X +2027(functionality)X +2456(provided)X +2761(by)X +2 f +2861(dbm)X +1 f +3019(and)X +2 f +3155(hsearch)X +1 f +3409(.)X +3469(The)X +3614(new)X +3768(package)X +1152 1864(uses)N +1322(linear)X +1537(hashing)X +1818(to)X +1912(provide)X +2189(ef\256cient)X +2484(support)X +2755(of)X +2853(both)X +3026(memory)X +3324(based)X +3538(and)X +3685(disk)X +3849(based)X +1152 1952(hash)N +1319(tables)X +1526(with)X +1688(performance)X +2115(superior)X +2398(to)X +2480(both)X +2 f +2642(dbm)X +1 f +2800(and)X +2 f +2936(hsearch)X +1 f +3210(under)X +3413(most)X +3588(conditions.)X +3 f +1380 2128(Introduction)N +1 f +892 2260(Current)N +1196(UNIX)X +1456(systems)X +1768(offer)X +1984(two)X +2163(forms)X +2409(of)X +720 2348(hashed)N +973(data)X +1137(access.)X +2 f +1413(Dbm)X +1 f +1599(and)X +1745(its)X +1850(derivatives)X +2231(provide)X +720 2436(keyed)N +939(access)X +1171(to)X +1259(disk)X +1418(resident)X +1698(data)X +1858(while)X +2 f +2062(hsearch)X +1 f +2342(pro-)X +720 2524(vides)N +929(access)X +1175(for)X +1309(memory)X +1616(resident)X +1910(data.)X +2124(These)X +2356(two)X +720 2612(access)N +979(methods)X +1302(are)X +1453(incompatible)X +1923(in)X +2037(that)X +2209(memory)X +720 2700(resident)N +1011(hash)X +1195(tables)X +1419(may)X +1593(not)X +1731(be)X +1843(stored)X +2075(on)X +2191(disk)X +2360(and)X +720 2788(disk)N +884(resident)X +1169(tables)X +1387(cannot)X +1632(be)X +1739(read)X +1909(into)X +2063(memory)X +2360(and)X +720 2876(accessed)N +1022(using)X +1215(the)X +1333(in-memory)X +1709(routines.)X +2 f +892 2990(Dbm)N +1 f +1091(has)X +1241(several)X +1512(shortcomings.)X +2026(Since)X +2247(data)X +2423(is)X +720 3078(assumed)N +1032(to)X +1130(be)X +1242(disk)X +1411(resident,)X +1721(each)X +1905(access)X +2146(requires)X +2440(a)X +720 3166(system)N +963(call,)X +1120(and)X +1257(almost)X +1491(certainly,)X +1813(a)X +1869(disk)X +2022(operation.)X +2365(For)X +720 3254(extremely)N +1072(large)X +1264(databases,)X +1623(where)X +1851(caching)X +2131(is)X +2214(unlikely)X +720 3342(to)N +810(be)X +914(effective,)X +1244(this)X +1386(is)X +1466(acceptable,)X +1853(however,)X +2177(when)X +2378(the)X +720 3430(database)N +1022(is)X +1100(small)X +1298(\(i.e.)X +1447(the)X +1569(password)X +1896(\256le\),)X +2069(performance)X +720 3518(improvements)N +1204(can)X +1342(be)X +1443(obtained)X +1744(through)X +2018(caching)X +2293(pages)X +720 3606(of)N +818(the)X +947(database)X +1255(in)X +1348(memory.)X +1685(In)X +1782(addition,)X +2 f +2094(dbm)X +1 f +2262(cannot)X +720 3694(store)N +902(data)X +1062(items)X +1261(whose)X +1492(total)X +1660(key)X +1802(and)X +1943(data)X +2102(size)X +2252(exceed)X +720 3782(the)N +850(page)X +1034(size)X +1191(of)X +1290(the)X +1420(hash)X +1599(table.)X +1827(Similarly,)X +2176(if)X +2257(two)X +2409(or)X +720 3870(more)N +907(keys)X +1076(produce)X +1357(the)X +1477(same)X +1664(hash)X +1833(value)X +2029(and)X +2166(their)X +2334(total)X +720 3958(size)N +876(exceeds)X +1162(the)X +1291(page)X +1474(size,)X +1650(the)X +1779(table)X +1966(cannot)X +2210(store)X +2396(all)X +720 4046(the)N +838(colliding)X +1142(keys.)X +892 4160(The)N +1050(in-memory)X +2 f +1439(hsearch)X +1 f +1725(routines)X +2015(have)X +2199(different)X +720 4248(shortcomings.)N +1219(First,)X +1413(the)X +1539(notion)X +1771(of)X +1865(a)X +1928(single)X +2146(hash)X +2320(table)X +720 4336(is)N +807(embedded)X +1171(in)X +1266(the)X +1397(interface,)X +1732(preventing)X +2108(an)X +2217(applica-)X +720 4424(tion)N +902(from)X +1116(accessing)X +1482(multiple)X +1806(tables)X +2050(concurrently.)X +720 4512(Secondly,)N +1063(the)X +1186(routine)X +1438(to)X +1525(create)X +1743(a)X +1804(hash)X +1976(table)X +2157(requires)X +2440(a)X +720 4600(parameter)N +1066(which)X +1286(declares)X +1573(the)X +1694(size)X +1842(of)X +1932(the)X +2053(hash)X +2223(table.)X +2422(If)X +720 4688(this)N +856(size)X +1001(is)X +1074(set)X +1183(too)X +1305(low,)X +1465(performance)X +1892(degradation)X +2291(or)X +2378(the)X +720 4776(inability)N +1008(to)X +1092(add)X +1230(items)X +1425(to)X +1509(the)X +1628(table)X +1805(may)X +1964(result.)X +2223(In)X +2311(addi-)X +720 4864(tion,)N +2 f +910(hsearch)X +1 f +1210(requires)X +1515(that)X +1681(the)X +1825(application)X +2226(allocate)X +720 4952(memory)N +1037(for)X +1181(the)X +1329(key)X +1495(and)X +1661(data)X +1845(items.)X +2108(Lastly,)X +2378(the)X +2 f +720 5040(hsearch)N +1 f +1013(routines)X +1310(provide)X +1594(no)X +1713(interface)X +2034(to)X +2135(store)X +2329(hash)X +720 5128(tables)N +927(on)X +1027(disk.)X +16 s +720 5593 MXY +864 0 Dl +2 f +8 s +760 5648(1)N +1 f +9 s +5673(UNIX)Y +990(is)X +1056(a)X +1106(registered)X +1408(trademark)X +1718(of)X +1796(AT&T.)X +10 s +2878 2128(The)N +3032(goal)X +3199(of)X +3295(our)X +3431(work)X +3625(was)X +3779(to)X +3870(design)X +4108(and)X +4253(imple-)X +2706 2216(ment)N +2900(a)X +2970(new)X +3138(package)X +3436(that)X +3590(provides)X +3899(a)X +3968(superset)X +4264(of)X +4364(the)X +2706 2304(functionality)N +3144(of)X +3240(both)X +2 f +3411(dbm)X +1 f +3578(and)X +2 f +3723(hsearch)X +1 f +3977(.)X +4045(The)X +4198(package)X +2706 2392(had)N +2871(to)X +2982(overcome)X +3348(the)X +3495(interface)X +3826(shortcomings)X +4306(cited)X +2706 2480(above)N +2930(and)X +3078(its)X +3185(implementation)X +3719(had)X +3867(to)X +3961(provide)X +4238(perfor-)X +2706 2568(mance)N +2942(equal)X +3142(or)X +3235(superior)X +3524(to)X +3612(that)X +3758(of)X +3851(the)X +3975(existing)X +4253(imple-)X +2706 2656(mentations.)N +3152(In)X +3274(order)X +3498(to)X +3614(provide)X +3913(a)X +4003(compact)X +4329(disk)X +2706 2744(representation,)N +3224(graceful)X +3531(table)X +3729(growth,)X +4018(and)X +4176(expected)X +2706 2832(constant)N +3033(time)X +3234(performance,)X +3720(we)X +3873(selected)X +4191(Litwin's)X +2706 2920(linear)N +2923(hashing)X +3206(algorithm)X +3551([LAR88,)X +3872(LIT80].)X +4178(We)X +4324(then)X +2706 3008(enhanced)N +3037(the)X +3161(algorithm)X +3498(to)X +3586(handle)X +3826(page)X +4004(over\257ows)X +4346(and)X +2706 3096(large)N +2900(key)X +3049(handling)X +3362(with)X +3537(a)X +3606(single)X +3830(mechanism,)X +4248(named)X +2706 3184(buddy-in-waiting.)N +3 f +2975 3338(Existing)N +3274(UNIX)X +3499(Hashing)X +3802(Techniques)X +1 f +2878 3470(Over)N +3076(the)X +3210(last)X +3357(decade,)X +3637(several)X +3901(dynamic)X +4213(hashing)X +2706 3558(schemes)N +3000(have)X +3174(been)X +3348(developed)X +3700(for)X +3816(the)X +3936(UNIX)X +4159(timeshar-)X +2706 3646(ing)N +2856(system,)X +3146(starting)X +3433(with)X +3622(the)X +3767(inclusion)X +4107(of)X +2 f +4221(dbm)X +1 f +4359(,)X +4426(a)X +2706 3734(minimal)N +3008(database)X +3321(library)X +3571(written)X +3834(by)X +3950(Ken)X +4120(Thompson)X +2706 3822([THOM90],)N +3141(in)X +3248(the)X +3391(Seventh)X +3694(Edition)X +3974(UNIX)X +4220(system.)X +2706 3910(Since)N +2916(then,)X +3106(an)X +3214(extended)X +3536(version)X +3804(of)X +3903(the)X +4032(same)X +4228(library,)X +2 f +2706 3998(ndbm)N +1 f +2884(,)X +2933(and)X +3078(a)X +3142(public-domain)X +3637(clone)X +3839(of)X +3934(the)X +4060(latter,)X +2 f +4273(sdbm)X +1 f +4442(,)X +2706 4086(have)N +2902(been)X +3098(developed.)X +3491(Another)X +3797 0.1645(interface-compatible)AX +2706 4174(library)N +2 f +2950(gdbm)X +1 f +3128(,)X +3178(was)X +3333(recently)X +3622(made)X +3826(available)X +4145(as)X +4241(part)X +4395(of)X +2706 4262(the)N +2829(Free)X +2997(Software)X +3312(Foundation's)X +3759(\(FSF\))X +3970(software)X +4271(distri-)X +2706 4350(bution.)N +2878 4464(All)N +3017(of)X +3121(these)X +3323(implementations)X +3893(are)X +4029(based)X +4248(on)X +4364(the)X +2706 4552(idea)N +2871(of)X +2969(revealing)X +3299(just)X +3445(enough)X +3711(bits)X +3856(of)X +3953(a)X +4019(hash)X +4196(value)X +4400(to)X +2706 4640(locate)N +2920(a)X +2978(page)X +3151(in)X +3234(a)X +3291(single)X +3503(access.)X +3770(While)X +2 f +3987(dbm/ndbm)X +1 f +4346(and)X +2 f +2706 4728(sdbm)N +1 f +2908(map)X +3079(the)X +3210(hash)X +3390(value)X +3597(directly)X +3874(to)X +3968(a)X +4036(disk)X +4201(address,)X +2 f +2706 4816(gdbm)N +1 f +2921(uses)X +3096(the)X +3231(hash)X +3414(value)X +3624(to)X +3722(index)X +3936(into)X +4096(a)X +2 f +4168(directory)X +1 f +2706 4904([ENB88])N +3020(containing)X +3378(disk)X +3531(addresses.)X +2878 5018(The)N +2 f +3033(hsearch)X +1 f +3317(routines)X +3605(in)X +3697(System)X +3962(V)X +4049(are)X +4177(designed)X +2706 5106(to)N +2804(provide)X +3085(memory-resident)X +3669(hash)X +3852(tables.)X +4115(Since)X +4328(data)X +2706 5194(access)N +2948(does)X +3131(not)X +3269(require)X +3533(disk)X +3702(access,)X +3964(simple)X +4213(hashing)X +2706 5282(schemes)N +3010(which)X +3238(may)X +3408(require)X +3667(multiple)X +3964(probes)X +4209(into)X +4364(the)X +2706 5370(table)N +2889(are)X +3015(used.)X +3209(A)X +3294(more)X +3486(interesting)X +3851(version)X +4114(of)X +2 f +4208(hsearch)X +1 f +2706 5458(is)N +2784(a)X +2845(public)X +3070(domain)X +3335(library,)X +2 f +3594(dynahash)X +1 f +3901(,)X +3945(that)X +4089(implements)X +2706 5546(Larson's)N +3036(in-memory)X +3440(adaptation)X +3822([LAR88])X +4164(of)X +4279(linear)X +2706 5634(hashing)N +2975([LIT80].)X +3 f +720 5960(USENIX)N +9 f +1042(-)X +3 f +1106(Winter)X +1371('91)X +9 f +1498(-)X +3 f +1562(Dallas,)X +1815(TX)X +1 f +4424(1)X + +2 p +%%Page: 2 2 +10 s 10 xH 0 xS 1 f +3 f +432 258(A)N +510(New)X +682(Hashing)X +985(Package)X +1290(for)X +1413(UNIX)X +3663(Seltzer)X +3920(&)X +4007(Yigit)X +2 f +1074 538(dbm)N +1 f +1232(and)X +2 f +1368(ndbm)X +1 f +604 670(The)N +2 f +760(dbm)X +1 f +928(and)X +2 f +1074(ndbm)X +1 f +1282(library)X +1526(implementations)X +2089(are)X +432 758(based)N +667(on)X +799(the)X +949(same)X +1166(algorithm)X +1529(by)X +1661(Ken)X +1846(Thompson)X +432 846([THOM90,)N +824(TOR88,)X +1113(WAL84],)X +1452(but)X +1582(differ)X +1789(in)X +1879(their)X +2054(pro-)X +432 934(grammatic)N +801(interfaces.)X +1160(The)X +1311(latter)X +1502(is)X +1581(a)X +1643(modi\256ed)X +1952(version)X +432 1022(of)N +533(the)X +665(former)X +918(which)X +1148(adds)X +1328(support)X +1601(for)X +1728(multiple)X +2027(data-)X +432 1110(bases)N +634(to)X +724(be)X +828(open)X +1011(concurrently.)X +1484(The)X +1636(discussion)X +1996(of)X +2090(the)X +432 1198(algorithm)N +774(that)X +925(follows)X +1196(is)X +1280(applicable)X +1640(to)X +1732(both)X +2 f +1904(dbm)X +1 f +2072(and)X +2 f +432 1286(ndbm)N +1 f +610(.)X +604 1400(The)N +760(basic)X +956(structure)X +1268(of)X +2 f +1366(dbm)X +1 f +1535(calls)X +1712(for)X +1836(\256xed-sized)X +432 1488(disk)N +612(blocks)X +868(\(buckets\))X +1214(and)X +1377(an)X +2 f +1499(access)X +1 f +1755(function)X +2068(that)X +432 1576(maps)N +623(a)X +681(key)X +819(to)X +902(a)X +959(bucket.)X +1234(The)X +1380(interface)X +1683(routines)X +1962(use)X +2090(the)X +2 f +432 1664(access)N +1 f +673(function)X +970(to)X +1062(obtain)X +1292(the)X +1420(appropriate)X +1816(bucket)X +2060(in)X +2152(a)X +432 1752(single)N +643(disk)X +796(access.)X +604 1866(Within)N +869(the)X +2 f +1010(access)X +1 f +1263(function,)X +1593(a)X +1672(bit-randomizing)X +432 1954(hash)N +610(function)X +2 f +8 s +877 1929(2)N +1 f +10 s +940 1954(is)N +1024(used)X +1202(to)X +1294(convert)X +1565(a)X +1631(key)X +1777(into)X +1931(a)X +1997(32-bit)X +432 2042(hash)N +605(value.)X +825(Out)X +971(of)X +1064(these)X +1254(32)X +1359(bits,)X +1519(only)X +1686(as)X +1778(many)X +1981(bits)X +2121(as)X +432 2130(necessary)N +773(are)X +900(used)X +1075(to)X +1165(determine)X +1514(the)X +1639(particular)X +1974(bucket)X +432 2218(on)N +533(which)X +750(a)X +807(key)X +944(resides.)X +1228(An)X +1347(in-memory)X +1724(bitmap)X +1967(is)X +2041(used)X +432 2306(to)N +533(determine)X +893(how)X +1070(many)X +1287(bits)X +1441(are)X +1579(required.)X +1905(Each)X +2104(bit)X +432 2394(indicates)N +746(whether)X +1033(its)X +1136(associated)X +1494(bucket)X +1736(has)X +1871(been)X +2051(split)X +432 2482(yet)N +562(\(a)X +657(0)X +728(indicating)X +1079(that)X +1230(the)X +1359(bucket)X +1604(has)X +1742(not)X +1875(yet)X +2004(split\).)X +432 2570(The)N +590(use)X +730(of)X +830(the)X +961(hash)X +1141(function)X +1441(and)X +1590(the)X +1720(bitmap)X +1974(is)X +2059(best)X +432 2658(described)N +769(by)X +878(stepping)X +1177(through)X +1454(database)X +1759(creation)X +2046(with)X +432 2746(multiple)N +718(invocations)X +1107(of)X +1194(a)X +2 f +1250(store)X +1 f +1430(operation.)X +604 2860(Initially,)N +906(the)X +1033(hash)X +1209(table)X +1394(contains)X +1690(a)X +1755(single)X +1974(bucket)X +432 2948(\(bucket)N +711(0\),)X +836(the)X +972(bit)X +1094(map)X +1270(contains)X +1575(a)X +1649(single)X +1878(bit)X +2000(\(bit)X +2148(0)X +432 3036(corresponding)N +913(to)X +997(bucket)X +1233(0\),)X +1342(and)X +1480(0)X +1542(bits)X +1699(of)X +1788(a)X +1846(hash)X +2014(value)X +432 3124(are)N +560(examined)X +901(to)X +992(determine)X +1342(where)X +1568(a)X +1633(key)X +1778(is)X +1860(placed)X +2099(\(in)X +432 3212(bucket)N +670(0\).)X +801(When)X +1017(bucket)X +1255(0)X +1319(is)X +1396(full,)X +1551(its)X +1650(bit)X +1758(in)X +1844(the)X +1966(bitmap)X +432 3300(\(bit)N +564(0\))X +652(is)X +726(set,)X +856(and)X +993(its)X +1089(contents)X +1377(are)X +1497(split)X +1655(between)X +1943(buckets)X +432 3388(0)N +499(and)X +641(1,)X +727(by)X +833(considering)X +1233(the)X +1357(0)X +2 f +7 s +3356(th)Y +10 s +1 f +1480 3388(bit)N +1590(\(the)X +1741(lowest)X +1976(bit)X +2086(not)X +432 3476(previously)N +800(examined\))X +1169(of)X +1266(the)X +1393(hash)X +1569(value)X +1772(for)X +1895(each)X +2072(key)X +432 3564(within)N +668(the)X +798(bucket.)X +1064(Given)X +1292(a)X +1359(well-designed)X +1840(hash)X +2018(func-)X +432 3652(tion,)N +613(approximately)X +1112(half)X +1273(of)X +1376(the)X +1510(keys)X +1693(will)X +1853(have)X +2041(hash)X +432 3740(values)N +666(with)X +837(the)X +964(0)X +2 f +7 s +3708(th)Y +10 s +1 f +1090 3740(bit)N +1203(set.)X +1341(All)X +1471(such)X +1646(keys)X +1821(and)X +1965(associ-)X +432 3828(ated)N +586(data)X +740(are)X +859(moved)X +1097(to)X +1179(bucket)X +1413(1,)X +1493(and)X +1629(the)X +1747(rest)X +1883(remain)X +2126(in)X +432 3916(bucket)N +666(0.)X +604 4030(After)N +804(this)X +949(split,)X +1135(the)X +1262(\256le)X +1393(now)X +1560(contains)X +1856(two)X +2005(buck-)X +432 4118(ets,)N +562(and)X +699(the)X +818(bitmap)X +1061(contains)X +1349(three)X +1530(bits:)X +1687(the)X +1805(0)X +2 f +7 s +4086(th)Y +10 s +1 f +1922 4118(bit)N +2026(is)X +2099(set)X +432 4206(to)N +525(indicate)X +810(a)X +876(bucket)X +1120(0)X +1190(split)X +1357(when)X +1561(no)X +1671(bits)X +1816(of)X +1913(the)X +2041(hash)X +432 4294(value)N +648(are)X +789(considered,)X +1199(and)X +1357(two)X +1519(more)X +1726(unset)X +1937(bits)X +2094(for)X +432 4382(buckets)N +706(0)X +775(and)X +920(1.)X +1029(The)X +1183(placement)X +1542(of)X +1638(an)X +1742(incoming)X +2072(key)X +432 4470(now)N +604(requires)X +897(examination)X +1327(of)X +1428(the)X +1560(0)X +2 f +7 s +4438(th)Y +10 s +1 f +1691 4470(bit)N +1809(of)X +1910(the)X +2041(hash)X +432 4558(value,)N +667(and)X +824(the)X +963(key)X +1119(is)X +1212(placed)X +1462(either)X +1685(in)X +1787(bucket)X +2041(0)X +2121(or)X +432 4646(bucket)N +674(1.)X +782(If)X +864(either)X +1075(bucket)X +1317(0)X +1385(or)X +1480(bucket)X +1722(1)X +1790(\256lls)X +1937(up,)X +2064(it)X +2135(is)X +432 4734(split)N +598(as)X +693(before,)X +947(its)X +1050(bit)X +1162(is)X +1243(set)X +1360(in)X +1450(the)X +1576(bitmap,)X +1846(and)X +1990(a)X +2054(new)X +432 4822(set)N +541(of)X +628(unset)X +817(bits)X +952(are)X +1071(added)X +1283(to)X +1365(the)X +1483(bitmap.)X +604 4936(Each)N +791(time)X +959(we)X +1079(consider)X +1376(a)X +1437(new)X +1596(bit)X +1705(\(bit)X +1841(n\),)X +1953(we)X +2072(add)X +432 5024(2)N +2 f +7 s +4992(n)Y +9 f +509(+)X +1 f +540(1)X +10 s +595 5024(bits)N +737(to)X +826(the)X +951(bitmap)X +1199(and)X +1341(obtain)X +1567(2)X +2 f +7 s +4992(n)Y +9 f +1644(+)X +1 f +1675(1)X +10 s +1729 5024(more)N +1920(address-)X +432 5112(able)N +595(buckets)X +869(in)X +960(the)X +1087(\256le.)X +1258(As)X +1376(a)X +1441(result,)X +1668(the)X +1795(bitmap)X +2045(con-)X +432 5200(tains)N +618(the)X +751(previous)X +1062(2)X +2 f +7 s +5168(n)Y +9 f +1139(+)X +1 f +1170(1)X +2 f +10 s +9 f +5200(-)Y +1 f +1242(1)X +1317(bits)X +1467(\(1)X +2 f +9 f +1534(+)X +1 f +1578(2)X +2 f +9 f +(+)S +1 f +1662(4)X +2 f +9 f +(+)S +1 f +1746(...)X +2 f +9 f +(+)S +1 f +1850(2)X +2 f +7 s +5168(n)Y +10 s +1 f +1931 5200(\))N +1992(which)X +432 5288(trace)N +649(the)X +807(entire)X +2 f +1050(split)X +1247(history)X +1 f +1529(of)X +1656(the)X +1813(addressable)X +16 s +432 5433 MXY +864 0 Dl +2 f +8 s +472 5488(2)N +1 f +9 s +523 5513(This)N +670(bit-randomizing)X +1153(property)X +1416(is)X +1482(important)X +1780(to)X +1854(obtain)X +2052(radi-)X +432 5593(cally)N +599(different)X +874(hash)X +1033(values)X +1244(for)X +1355(nearly)X +1562(identical)X +1836(keys,)X +2012(which)X +432 5673(in)N +506(turn)X +640(avoids)X +846(clustering)X +1148(of)X +1226(such)X +1376(keys)X +1526(in)X +1600(a)X +1650(single)X +1840(bucket.)X +10 s +2418 538(buckets.)N +2590 652(Given)N +2809(a)X +2868(key)X +3007(and)X +3146(the)X +3267(bitmap)X +3512(created)X +3768(by)X +3871(this)X +4009(algo-)X +2418 740(rithm,)N +2638(we)X +2759(\256rst)X +2910(examine)X +3209(bit)X +3320(0)X +3386(of)X +3479(the)X +3603(bitmap)X +3851(\(the)X +4002(bit)X +4112(to)X +2418 828(consult)N +2673(when)X +2871(0)X +2934(bits)X +3072(of)X +3162(the)X +3283(hash)X +3453(value)X +3650(are)X +3772(being)X +3973(exam-)X +2418 916(ined\).)N +2631(If)X +2713(it)X +2785(is)X +2866(set)X +2982(\(indicating)X +3356(that)X +3503(the)X +3628(bucket)X +3869(split\),)X +4080(we)X +2418 1004(begin)N +2617(considering)X +3012(the)X +3131(bits)X +3267(of)X +3355(the)X +3473(32-bit)X +3684(hash)X +3851(value.)X +4085(As)X +2418 1092(bit)N +2525(n)X +2587(is)X +2662(revealed,)X +2977(a)X +3035(mask)X +3226(equal)X +3422(to)X +3506(2)X +2 f +7 s +1060(n)Y +9 f +3583(+)X +1 f +3614(1)X +2 f +10 s +9 f +1092(-)Y +1 f +3686(1)X +3748(will)X +3894(yield)X +4076(the)X +2418 1180(current)N +2675(bucket)X +2918(address.)X +3228(Adding)X +3496(2)X +2 f +7 s +1148(n)Y +9 f +3573(+)X +1 f +3604(1)X +2 f +10 s +9 f +1180(-)Y +1 f +3676(1)X +3744(to)X +3834(the)X +3960(bucket)X +2418 1268(address)N +2701(identi\256es)X +3035(which)X +3272(bit)X +3397(in)X +3500(the)X +3639(bitmap)X +3902(must)X +4098(be)X +2418 1356(checked.)N +2743(We)X +2876(continue)X +3173(revealing)X +3493(bits)X +3628(of)X +3715(the)X +3833(hash)X +4000(value)X +2418 1444(until)N +2591(all)X +2698(set)X +2814(bits)X +2955(in)X +3043(the)X +3167(bitmap)X +3415(are)X +3540(exhausted.)X +3907(The)X +4058(fol-)X +2418 1532(lowing)N +2682(algorithm,)X +3055(a)X +3133(simpli\256cation)X +3614(of)X +3723(the)X +3863(algorithm)X +2418 1620(due)N +2565(to)X +2658(Ken)X +2823(Thompson)X +3196([THOM90,)X +3590(TOR88],)X +3908(uses)X +4076(the)X +2418 1708(hash)N +2625(value)X +2839(and)X +2995(the)X +3133(bitmap)X +3395(to)X +3497(calculate)X +3823(the)X +3960(bucket)X +2418 1796(address)N +2679(as)X +2766(discussed)X +3093(above.)X +0(Courier)xf 0 f +1 f +0 f +8 s +2418 2095(hash)N +2608(=)X +2684 -0.4038(calchash\(key\);)AX +2418 2183(mask)N +2608(=)X +2684(0;)X +2418 2271(while)N +2646 -0.4018(\(isbitset\(\(hash)AX +3254(&)X +3330(mask\))X +3558(+)X +3634(mask\)\))X +2706 2359(mask)N +2896(=)X +2972(\(mask)X +3200(<<)X +3314(1\))X +3428(+)X +3504(1;)X +2418 2447(bucket)N +2684(=)X +2760(hash)X +2950(&)X +3026(mask;)X +2 f +10 s +3211 2812(sdbm)N +1 f +2590 2944(The)N +2 f +2738(sdbm)X +1 f +2930(library)X +3167(is)X +3243(a)X +3302(public-domain)X +3791(clone)X +3987(of)X +4076(the)X +2 f +2418 3032(ndbm)N +1 f +2638(library,)X +2914(developed)X +3286(by)X +3408(Ozan)X +3620(Yigit)X +3826(to)X +3929(provide)X +2 f +2418 3120(ndbm)N +1 f +2596('s)X +2692(functionality)X +3139(under)X +3359(some)X +3565(versions)X +3869(of)X +3973(UNIX)X +2418 3208(that)N +2559(exclude)X +2830(it)X +2894(for)X +3008(licensing)X +3317(reasons)X +3578([YIG89].)X +3895(The)X +4040(pro-)X +2418 3296(grammer)N +2735(interface,)X +3064(and)X +3207(the)X +3332(basic)X +3524(structure)X +3832(of)X +2 f +3926(sdbm)X +1 f +4121(is)X +2418 3384(identical)N +2733(to)X +2 f +2834(ndbm)X +1 f +3051(but)X +3192(internal)X +3476(details)X +3723(of)X +3828(the)X +2 f +3964(access)X +1 f +2418 3472(function,)N +2726(such)X +2894(as)X +2982(the)X +3101(calculation)X +3474(of)X +3561(the)X +3679(bucket)X +3913(address,)X +2418 3560(and)N +2563(the)X +2690(use)X +2825(of)X +2920(different)X +3225(hash)X +3400(functions)X +3726(make)X +3928(the)X +4054(two)X +2418 3648(incompatible)N +2856(at)X +2934(the)X +3052(database)X +3349(level.)X +2590 3762(The)N +2 f +2740(sdbm)X +1 f +2934(library)X +3173(is)X +3251(based)X +3458(on)X +3562(a)X +3622(simpli\256ed)X +3965(imple-)X +2418 3850(mentation)N +2778(of)X +2885(Larson's)X +3206(1978)X +2 f +3406(dynamic)X +3717(hashing)X +1 f +4009(algo-)X +2418 3938(rithm)N +2616(including)X +2943(the)X +2 f +3066(re\256nements)X +3461(and)X +3605(variations)X +1 f +3953(of)X +4044(sec-)X +2418 4026(tion)N +2562(5)X +2622([LAR78].)X +2956(Larson's)X +3257(original)X +3526(algorithm)X +3857(calls)X +4024(for)X +4138(a)X +2418 4114(forest)N +2635(of)X +2736(binary)X +2975(hash)X +3156(trees)X +3341(that)X +3494(are)X +3626(accessed)X +3941(by)X +4054(two)X +2418 4202(hash)N +2586(functions.)X +2925(The)X +3071(\256rst)X +3216(hash)X +3384(function)X +3672(selects)X +3907(a)X +3964(partic-)X +2418 4290(ular)N +2571(tree)X +2720(within)X +2952(the)X +3078(forest.)X +3309(The)X +3462(second)X +3713(hash)X +3887(function,)X +2418 4378(which)N +2659(is)X +2757(required)X +3070(to)X +3177(be)X +3297(a)X +3377(boolean)X +3675(pseudo-random)X +2418 4466(number)N +2687(generator)X +3015(that)X +3159(is)X +3236(seeded)X +3479(by)X +3583(the)X +3705(key,)X +3865(is)X +3942(used)X +4112(to)X +2418 4554(traverse)N +2733(the)X +2890(tree)X +3070(until)X +3275(internal)X +3579(\(split\))X +3829(nodes)X +4075(are)X +2418 4642(exhausted)N +2763(and)X +2903(an)X +3003(external)X +3286(\(non-split\))X +3648(node)X +3827(is)X +3903(reached.)X +2418 4730(The)N +2571(bucket)X +2813(addresses)X +3149(are)X +3276(stored)X +3500(directly)X +3772(in)X +3861(the)X +3986(exter-)X +2418 4818(nal)N +2536(nodes.)X +2590 4932(Larson's)N +2903(re\256nements)X +3309(are)X +3440(based)X +3655(on)X +3767(the)X +3897(observa-)X +2418 5020(tion)N +2570(that)X +2718(the)X +2844(nodes)X +3059(can)X +3199(be)X +3303(represented)X +3702(by)X +3809(a)X +3872(single)X +4090(bit)X +2418 5108(that)N +2569(is)X +2653(set)X +2773(for)X +2898(internal)X +3174(nodes)X +3392(and)X +3539(not)X +3672(set)X +3791(for)X +3915(external)X +2418 5196(nodes,)N +2652(resulting)X +2959(in)X +3048(a)X +3111(radix)X +3303(search)X +3536(trie.)X +3709(Figure)X +3944(1)X +4010(illus-)X +2418 5284(trates)N +2621(this.)X +2804(Nodes)X +3037(A)X +3123(and)X +3267(B)X +3348(are)X +3475(internal)X +3748(\(split\))X +3967(nodes,)X +2418 5372(thus)N +2573(having)X +2813(no)X +2915(bucket)X +3151(addresses)X +3480(associated)X +3831(with)X +3994(them.)X +2418 5460(Instead,)N +2693(the)X +2814(external)X +3096(nodes)X +3306(\(C,)X +3429(D,)X +3530(and)X +3669(E\))X +3768(each)X +3938(need)X +4112(to)X +2418 5548(refer)N +2594(to)X +2679(a)X +2738(bucket)X +2975(address.)X +3279(These)X +3494(bucket)X +3731(addresses)X +4062(can)X +2418 5636(be)N +2529(stored)X +2760(in)X +2857(the)X +2990(trie)X +3132(itself)X +3327(where)X +3559(the)X +3691(subtries)X +3974(would)X +3 f +432 5960(2)N +2970(USENIX)X +9 f +3292(-)X +3 f +3356(Winter)X +3621('91)X +9 f +3748(-)X +3 f +3812(Dallas,)X +4065(TX)X + +3 p +%%Page: 3 3 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +720 258(Seltzer)N +977(&)X +1064(Yigit)X +3278(A)X +3356(New)X +3528(Hashing)X +3831(Package)X +4136(for)X +4259(UNIX)X +1 f +720 538(live)N +862(if)X +933(they)X +1092(existed)X +1340([KNU68].)X +1709(For)X +1841(example,)X +2154(if)X +2224(nodes)X +2432(F)X +720 626(and)N +858(G)X +938(were)X +1117(the)X +1237(children)X +1522(of)X +1610(node)X +1787(C,)X +1881(the)X +2000(bucket)X +2235(address)X +720 714(L00)N +886(could)X +1101(reside)X +1330(in)X +1429(the)X +1563(bits)X +1714(that)X +1870(will)X +2030(eventually)X +2400(be)X +720 802(used)N +887(to)X +969(store)X +1145(nodes)X +1352(F)X +1416(and)X +1552(G)X +1630(and)X +1766(all)X +1866(their)X +2033(children.)X +10 f +720 890 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +1894 2247(L1)N +784 1925(A)N +1431(E)X +1106 2247(D)N +1428 1281(C)N +1109 1603(B)N +1884 1930(L01)N +1879 1286(L00)N +1221 1814(1)N +903 2131(1)N +1221 1402(0)N +903 1714(0)N +1 Dt +1397 1821 MXY +-8 -32 Dl +-5 19 Dl +-20 6 Dl +33 7 Dl +-187 -182 Dl +1397 1322 MXY +-33 7 Dl +20 6 Dl +5 19 Dl +8 -32 Dl +-187 182 Dl +1069 1639 MXY +-32 7 Dl +20 6 Dl +5 19 Dl +7 -32 Dl +-186 182 Dl +1374 1891 MXY +185 Dc +1779 2133 MXY +0 161 Dl +322 0 Dl +0 -161 Dl +-322 0 Dl +1811 MY +0 161 Dl +322 0 Dl +0 -161 Dl +-322 0 Dl +1166 MY +0 161 Dl +322 0 Dl +0 -161 Dl +-322 0 Dl +1052 2213 MXY +185 Dc +1569 MY +185 Dc +720 1881 MXY +185 Dc +1779 2213 MXY +-28 -17 Dl +10 17 Dl +-10 18 Dl +28 -18 Dl +-543 0 Dl +1769 1891 MXY +-28 -18 Dl +10 18 Dl +-10 18 Dl +28 -18 Dl +-201 0 Dl +1364 1247 MXY +185 Dc +1769 MX +-28 -18 Dl +10 18 Dl +-10 18 Dl +28 -18 Dl +-201 0 Dl +1064 2143 MXY +-7 -32 Dl +-5 19 Dl +-20 6 Dl +32 7 Dl +-181 -181 Dl +3 Dt +-1 Ds +8 s +720 2482(Figure)N +925(1:)X +1 f +1002(Radix)X +1179(search)X +1365(trie)X +1474(with)X +1612(internal)X +1831(nodes)X +2004(A)X +2074(and)X +2189(B,)X +2271(external)X +720 2570(nodes)N +891(C,)X +972(D,)X +1056(and)X +1170(E,)X +1247(and)X +1361(bucket)X +1553(addresses)X +1819(stored)X +1997(in)X +2069(the)X +2168(unused)X +2370(por-)X +720 2658(tion)N +836(of)X +905(the)X +999(trie.)X +10 s +10 f +720 2922 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +1 f +892 3124(Further)N +1153(simpli\256cations)X +1647(of)X +1738(the)X +1860(above)X +2076([YIG89])X +2377(are)X +720 3212(possible.)N +1038(Using)X +1265(a)X +1337(single)X +1564(radix)X +1765(trie)X +1908(to)X +2006(avoid)X +2219(the)X +2352(\256rst)X +720 3300(hash)N +904(function,)X +1227(replacing)X +1562(the)X +1696(pseudo-random)X +2231(number)X +720 3388(generator)N +1052(with)X +1222(a)X +1286(well)X +1452(designed,)X +1785(bit-randomizing)X +2329(hash)X +720 3476(function,)N +1053(and)X +1215(using)X +1434(the)X +1578(portion)X +1855(of)X +1967(the)X +2110(hash)X +2302(value)X +720 3564(exposed)N +1021(during)X +1268(the)X +1404(trie)X +1549(traversal)X +1864(as)X +1969(a)X +2042(direct)X +2262(bucket)X +720 3652(address)N +990(results)X +1228(in)X +1319(an)X +2 f +1424(access)X +1 f +1663(function)X +1959(that)X +2108(works)X +2333(very)X +720 3740(similar)N +974(to)X +1068(Thompson's)X +1499(algorithm)X +1841(above.)X +2084(The)X +2240(follow-)X +720 3828(ing)N +847(algorithm)X +1183(uses)X +1346(the)X +1469(hash)X +1641(value)X +1840(to)X +1927(traverse)X +2206(a)X +2266(linear-)X +720 3916(ized)N +874(radix)X +1059(trie)X +2 f +8 s +1166 3891(3)N +1 f +10 s +1218 3916(starting)N +1478(at)X +1556(the)X +1674(0)X +2 f +7 s +3884(th)Y +10 s +1 f +1791 3916(bit.)N +0 f +8 s +720 4215(tbit)N +910(=)X +986(0;)X +1296(/*)X +1410(radix)X +1638(trie)X +1828(index)X +2056(*/)X +720 4303(hbit)N +910(=)X +986(0;)X +1296(/*)X +1410(hash)X +1600(bit)X +1752(index)X +2056(*/)X +720 4391(mask)N +910(=)X +986(0;)X +720 4479(hash)N +910(=)X +986 -0.4038(calchash\(key\);)AX +720 4655(for)N +872(\(mask)X +1100(=)X +1176(0;)X +910 4743 -0.4018(isbitset\(tbit\);)AN +910 4831(mask)N +1100(=)X +1176(\(mask)X +1404(<<)X +1518(1\))X +1632(+)X +1708(1\))X +1008 4919(if)N +1122(\(hash)X +1350(&)X +1426(\(1)X +1540(<<)X +1654 -0.4219(hbit++\)\)\))AX +1160 5007(/*)N +1274(right)X +1502(son)X +1692(*/)X +1160 5095(tbit)N +1350(=)X +1426(2)X +1502(*)X +1578(tbit)X +1768(+)X +1844(2;)X +1008 5183(else)N +1 f +16 s +720 5353 MXY +864 0 Dl +2 f +8 s +760 5408(3)N +1 f +9 s +818 5433(A)N +896(linearized)X +1206(radix)X +1380(trie)X +1502(is)X +1576(merely)X +1802(an)X +1895(array)X +2068(representation)X +720 5513(of)N +800(the)X +908(radix)X +1076(search)X +1280(trie)X +1396(described)X +1692(above.)X +1920(The)X +2052(children)X +2308(of)X +2388(the)X +720 5593(node)N +885(with)X +1038(index)X +1223(i)X +1267(can)X +1391(be)X +1483(found)X +1675(at)X +1751(the)X +1863(nodes)X +2055(indexed)X +2307(2*i+1)X +720 5673(and)N +842(2*i+2.)X +0 f +8 s +3146 538(/*)N +3260(left)X +3450(son)X +3678(*/)X +3146 626(tbit)N +3336(=)X +3412(2)X +3488(*)X +3564(tbit)X +3754(+)X +3830(1;)X +2706 802(bucket)N +2972(=)X +3048(hash)X +3238(&)X +3314(mask;)X +2 f +10 s +3495 1167(gdbm)N +1 f +2878 1299(The)N +3027(gdbm)X +3233(\(GNU)X +3458(data)X +3616(base)X +3783(manager\))X +4111(library)X +4349(is)X +4426(a)X +2706 1387(UNIX)N +2933(database)X +3236(manager)X +3539(written)X +3792(by)X +3897(Philip)X +4112(A.)X +4215(Nelson,)X +2706 1475(and)N +2848(made)X +3048(available)X +3364(as)X +3457(a)X +3518(part)X +3668(of)X +3760(the)X +3883(FSF)X +4040(software)X +4342(dis-)X +2706 1563(tribution.)N +3052(The)X +3207(gdbm)X +3419(library)X +3663(provides)X +3969(the)X +4097(same)X +4292(func-)X +2706 1651(tionality)N +3028(of)X +3151(the)X +2 f +3304(dbm)X +1 f +3442(/)X +2 f +3464(ndbm)X +1 f +3697(libraries)X +4015([NEL90])X +4360(but)X +2706 1739(attempts)N +3018(to)X +3121(avoid)X +3340(some)X +3550(of)X +3658(their)X +3846(shortcomings.)X +4337(The)X +2706 1827(gdbm)N +2918(library)X +3162(allows)X +3401(for)X +3525(arbitrary-length)X +4059(data,)X +4242(and)X +4387(its)X +2706 1915(database)N +3027(is)X +3124(a)X +3203(singular,)X +3524(non-sparse)X +2 f +8 s +3872 1890(4)N +1 f +10 s +3947 1915(\256le.)N +4112(The)X +4280(gdbm)X +2706 2003(library)N +2947(also)X +3103(includes)X +2 f +3396(dbm)X +1 f +3560(and)X +2 f +3702(ndbm)X +1 f +3906(compatible)X +4288(inter-)X +2706 2091(faces.)N +2878 2205(The)N +3025(gdbm)X +3229(library)X +3465(is)X +3540(based)X +3745(on)X +2 f +3847(extensible)X +4189(hashing)X +1 f +4442(,)X +2706 2293(a)N +2766(dynamic)X +3066(hashing)X +3339(algorithm)X +3674(by)X +3778(Fagin)X +3984(et)X +4066(al)X +4148([FAG79].)X +2706 2381(This)N +2881(algorithm)X +3225(differs)X +3467(from)X +3655(the)X +3785(previously)X +4155(discussed)X +2706 2469(algorithms)N +3069(in)X +3152(that)X +3293(it)X +3358(uses)X +3517(a)X +2 f +3574(directory)X +1 f +3889(that)X +4030(is)X +4103(a)X +4159(collapsed)X +2706 2557(representation)N +3192([ENB88])X +3517(of)X +3615(the)X +3744(radix)X +3940(search)X +4177(trie)X +4315(used)X +2706 2645(by)N +2 f +2806(sdbm)X +1 f +2975(.)X +10 f +2706 2733 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +7 s +3572 3761(L1)N +1 Dt +3485 3738 MXY +-20 -13 Dl +7 13 Dl +-7 13 Dl +20 -13 Dl +-400 0 Dl +3180 3027 MXY +136 Dc +2706 3494 MXY +136 Dc +2950 3264 MXY +136 Dc +3738 MY +136 Dc +3485 2968 MXY +0 118 Dl +238 0 Dl +0 -118 Dl +-238 0 Dl +3442 MY +0 119 Dl +238 0 Dl +0 -119 Dl +-238 0 Dl +3679 MY +0 119 Dl +238 0 Dl +0 -119 Dl +-238 0 Dl +3187 3501 MXY +136 Dc +2963 3316 MXY +-24 5 Dl +15 4 Dl +4 15 Dl +5 -24 Dl +-137 134 Dl +3204 3083 MXY +-24 5 Dl +15 4 Dl +3 14 Dl +6 -23 Dl +-137 133 Dl +3204 3450 MXY +-6 -24 Dl +-3 14 Dl +-15 5 Dl +24 5 Dl +-137 -134 Dl +2842 3369(0)N +3075 3139(0)N +2842 3676(1)N +3075 3443(1)N +3562 3054(L00)N +3565 3528(L01)N +4197 2968 MXY +0 118 Dl +237 0 Dl +0 -118 Dl +-237 0 Dl +3205 MY +0 119 Dl +237 0 Dl +0 -119 Dl +-237 0 Dl +3561 MY +0 118 Dl +237 0 Dl +0 -118 Dl +-237 0 Dl +3960 2909 MXY +0 237 Dl +118 0 Dl +0 -237 Dl +-118 0 Dl +3146 MY +0 237 Dl +118 0 Dl +0 -237 Dl +-118 0 Dl +3383 MY +0 237 Dl +118 0 Dl +0 -237 Dl +-118 0 Dl +3620 MY +0 237 Dl +118 0 Dl +0 -237 Dl +-118 0 Dl +4197 3027 MXY +-21 -13 Dl +8 13 Dl +-8 13 Dl +21 -13 Dl +-119 0 Dl +4197 3264 MXY +-21 -13 Dl +8 13 Dl +-8 13 Dl +21 -13 Dl +-119 0 Dl +3501 MY +59 0 Dl +0 89 Dl +4078 3738 MXY +59 0 Dl +0 -88 Dl +4197 3590 MXY +-21 -13 Dl +8 13 Dl +-8 13 Dl +21 -13 Dl +-60 0 Dl +4197 3650 MXY +-21 -13 Dl +8 13 Dl +-8 13 Dl +21 -13 Dl +-60 0 Dl +3991 3050(00)N +3991 3287(01)N +3991 3524(10)N +3991 3761(11)N +4269 3050(L00)N +4269 3287(L01)N +4283 3643(L1)N +3485 3501 MXY +-20 -13 Dl +7 13 Dl +-7 13 Dl +20 -13 Dl +-155 0 Dl +3485 3027 MXY +-20 -13 Dl +7 13 Dl +-7 13 Dl +20 -13 Dl +-163 0 Dl +2967 3687 MXY +-5 -24 Dl +-4 14 Dl +-15 4 Dl +24 6 Dl +-141 -141 Dl +3 Dt +-1 Ds +8 s +2706 4033(Figure)N +2903(2:)X +1 f +2972(A)X +3034(radix)X +3181(search)X +3359(trie)X +3460(and)X +3568(a)X +3612(directory)X +3858(representing)X +4189(the)X +4283(trie.)X +10 s +10 f +2706 4209 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +1 f +2878 4411(In)N +2968(this)X +3106(algorithm,)X +3460(a)X +3519(directory)X +3832(consists)X +4108(of)X +4198(a)X +4256(search)X +2706 4499(trie)N +2847(of)X +2947(depth)X +2 f +3158(n)X +1 f +3211(,)X +3264(containing)X +3635(2)X +2 f +7 s +4467(n)Y +10 s +1 f +3749 4499(bucket)N +3996(addresses)X +4337(\(i.e.)X +2706 4587(each)N +2897(element)X +3194(of)X +3304(the)X +3445(trie)X +3594(is)X +3689(a)X +3767(bucket)X +4023(address\).)X +4373(To)X +2706 4675(access)N +2935(the)X +3056(hash)X +3226(table,)X +3425(a)X +3483(32-bit)X +3696(hash)X +3865(value)X +4061(is)X +4136(calculated)X +2706 4763(and)N +2 f +2861(n)X +1 f +2953(bits)X +3107(of)X +3213(the)X +3350(value)X +3563(are)X +3701(used)X +3886(to)X +3986(index)X +4202(into)X +4364(the)X +2706 4851(directory)N +3018(to)X +3102(obtain)X +3324(a)X +3382(bucket)X +3618(address.)X +3921(It)X +3992(is)X +4067(important)X +4400(to)X +2706 4939(note)N +2866(that)X +3008(multiple)X +3296(entries)X +3532(of)X +3620(this)X +3756(directory)X +4067(may)X +4226(contain)X +2706 5027(the)N +2833(same)X +3026(bucket)X +3268(address)X +3537(as)X +3632(a)X +3696(result)X +3902(of)X +3997(directory)X +4315(dou-)X +2706 5115(bling)N +2903(during)X +3145(bucket)X +3392(splitting.)X +3706(Figure)X +3948(2)X +4021(illustrates)X +4364(the)X +2706 5203(relationship)N +3126(between)X +3436(a)X +3513(typical)X +3772(\(skewed\))X +4108(search)X +4355(trie)X +2706 5291(and)N +2850(its)X +2953(directory)X +3271(representation.)X +3774(The)X +3927(formation)X +4270(of)X +4364(the)X +2706 5379(directory)N +3016(shown)X +3245(in)X +3327(the)X +3445(\256gure)X +3652(is)X +3725(as)X +3812(follows.)X +16 s +2706 5593 MXY +864 0 Dl +2 f +8 s +2746 5648(4)N +1 f +9 s +2796 5673(It)N +2858(does)X +3008(not)X +3118(contain)X +3348(holes.)X +3 f +10 s +720 5960(USENIX)N +9 f +1042(-)X +3 f +1106(Winter)X +1371('91)X +9 f +1498(-)X +3 f +1562(Dallas,)X +1815(TX)X +4424(3)X + +4 p +%%Page: 4 4 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +432 258(A)N +510(New)X +682(Hashing)X +985(Package)X +1290(for)X +1413(UNIX)X +3663(Seltzer)X +3920(&)X +4007(Yigit)X +1 f +604 538(Initially,)N +937(there)X +1158(is)X +1271(one)X +1446(slot)X +1620(in)X +1741(the)X +1898(directory)X +432 626(addressing)N +802(a)X +865(single)X +1083(bucket.)X +1364(The)X +1515(depth)X +1719(of)X +1812(the)X +1936(trie)X +2069(is)X +2148(0)X +432 714(and)N +577(0)X +646(bits)X +790(of)X +886(each)X +1063(hash)X +1239(value)X +1442(are)X +1570(examined)X +1910(to)X +2000(deter-)X +432 802(mine)N +624(in)X +718(which)X +946(bucket)X +1192(to)X +1286(place)X +1488(a)X +1556(key;)X +1726(all)X +1837(keys)X +2015(go)X +2126(in)X +432 890(bucket)N +682(0.)X +797(When)X +1024(this)X +1174(bucket)X +1423(is)X +1511(full,)X +1677(its)X +1787(contents)X +2089(are)X +432 978(divided)N +698(between)X +992(L0)X +1107(and)X +1249(L1)X +1363(as)X +1455(was)X +1605(done)X +1786(in)X +1873(the)X +1996(previ-)X +432 1066(ously)N +664(discussed)X +1030(algorithms.)X +1471(After)X +1700(this)X +1874(split,)X +2090(the)X +432 1154(address)N +710(of)X +814(the)X +948(second)X +1207(bucket)X +1457(must)X +1648(be)X +1760(stored)X +1992(in)X +2090(the)X +432 1242(directory.)N +796(To)X +939(accommodate)X +1438(the)X +1589(new)X +1776(address,)X +2090(the)X +432 1330(directory)N +752(is)X +835(split)X +2 f +8 s +972 1305(5)N +1 f +10 s +1330(,)Y +1054(by)X +1163(doubling)X +1476(it,)X +1569(thus)X +1731(increasing)X +2090(the)X +432 1418(depth)N +630(of)X +717(the)X +835(directory)X +1145(by)X +1245(one.)X +604 1532(After)N +813(this)X +967(split,)X +1163(a)X +1237(single)X +1466(bit)X +1588(of)X +1693(the)X +1829(hash)X +2014(value)X +432 1620(needs)N +663(to)X +773(be)X +896(examined)X +1255(to)X +1364(decide)X +1621(whether)X +1927(the)X +2072(key)X +432 1708(belongs)N +711(to)X +803(L0)X +922(or)X +1019(L1.)X +1158(Once)X +1358(one)X +1504(of)X +1601(these)X +1795(buckets)X +2069(\256lls)X +432 1796(\(L0)N +578(for)X +702(example\),)X +1051(it)X +1125(is)X +1208(split)X +1375(as)X +1472(before,)X +1728(and)X +1873(the)X +2000(direc-)X +432 1884(tory)N +585(is)X +662(split)X +823(again)X +1021(to)X +1107(make)X +1305(room)X +1498(for)X +1615(the)X +1736(address)X +2000(of)X +2090(the)X +432 1972(third)N +618(bucket.)X +927(This)X +1104(splitting)X +1400(causes)X +1645(the)X +1778(addresses)X +2121(of)X +432 2060(the)N +567(non-splitting)X +1012(bucket)X +1263(\(L1\))X +1443(to)X +1541(be)X +1653(duplicated.)X +2063(The)X +432 2148(directory)N +766(now)X +948(has)X +1099(four)X +1277(entries,)X +1555(a)X +1635(depth)X +1857(of)X +1968(2,)X +2072(and)X +432 2236(indexes)N +700(the)X +821(buckets)X +1089(L00,)X +1261(L01)X +1413(and)X +1552(L1,)X +1684(as)X +1774(shown)X +2006(in)X +2090(the)X +432 2324(Figure)N +661(2.)X +604 2438(The)N +756(crucial)X +1002(part)X +1154(of)X +1247(the)X +1371(algorithm)X +1708(is)X +1787(the)X +1911(observa-)X +432 2526(tion)N +580(that)X +724(L1)X +837(is)X +914(addressed)X +1255(twice)X +1453(in)X +1539(the)X +1661(directory.)X +1995(If)X +2073(this)X +432 2614(bucket)N +679(were)X +869(to)X +964(split)X +1134(now,)X +1324(the)X +1454(directory)X +1776(already)X +2045(con-)X +432 2702(tains)N +611(room)X +808(to)X +898(hold)X +1067(the)X +1192(address)X +1460(of)X +1554(the)X +1679(new)X +1840(bucket.)X +2121(In)X +432 2790(general,)N +711(the)X +831(relationship)X +1231(between)X +1521(the)X +1641(directory)X +1953(and)X +2090(the)X +432 2878(number)N +704(of)X +798(bucket)X +1039(addresses)X +1374(contained)X +1713(therein)X +1962(is)X +2041(used)X +432 2966(to)N +517(decide)X +750(when)X +947(to)X +1031(split)X +1190(the)X +1310(directory.)X +1662(Each)X +1845(bucket)X +2081(has)X +432 3054(a)N +505(depth,)X +740(\()X +2 f +767(n)X +7 s +3070(b)Y +10 s +1 f +848 3054(\),)N +932(associated)X +1299(with)X +1478(it)X +1558(and)X +1710(appears)X +1992(in)X +2090(the)X +432 3142(directory)N +744(exactly)X +998(2)X +2 f +7 s +3106(n)Y +9 f +1075(-)X +2 f +1106(n)X +4 s +3110(b)Y +7 s +1 f +10 s +1181 3142(times.)N +1396(When)X +1610(a)X +1668(bucket)X +1904(splits,)X +2113(its)X +432 3230(depth)N +638(increases)X +961(by)X +1069(one.)X +1253(The)X +1406(directory)X +1724(must)X +1907(split)X +2072(any)X +432 3318(time)N +602(a)X +665(bucket's)X +964(depth)X +1169(exceeds)X +1451(the)X +1576(depth)X +1781(of)X +1875(the)X +2000(direc-)X +432 3406(tory.)N +630(The)X +784(following)X +1123(code)X +1303(fragment)X +1621(helps)X +1818(to)X +1908(illustrate)X +432 3494(the)N +554(extendible)X +912(hashing)X +1185(algorithm)X +1520([FAG79])X +1838(for)X +1955(access-)X +432 3582(ing)N +554(individual)X +898(buckets)X +1163(and)X +1299(maintaining)X +1701(the)X +1819(directory.)X +0 f +8 s +432 3881(hash)N +622(=)X +698 -0.4038(calchash\(key\);)AX +432 3969(mask)N +622(=)X +698 -0.4018(maskvec[depth];)AX +432 4145(bucket)N +698(=)X +774 -0.4038(directory[hash)AX +1344(&)X +1420(mask];)X +432 4321(/*)N +546(Key)X +698 -0.4219(Insertion)AX +1078(*/)X +432 4409(if)N +546 -0.4038(\(store\(bucket,)AX +1116(key,)X +1306(data\))X +1534(==)X +1648(FAIL\))X +1876({)X +720 4497(newbl)N +948(=)X +1024 -0.4167(getpage\(\);)AX +720 4585 -0.4000(bucket->depth++;)AN +720 4673 -0.4091(newbl->depth)AN +1214(=)X +1290 -0.4038(bucket->depth;)AX +720 4761(if)N +834 -0.4038(\(bucket->depth)AX +1404(>)X +1480(depth\))X +1746({)X +1008 4849(/*)N +1122(double)X +1388 -0.4219(directory)AX +1768(*/)X +1008 4937(depth++;)N +1 f +16 s +432 5033 MXY +864 0 Dl +2 f +8 s +472 5088(5)N +1 f +9 s +534 5113(This)N +692(decision)X +962(to)X +1048(split)X +1202(the)X +1319(directory)X +1608(is)X +1685(based)X +1878(on)X +1979(a)X +2040(com-)X +432 5193(parison)N +666(of)X +748(the)X +858(depth)X +1040(of)X +1121(the)X +1230(page)X +1387(being)X +1568(split)X +1713(and)X +1838(the)X +1947(depth)X +2128(of)X +432 5273(the)N +543(trie.)X +698(In)X +781(Figure)X +992(2,)X +1069(the)X +1180(depths)X +1390(of)X +1472(both)X +1622(L00)X +1760(and)X +1886(L01)X +2024(are)X +2134(2,)X +432 5353(whereas)N +689(the)X +798(depth)X +979(of)X +1060(L1)X +1161(is)X +1230(1.)X +1323(Therefore,)X +1646(if)X +1710(L1)X +1810(were)X +1970(to)X +2046(split,)X +432 5433(the)N +543(directory)X +826(would)X +1029(not)X +1144(need)X +1303(to)X +1382(split.)X +1565(In)X +1648(reality,)X +1872(a)X +1926(bucket)X +2140(is)X +432 5513(allocated)N +727(for)X +846(the)X +969(directory)X +1264(at)X +1351(the)X +1474(time)X +1637(of)X +1732(\256le)X +1858(creation)X +2124(so)X +432 5593(although)N +707(the)X +818(directory)X +1100(splits)X +1274(logically,)X +1566(physical)X +1828(splits)X +2002(do)X +2096(not)X +432 5673(occur)N +610(until)X +760(the)X +866(\256le)X +976(becomes)X +1246(quite)X +1408(large.)X +0 f +8 s +2994 538 -0.4219(directory)AN +3374(=)X +3450 -0.3971(double\(directory\);)AX +2706 626(})N +2706 714 -0.3958(splitbucket\(bucket,)AN +3466(newbl\))X +2706 802(...)N +2418 890(})N +2 f +10 s +3169 1255(hsearch)N +1 f +2590 1387(Since)N +2 f +2807(hsearch)X +1 f +3100(does)X +3286(not)X +3427(have)X +3617(to)X +3717(translate)X +4027(hash)X +2418 1475(values)N +2659(into)X +2819(disk)X +2988(addresses,)X +3352(it)X +3432(can)X +3579(use)X +3721(much)X +3934(simpler)X +2418 1563(algorithms)N +2808(than)X +2994(those)X +3211(de\256ned)X +3495(above.)X +3775(System)X +4058(V's)X +2 f +2418 1651(hsearch)N +1 f +2708(constructs)X +3069(a)X +3141(\256xed-size)X +3489(hash)X +3671(table)X +3862(\(speci\256ed)X +2418 1739(by)N +2519(the)X +2637(user)X +2791(at)X +2869(table)X +3045(creation\).)X +3391(By)X +3504(default,)X +3767(a)X +3823(multiplica-)X +2418 1827(tive)N +2570(hash)X +2748(function)X +3046(based)X +3260(on)X +3371(that)X +3522(described)X +3861(in)X +3954(Knuth,)X +2418 1915(Volume)N +2710(3,)X +2804(section)X +3065(6.4)X +3199([KNU68])X +3541(is)X +3628(used)X +3809(to)X +3905(obtain)X +4138(a)X +2418 2003(primary)N +2694(bucket)X +2930(address.)X +3233(If)X +3309(this)X +3446(bucket)X +3681(is)X +3755(full,)X +3907(a)X +3964(secon-)X +2418 2091(dary)N +2593(multiplicative)X +3069(hash)X +3248(value)X +3454(is)X +3538(computed)X +3885(to)X +3978(de\256ne)X +2418 2179(the)N +2542(probe)X +2751(interval.)X +3062(The)X +3213(probe)X +3422(interval)X +3693(is)X +3772(added)X +3989(to)X +4076(the)X +2418 2267(original)N +2712(bucket)X +2971(address)X +3257(\(modulo)X +3573(the)X +3716(table)X +3916(size\))X +4112(to)X +2418 2355(obtain)N +2658(a)X +2734(new)X +2908(bucket)X +3162(address.)X +3483(This)X +3665(process)X +3946(repeats)X +2418 2443(until)N +2588(an)X +2688(empty)X +2911(bucket)X +3148(is)X +3224(found.)X +3474(If)X +3551(no)X +3654(bucket)X +3891(is)X +3967(found,)X +2418 2531(an)N +2514(insertion)X +2814(fails)X +2972(with)X +3134(a)X +3190(``table)X +3420(full'')X +3605(condition.)X +2590 2645(The)N +2768(basic)X +2986(algorithm)X +3350(may)X +3541(be)X +3670(modi\256ed)X +4006(by)X +4138(a)X +2418 2733(number)N +2705(of)X +2813(compile)X +3112(time)X +3295(options)X +3571(available)X +3902(to)X +4005(those)X +2418 2821(users)N +2604(with)X +2767(AT&T)X +3006(source)X +3237(code.)X +3450(First,)X +3637(the)X +3756(package)X +4040(pro-)X +2418 2909(vides)N +2638(two)X +2809(options)X +3094(for)X +3238(hash)X +3435(functions.)X +3803(Users)X +4036(may)X +2418 2997(specify)N +2690(their)X +2877(own)X +3055(hash)X +3242(function)X +3549(by)X +3669(compiling)X +4032(with)X +2418 3085(``USCR'')N +2757(de\256ned)X +3016(and)X +3155(declaring)X +3477(and)X +3616(de\256ning)X +3901(the)X +4022(vari-)X +2418 3173(able)N +2 f +2578(hcompar)X +1 f +2863(,)X +2909(a)X +2971(function)X +3263(taking)X +3488(two)X +3633(string)X +3840(arguments)X +2418 3261(and)N +2560(returning)X +2880(an)X +2982(integer.)X +3271(Users)X +3480(may)X +3643(also)X +3797(request)X +4054(that)X +2418 3349(hash)N +2587(values)X +2814(be)X +2912(computed)X +3250(simply)X +3489(by)X +3590(taking)X +3811(the)X +3930(modulo)X +2418 3437(of)N +2521(key)X +2673(\(using)X +2909(division)X +3201(rather)X +3424(than)X +3597(multiplication)X +4080(for)X +2418 3525(hash)N +2589(value)X +2787(calculation\).)X +3230(If)X +3308(this)X +3447(technique)X +3783(is)X +3859(used,)X +4049(col-)X +2418 3613(lisions)N +2651(are)X +2775(resolved)X +3072(by)X +3176(scanning)X +3485(sequentially)X +3896(from)X +4076(the)X +2418 3701(selected)N +2702(bucket)X +2941(\(linear)X +3176(probing\).)X +3517(This)X +3684(option)X +3913(is)X +3991(avail-)X +2418 3789(able)N +2572(by)X +2672(de\256ning)X +2954(the)X +3072(variable)X +3351(``DIV'')X +3622(at)X +3700(compile)X +3978(time.)X +2590 3903(A)N +2720(second)X +3015(option,)X +3311(based)X +3565(on)X +3716(an)X +3863(algorithm)X +2418 3991(discovered)N +2787(by)X +2888(Richard)X +3163(P.)X +3248(Brent,)X +3466(rearranges)X +3822(the)X +3940(table)X +4116(at)X +2418 4079(the)N +2549(time)X +2724(of)X +2824(insertion)X +3137(in)X +3232(order)X +3434(to)X +3528(speed)X +3743(up)X +3855(retrievals.)X +2418 4167(The)N +2571(basic)X +2764(idea)X +2926(is)X +3007(to)X +3097(shorten)X +3361(long)X +3531(probe)X +3741(sequences)X +4094(by)X +2418 4255(lengthening)N +2833(short)X +3030(probe)X +3249(sequences.)X +3651(Once)X +3857(the)X +3991(probe)X +2418 4343(chain)N +2613(has)X +2741(exceeded)X +3062(some)X +3252(threshold)X +3571(\(Brent)X +3796(suggests)X +4087(2\),)X +2418 4431(we)N +2541(attempt)X +2809(to)X +2899(shuf\257e)X +3145(any)X +3289(colliding)X +3601(keys)X +3776(\(keys)X +3978(which)X +2418 4519(appeared)N +2734(in)X +2821(the)X +2944(probe)X +3152(sequence)X +3471(of)X +3562(the)X +3684(new)X +3842(key\).)X +4049(The)X +2418 4607(details)N +2652(of)X +2744(this)X +2884(key)X +3025(shuf\257ing)X +3333(can)X +3469(be)X +3569(found)X +3780(in)X +3866([KNU68])X +2418 4695(and)N +2576([BRE73].)X +2946(This)X +3129(algorithm)X +3481(may)X +3660(be)X +3777(obtained)X +4094(by)X +2418 4783(de\256ning)N +2700(the)X +2818(variable)X +3097(``BRENT'')X +3487(at)X +3565(compile)X +3843(time.)X +2590 4897(A)N +2698(third)X +2899(set)X +3038(of)X +3154(options,)X +3458(obtained)X +3783(by)X +3912(de\256ning)X +2418 4985(``CHAINED'',)N +2943(use)X +3086(linked)X +3321(lists)X +3484(to)X +3581(resolve)X +3848(collisions.)X +2418 5073(Either)N +2647(of)X +2747(the)X +2878(primary)X +3164(hash)X +3343(function)X +3642(described)X +3982(above)X +2418 5161(may)N +2584(be)X +2688(used,)X +2882(but)X +3011(all)X +3118(collisions)X +3451(are)X +3577(resolved)X +3876(by)X +3983(build-)X +2418 5249(ing)N +2554(a)X +2623(linked)X +2856(list)X +2986(of)X +3086(entries)X +3333(from)X +3522(the)X +3653(primary)X +3940(bucket.)X +2418 5337(By)N +2542(default,)X +2816(new)X +2981(entries)X +3226(will)X +3381(be)X +3488(added)X +3711(to)X +3804(a)X +3871(bucket)X +4116(at)X +2418 5425(the)N +2541(beginning)X +2886(of)X +2978(the)X +3101(bucket)X +3339(chain.)X +3577(However,)X +3916(compile)X +2418 5513(options)N +2706(``SORTUP'')X +3173(or)X +3293(``SORTDOWN'')X +3908(may)X +4098(be)X +2418 5601(speci\256ed)N +2723(to)X +2805(order)X +2995(the)X +3113(hash)X +3280(chains)X +3505(within)X +3729(each)X +3897(bucket.)X +3 f +432 5960(4)N +2970(USENIX)X +9 f +3292(-)X +3 f +3356(Winter)X +3621('91)X +9 f +3748(-)X +3 f +3812(Dallas,)X +4065(TX)X + +5 p +%%Page: 5 5 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +720 258(Seltzer)N +977(&)X +1064(Yigit)X +3278(A)X +3356(New)X +3528(Hashing)X +3831(Package)X +4136(for)X +4259(UNIX)X +2 f +1444 538(dynahash)N +1 f +892 670(The)N +2 f +1054(dynahash)X +1 f +1398(library,)X +1669(written)X +1932(by)X +2048(Esmond)X +2346(Pitt,)X +720 758(implements)N +1183(Larson's)X +1554(linear)X +1827(hashing)X +2165(algorithm)X +720 846([LAR88])N +1097(with)X +1302(an)X +2 f +1440(hsearch)X +1 f +1756(compatible)X +2174(interface.)X +720 934(Intuitively,)N +1099(a)X +1161(hash)X +1334(table)X +1516(begins)X +1751(as)X +1844(a)X +1905(single)X +2121(bucket)X +2360(and)X +720 1022(grows)N +941(in)X +1028(generations,)X +1443(where)X +1665(a)X +1725(generation)X +2088(corresponds)X +720 1110(to)N +815(a)X +884(doubling)X +1201(in)X +1296(the)X +1427(size)X +1585(of)X +1685(the)X +1815(hash)X +1994(table.)X +2222(The)X +2379(0)X +2 f +7 s +1078(th)Y +10 s +1 f +720 1198(generation)N +1085(occurs)X +1321(as)X +1414(the)X +1538(table)X +1719(grows)X +1940(from)X +2121(one)X +2262(bucket)X +720 1286(to)N +814(two.)X +1006(In)X +1105(the)X +1235(next)X +1405(generation)X +1776(the)X +1906(table)X +2093(grows)X +2320(from)X +720 1374(two)N +862(to)X +946(four.)X +1122(During)X +1371(each)X +1541(generation,)X +1921(every)X +2121(bucket)X +2356(that)X +720 1462(existed)N +967(at)X +1045(the)X +1163(beginning)X +1503(of)X +1590(the)X +1708(generation)X +2067(is)X +2140(split.)X +892 1576(The)N +1041(table)X +1221(starts)X +1414(as)X +1505(a)X +1565(single)X +1780(bucket)X +2018(\(numbered)X +2389(0\),)X +720 1664(the)N +839(current)X +1088(split)X +1245(bucket)X +1479(is)X +1552(set)X +1661(to)X +1743(bucket)X +1977(0,)X +2057(and)X +2193(the)X +2311(max-)X +720 1752(imum)N +933(split)X +1097(point)X +1288(is)X +1368(set)X +1483(to)X +1571(twice)X +1771(the)X +1895(current)X +2149(split)X +2312(point)X +720 1840(\(0\).)N +863(When)X +1084(it)X +1157(is)X +1239(time)X +1410(for)X +1532(a)X +1596(bucket)X +1838(to)X +1928(split,)X +2113(the)X +2239(keys)X +2414(in)X +720 1928(the)N +872(current)X +1154(split)X +1345(bucket)X +1612(are)X +1764(divided)X +2057(between)X +2378(the)X +720 2016(current)N +981(split)X +1151(bucket)X +1397(and)X +1545(a)X +1613(new)X +1779(bucket)X +2025(whose)X +2262(bucket)X +720 2104(number)N +1000(is)X +1088(equal)X +1297(to)X +1394(1)X +1469(+)X +1549(current)X +1812(split)X +1984(bucket)X +2232(+)X +2311(max-)X +720 2192(imum)N +927(split)X +1085(point.)X +1310(We)X +1442(can)X +1574(determine)X +1915(which)X +2131(keys)X +2298(move)X +720 2280(to)N +807(the)X +929(new)X +1087(bucket)X +1325(by)X +1429(examining)X +1791(the)X +2 f +1913(n)X +7 s +1962 2248(th)N +10 s +1 f +2043 2280(bit)N +2151(of)X +2242(a)X +2302(key's)X +720 2368(hash)N +899(value)X +1105(where)X +1334(n)X +1406(is)X +1491(the)X +1620(generation)X +1990(number.)X +2306(After)X +720 2456(the)N +846(bucket)X +1088(at)X +1174(the)X +1300(maximum)X +1651(split)X +1815(point)X +2006(has)X +2140(been)X +2319(split,)X +720 2544(the)N +839(generation)X +1198(number)X +1463(is)X +1536(incremented,)X +1973(the)X +2091(current)X +2339(split)X +720 2632(point)N +908(is)X +985(set)X +1098(back)X +1274(to)X +1360(zero,)X +1543(and)X +1683(the)X +1805(maximum)X +2152(split)X +2312(point)X +720 2720(is)N +815(set)X +946(to)X +1050(the)X +1190(number)X +1477(of)X +1586(the)X +1725(last)X +1877(bucket)X +2132(in)X +2235(the)X +2374(\256le)X +720 2808(\(which)N +971(is)X +1052(equal)X +1253(to)X +1342(twice)X +1543(the)X +1668(old)X +1797(maximum)X +2148(split)X +2312(point)X +720 2896(plus)N +873(1\).)X +892 3010(To)N +1031(facilitate)X +1361(locating)X +1668(keys,)X +1884(we)X +2027(maintain)X +2356(two)X +720 3098(masks.)N +989(The)X +1143(low)X +1291(mask)X +1488(is)X +1569(equal)X +1771(to)X +1861(the)X +1987(maximum)X +2339(split)X +720 3186(bucket)N +967(and)X +1116(the)X +1247(high)X +1422(mask)X +1624(is)X +1710(equal)X +1917(to)X +2011(the)X +2141(next)X +2311(max-)X +720 3274(imum)N +931(split)X +1093(bucket.)X +1372(To)X +1486(locate)X +1703(a)X +1764(speci\256c)X +2033(key,)X +2193(we)X +2311(com-)X +720 3362(pute)N +881(a)X +940(32-bit)X +1154(hash)X +1324(value)X +1520(using)X +1715(a)X +1773(bit-randomizing)X +2311(algo-)X +720 3450(rithm)N +932(such)X +1118(as)X +1224(the)X +1361(one)X +1516(described)X +1862(in)X +1962([LAR88].)X +2334(This)X +720 3538(hash)N +893(value)X +1093(is)X +1172(then)X +1336(masked)X +1607(with)X +1775(the)X +1898(high)X +2065(mask.)X +2299(If)X +2378(the)X +720 3626(resulting)N +1026(number)X +1297(is)X +1376(greater)X +1626(than)X +1790(the)X +1913(maximum)X +2262(bucket)X +720 3714(in)N +823(the)X +962(table)X +1159(\(current)X +1455(split)X +1633(bucket)X +1888(+)X +1974(maximum)X +2339(split)X +720 3802(point\),)N +962(the)X +1091(hash)X +1269(value)X +1474(is)X +1558(masked)X +1834(with)X +2007(the)X +2136(low)X +2287(mask.)X +720 3890(In)N +825(either)X +1046(case,)X +1242(the)X +1377(result)X +1592(of)X +1696(the)X +1831(mask)X +2037(is)X +2127(the)X +2262(bucket)X +720 3978(number)N +989(for)X +1107(the)X +1229(given)X +1431(key.)X +1611(The)X +1759(algorithm)X +2093(below)X +2312(illus-)X +720 4066(trates)N +914(this)X +1049(process.)X +0 f +8 s +720 4365(h)N +796(=)X +872 -0.4038(calchash\(key\);)AX +720 4453(bucket)N +986(=)X +1062(h)X +1138(&)X +1214 -0.4167(high_mask;)AX +720 4541(if)N +834(\()X +910(bucket)X +1176(>)X +1252 -0.4167(max_bucket)AX +1670(\))X +1008 4629(bucket)N +1274(=)X +1350(h)X +1426(&)X +1502 -0.4219(low_mask;)AX +720 4717 -0.4018(return\(bucket\);)AN +1 f +10 s +892 5042(In)N +1013(order)X +1237(to)X +1353(decide)X +1617(when)X +1845(to)X +1961(split)X +2152(a)X +2242(bucket,)X +2 f +720 5130(dynahash)N +1 f +1050(uses)X +2 f +1210(controlled)X +1561(splitting)X +1 f +1822(.)X +1884(A)X +1964(hash)X +2133(table)X +2311(has)X +2440(a)X +720 5218(\256ll)N +837(factor)X +1054(which)X +1279(is)X +1361(expressed)X +1707(in)X +1798(terms)X +2004(of)X +2099(the)X +2225(average)X +720 5306(number)N +990(of)X +1082(keys)X +1253(in)X +1339(each)X +1511(bucket.)X +1789(Each)X +1974(time)X +2140(the)X +2262(table's)X +720 5394(total)N +885(number)X +1153(of)X +1243(keys)X +1413(divided)X +1676(by)X +1778(its)X +1875(number)X +2142(of)X +2231(buckets)X +720 5482(exceeds)N +995(this)X +1130(\256ll)X +1238(factor,)X +1466(a)X +1522(bucket)X +1756(is)X +1829(split.)X +2878 538(Since)N +3079(the)X +2 f +3200(hsearch)X +1 f +3477(create)X +3693(interface)X +3998(\()X +2 f +4025(hcreate)X +1 f +4266(\))X +4315(calls)X +2706 626(for)N +2842(an)X +2960(estimate)X +3269(of)X +3378(the)X +3518(\256nal)X +3702(size)X +3869(of)X +3978(the)X +4118(hash)X +4306(table)X +2706 714(\()N +2 f +2733(nelem)X +1 f +2925(\),)X +2 f +3007(dynahash)X +1 f +3349(uses)X +3522(this)X +3672(information)X +4085(to)X +4182(initialize)X +2706 802(the)N +2848(table.)X +3088(The)X +3257(initial)X +3486(number)X +3774(of)X +3884(buckets)X +4172(is)X +4268(set)X +4400(to)X +2 f +2706 890(nelem)N +1 f +2926(rounded)X +3217(to)X +3306(the)X +3431(next)X +3596(higher)X +3828(power)X +4056(of)X +4150(two.)X +4337(The)X +2706 978(current)N +2958(split)X +3118(point)X +3305(is)X +3381(set)X +3493(to)X +3578(0)X +3641(and)X +3780(the)X +3901(maximum)X +4248(bucket)X +2706 1066(and)N +2842(maximum)X +3186(split)X +3343(point)X +3527(are)X +3646(set)X +3755(to)X +3837(this)X +3972(rounded)X +4255(value.)X +3 f +3148 1220(The)N +3301(New)X +3473(Implementation)X +1 f +2878 1352(Our)N +3042(implementation)X +3583(is)X +3675(also)X +3842(based)X +4063(on)X +4181(Larson's)X +2706 1440(linear)N +2939(hashing)X +3238([LAR88])X +3582(algorithm)X +3943(as)X +4060(well)X +4248(as)X +4364(the)X +2 f +2706 1528(dynahash)N +1 f +3047(implementation.)X +3623(The)X +2 f +3782(dbm)X +1 f +3954(family)X +4197(of)X +4297(algo-)X +2706 1616(rithms)N +2942(decide)X +3184(dynamically)X +3612(which)X +3840(bucket)X +4085(to)X +4178(split)X +4346(and)X +2706 1704(when)N +2914(to)X +3010(split)X +3180(it)X +3257(\(when)X +3491(it)X +3568(over\257ows\))X +3944(while)X +2 f +4155(dynahash)X +1 f +2706 1792(splits)N +2933(in)X +3054(a)X +3149(prede\256ned)X +3547(order)X +3776(\(linearly\))X +4134(and)X +4309(at)X +4426(a)X +2706 1880(prede\256ned)N +3116(time)X +3328(\(when)X +3599(the)X +3767(table)X +3993(\256ll)X +4151(factor)X +4409(is)X +2706 1968(exceeded\).)N +3121(We)X +3280(use)X +3434(a)X +3517(hybrid)X +3773(of)X +3887(these)X +4099(techniques.)X +2706 2056(Splits)N +2913(occur)X +3118(in)X +3206(the)X +3330(prede\256ned)X +3695(order)X +3891(of)X +3984(linear)X +4193(hashing,)X +2706 2144(but)N +2845(the)X +2980(time)X +3159(at)X +3253(which)X +3485(pages)X +3704(are)X +3839(split)X +4012(is)X +4101(determined)X +2706 2232(both)N +2869(by)X +2970(page)X +3143(over\257ows)X +3480(\()X +2 f +3507(uncontrolled)X +3937(splitting)X +1 f +4198(\))X +4246(and)X +4382(by)X +2706 2320(exceeding)N +3052(the)X +3170(\256ll)X +3278(factor)X +3486(\()X +2 f +3513(controlled)X +3862(splitting)X +1 f +4123(\))X +2878 2434(A)N +2962(hash)X +3135(table)X +3317(is)X +3395(parameterized)X +3876(by)X +3981(both)X +4148(its)X +4248(bucket)X +2706 2522(size)N +2904(\()X +2 f +2931(bsize)X +1 f +(\))S +3191(and)X +3380(\256ll)X +3541(factor)X +3801(\()X +2 f +3828(ffactor)X +1 f +4041(\).)X +4180(Whereas)X +2 f +2706 2610(dynahash's)N +1 f +3095(buckets)X +3364(can)X +3500(be)X +3599(represented)X +3993(as)X +4083(a)X +4142(linked)X +4365(list)X +2706 2698(of)N +2798(elements)X +3108(in)X +3195(memory,)X +3507(our)X +3639(package)X +3928(needs)X +4136(to)X +4222(support)X +2706 2786(disk)N +2874(access,)X +3135(and)X +3286(must)X +3476(represent)X +3806(buckets)X +4086(in)X +4183(terms)X +4395(of)X +2706 2874(pages.)N +2955(The)X +2 f +3106(bsize)X +1 f +3291(is)X +3369(the)X +3492(size)X +3642(\(in)X +3756(bytes\))X +3977(of)X +4069(these)X +4259(pages.)X +2706 2962(As)N +2833(in)X +2933(linear)X +3154(hashing,)X +3461(the)X +3597(number)X +3879(of)X +3983(buckets)X +4265(in)X +4364(the)X +2706 3050(table)N +2906(is)X +3003(equal)X +3221(to)X +3327(the)X +3469(number)X +3758(of)X +3869(keys)X +4060(in)X +4165(the)X +4306(table)X +2706 3138(divided)N +2988(by)X +2 f +3110(ffactor)X +1 f +3323(.)X +2 f +8 s +3113(6)Y +1 f +10 s +3417 3138(The)N +3584(controlled)X +3950(splitting)X +4252(occurs)X +2706 3226(each)N +2878(time)X +3044(the)X +3166(number)X +3435(of)X +3526(keys)X +3697(in)X +3783(the)X +3905(table)X +4085(exceeds)X +4364(the)X +2706 3314(\256ll)N +2814(factor)X +3022(multiplied)X +3370(by)X +3470(the)X +3588(number)X +3853(of)X +3940(buckets.)X +2878 3428(Inserting)N +3187(keys)X +3358(and)X +3498(splitting)X +3783(buckets)X +4051(is)X +4127(performed)X +2706 3516(precisely)N +3018(as)X +3107(described)X +3437(previously)X +3796(for)X +2 f +3911(dynahash)X +1 f +4218(.)X +4279(How-)X +2706 3604(ever,)N +2897(since)X +3094(buckets)X +3371(are)X +3502(now)X +3671(comprised)X +4036(of)X +4134(pages,)X +4368(we)X +2706 3692(must)N +2883(be)X +2981(prepared)X +3284(to)X +3367(handle)X +3602(cases)X +3793(where)X +4011(the)X +4130(size)X +4276(of)X +4364(the)X +2706 3780(keys)N +2873(and)X +3009(data)X +3163(in)X +3245(a)X +3301(bucket)X +3535(exceed)X +3779(the)X +3897(bucket)X +4131(size.)X +3 f +3318 3934(Over\257ow)N +3654(Pages)X +1 f +2878 4066(There)N +3095(are)X +3223(two)X +3372(cases)X +3571(where)X +3797(a)X +3862(key)X +4007(may)X +4174(not)X +4305(\256t)X +4400(in)X +2706 4154(its)N +2802(designated)X +3166(bucket.)X +3441(In)X +3529(the)X +3647(\256rst)X +3791(case,)X +3970(the)X +4088(total)X +4250(size)X +4395(of)X +2706 4242(the)N +2833(key)X +2978(and)X +3123(data)X +3286(may)X +3453(exceed)X +3706(the)X +3833(bucket)X +4076(size.)X +4269(In)X +4364(the)X +2706 4330(second,)N +3008(addition)X +3328(of)X +3453(a)X +3547(new)X +3739(key)X +3913(could)X +4149(cause)X +4386(an)X +2706 4418(over\257ow,)N +3068(but)X +3227(the)X +3382(bucket)X +3652(in)X +3770(question)X +4097(is)X +4206(not)X +4364(yet)X +2706 4506(scheduled)N +3049(to)X +3133(be)X +3230(split.)X +3428(In)X +3516(existing)X +3790(implementations,)X +4364(the)X +2706 4594(second)N +2953(case)X +3115(never)X +3317(arises)X +3523(\(since)X +3738(buckets)X +4006(are)X +4128(split)X +4288(when)X +2706 4682(they)N +2871(over\257ow\))X +3210(and)X +3352(the)X +3476(\256rst)X +3626(case)X +3791(is)X +3870(not)X +3998(handled)X +4278(at)X +4362(all.)X +2706 4770(Although)N +3036(large)X +3225(key/data)X +3525(pair)X +3678(handling)X +3986(is)X +4066(dif\256cult)X +4346(and)X +2706 4858(expensive,)N +3083(it)X +3163(is)X +3252(essential.)X +3604(In)X +3706(a)X +3777(linear)X +3995(hashed)X +4253(imple-)X +2706 4946(mentation,)N +3087(over\257ow)X +3413(pages)X +3636(are)X +3775(required)X +4083(for)X +4217(buckets)X +2706 5034(which)N +2935(over\257ow)X +3253(before)X +3492(they)X +3662(are)X +3793(split,)X +3982(so)X +4085(we)X +4211(can)X +4355(use)X +2706 5122(the)N +2833(same)X +3027(mechanism)X +3421(for)X +3544(large)X +3734(key/data)X +4035(pairs)X +4220(that)X +4368(we)X +2706 5210(use)N +2837(for)X +2955(over\257ow)X +3264(pages.)X +3511(Logically,)X +3862(we)X +3980(chain)X +4177(over\257ow)X +16 s +2706 5353 MXY +864 0 Dl +2 f +8 s +2746 5408(6)N +1 f +9 s +2801 5433(This)N +2952(is)X +3023(not)X +3138(strictly)X +3361(true.)X +3532(The)X +3667(\256le)X +3782(does)X +3937(not)X +4052(contract)X +4306(when)X +2706 5513(keys)N +2861(are)X +2972(deleted,)X +3221(so)X +3308(the)X +3419(number)X +3662(of)X +3744(buckets)X +3986(is)X +4056(actually)X +4306(equal)X +2706 5593(to)N +2782(the)X +2890(maximum)X +3202(number)X +3441(of)X +3520(keys)X +3671(ever)X +3814(present)X +4041(in)X +4116(the)X +4223(table)X +4382(di-)X +2706 5673(vided)N +2884(by)X +2974(the)X +3080(\256ll)X +3178(factor.)X +3 f +10 s +720 5960(USENIX)N +9 f +1042(-)X +3 f +1106(Winter)X +1371('91)X +9 f +1498(-)X +3 f +1562(Dallas,)X +1815(TX)X +4424(5)X + +6 p +%%Page: 6 6 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +432 258(A)N +510(New)X +682(Hashing)X +985(Package)X +1290(for)X +1413(UNIX)X +3663(Seltzer)X +3920(&)X +4007(Yigit)X +1 f +432 538(pages)N +639(to)X +725(the)X +847(buckets)X +1116(\(also)X +1296(called)X +1512(primary)X +1789(pages\).)X +2062(In)X +2152(a)X +432 626(memory)N +730(based)X +943(representation,)X +1448(over\257ow)X +1763(pages)X +1976(do)X +2086(not)X +432 714(pose)N +628(any)X +792(special)X +1063(problems)X +1409(because)X +1712(we)X +1854(can)X +2014(chain)X +432 802(over\257ow)N +776(pages)X +1017(to)X +1137(primary)X +1449(pages)X +1690(using)X +1921(memory)X +432 890(pointers.)N +776(However,)X +1137(mapping)X +1463(these)X +1674(over\257ow)X +2005(pages)X +432 978(into)N +584(a)X +648(disk)X +809(\256le)X +939(is)X +1019(more)X +1211(of)X +1305(a)X +1368(challenge,)X +1723(since)X +1915(we)X +2036(need)X +432 1066(to)N +547(be)X +675(able)X +861(to)X +975(address)X +1268(both)X +1462(bucket)X +1728(pages,)X +1983(whose)X +432 1154(numbers)N +729(are)X +849(growing)X +1137(linearly,)X +1422(and)X +1558(some)X +1747(indeterminate)X +432 1242(number)N +715(of)X +820(over\257ow)X +1143(pages)X +1364(without)X +1646(reorganizing)X +2090(the)X +432 1330(\256le.)N +604 1444(One)N +789(simple)X +1053(solution)X +1361(would)X +1612(be)X +1739(to)X +1852(allocate)X +2152(a)X +432 1532(separate)N +737(\256le)X +880(for)X +1015(over\257ow)X +1341(pages.)X +1604(The)X +1769(disadvantage)X +432 1620(with)N +605(such)X +783(a)X +850(technique)X +1193(is)X +1276(that)X +1426(it)X +1500(requires)X +1789(an)X +1895(extra)X +2086(\256le)X +432 1708(descriptor,)N +794(an)X +891(extra)X +1073(system)X +1316(call)X +1453(on)X +1554(open)X +1731(and)X +1867(close,)X +2072(and)X +432 1796(logically)N +739(associating)X +1122(two)X +1269(independent)X +1687(\256les.)X +1886(For)X +2023(these)X +432 1884(reasons,)N +728(we)X +857(wanted)X +1123(to)X +1219(map)X +1391(both)X +1567(primary)X +1855(pages)X +2072(and)X +432 1972(over\257ow)N +737(pages)X +940(into)X +1084(the)X +1202(same)X +1387(\256le)X +1509(space.)X +604 2086(The)N +799(buddy-in-waiting)X +1425(algorithm)X +1806(provides)X +2152(a)X +432 2174(mechanism)N +851(to)X +966(support)X +1259(multiple)X +1578(pages)X +1814(per)X +1970(logical)X +432 2262(bucket)N +685(while)X +902(retaining)X +1226(the)X +1362(simple)X +1613(split)X +1788(sequence)X +2121(of)X +432 2350(linear)N +681(hashing.)X +1015(Over\257ow)X +1383(pages)X +1631(are)X +1795(preallocated)X +432 2438(between)N +781(generations)X +1232(of)X +1379(primary)X +1713(pages.)X +1996(These)X +432 2526(over\257ow)N +759(pages)X +984(are)X +1125(used)X +1314(by)X +1436(any)X +1594(bucket)X +1850(containing)X +432 2614(more)N +646(keys)X +842(than)X +1029(\256t)X +1144(on)X +1273(the)X +1420(primary)X +1723(page)X +1924(and)X +2089(are)X +432 2702(reclaimed,)N +808(if)X +896(possible,)X +1217(when)X +1430(the)X +1567(bucket)X +1819(later)X +2000(splits.)X +432 2790(Figure)N +687(3)X +773(depicts)X +1045(the)X +1188(layout)X +1433(of)X +1545(primary)X +1844(pages)X +2072(and)X +432 2878(over\257ow)N +752(pages)X +970(within)X +1209(the)X +1342(same)X +1542(\256le.)X +1699(Over\257ow)X +2036(page)X +432 2966(use)N +586(information)X +1011(is)X +1111(recorded)X +1440(in)X +1548(bitmaps)X +1847(which)X +2089(are)X +432 3054(themselves)N +819(stored)X +1046(on)X +1157(over\257ow)X +1472(pages.)X +1725(The)X +1880(addresses)X +432 3142(of)N +520(the)X +639(bitmap)X +882(pages)X +1086(and)X +1223(the)X +1342(number)X +1608(of)X +1695(pages)X +1898(allocated)X +432 3230(at)N +515(each)X +688(split)X +850(point)X +1039(are)X +1163(stored)X +1384(in)X +1470(the)X +1592(\256le)X +1718(header.)X +1997(Using)X +432 3318(this)N +577(information,)X +1005(both)X +1177(over\257ow)X +1492(addresses)X +1829(and)X +1974(bucket)X +432 3406(addresses)N +764(can)X +900(be)X +999(mapped)X +1276(to)X +1361(disk)X +1517(addresses)X +1848(by)X +1951(the)X +2072(fol-)X +432 3494(lowing)N +674(calculation:)X +0 f +8 s +432 3793(int)N +736(bucket;)X +1192(/*)X +1306(bucket)X +1572(address)X +1876(*/)X +432 3881(u_short)N +736(oaddr;)X +1192(/*)X +1306(OVERFLOW)X +1648(address)X +1952(*/)X +432 3969(int)N +736 -0.4125(nhdr_pages;)AX +1192(/*)X +1306(npages)X +1572(in)X +1686 -112.4062(\256le)AX +1838(header)X +2104(*/)X +432 4057(int)N +736 -0.4125(spares[32];)AX +1192(/*)X +1306(npages)X +1572(at)X +1686(each)X +1876(split)X +2104(*/)X +432 4145(int)N +736(log2\(\);)X +1198(/*)X +1312(ceil\(log)X +1654(base)X +1844(2\))X +1958(*/)X +432 4321(#DEFINE)N +736 -0.3929(BUCKET_TO_PAGE\(bucket\))AX +1610(\\)X +584 4409(bucket)N +850(+)X +926 -0.4167(nhdr_pages)AX +1344(+)X +1420(\\)X +584 4497 -0.3894(\(bucket?spares[logs2\(bucket)AN +1648(+)X +1724(1\)-1]:0\))X +432 4673(#DEFINE)N +736 -0.3947(OADDR_TO_PAGE\(oaddr\))AX +1534(\\)X +584 4761 -0.3984(BUCKET_TO_PAGE\(\(1)AN +1268(<<)X +1382 -0.4091(\(oaddr>>11\)\))AX +1876(-)X +1952(1\))X +2066(+)X +2142(\\)X +584 4849(oaddr)N +812(&)X +888(0x7ff;)X +1 f +10 s +604 5262(An)N +728(over\257ow)X +1039(page)X +1217(is)X +1295(addressed)X +1637(by)X +1742(its)X +1842(split)X +2004(point,)X +432 5350(identifying)N +858(the)X +1031(generations)X +1476(between)X +1819(which)X +2090(the)X +432 5438(over\257ow)N +740(page)X +915(is)X +991(allocated,)X +1324(and)X +1463(its)X +1561(page)X +1736(number,)X +2023(iden-)X +432 5526(tifying)N +665(the)X +783(particular)X +1111(page)X +1283(within)X +1507(the)X +1625(split)X +1782(point.)X +1986(In)X +2073(this)X +432 5614(implementation,)N +983(offsets)X +1225(within)X +1457(pages)X +1668(are)X +1795(16)X +1903(bits)X +2046(long)X +432 5702(\(limiting)N +732(the)X +851(maximum)X +1196(page)X +1368(size)X +1513(to)X +1595(32K\),)X +1800(so)X +1891(we)X +2005(select)X +2418 538(an)N +2535(over\257ow)X +2860(page)X +3052(addressing)X +3435(algorithm)X +3786(that)X +3946(can)X +4098(be)X +2418 626(expressed)N +2760(in)X +2847(16)X +2952(bits)X +3091(and)X +3231(which)X +3451(allows)X +3684(quick)X +3886(retrieval.)X +2418 714(The)N +2568(top)X +2695(\256ve)X +2840(bits)X +2980(indicate)X +3258(the)X +3380(split)X +3541(point)X +3729(and)X +3869(the)X +3991(lower)X +2418 802(eleven)N +2650(indicate)X +2926(the)X +3046(page)X +3220(number)X +3487(within)X +3713(the)X +3832(split)X +3990(point.)X +2418 890(Since)N +2633(\256ve)X +2789(bits)X +2940(are)X +3075(reserved)X +3384(for)X +3514(the)X +3648(split)X +3821(point,)X +4041(\256les)X +2418 978(may)N +2578(split)X +2737(32)X +2839(times)X +3034(yielding)X +3318(a)X +3376(maximum)X +3721(\256le)X +3844(size)X +3990(of)X +4078(2)X +7 s +946(32)Y +10 s +2418 1066(buckets)N +2698(and)X +2849(32)X +2 f +(*)S +1 f +2982(2)X +7 s +1034(11)Y +10 s +3113 1066(over\257ow)N +3433(pages.)X +3691(The)X +3850(maximum)X +2418 1154(page)N +2597(size)X +2749(is)X +2829(2)X +7 s +1122(15)Y +10 s +1154(,)Y +2971(yielding)X +3259(a)X +3321(maximum)X +3671(\256le)X +3799(size)X +3950(greater)X +2418 1242(than)N +2601(131,000)X +2906(GB)X +3061(\(on)X +3212(\256le)X +3358(systems)X +3655(supporting)X +4041(\256les)X +2418 1330(larger)N +2626(than)X +2784(4GB\).)X +10 f +2418 1418 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +1 Dt +4014 2275 MXY +0 133 Dl +3881 2275 MXY +0 133 Dl +3748 2275 MXY +0 133 Dl +3083 2275 MXY +0 133 Dl +5 s +1 f +3523 2475(2/3)N +3390(2/2)X +3257(2/1)X +2859(1/2)X +2726(1/1)X +5 Dt +3814 1743 MXY +0 133 Dl +3282 1743 MXY +0 133 Dl +3017 1743 MXY +0 133 Dl +2884 1743 MXY +0 133 Dl +1 Dt +3681 1743 MXY +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3548 MX +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3415 MX +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3282 MX +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3150 MX +0 133 Dl +132 0 Dl +0 -133 Dl +-132 0 Dl +3017 MX +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +2884 MX +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3 f +8 s +3017 2601(Over\257ow)N +3285(Addresses)X +3515 2833(Over\257ow)N +3783(Pages)X +2850(Buckets)X +1 Di +3349 2740 MXY + 3349 2740 lineto + 3482 2740 lineto + 3482 2873 lineto + 3349 2873 lineto + 3349 2740 lineto +closepath 3 3349 2740 3482 2873 Dp +2684 MX +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +5 Dt +4146 2275 MXY +0 133 Dl +3216 2275 MXY +0 133 Dl +2684 2275 MXY +0 133 Dl +2551 2275 MXY +0 133 Dl +1 f +3798 1963(3)N +3266 1980(2)N +3001(1)X +2868(0)X +1 Dt +2751 1743 MXY +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3548 2275 MXY +-15 -22 Dl +2 16 Dl +-13 11 Dl +26 -5 Dl +-282 -117 Dl +3432 2275 MXY +-10 -25 Dl +-2 16 Dl +-15 8 Dl +27 1 Dl +-166 -117 Dl +3282 2275 MXY +12 -25 Dl +-14 10 Dl +-15 -6 Dl +17 21 Dl +-16 -117 Dl +2884 2275 MXY +26 7 Dl +-12 -12 Dl +3 -16 Dl +-17 21 Dl +382 -117 Dl +2751 2275 MXY +25 9 Dl +-11 -12 Dl +5 -17 Dl +-19 20 Dl +515 -117 Dl +3 f +3070 2152(Over\257ow)N +3338(Pages)X +3482 2275 MXY + 3482 2275 lineto + 3615 2275 lineto + 3615 2408 lineto + 3482 2408 lineto + 3482 2275 lineto +closepath 3 3482 2275 3615 2408 Dp +3349 MX + 3349 2275 lineto + 3482 2275 lineto + 3482 2408 lineto + 3349 2408 lineto + 3349 2275 lineto +closepath 3 3349 2275 3482 2408 Dp +3216 MX + 3216 2275 lineto + 3349 2275 lineto + 3349 2408 lineto + 3216 2408 lineto + 3216 2275 lineto +closepath 3 3216 2275 3349 2408 Dp +2817 MX + 2817 2275 lineto + 2950 2275 lineto + 2950 2408 lineto + 2817 2408 lineto + 2817 2275 lineto +closepath 3 2817 2275 2950 2408 Dp +2684 MX + 2684 2275 lineto + 2817 2275 lineto + 2817 2408 lineto + 2684 2408 lineto + 2684 2275 lineto +closepath 3 2684 2275 2817 2408 Dp +3615 MX +0 133 Dl +531 0 Dl +0 -133 Dl +-531 0 Dl +2950 MX +0 133 Dl +266 0 Dl +0 -133 Dl +-266 0 Dl +2551 MX +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3798 1726 MXY +-21 -18 Dl +6 16 Dl +-10 13 Dl +25 -11 Dl +-599 -99 Dl +3266 1726 MXY +-1 -27 Dl +-7 15 Dl +-17 1 Dl +25 11 Dl +-67 -99 Dl +3033 1726 MXY +27 1 Dl +-14 -8 Dl +-1 -17 Dl +-12 24 Dl +166 -99 Dl +2900 1726 MXY +27 7 Dl +-13 -11 Dl +3 -17 Dl +-17 21 Dl +299 -99 Dl +3058 1621(Split)N +3203(Points)X +2418 2275 MXY +0 133 Dl +133 0 Dl +0 -133 Dl +-133 0 Dl +3 Dt +-1 Ds +3137(Figure)Y +2619(3:)X +1 f +2691(Split)X +2832(points)X +3008(occur)X +3168(between)X +3399(generations)X +3712(and)X +3823(are)X +3919(numbered)X +2418 3225(from)N +2560(0.)X +2642(In)X +2713(this)X +2824(\256gure)X +2991(there)X +3136(are)X +3231(two)X +3345(over\257ow)X +3590(pages)X +3753(allocated)X +4000(at)X +4063(split)X +2418 3313(point)N +2566(1)X +2614(and)X +2722(three)X +2865(allocated)X +3111(at)X +3173(split)X +3300(point)X +3448(2.)X +10 s +10 f +2418 3489 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +2949 3731(Buffer)N +3192(Management)X +1 f +2590 3863(The)N +2744(hash)X +2920(table)X +3105(is)X +3187(stored)X +3412(in)X +3502(memory)X +3797(as)X +3892(a)X +3956(logical)X +2418 3951(array)N +2633(of)X +2749(bucket)X +3012(pointers.)X +3359(Physically,)X +3761(the)X +3907(array)X +4121(is)X +2418 4039(arranged)N +2728(in)X +2818(segments)X +3144(of)X +3239(256)X +3387(pointers.)X +3713(Initially,)X +4013(there)X +2418 4127(is)N +2530(space)X +2767(to)X +2887(allocate)X +3195(256)X +3373(segments.)X +3769(Reallocation)X +2418 4215(occurs)N +2651(when)X +2847(the)X +2967(number)X +3234(of)X +3323(buckets)X +3590(exceeds)X +3867(32K)X +4027(\(256)X +2418 4303(*)N +2508(256\).)X +2745(Primary)X +3053(pages)X +3286(may)X +3473(be)X +3598(accessed)X +3929(directly)X +2418 4391(through)N +2711(the)X +2853(array)X +3062(by)X +3185(bucket)X +3442(number)X +3730(and)X +3889(over\257ow)X +2418 4479(pages)N +2628(are)X +2754 0.4028(referenced)AX +3122(logically)X +3429(by)X +3536(their)X +3710(over\257ow)X +4022(page)X +2418 4567(address.)N +2726(For)X +2864(small)X +3063(hash)X +3236(tables,)X +3469(it)X +3539(is)X +3618(desirable)X +3934(to)X +4022(keep)X +2418 4655(all)N +2525(pages)X +2735(in)X +2823(main)X +3009(memory)X +3302(while)X +3506(on)X +3612(larger)X +3826(tables,)X +4059(this)X +2418 4743(is)N +2523(probably)X +2860(impossible.)X +3298(To)X +3438(satisfy)X +3698(both)X +3891(of)X +4009(these)X +2418 4831(requirements,)N +2900(the)X +3041(package)X +3348(includes)X +3658(buffer)X +3897(manage-)X +2418 4919(ment)N +2598(with)X +2760(LRU)X +2940(\(least)X +3134(recently)X +3413(used\))X +3607(replacement.)X +2590 5033(By)N +2730(default,)X +3020(the)X +3165(package)X +3475(allocates)X +3802(up)X +3928(to)X +4036(64K)X +2418 5121(bytes)N +2616(of)X +2712(buffered)X +3014(pages.)X +3246(All)X +3377(pages)X +3589(in)X +3680(the)X +3807(buffer)X +4032(pool)X +2418 5209(are)N +2542(linked)X +2766(in)X +2852(LRU)X +3036(order)X +3230(to)X +3316(facilitate)X +3621(fast)X +3761(replacement.)X +2418 5297(Whereas)N +2724(ef\256cient)X +3011(access)X +3241(to)X +3327(primary)X +3605(pages)X +3812(is)X +3889(provided)X +2418 5385(by)N +2521(the)X +2642(bucket)X +2879(array,)X +3087(ef\256cient)X +3372(access)X +3600(to)X +3684(over\257ow)X +3991(pages)X +2418 5473(is)N +2501(provided)X +2816(by)X +2926(linking)X +3182(over\257ow)X +3497(page)X +3679(buffers)X +3936(to)X +4027(their)X +2418 5561(predecessor)N +2827(page)X +3008(\(either)X +3247(the)X +3374(primary)X +3657(page)X +3838(or)X +3933(another)X +2418 5649(over\257ow)N +2742(page\).)X +3000(This)X +3181(means)X +3425(that)X +3584(an)X +3699(over\257ow)X +4022(page)X +3 f +432 5960(6)N +2970(USENIX)X +9 f +3292(-)X +3 f +3356(Winter)X +3621('91)X +9 f +3748(-)X +3 f +3812(Dallas,)X +4065(TX)X + +7 p +%%Page: 7 7 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +720 258(Seltzer)N +977(&)X +1064(Yigit)X +3278(A)X +3356(New)X +3528(Hashing)X +3831(Package)X +4136(for)X +4259(UNIX)X +1 f +720 538(cannot)N +955(be)X +1052(present)X +1305(in)X +1388(the)X +1507(buffer)X +1724(pool)X +1886(if)X +1955(its)X +2050(primary)X +2324(page)X +720 626(is)N +804(not)X +937(present.)X +1240(This)X +1413(does)X +1591(not)X +1724(impact)X +1972(performance)X +2409(or)X +720 714(functionality,)N +1209(because)X +1524(an)X +1660(over\257ow)X +2005(page)X +2217(will)X +2400(be)X +720 802(accessed)N +1048(only)X +1236(after)X +1430(its)X +1550(predecessor)X +1975(page)X +2172(has)X +2324(been)X +720 890(accessed.)N +1068(Figure)X +1303(4)X +1369(depicts)X +1622(the)X +1746(data)X +1905(structures)X +2242(used)X +2414(to)X +720 978(manage)N +990(the)X +1108(buffer)X +1325(pool.)X +892 1092(The)N +1040(in-memory)X +1419(bucket)X +1656(array)X +1845(contains)X +2134(pointers)X +2414(to)X +720 1180(buffer)N +975(header)X +1248(structures)X +1617(which)X +1870(represent)X +2222(primary)X +720 1268(pages.)N +968(Buffer)X +1203(headers)X +1474(contain)X +1735(modi\256ed)X +2043(bits,)X +2202(the)X +2324(page)X +720 1356(address)N +995(of)X +1096(the)X +1228(buffer,)X +1479(a)X +1548(pointer)X +1808(to)X +1903(the)X +2034(actual)X +2259(buffer,)X +720 1444(and)N +875(a)X +950(pointer)X +1216(to)X +1317(the)X +1454(buffer)X +1690(header)X +1944(for)X +2077(an)X +2191(over\257ow)X +720 1532(page)N +901(if)X +979(it)X +1052(exists,)X +1283(in)X +1374(addition)X +1665(to)X +1756(the)X +1883(LRU)X +2072(links.)X +2296(If)X +2378(the)X +720 1620(buffer)N +950(corresponding)X +1442(to)X +1537(a)X +1606(particular)X +1947(bucket)X +2194(is)X +2280(not)X +2414(in)X +720 1708(memory,)N +1048(its)X +1164(pointer)X +1432(is)X +1526(NULL.)X +1801(In)X +1909(effect,)X +2154(pages)X +2377(are)X +720 1796(linked)N +950(in)X +1042(three)X +1233(ways.)X +1468(Using)X +1689(the)X +1817(buffer)X +2043(headers,)X +2338(they)X +720 1884(are)N +851(linked)X +1083(physically)X +1444(through)X +1725(the)X +1854(LRU)X +2045(links)X +2231(and)X +2378(the)X +720 1972(over\257ow)N +1036(links.)X +1241(Using)X +1462(the)X +1590(pages)X +1803(themselves,)X +2209(they)X +2377(are)X +720 2060(linked)N +943(logically)X +1246(through)X +1518(the)X +1639(over\257ow)X +1946(addresses)X +2276(on)X +2378(the)X +720 2148(page.)N +948(Since)X +1162(over\257ow)X +1482(pages)X +1700(are)X +1834(accessed)X +2151(only)X +2328(after)X +720 2236(their)N +904(predecessor)X +1321(pages,)X +1560(they)X +1734(are)X +1869(removed)X +2186(from)X +2378(the)X +720 2324(buffer)N +937(pool)X +1099(when)X +1293(their)X +1460(primary)X +1734(is)X +1807(removed.)X +10 f +720 2412 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +1 Dt +2309 3177 MXY +24 15 Dl +-8 -15 Dl +8 -15 Dl +-24 15 Dl +52 0 Dl +789 3160 MXY +-35 0 Dl +0 -156 Dl +1607 0 Dl +0 173 Dl +789 3091 MXY +-24 -15 Dl +9 15 Dl +-9 15 Dl +24 -15 Dl +-69 0 Dl +2309 3125 MXY +104 0 Dl +0 -155 Dl +-1693 0 Dl +0 121 Dl +927 3160 MXY +24 15 Dl +-9 -15 Dl +9 -15 Dl +-24 15 Dl +553 0 Dl +1618 3177 MXY +8 27 Dl +4 -17 Dl +16 -6 Dl +-28 -4 Dl +138 121 Dl +1895 3315 MXY +28 3 Dl +-15 -9 Dl +1 -18 Dl +-14 24 Dl +276 -138 Dl +3108 MY +-28 -3 Dl +15 10 Dl +-1 17 Dl +14 -24 Dl +-276 138 Dl +1756 3229 MXY +-8 -27 Dl +-3 17 Dl +-16 6 Dl +27 4 Dl +-138 -121 Dl +1480 MX +-24 -15 Dl +9 15 Dl +-9 15 Dl +24 -15 Dl +-553 0 Dl +3 f +5 s +1083 3073(LRU)N +1178(chain)X +4 Ds +1402 3851 MXY + 1402 3851 lineto + 1471 3851 lineto + 1471 3920 lineto + 1402 3920 lineto + 1402 3851 lineto +closepath 19 1402 3851 1471 3920 Dp +1445 3747(Over\257ow)N +1613(Address)X +1549 3609 MXY +0 69 Dl +1756 MX +-23 -15 Dl +8 15 Dl +-8 15 Dl +23 -15 Dl +-207 0 Dl +-1 Ds +3 Dt +1756 3419 MXY +-6 -28 Dl +-4 17 Dl +-17 5 Dl +27 6 Dl +-138 -138 Dl +2240 3471 MXY +15 -24 Dl +-15 9 Dl +-15 -9 Dl +15 24 Dl +0 -138 Dl +1826 3609 MXY +15 -24 Dl +-15 9 Dl +-16 -9 Dl +16 24 Dl +0 -138 Dl +1549 MX +15 -24 Dl +-15 9 Dl +-15 -9 Dl +15 24 Dl +0 -138 Dl +858 3471 MXY +15 -24 Dl +-15 9 Dl +-15 -9 Dl +15 24 Dl +0 -138 Dl +2240 3056 MXY +15 -24 Dl +-15 9 Dl +-15 -9 Dl +15 24 Dl +0 -138 Dl +1549 3056 MXY +15 -24 Dl +-15 9 Dl +-15 -9 Dl +15 24 Dl +0 -138 Dl +858 3056 MXY +15 -24 Dl +-15 9 Dl +-15 -9 Dl +15 24 Dl +0 -138 Dl +1 Dt +2171 3471 MXY + 2171 3471 lineto + 2448 3471 lineto + 2448 3609 lineto + 2171 3609 lineto + 2171 3471 lineto +closepath 19 2171 3471 2448 3609 Dp +1756 3609 MXY + 1756 3609 lineto + 2033 3609 lineto + 2033 3747 lineto + 1756 3747 lineto + 1756 3609 lineto +closepath 3 1756 3609 2033 3747 Dp +1480 3471 MXY + 1480 3471 lineto + 1756 3471 lineto + 1756 3609 lineto + 1480 3609 lineto + 1480 3471 lineto +closepath 19 1480 3471 1756 3609 Dp +789 MX + 789 3471 lineto + 1065 3471 lineto + 1065 3609 lineto + 789 3609 lineto + 789 3471 lineto +closepath 19 789 3471 1065 3609 Dp +962 3903(Buffer)N +1083(Header)X +849 3851 MXY + 849 3851 lineto + 918 3851 lineto + 918 3920 lineto + 849 3920 lineto + 849 3851 lineto +closepath 14 849 3851 918 3920 Dp +1756 3194 MXY + 1756 3194 lineto + 1895 3194 lineto + 1895 3471 lineto + 1756 3471 lineto + 1756 3194 lineto +closepath 14 1756 3194 1895 3471 Dp +2171 3056 MXY + 2171 3056 lineto + 2309 3056 lineto + 2309 3333 lineto + 2171 3333 lineto + 2171 3056 lineto +closepath 14 2171 3056 2309 3333 Dp +1480 MX + 1480 3056 lineto + 1618 3056 lineto + 1618 3333 lineto + 1480 3333 lineto + 1480 3056 lineto +closepath 14 1480 3056 1618 3333 Dp +789 MX + 789 3056 lineto + 927 3056 lineto + 927 3333 lineto + 789 3333 lineto + 789 3056 lineto +closepath 14 789 3056 927 3333 Dp +2780 MY +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +927 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +1065 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +1203 MX +0 138 Dl +139 0 Dl +0 -138 Dl +-139 0 Dl +1342 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +1480 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +1618 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +1756 MX +0 138 Dl +139 0 Dl +0 -138 Dl +-139 0 Dl +1895 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +2033 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +2171 MX +0 138 Dl +138 0 Dl +0 -138 Dl +-138 0 Dl +2309 MX +0 138 Dl +139 0 Dl +0 -138 Dl +-139 0 Dl +13 s +1048 2720(In)N +1173(Memory)X +1580(Bucket)X +1918(Array)X +867 3584(B0)N +1558(B5)X +2223(B10)X +1788 3722(O1/1)N +5 s +1515 3903(Primay)N +1651(Buffer)X +4 Ds +1990 3851 MXY + 1990 3851 lineto + 2059 3851 lineto + 2059 3920 lineto + 1990 3920 lineto + 1990 3851 lineto +closepath 3 1990 3851 2059 3920 Dp +2102 3903(Over\257ow)N +2270(Buffer)X +3 Dt +-1 Ds +8 s +720 4184(Figure)N +922(4:)X +1 f +996(Three)X +1164(primary)X +1386(pages)X +1551(\(B0,)X +1683(B5,)X +1794(B10\))X +1942(are)X +2039(accessed)X +2281(directly)X +720 4272(from)N +862(the)X +958(bucket)X +1146(array.)X +1326(The)X +1443(one)X +1553(over\257ow)X +1798(page)X +1935(\(O1/1\))X +2122(is)X +2182(linked)X +2359(phy-)X +720 4360(sically)N +915(from)X +1067(its)X +1155(primary)X +1384(page's)X +1577(buffer)X +1759(header)X +1955(as)X +2035(well)X +2172(as)X +2252(logically)X +720 4448(from)N +860(its)X +937(predecessor)X +1253(page)X +1389(buffer)X +1560(\(B5\).)X +10 s +10 f +720 4624 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +1191 4954(Table)N +1406(Parameterization)X +1 f +892 5086(When)N +1107(a)X +1166(hash)X +1336(table)X +1515(is)X +1590(created,)X +1865(the)X +1985(bucket)X +2221(size,)X +2388(\256ll)X +720 5174(factor,)N +953(initial)X +1164(number)X +1434(of)X +1526(elements,)X +1856(number)X +2125(of)X +2216(bytes)X +2409(of)X +720 5262(main)N +919(memory)X +1225(used)X +1411(for)X +1543(caching,)X +1851(and)X +2005(a)X +2079(user-de\256ned)X +720 5350(hash)N +892(function)X +1184(may)X +1347(be)X +1448(speci\256ed.)X +1797(The)X +1946(bucket)X +2184(size)X +2333(\(and)X +720 5438(page)N +906(size)X +1064(for)X +1191(over\257ow)X +1509(pages\))X +1752(defaults)X +2039(to)X +2134(256)X +2287(bytes.)X +720 5526(For)N +858(tables)X +1072(with)X +1241(large)X +1429(data)X +1590(items,)X +1810(it)X +1881(may)X +2046(be)X +2149(preferable)X +720 5614(to)N +803(increase)X +1088(the)X +1207(page)X +1380(size,)X +1545(and,)X +1701(conversely,)X +2089(applications)X +720 5702(storing)N +1002(small)X +1235(items)X +1467(exclusively)X +1891(in)X +2012(memory)X +2338(may)X +2706 538(bene\256t)N +2966(from)X +3164(a)X +3242(smaller)X +3520(bucket)X +3776(size.)X +3983(A)X +4082(bucket)X +4337(size)X +2706 626(smaller)N +2962(than)X +3120(64)X +3220(bytes)X +3409(is)X +3482(not)X +3604(recommended.)X +2878 740(The)N +3031(\256ll)X +3147(factor)X +3363(indicates)X +3676(a)X +3740(desired)X +4000(density)X +4258(within)X +2706 828(the)N +2833(hash)X +3009(table.)X +3234(It)X +3312(is)X +3394(an)X +3499(approximation)X +3995(of)X +4091(the)X +4217(number)X +2706 916(of)N +2815(keys)X +3004(allowed)X +3300(to)X +3404(accumulate)X +3811(in)X +3914(any)X +4071(one)X +4228(bucket,)X +2706 1004(determining)N +3119(when)X +3319(the)X +3442(hash)X +3614(table)X +3795(grows.)X +4056(Its)X +4161(default)X +4409(is)X +2706 1092(eight.)N +2953(If)X +3054(the)X +3199(user)X +3380(knows)X +3636(the)X +3781(average)X +4079(size)X +4251(of)X +4364(the)X +2706 1180(key/data)N +3008(pairs)X +3194(being)X +3402(stored)X +3627(in)X +3718(the)X +3845(table,)X +4050(near)X +4218(optimal)X +2706 1268(bucket)N +2943(sizes)X +3122(and)X +3261(\256ll)X +3372(factors)X +3614(may)X +3775(be)X +3874(selected)X +4155(by)X +4257(apply-)X +2706 1356(ing)N +2828(the)X +2946(equation:)X +0 f +8 s +2706 1655(\(1\))N +2994 -0.3938(\(\(average_pair_length)AX +3830(+)X +3906(4\))X +4020(*)X +3032 1743(ffactor\))N +3374(>=)X +3488(bsize)X +1 f +10 s +2706 2042(For)N +2859(highly)X +3104(time)X +3287(critical)X +3551(applications,)X +3999(experimenting)X +2706 2130(with)N +2919(different)X +3266(bucket)X +3550(sizes)X +3776(and)X +3962(\256ll)X +4120(factors)X +4409(is)X +2706 2218(encouraged.)N +2878 2332(Figures)N +3144(5a,b,)X +3326(and)X +3468(c)X +3530(illustrate)X +3836(the)X +3960(effects)X +4200(of)X +4292(vary-)X +2706 2420(ing)N +2841(page)X +3026(sizes)X +3215(and)X +3363(\256ll)X +3483(factors)X +3734(for)X +3860(the)X +3990(same)X +4187(data)X +4353(set.)X +2706 2508(The)N +2864(data)X +3031(set)X +3152(consisted)X +3482(of)X +3581(24474)X +3813(keys)X +3992(taken)X +4198(from)X +4386(an)X +2706 2596(online)N +2931(dictionary.)X +3301(The)X +3451(data)X +3609(value)X +3807(for)X +3925(each)X +4097(key)X +4237(was)X +4386(an)X +2706 2684(ASCII)N +2938(string)X +3143(for)X +3260(an)X +3359(integer)X +3605(from)X +3784(1)X +3847(to)X +3931(24474)X +4153(inclusive.)X +2706 2772(The)N +2867(test)X +3013(run)X +3155(consisted)X +3488(of)X +3590(creating)X +3884(a)X +3955(new)X +4124(hash)X +4306(table)X +2706 2860(\(where)N +2966(the)X +3100(ultimate)X +3398(size)X +3559(of)X +3662(the)X +3796(table)X +3987(was)X +4147(known)X +4400(in)X +2706 2948(advance\),)N +3054(entering)X +3354(each)X +3539(key/data)X +3848(pair)X +4010(into)X +4171(the)X +4306(table)X +2706 3036(and)N +2849(then)X +3014(retrieving)X +3353(each)X +3528(key/data)X +3827(pair)X +3979(from)X +4162(the)X +4286(table.)X +2706 3124(Each)N +2898(of)X +2996(the)X +3125(graphs)X +3369(shows)X +3599(the)X +3727(timings)X +3996(resulting)X +4306(from)X +2706 3212(varying)N +2973(the)X +3093(pagesize)X +3392(from)X +3570(128)X +3712(bytes)X +3903(to)X +3986(1M)X +4118(and)X +4255(the)X +4374(\256ll)X +2706 3300(factor)N +2929(from)X +3120(1)X +3195(to)X +3292(128.)X +3486(For)X +3631(each)X +3813(run,)X +3974(the)X +4106(buffer)X +4337(size)X +2706 3388(was)N +2874(set)X +3006(at)X +3106(1M.)X +3299(The)X +3466(tests)X +3650(were)X +3849(all)X +3971(run)X +4120(on)X +4242(an)X +4360(HP)X +2706 3476(9000/370)N +3077(\(33.3)X +3312(Mhz)X +3527(MC68030\),)X +3966(with)X +4176(16M)X +4395(of)X +2706 3564(memory,)N +3042(64K)X +3228(physically)X +3605(addressed)X +3970(cache,)X +4222(and)X +4386(an)X +2706 3652(HP7959S)N +3055(disk)X +3231(drive,)X +3459(running)X +3751(4.3BSD-Reno)X +4244(single-)X +2706 3740(user.)N +2878 3854(Both)N +3066(system)X +3321(time)X +3496(\(Figure)X +3764(5a\))X +3899(and)X +4047(elapsed)X +4320(time)X +2706 3942(\(Figure)N +2966(5b\))X +3097(show)X +3290(that)X +3434(for)X +3552(all)X +3655(bucket)X +3892(sizes,)X +4091(the)X +4212(greatest)X +2706 4030(performance)N +3137(gains)X +3329(are)X +3451(made)X +3648(by)X +3751(increasing)X +4104(the)X +4225(\256ll)X +4336(fac-)X +2706 4118(tor)N +2822(until)X +2995(equation)X +3298(1)X +3365(is)X +3445(satis\256ed.)X +3774(The)X +3925(user)X +4085(time)X +4253(shown)X +2706 4206(in)N +2791(Figure)X +3023(5c)X +3122(gives)X +3314(a)X +3373(more)X +3561(detailed)X +3838(picture)X +4083(of)X +4172(how)X +4332(per-)X +2706 4294(formance)N +3054(varies.)X +3330(The)X +3499(smaller)X +3778(bucket)X +4035(sizes)X +4234(require)X +2706 4382(fewer)N +2921(keys)X +3099(per)X +3233(page)X +3416(to)X +3509(satisfy)X +3749(equation)X +4056(1)X +4127(and)X +4274(there-)X +2706 4470(fore)N +2860(incur)X +3049(fewer)X +3257(collisions.)X +3607(However,)X +3946(when)X +4144(the)X +4265(buffer)X +2706 4558(pool)N +2884(size)X +3045(is)X +3134(\256xed,)X +3349(smaller)X +3620(pages)X +3838(imply)X +4059(more)X +4259(pages.)X +2706 4646(An)N +2830(increased)X +3160(number)X +3430(of)X +3522(pages)X +3730(means)X +3960(more)X +2 f +4150(malloc\(3\))X +1 f +2706 4734(calls)N +2879(and)X +3021(more)X +3212(overhead)X +3533(in)X +3621(the)X +3745(hash)X +3918(package's)X +4265(buffer)X +2706 4822(manager)N +3003(to)X +3085(manage)X +3355(the)X +3473(additional)X +3813(pages.)X +2878 4936(The)N +3028(tradeoff)X +3308(works)X +3529(out)X +3655(most)X +3834(favorably)X +4166(when)X +4364(the)X +2706 5024(page)N +2886(size)X +3039(is)X +3120(256)X +3268(and)X +3412(the)X +3538(\256ll)X +3654(factor)X +3870(is)X +3950(8.)X +4057(Similar)X +4319(con-)X +2706 5112(clusions)N +3009(were)X +3207(obtained)X +3524(if)X +3614(the)X +3753(test)X +3905(was)X +4071(run)X +4218(without)X +2706 5200(knowing)N +3007(the)X +3126(\256nal)X +3289(table)X +3466(size)X +3612(in)X +3695(advance.)X +4020(If)X +4095(the)X +4214(\256le)X +4337(was)X +2706 5288(closed)N +2942(and)X +3088(written)X +3345(to)X +3437(disk,)X +3620(the)X +3748(conclusions)X +4156(were)X +4343(still)X +2706 5376(the)N +2832(same.)X +3065(However,)X +3408(rereading)X +3740(the)X +3865(\256le)X +3994(from)X +4177(disk)X +4337(was)X +2706 5464(slightly)N +2983(faster)X +3199(if)X +3285(a)X +3358(larger)X +3583(bucket)X +3834(size)X +3996(and)X +4149(\256ll)X +4274(factor)X +2706 5552(were)N +2898(used)X +3079(\(1K)X +3238(bucket)X +3486(size)X +3645(and)X +3795(32)X +3909(\256ll)X +4031(factor\).)X +4320(This)X +2706 5640(follows)N +2987(intuitively)X +3356(from)X +3553(the)X +3691(improved)X +4038(ef\256ciency)X +4395(of)X +3 f +720 5960(USENIX)N +9 f +1042(-)X +3 f +1106(Winter)X +1371('91)X +9 f +1498(-)X +3 f +1562(Dallas,)X +1815(TX)X +4424(7)X + +8 p +%%Page: 8 8 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +432 258(A)N +510(New)X +682(Hashing)X +985(Package)X +1290(for)X +1413(UNIX)X +3663(Seltzer)X +3920(&)X +4007(Yigit)X +1 f +432 538(performing)N +830(1K)X +965(reads)X +1172(from)X +1365(the)X +1500(disk)X +1670(rather)X +1894(than)X +2068(256)X +432 626(byte)N +609(reads.)X +857(In)X +962(general,)X +1257(performance)X +1702(for)X +1834(disk)X +2005(based)X +432 714(tables)N +639(is)X +712(best)X +861(when)X +1055(the)X +1173(page)X +1345(size)X +1490(is)X +1563(approximately)X +2046(1K.)X +10 f +432 802 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +619 2380 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +629 2437 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +648 2504 MXY +-12 25 Dl +24 0 Dl +-12 -25 Dl +686 2515 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +762 2516 MXY +-12 24 Dl +25 0 Dl +-13 -24 Dl +916 2515 MXY +-13 24 Dl +25 0 Dl +-12 -24 Dl +1222 2516 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +1834 2515 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +1 Dt +619 2392 MXY +10 57 Dl +19 67 Dl +38 11 Dl +76 1 Dl +154 -1 Dl +306 1 Dl +612 -1 Dl +8 s +1 f +1628 2522(128)N +3 Dt +607 2245 MXY +24 Dc +617 2375 MXY +23 Dc +635 2442 MXY +24 Dc +674 2525 MXY +23 Dc +750 2529 MXY +24 Dc +904 2527 MXY +23 Dc +1210 MX +23 Dc +1822 2528 MXY +23 Dc +20 Ds +1 Dt +619 2245 MXY +10 130 Dl +19 67 Dl +38 83 Dl +76 4 Dl +154 -2 Dl +306 0 Dl +612 1 Dl +678 2482(256)N +-1 Ds +3 Dt +619 2127 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +629 2191 MXY +0 25 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +648 2334 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +686 2409 MXY +0 25 Dl +0 -13 Dl +12 0 Dl +-24 0 Dl +762 2516 MXY +0 25 Dl +0 -12 Dl +13 0 Dl +-25 0 Dl +916 2516 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-25 0 Dl +1222 2515 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +1834 2515 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +5 Dt +619 2139 MXY +10 65 Dl +19 142 Dl +38 75 Dl +76 108 Dl +154 -1 Dl +306 -1 Dl +612 0 Dl +694 2401(512)N +3 Dt +631 2064 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +641 2077 MXY +-24 25 Dl +12 -12 Dl +-12 -13 Dl +24 25 Dl +660 2132 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +698 2292 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +775 2382 MXY +-25 24 Dl +12 -12 Dl +-12 -12 Dl +25 24 Dl +928 2516 MXY +-25 24 Dl +13 -12 Dl +-13 -12 Dl +25 24 Dl +1234 2516 MXY +-24 25 Dl +12 -12 Dl +-12 -13 Dl +24 25 Dl +1846 2516 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +16 Ds +1 Dt +619 2076 MXY +10 14 Dl +19 54 Dl +38 160 Dl +76 90 Dl +154 134 Dl +306 1 Dl +612 -1 Dl +694 2257(1024)N +-1 Ds +3 Dt +619 1877 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +629 1855 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +648 1838 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +686 1860 MXY +12 -25 Dl +-24 0 Dl +12 25 Dl +762 1923 MXY +13 -24 Dl +-25 0 Dl +12 24 Dl +916 2087 MXY +12 -24 Dl +-25 0 Dl +13 24 Dl +1222 2256 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +1834 2541 MXY +12 -25 Dl +-24 0 Dl +12 25 Dl +619 1865 MXY +10 -22 Dl +19 -17 Dl +38 21 Dl +76 64 Dl +154 164 Dl +306 169 Dl +612 285 Dl +1645 2427(4096)N +619 1243 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +629 1196 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +648 1146 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +686 1174 MXY +0 25 Dl +0 -13 Dl +12 0 Dl +-24 0 Dl +762 1249 MXY +0 24 Dl +0 -12 Dl +13 0 Dl +-25 0 Dl +916 1371 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-25 0 Dl +1222 1680 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +1834 1999 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +619 1255 MXY +10 -47 Dl +19 -50 Dl +38 28 Dl +76 75 Dl +154 122 Dl +306 309 Dl +612 319 Dl +1741 1934(8192)N +5 Dt +609 2531 MXY +1225 0 Dl +609 MX +0 -1553 Dl +2531 MY +0 16 Dl +4 Ds +1 Dt +2531 MY +0 -1553 Dl +593 2625(0)N +-1 Ds +5 Dt +916 2531 MXY +0 16 Dl +4 Ds +1 Dt +2531 MY +0 -1553 Dl +884 2625(32)N +-1 Ds +5 Dt +1222 2531 MXY +0 16 Dl +4 Ds +1 Dt +2531 MY +0 -1553 Dl +1190 2625(64)N +-1 Ds +5 Dt +1528 2531 MXY +0 16 Dl +4 Ds +1 Dt +2531 MY +0 -1553 Dl +1496 2625(96)N +-1 Ds +5 Dt +1834 2531 MXY +0 16 Dl +4 Ds +1 Dt +2531 MY +0 -1553 Dl +1786 2625(128)N +-1 Ds +5 Dt +609 2531 MXY +-16 0 Dl +4 Ds +1 Dt +609 MX +1225 0 Dl +545 2558(0)N +-1 Ds +5 Dt +609 2013 MXY +-16 0 Dl +4 Ds +1 Dt +609 MX +1225 0 Dl +481 2040(100)N +-1 Ds +5 Dt +609 1496 MXY +-16 0 Dl +4 Ds +1 Dt +609 MX +1225 0 Dl +481 1523(200)N +-1 Ds +5 Dt +609 978 MXY +-16 0 Dl +4 Ds +1 Dt +609 MX +1225 0 Dl +481 1005(300)N +1088 2724(Fill)N +1194(Factor)X +422 1611(S)N +426 1667(e)N +426 1724(c)N +424 1780(o)N +424 1837(n)N +424 1893(d)N +428 1949(s)N +3 Dt +-1 Ds +3 f +432 2882(Figure)N +636(5a:)X +1 f +744(System)X +956(Time)X +1113(for)X +1209(dictionary)X +1490(data)X +1618(set)X +1711(with)X +1847(1M)X +1958(of)X +2033(buffer)X +432 2970(space)N +594(and)X +707(varying)X +923(bucket)X +1114(sizes)X +1259(and)X +1372(\256ll)X +1465(factors.)X +1675(Each)X +1823(line)X +1940(is)X +2004(labeled)X +432 3058(with)N +562(its)X +639(bucket)X +825(size.)X +10 s +10 f +432 3234 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +8 s +1 f +428 4381(s)N +424 4325(d)N +424 4269(n)N +424 4212(o)N +426 4156(c)N +426 4099(e)N +422 4043(S)N +1116 5156(Fill)N +1222(Factor)X +506 3437(3200)N +4 Ds +1 Dt +666 3410 MXY +1168 0 Dl +-1 Ds +5 Dt +666 MX +-16 0 Dl +506 3825(2400)N +4 Ds +1 Dt +666 3799 MXY +1168 0 Dl +-1 Ds +5 Dt +666 MX +-16 0 Dl +506 4214(1600)N +4 Ds +1 Dt +666 4186 MXY +1168 0 Dl +-1 Ds +5 Dt +666 MX +-16 0 Dl +538 4602(800)N +4 Ds +1 Dt +666 4575 MXY +1168 0 Dl +-1 Ds +5 Dt +666 MX +-16 0 Dl +602 4990(0)N +4 Ds +1 Dt +666 4963 MXY +1168 0 Dl +-1 Ds +5 Dt +666 MX +-16 0 Dl +1786 5057(128)N +4 Ds +1 Dt +1834 4963 MXY +0 -1553 Dl +-1 Ds +5 Dt +4963 MY +0 16 Dl +1510 5057(96)N +4 Ds +1 Dt +1542 4963 MXY +0 -1553 Dl +-1 Ds +5 Dt +4963 MY +0 16 Dl +1218 5057(64)N +4 Ds +1 Dt +1250 4963 MXY +0 -1553 Dl +-1 Ds +5 Dt +4963 MY +0 16 Dl +926 5057(32)N +4 Ds +1 Dt +958 4963 MXY +0 -1553 Dl +-1 Ds +5 Dt +4963 MY +0 16 Dl +650 5057(0)N +4 Ds +1 Dt +666 4963 MXY +0 -1553 Dl +-1 Ds +5 Dt +4963 MY +0 16 Dl +4963 MY +0 -1553 Dl +4963 MY +1168 0 Dl +1741 4752(8192)N +3 Dt +675 3732 MXY +9 -172 Dl +18 -118 Dl +37 128 Dl +73 -121 Dl +146 623 Dl +292 497 Dl +584 245 Dl +4802 MY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +1250 4557 MXY +0 25 Dl +0 -13 Dl +12 0 Dl +-24 0 Dl +958 4060 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +812 3437 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +739 3558 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +702 3430 MXY +0 25 Dl +0 -13 Dl +13 0 Dl +-25 0 Dl +684 3548 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +675 3720 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +1637 4912(4096)N +675 4307 MXY +9 -58 Dl +18 30 Dl +37 89 Dl +73 144 Dl +146 235 Dl +292 122 Dl +584 89 Dl +4970 MY +12 -24 Dl +-24 0 Dl +12 24 Dl +1250 4881 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +958 4759 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +812 4524 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +739 4380 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +702 4291 MXY +13 -24 Dl +-25 0 Dl +12 24 Dl +684 4261 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +675 4319 MXY +12 -24 Dl +-24 0 Dl +12 24 Dl +734 4662(1024)N +16 Ds +1 Dt +675 4352 MXY +9 60 Dl +18 134 Dl +37 266 Dl +73 117 Dl +146 30 Dl +292 0 Dl +584 -1 Dl +-1 Ds +3 Dt +1846 4946 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +1262 4946 MXY +-24 25 Dl +12 -12 Dl +-12 -13 Dl +24 25 Dl +970 4947 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +824 4917 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +751 4800 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +715 4534 MXY +-25 25 Dl +12 -13 Dl +-12 -12 Dl +25 25 Dl +696 4400 MXY +-24 24 Dl +12 -12 Dl +-12 -12 Dl +24 24 Dl +687 4339 MXY +-24 25 Dl +12 -12 Dl +-12 -13 Dl +24 25 Dl +718 4792(512)N +5 Dt +675 4422 MXY +9 137 Dl +18 278 Dl +37 105 Dl +73 18 Dl +146 -1 Dl +292 0 Dl +584 -1 Dl +3 Dt +4946 MY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +1250 4946 MXY +0 25 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +958 4947 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +812 4948 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +739 4930 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +702 4824 MXY +0 25 Dl +0 -12 Dl +13 0 Dl +-25 0 Dl +684 4547 MXY +0 24 Dl +0 -12 Dl +12 0 Dl +-24 0 Dl +675 4410 MXY +0 25 Dl +0 -13 Dl +12 0 Dl +-24 0 Dl +750 4921(256)N +20 Ds +1 Dt +675 4597 MXY +9 246 Dl +18 106 Dl +37 10 Dl +73 0 Dl +146 0 Dl +292 0 Dl +584 -1 Dl +-1 Ds +3 Dt +1822 MX +23 Dc +1238 4959 MXY +23 Dc +946 MX +23 Dc +800 MX +23 Dc +727 MX +23 Dc +691 4949 MXY +23 Dc +672 4843 MXY +24 Dc +663 4597 MXY +24 Dc +1395 4961(128)N +1 Dt +675 4855 MXY +9 93 Dl +18 10 Dl +37 1 Dl +73 0 Dl +146 -1 Dl +292 0 Dl +584 0 Dl +3 Dt +4946 MY +-12 24 Dl +24 0 Dl +-12 -24 Dl +1250 MX +-12 24 Dl +24 0 Dl +-12 -24 Dl +958 MX +-12 24 Dl +24 0 Dl +-12 -24 Dl +812 MX +-12 25 Dl +24 0 Dl +-12 -25 Dl +739 4947 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +702 4946 MXY +-12 24 Dl +25 0 Dl +-13 -24 Dl +684 4936 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +675 4843 MXY +-12 24 Dl +24 0 Dl +-12 -24 Dl +3 Dt +-1 Ds +3 f +432 5314(Figure)N +634(5b:)X +1 f +744(Elapsed)X +967(Time)X +1123(for)X +1218(dictionary)X +1498(data)X +1625(set)X +1717(with)X +1851(1M)X +1960(of)X +2033(buffer)X +432 5402(space)N +593(and)X +705(varying)X +920(bucket)X +1110(sizes)X +1254(and)X +1366(\256ll)X +1457(factors.)X +1681(Each)X +1827(line)X +1942(is)X +2004(labeled)X +432 5490(with)N +562(its)X +639(bucket)X +825(size.)X +10 s +2590 538(If)N +2677(an)X +2785(approximation)X +3284(of)X +3383(the)X +3513(number)X +3790(of)X +3889(elements)X +2418 626(ultimately)N +2773(to)X +2866(be)X +2973(stored)X +3200(in)X +3293(the)X +3422(hash)X +3599(table)X +3785(is)X +3868(known)X +4116(at)X +2418 714(the)N +2564(time)X +2754(of)X +2869(creation,)X +3196(the)X +3342(hash)X +3536(package)X +3847(takes)X +4059(this)X +2418 802(number)N +2688(as)X +2779(a)X +2839(parameter)X +3185(and)X +3325(uses)X +3487(it)X +3555(to)X +3641(hash)X +3812(entries)X +4050(into)X +2418 890(the)N +2541(full)X +2677(sized)X +2867(table)X +3048(rather)X +3261(than)X +3424(growing)X +3716(the)X +3838(table)X +4018(from)X +2418 978(a)N +2477(single)X +2691(bucket.)X +2968(If)X +3044(this)X +3181(number)X +3448(is)X +3523(not)X +3647(known,)X +3907(the)X +4027(hash)X +2418 1066(table)N +2632(starts)X +2859(with)X +3059(a)X +3153(single)X +3402(bucket)X +3674(and)X +3848(gracefully)X +2418 1154(expands)N +2707(as)X +2800(elements)X +3111(are)X +3236(added,)X +3474(although)X +3780(a)X +3842(slight)X +4044(per-)X +2418 1242(formance)N +2747(degradation)X +3151(may)X +3313(be)X +3413(noticed.)X +3713(Figure)X +3946(6)X +4010(illus-)X +2418 1330(trates)N +2625(the)X +2756(difference)X +3116(in)X +3211(performance)X +3651(between)X +3952(storing)X +2418 1418(keys)N +2588(in)X +2673(a)X +2732(\256le)X +2857(when)X +3054(the)X +3174(ultimate)X +3458(size)X +3605(is)X +3680(known)X +3920(\(the)X +4067(left)X +2418 1506(bars)N +2581(in)X +2672(each)X +2849(set\),)X +3014(compared)X +3360(to)X +3450(building)X +3744(the)X +3870(\256le)X +4000(when)X +2418 1594(the)N +2550(ultimate)X +2846(size)X +3005(is)X +3091(unknown)X +3422(\(the)X +3580(right)X +3764(bars)X +3931(in)X +4026(each)X +2418 1682(set\).)N +2609(Once)X +2814(the)X +2947(\256ll)X +3069(factor)X +3291(is)X +3378(suf\256ciently)X +3772(high)X +3948(for)X +4076(the)X +2418 1770(page)N +2596(size)X +2747(\(8\),)X +2887(growing)X +3180(the)X +3304(table)X +3486(dynamically)X +3908(does)X +4081(lit-)X +2418 1858(tle)N +2518(to)X +2600(degrade)X +2875(performance.)X +10 f +2418 1946 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +9 s +1 f +2413 3238(s)N +2409 3173(d)N +2409 3108(n)N +2409 3043(o)N +2411 2979(c)N +2411 2914(e)N +2407 2849(S)N +3143 4129(Fill)N +3261(Factor)X +2448 2152(15)N +4 Ds +1 Dt +2557 2122 MXY +1473 0 Dl +-1 Ds +5 Dt +2557 MX +-19 0 Dl +2448 2747(10)N +4 Ds +1 Dt +2557 2717 MXY +1473 0 Dl +-1 Ds +5 Dt +2557 MX +-19 0 Dl +2484 3343(5)N +4 Ds +1 Dt +2557 3313 MXY +1473 0 Dl +-1 Ds +5 Dt +2557 MX +-19 0 Dl +2484 3938(0)N +4 Ds +1 Dt +2557 3908 MXY +1473 0 Dl +-1 Ds +5 Dt +2557 MX +-19 0 Dl +3976 4015(128)N +4 Ds +1 Dt +4030 3908 MXY +0 -1786 Dl +-1 Ds +5 Dt +3908 MY +0 19 Dl +3626 4015(96)N +4 Ds +1 Dt +3662 3908 MXY +0 -1786 Dl +-1 Ds +5 Dt +3908 MY +0 19 Dl +3258 4015(64)N +4 Ds +1 Dt +3294 3908 MXY +0 -1786 Dl +-1 Ds +5 Dt +3908 MY +0 19 Dl +2889 4015(32)N +4 Ds +1 Dt +2925 3908 MXY +0 -1786 Dl +-1 Ds +5 Dt +3908 MY +0 19 Dl +2539 4015(0)N +4 Ds +1 Dt +2557 3908 MXY +0 -1786 Dl +-1 Ds +5 Dt +3908 MY +0 19 Dl +3908 MY +0 -1786 Dl +3908 MY +1473 0 Dl +4053 2378(8192)N +3 Dt +2569 2277 MXY +11 0 Dl +23 48 Dl +46 -167 Dl +92 35 Dl +184 12 Dl +369 143 Dl +736 0 Dl +2334 MY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +3294 2334 MXY +0 28 Dl +0 -14 Dl +13 0 Dl +-27 0 Dl +2925 2192 MXY +0 27 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2741 2180 MXY +0 27 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2649 2144 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2603 2311 MXY +0 27 Dl +0 -13 Dl +14 0 Dl +-28 0 Dl +2580 2263 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2569 2263 MXY +0 28 Dl +0 -14 Dl +13 0 Dl +-27 0 Dl +4053 2591(4096)N +2569 2348 MXY +11 -11 Dl +23 -96 Dl +46 71 Dl +92 72 Dl +184 226 Dl +369 48 Dl +736 -60 Dl +2612 MY +14 -28 Dl +-28 0 Dl +14 28 Dl +3294 2672 MXY +13 -28 Dl +-27 0 Dl +14 28 Dl +2925 2624 MXY +14 -28 Dl +-28 0 Dl +14 28 Dl +2741 2398 MXY +14 -28 Dl +-28 0 Dl +14 28 Dl +2649 2326 MXY +14 -27 Dl +-28 0 Dl +14 27 Dl +2603 2255 MXY +14 -28 Dl +-28 0 Dl +14 28 Dl +2580 2350 MXY +14 -27 Dl +-28 0 Dl +14 27 Dl +2569 2362 MXY +13 -28 Dl +-27 0 Dl +14 28 Dl +4053 2681(1024)N +16 Ds +1 Dt +2569 2300 MXY +11 48 Dl +23 96 Dl +46 95 Dl +92 274 Dl +184 202 Dl +369 -155 Dl +736 -190 Dl +-1 Ds +3 Dt +4044 2656 MXY +-28 28 Dl +14 -14 Dl +-14 -14 Dl +28 28 Dl +3307 2846 MXY +-27 28 Dl +14 -14 Dl +-14 -14 Dl +27 28 Dl +2939 3001 MXY +-28 28 Dl +14 -14 Dl +-14 -14 Dl +28 28 Dl +2755 2799 MXY +-28 28 Dl +14 -14 Dl +-14 -14 Dl +28 28 Dl +2663 2525 MXY +-28 28 Dl +14 -14 Dl +-14 -14 Dl +28 28 Dl +2617 2430 MXY +-28 28 Dl +14 -14 Dl +-14 -14 Dl +28 28 Dl +2594 2334 MXY +-28 28 Dl +14 -14 Dl +-14 -14 Dl +28 28 Dl +2582 2287 MXY +-27 27 Dl +14 -14 Dl +-14 -13 Dl +27 27 Dl +4053 2851(512)N +5 Dt +2569 2372 MXY +11 -24 Dl +23 405 Dl +46 83 Dl +92 227 Dl +184 -72 Dl +369 -119 Dl +736 -107 Dl +3 Dt +2751 MY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +3294 2858 MXY +0 28 Dl +0 -14 Dl +13 0 Dl +-27 0 Dl +2925 2977 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2741 3049 MXY +0 27 Dl +0 -13 Dl +14 0 Dl +-28 0 Dl +2649 2823 MXY +0 27 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2603 2739 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2580 2334 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2569 2358 MXY +0 28 Dl +0 -14 Dl +13 0 Dl +-27 0 Dl +4053 2795(256)N +20 Ds +1 Dt +2569 2456 MXY +11 285 Dl +23 95 Dl +46 251 Dl +92 -60 Dl +184 -84 Dl +369 -107 Dl +736 -71 Dl +-1 Ds +3 Dt +4016 MX +27 Dc +3280 2836 MXY +27 Dc +2912 2943 MXY +27 Dc +2728 3027 MXY +27 Dc +2635 3087 MXY +28 Dc +2589 2836 MXY +28 Dc +2566 2741 MXY +27 Dc +2554 2456 MXY +28 Dc +4053 2741(128)N +1 Dt +2569 2729 MXY +11 203 Dl +23 131 Dl +46 -60 Dl +92 -119 Dl +184 -60 Dl +369 -83 Dl +736 -12 Dl +3 Dt +2716 MY +-14 27 Dl +28 0 Dl +-14 -27 Dl +3294 2727 MXY +-14 28 Dl +27 0 Dl +-13 -28 Dl +2925 2811 MXY +-14 27 Dl +28 0 Dl +-14 -27 Dl +2741 2870 MXY +-14 28 Dl +28 0 Dl +-14 -28 Dl +2649 2989 MXY +-14 28 Dl +28 0 Dl +-14 -28 Dl +2603 3049 MXY +-14 27 Dl +28 0 Dl +-14 -27 Dl +2580 2918 MXY +-14 28 Dl +28 0 Dl +-14 -28 Dl +2569 2716 MXY +-14 27 Dl +27 0 Dl +-13 -27 Dl +3 Dt +-1 Ds +3 f +8 s +2418 4286(Figure)N +2628(5c:)X +1 f +2738(User)X +2887(Time)X +3051(for)X +3154(dictionary)X +3442(data)X +3577(set)X +3677(with)X +3820(1M)X +3938(of)X +4019(buffer)X +2418 4374(space)N +2579(and)X +2691(varying)X +2906(bucket)X +3096(sizes)X +3240(and)X +3352(\256ll)X +3443(factors.)X +3667(Each)X +3813(line)X +3928(is)X +3990(labeled)X +2418 4462(with)N +2548(its)X +2625(bucket)X +2811(size.)X +10 s +10 f +2418 4638 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +1 f +2590 4840(Since)N +2796(no)X +2904(known)X +3150(hash)X +3325(function)X +3620(performs)X +3938(equally)X +2418 4928(well)N +2589(on)X +2702(all)X +2815(possible)X +3110(data,)X +3297(the)X +3428(user)X +3595(may)X +3766(\256nd)X +3923(that)X +4076(the)X +2418 5016(built-in)N +2678(hash)X +2849(function)X +3140(does)X +3311(poorly)X +3544(on)X +3648(a)X +3708(particular)X +4040(data)X +2418 5104(set.)N +2548(In)X +2636(this)X +2771(case,)X +2950(a)X +3006(hash)X +3173(function,)X +3480(taking)X +3700(two)X +3840(arguments)X +2418 5192(\(a)N +2507(pointer)X +2760(to)X +2848(a)X +2910(byte)X +3074(string)X +3282(and)X +3424(a)X +3486(length\))X +3739(and)X +3880(returning)X +2418 5280(an)N +2517(unsigned)X +2829(long)X +2993(to)X +3077(be)X +3175(used)X +3344(as)X +3433(the)X +3553(hash)X +3722(value,)X +3938(may)X +4098(be)X +2418 5368(speci\256ed)N +2731(at)X +2817(hash)X +2992(table)X +3176(creation)X +3463(time.)X +3673(When)X +3893(an)X +3996(exist-)X +2418 5456(ing)N +2570(hash)X +2767(table)X +2973(is)X +3076(opened)X +3358(and)X +3524(a)X +3609(hash)X +3805(function)X +4121(is)X +2418 5544(speci\256ed,)N +2752(the)X +2879(hash)X +3054(package)X +3346(will)X +3498(try)X +3615(to)X +3705(determine)X +4054(that)X +2418 5632(the)N +2546(hash)X +2723(function)X +3020(supplied)X +3321(is)X +3404(the)X +3532(one)X +3678(with)X +3850(which)X +4076(the)X +2418 5720(table)N +2630(was)X +2811(created.)X +3139(There)X +3382(are)X +3536(a)X +3627(variety)X +3905(of)X +4027(hash)X +3 f +432 5960(8)N +2970(USENIX)X +9 f +3292(-)X +3 f +3356(Winter)X +3621('91)X +9 f +3748(-)X +3 f +3812(Dallas,)X +4065(TX)X + +9 p +%%Page: 9 9 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +720 258(Seltzer)N +977(&)X +1064(Yigit)X +3278(A)X +3356(New)X +3528(Hashing)X +3831(Package)X +4136(for)X +4259(UNIX)X +1 f +720 538(functions)N +1065(provided)X +1397(with)X +1586(the)X +1731(package.)X +2082(The)X +2253(default)X +720 626(function)N +1014(for)X +1135(the)X +1260(package)X +1551(is)X +1631(the)X +1755(one)X +1897(which)X +2119(offered)X +2378(the)X +720 714(best)N +875(performance)X +1308(in)X +1396(terms)X +1600(of)X +1693(cycles)X +1920(executed)X +2232(per)X +2360(call)X +720 802(\(it)N +827(did)X +965(not)X +1103(produce)X +1398(the)X +1531(fewest)X +1776(collisions)X +2117(although)X +2432(it)X +720 890(was)N +866(within)X +1091(a)X +1148(small)X +1341(percentage)X +1710(of)X +1797(the)X +1915(function)X +2202(that)X +2342(pro-)X +720 978(duced)N +947(the)X +1080(fewest)X +1324(collisions\).)X +1731(Again,)X +1981(in)X +2077(time)X +2253(critical)X +720 1066(applications,)N +1152(users)X +1342(are)X +1466(encouraged)X +1862(to)X +1949(experiment)X +2334(with)X +720 1154(a)N +783(variety)X +1032(of)X +1125(hash)X +1298(functions)X +1622(to)X +1710(achieve)X +1982(optimal)X +2252(perfor-)X +720 1242(mance.)N +10 f +720 1330 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +7 s +1038 2925(Full)N +1149(size)X +1251(table)X +1384(\(left\))X +1547 2718(Fill)N +1643(Factor)X +2268 2662(64)N +1964(32)X +1674(16)X +1384(8)X +1093(4)X +4 Ds +1 Dt +900 2280 MXY +1548 0 Dl +900 1879 MXY +1548 0 Dl +900 1506 MXY +1548 0 Dl +1563 2902 MXY +111 0 Dl +-1 Ds +900 MX +110 0 Dl +1425 2828(System)N +983(User)X +1895 2778 MXY + 1895 2778 lineto + 1950 2778 lineto + 1950 2833 lineto + 1895 2833 lineto + 1895 2778 lineto +closepath 21 1895 2778 1950 2833 Dp +1342 MX + 1342 2778 lineto + 1397 2778 lineto + 1397 2833 lineto + 1342 2833 lineto + 1342 2778 lineto +closepath 14 1342 2778 1397 2833 Dp +900 MX + 900 2778 lineto + 955 2778 lineto + 955 2833 lineto + 900 2833 lineto + 900 2778 lineto +closepath 3 900 2778 955 2833 Dp +5 Dt +2283 2211 MXY +96 0 Dl +1992 MX +97 0 Dl +1702 MX +97 0 Dl +1411 2252 MXY +97 0 Dl +4 Ds +1 Dt +2283 2211 MXY + 2283 2211 lineto + 2379 2211 lineto + 2379 2252 lineto + 2283 2252 lineto + 2283 2211 lineto +closepath 14 2283 2211 2379 2252 Dp +1992 MX + 1992 2211 lineto + 2089 2211 lineto + 2089 2252 lineto + 1992 2252 lineto + 1992 2211 lineto +closepath 14 1992 2211 2089 2252 Dp +1702 MX + 1702 2211 lineto + 1799 2211 lineto + 1799 2252 lineto + 1702 2252 lineto + 1702 2211 lineto +closepath 14 1702 2211 1799 2252 Dp +1411 2252 MXY + 1411 2252 lineto + 1508 2252 lineto + 1508 2294 lineto + 1411 2294 lineto + 1411 2252 lineto +closepath 14 1411 2252 1508 2294 Dp +2283 MX + 2283 2252 lineto + 2379 2252 lineto + 2379 2612 lineto + 2283 2612 lineto + 2283 2252 lineto +closepath 3 2283 2252 2379 2612 Dp +1992 MX + 1992 2252 lineto + 2089 2252 lineto + 2089 2612 lineto + 1992 2612 lineto + 1992 2252 lineto +closepath 3 1992 2252 2089 2612 Dp +1702 MX + 1702 2252 lineto + 1799 2252 lineto + 1799 2612 lineto + 1702 2612 lineto + 1702 2252 lineto +closepath 3 1702 2252 1799 2612 Dp +1411 2294 MXY + 1411 2294 lineto + 1508 2294 lineto + 1508 2612 lineto + 1411 2612 lineto + 1411 2294 lineto +closepath 3 1411 2294 1508 2612 Dp +-1 Ds +2158 2238 MXY + 2158 2238 lineto + 2255 2238 lineto + 2255 2252 lineto + 2158 2252 lineto + 2158 2238 lineto +closepath 21 2158 2238 2255 2252 Dp +1868 MX + 1868 2238 lineto + 1965 2238 lineto + 1965 2280 lineto + 1868 2280 lineto + 1868 2238 lineto +closepath 21 1868 2238 1965 2280 Dp +1577 MX + 1577 2238 lineto + 1674 2238 lineto + 1674 2308 lineto + 1577 2308 lineto + 1577 2238 lineto +closepath 21 1577 2238 1674 2308 Dp +1287 2308 MXY + 1287 2308 lineto + 1287 2280 lineto + 1384 2280 lineto + 1384 2308 lineto + 1287 2308 lineto +closepath 21 1287 2280 1384 2308 Dp +2158 2280 MXY + 2158 2280 lineto + 2158 2252 lineto + 2255 2252 lineto + 2255 2280 lineto + 2158 2280 lineto +closepath 14 2158 2252 2255 2280 Dp +1868 2308 MXY + 1868 2308 lineto + 1868 2280 lineto + 1965 2280 lineto + 1965 2308 lineto + 1868 2308 lineto +closepath 14 1868 2280 1965 2308 Dp +1577 2335 MXY + 1577 2335 lineto + 1577 2308 lineto + 1674 2308 lineto + 1674 2335 lineto + 1577 2335 lineto +closepath 14 1577 2308 1674 2335 Dp +1287 2363 MXY + 1287 2363 lineto + 1287 2308 lineto + 1384 2308 lineto + 1384 2363 lineto + 1287 2363 lineto +closepath 14 1287 2308 1384 2363 Dp +2158 2280 MXY + 2158 2280 lineto + 2255 2280 lineto + 2255 2612 lineto + 2158 2612 lineto + 2158 2280 lineto +closepath 3 2158 2280 2255 2612 Dp +1868 2308 MXY + 1868 2308 lineto + 1965 2308 lineto + 1965 2612 lineto + 1868 2612 lineto + 1868 2308 lineto +closepath 3 1868 2308 1965 2612 Dp +1577 2335 MXY + 1577 2335 lineto + 1674 2335 lineto + 1674 2612 lineto + 1577 2612 lineto + 1577 2335 lineto +closepath 3 1577 2335 1674 2612 Dp +1287 2363 MXY + 1287 2363 lineto + 1384 2363 lineto + 1384 2612 lineto + 1287 2612 lineto + 1287 2363 lineto +closepath 3 1287 2363 1384 2612 Dp +4 Ds +1121 2066 MXY + 1121 2066 lineto + 1218 2066 lineto + 1224 2080 lineto + 1127 2080 lineto + 1121 2066 lineto +closepath 21 1121 2066 1224 2080 Dp +2080 MY + 1121 2080 lineto + 1218 2080 lineto + 1218 2273 lineto + 1121 2273 lineto + 1121 2080 lineto +closepath 14 1121 2080 1218 2273 Dp +2273 MY + 1121 2273 lineto + 1218 2273 lineto + 1218 2612 lineto + 1121 2612 lineto + 1121 2273 lineto +closepath 3 1121 2273 1218 2612 Dp +-1 Ds +997 1589 MXY + 997 1589 lineto + 1093 1589 lineto + 1093 1644 lineto + 997 1644 lineto + 997 1589 lineto +closepath 21 997 1589 1093 1644 Dp +1644 MY + 997 1644 lineto + 1093 1644 lineto + 1093 2280 lineto + 997 2280 lineto + 997 1644 lineto +closepath 14 997 1644 1093 2280 Dp +2280 MY + 997 2280 lineto + 1093 2280 lineto + 1093 2612 lineto + 997 2612 lineto + 997 2280 lineto +closepath 3 997 2280 1093 2612 Dp +10 s +719 2093(s)N +712 2037(d)N +712 1982(n)N +714 1927(o)N +716 1872(c)N +716 1816(e)N +712 1761(S)N +804 2286(10)N +804 1899(20)N +804 1540(30)N +3 Dt +900 1506 MXY +0 1106 Dl +1548 0 Dl +7 s +1978 2828(Elapsed)N +1701 2925(Dynamically)N +2018(grown)X +2184(table)X +2317(\(right\))X +3 Dt +-1 Ds +8 s +720 3180(Figure)N +934(6:)X +1 f +1020(The)X +1152(total)X +1299(regions)X +1520(indicate)X +1755(the)X +1865(difference)X +2154(between)X +2398(the)X +720 3268(elapsed)N +931(time)X +1065(and)X +1177(the)X +1275(sum)X +1402(of)X +1475(the)X +1573(system)X +1771(and)X +1883(user)X +2008(time.)X +2173(The)X +2291(left)X +2395(bar)X +720 3356(of)N +798(each)X +939(set)X +1035(depicts)X +1241(the)X +1344(timing)X +1537(of)X +1615(the)X +1718(test)X +1831(run)X +1940(when)X +2102(the)X +2204(number)X +2423(of)X +720 3444(entries)N +910(is)X +973(known)X +1167(in)X +1237(advance.)X +1496(The)X +1614(right)X +1754(bars)X +1879(depict)X +2054(the)X +2151(timing)X +2338(when)X +720 3532(the)N +814(\256le)X +912(is)X +971(grown)X +1150(from)X +1290(a)X +1334(single)X +1503(bucket.)X +10 s +10 f +720 3708 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +1 f +892 3910(Since)N +1131(this)X +1307(hashing)X +1617(package)X +1942(provides)X +2279(buffer)X +720 3998(management,)N +1188(the)X +1323(amount)X +1600(of)X +1704(space)X +1920(allocated)X +2247(for)X +2378(the)X +720 4086(buffer)N +948(pool)X +1121(may)X +1290(be)X +1397(speci\256ed)X +1713(by)X +1824(the)X +1953(user.)X +2157(Using)X +2378(the)X +720 4174(same)N +910(data)X +1069(set)X +1183(and)X +1324(test)X +1459(procedure)X +1805(as)X +1896(used)X +2067(to)X +2153(derive)X +2378(the)X +720 4262(graphs)N +962(in)X +1052(Figures)X +1320(5a-c,)X +1507(Figure)X +1744(7)X +1812(shows)X +2039(the)X +2164(impact)X +2409(of)X +720 4350(varying)N +997(the)X +1126(size)X +1282(of)X +1380(the)X +1509(buffer)X +1737(pool.)X +1950(The)X +2106(bucket)X +2351(size)X +720 4438(was)N +873(set)X +989(to)X +1078(256)X +1225(bytes)X +1421(and)X +1564(the)X +1689(\256ll)X +1804(factor)X +2019(was)X +2171(set)X +2287(to)X +2376(16.)X +720 4526(The)N +869(buffer)X +1090(pool)X +1256(size)X +1404(was)X +1552(varied)X +1776(from)X +1955(0)X +2018(\(the)X +2166(minimum)X +720 4614(number)N +986(of)X +1074(pages)X +1277(required)X +1565(to)X +1647(be)X +1743(buffered\))X +2063(to)X +2145(1M.)X +2316(With)X +720 4702(1M)N +854(of)X +944(buffer)X +1164(space,)X +1386(the)X +1507(package)X +1794(performed)X +2151(no)X +2253(I/O)X +2382(for)X +720 4790(this)N +871(data)X +1040(set.)X +1204(As)X +1328(Figure)X +1572(7)X +1647(illustrates,)X +2013(increasing)X +2378(the)X +720 4878(buffer)N +944(pool)X +1113(size)X +1265(can)X +1404(have)X +1583(a)X +1646(dramatic)X +1954(affect)X +2165(on)X +2271(result-)X +720 4966(ing)N +842(performance.)X +2 f +8 s +1269 4941(7)N +1 f +16 s +720 5353 MXY +864 0 Dl +2 f +8 s +760 5408(7)N +1 f +9 s +826 5433(Some)N +1024(allocators)X +1338(are)X +1460(extremely)X +1782(inef\256cient)X +2107(at)X +2192(allocating)X +720 5513(memory.)N +1029(If)X +1110(you)X +1251(\256nd)X +1396(that)X +1536(applications)X +1916(are)X +2036(running)X +2292(out)X +2416(of)X +720 5593(memory)N +1005(before)X +1234(you)X +1386(think)X +1578(they)X +1746(should,)X +2000(try)X +2124(varying)X +2388(the)X +720 5673(pagesize)N +986(to)X +1060(get)X +1166(better)X +1348(utilization)X +1658(from)X +1816(the)X +1922(memory)X +2180(allocator.)X +10 s +2830 1975 MXY +0 -28 Dl +28 0 Dl +0 28 Dl +-28 0 Dl +2853 2004 MXY +0 -27 Dl +28 0 Dl +0 27 Dl +-28 0 Dl +2876 2016 MXY +0 -27 Dl +27 0 Dl +0 27 Dl +-27 0 Dl +2922 1998 MXY +0 -27 Dl +27 0 Dl +0 27 Dl +-27 0 Dl +2967 2025 MXY +0 -28 Dl +28 0 Dl +0 28 Dl +-28 0 Dl +3013 2031 MXY +0 -28 Dl +28 0 Dl +0 28 Dl +-28 0 Dl +3059 MX +0 -28 Dl +27 0 Dl +0 28 Dl +-27 0 Dl +3196 2052 MXY +0 -28 Dl +27 0 Dl +0 28 Dl +-27 0 Dl +3561 2102 MXY +0 -28 Dl +28 0 Dl +0 28 Dl +-28 0 Dl +4292 2105 MXY +0 -28 Dl +27 0 Dl +0 28 Dl +-27 0 Dl +4 Ds +1 Dt +2844 1961 MXY +23 30 Dl +23 12 Dl +45 -18 Dl +46 26 Dl +46 6 Dl +45 0 Dl +137 21 Dl +366 50 Dl +730 3 Dl +9 s +4227 2158(User)N +-1 Ds +3 Dt +2830 1211 MXY +27 Dc +2853 1261 MXY +27 Dc +2876 1267 MXY +27 Dc +2921 1341 MXY +27 Dc +2967 1385 MXY +27 Dc +3013 1450 MXY +27 Dc +3059 1497 MXY +27 Dc +3196 1686 MXY +27 Dc +3561 2109 MXY +27 Dc +4292 2295 MXY +27 Dc +20 Ds +1 Dt +2844 1211 MXY +23 50 Dl +23 6 Dl +45 74 Dl +46 44 Dl +46 65 Dl +45 47 Dl +137 189 Dl +366 423 Dl +730 186 Dl +4181 2270(System)N +-1 Ds +3 Dt +2844 583 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2867 672 MXY +0 27 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +2890 701 MXY +0 28 Dl +0 -14 Dl +13 0 Dl +-27 0 Dl +2935 819 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-27 0 Dl +2981 849 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +3027 908 MXY +0 27 Dl +0 -13 Dl +14 0 Dl +-28 0 Dl +3072 1026 MXY +0 27 Dl +0 -13 Dl +14 0 Dl +-27 0 Dl +3209 1292 MXY +0 27 Dl +0 -14 Dl +14 0 Dl +-27 0 Dl +3575 1823 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-28 0 Dl +4305 2059 MXY +0 28 Dl +0 -14 Dl +14 0 Dl +-27 0 Dl +5 Dt +2844 597 MXY +23 88 Dl +23 30 Dl +45 118 Dl +46 30 Dl +46 59 Dl +45 118 Dl +137 265 Dl +366 532 Dl +730 236 Dl +4328 2103(Total)N +2844 2310 MXY +1461 0 Dl +2844 MX +0 -1772 Dl +2310 MY +0 18 Dl +4 Ds +1 Dt +2310 MY +0 -1772 Dl +2826 2416(0)N +-1 Ds +5 Dt +3209 2310 MXY +0 18 Dl +4 Ds +1 Dt +2310 MY +0 -1772 Dl +3155 2416(256)N +-1 Ds +5 Dt +3575 2310 MXY +0 18 Dl +4 Ds +1 Dt +2310 MY +0 -1772 Dl +3521 2416(512)N +-1 Ds +5 Dt +3940 2310 MXY +0 18 Dl +4 Ds +1 Dt +2310 MY +0 -1772 Dl +3886 2416(768)N +-1 Ds +5 Dt +4305 2310 MXY +0 18 Dl +4 Ds +1 Dt +2310 MY +0 -1772 Dl +4233 2416(1024)N +-1 Ds +5 Dt +2844 2310 MXY +-18 0 Dl +4 Ds +1 Dt +2844 MX +1461 0 Dl +2771 2340(0)N +-1 Ds +5 Dt +2844 2014 MXY +-18 0 Dl +2844 1719 MXY +-18 0 Dl +4 Ds +1 Dt +2844 MX +1461 0 Dl +2735 1749(20)N +-1 Ds +5 Dt +2844 1423 MXY +-18 0 Dl +2844 1128 MXY +-18 0 Dl +4 Ds +1 Dt +2844 MX +1461 0 Dl +2735 1158(40)N +-1 Ds +5 Dt +2844 833 MXY +-18 0 Dl +2844 538 MXY +-18 0 Dl +4 Ds +1 Dt +2844 MX +1461 0 Dl +2735 568(60)N +3239 2529(Buffer)N +3445(Pool)X +3595(Size)X +3737(\(in)X +3835(K\))X +2695 1259(S)N +2699 1324(e)N +2699 1388(c)N +2697 1452(o)N +2697 1517(n)N +2697 1581(d)N +2701 1645(s)N +3 Dt +-1 Ds +3 f +8 s +2706 2773(Figure)N +2908(7:)X +1 f +2982(User)X +3123(time)X +3258(is)X +3322(virtually)X +3560(insensitive)X +3854(to)X +3924(the)X +4022(amount)X +4234(of)X +4307(buffer)X +2706 2861(pool)N +2852(available,)X +3130(however,)X +3396(both)X +3541(system)X +3750(time)X +3895(and)X +4018(elapsed)X +4240(time)X +4385(are)X +2706 2949(inversely)N +2960(proportional)X +3296(to)X +3366(the)X +3464(size)X +3583(of)X +3656(the)X +3753(buffer)X +3927(pool.)X +4092(Even)X +4242(for)X +4335(large)X +2706 3037(data)N +2831(sets)X +2946(where)X +3120(one)X +3230(expects)X +3439(few)X +3552(collisions,)X +3832(specifying)X +4116(a)X +4162(large)X +4307(buffer)X +2706 3125(pool)N +2836(dramatically)X +3171(improves)X +3425(performance.)X +10 s +10 f +2706 3301 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +3175 3543(Enhanced)N +3536(Functionality)X +1 f +2878 3675(This)N +3046(hashing)X +3320(package)X +3609(provides)X +3910(a)X +3971(set)X +4085(of)X +4177(compati-)X +2706 3763(bility)N +2895(routines)X +3174(to)X +3257(implement)X +3620(the)X +2 f +3739(ndbm)X +1 f +3937(interface.)X +4279(How-)X +2706 3851(ever,)N +2893(when)X +3095(the)X +3220(native)X +3443(interface)X +3752(is)X +3832(used,)X +4026(the)X +4151(following)X +2706 3939(additional)N +3046(functionality)X +3475(is)X +3548(provided:)X +10 f +2798 4071(g)N +1 f +2946(Inserts)X +3197(never)X +3413(fail)X +3556(because)X +3847(too)X +3985(many)X +4199(keys)X +2946 4159(hash)N +3113(to)X +3195(the)X +3313(same)X +3498(value.)X +10 f +2798 4247(g)N +1 f +2946(Inserts)X +3187(never)X +3393(fail)X +3527(because)X +3808(key)X +3950(and/or)X +4181(asso-)X +2946 4335(ciated)N +3158(data)X +3312(is)X +3385(too)X +3507(large)X +10 f +2798 4423(g)N +1 f +2946(Hash)X +3131(functions)X +3449(may)X +3607(be)X +3703(user-speci\256ed.)X +10 f +2798 4511(g)N +1 f +2946(Multiple)X +3268(pages)X +3498(may)X +3683(be)X +3806(cached)X +4077(in)X +4186(main)X +2946 4599(memory.)N +2706 4731(It)N +2801(also)X +2976(provides)X +3298(a)X +3380(set)X +3514(of)X +3626(compatibility)X +4097(routines)X +4400(to)X +2706 4819(implement)N +3087(the)X +2 f +3224(hsearch)X +1 f +3516(interface.)X +3876(Again,)X +4130(the)X +4266(native)X +2706 4907(interface)N +3008(offers)X +3216(enhanced)X +3540(functionality:)X +10 f +2798 5039(g)N +1 f +2946(Files)X +3121(may)X +3279(grow)X +3464(beyond)X +2 f +3720(nelem)X +1 f +3932(elements.)X +10 f +2798 5127(g)N +1 f +2946(Multiple)X +3247(hash)X +3420(tables)X +3632(may)X +3795(be)X +3896(accessed)X +4203(con-)X +2946 5215(currently.)N +10 f +2798 5303(g)N +1 f +2946(Hash)X +3134(tables)X +3344(may)X +3505(be)X +3604(stored)X +3823(and)X +3962(accessed)X +4266(on)X +2946 5391(disk.)N +10 f +2798 5479(g)N +1 f +2946(Hash)X +3155(functions)X +3497(may)X +3679(be)X +3799(user-speci\256ed)X +4288(at)X +2946 5567(runtime.)N +3 f +720 5960(USENIX)N +9 f +1042(-)X +3 f +1106(Winter)X +1371('91)X +9 f +1498(-)X +3 f +1562(Dallas,)X +1815(TX)X +4424(9)X + +10 p +%%Page: 10 10 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +432 258(A)N +510(New)X +682(Hashing)X +985(Package)X +1290(for)X +1413(UNIX)X +3663(Seltzer)X +3920(&)X +4007(Yigit)X +459 538(Relative)N +760(Performance)X +1227(of)X +1314(the)X +1441(New)X +1613(Implementation)X +1 f +604 670(The)N +761(performance)X +1200(testing)X +1445(of)X +1544(the)X +1674(new)X +1840(package)X +2135(is)X +432 758(divided)N +711(into)X +874(two)X +1033(test)X +1183(suites.)X +1424(The)X +1588(\256rst)X +1751(suite)X +1941(of)X +2046(tests)X +432 846(requires)N +727(that)X +882(the)X +1015(tables)X +1237(be)X +1348(read)X +1522(from)X +1713(and)X +1864(written)X +2126(to)X +432 934(disk.)N +640(In)X +742(these)X +942(tests,)X +1139(the)X +1272(basis)X +1467(for)X +1595(comparison)X +2003(is)X +2090(the)X +432 1022(4.3BSD-Reno)N +908(version)X +1169(of)X +2 f +1260(ndbm)X +1 f +1438(.)X +1502(Based)X +1722(on)X +1826(the)X +1948(designs)X +432 1110(of)N +2 f +521(sdbm)X +1 f +712(and)X +2 f +850(gdbm)X +1 f +1028(,)X +1070(they)X +1230(are)X +1351(expected)X +1659(to)X +1743(perform)X +2024(simi-)X +432 1198(larly)N +605(to)X +2 f +693(ndbm)X +1 f +871(,)X +917(and)X +1059(we)X +1179(do)X +1285(not)X +1413(show)X +1608(their)X +1781(performance)X +432 1286(numbers.)N +800(The)X +977(second)X +1252(suite)X +1454(contains)X +1772(the)X +1921(memory)X +432 1374(resident)N +712(test)X +849(which)X +1071(does)X +1243(not)X +1370(require)X +1623(that)X +1768(the)X +1891(\256les)X +2049(ever)X +432 1462(be)N +533(written)X +784(to)X +870(disk,)X +1047(only)X +1213(that)X +1357(hash)X +1528(tables)X +1739(may)X +1901(be)X +2001(mani-)X +432 1550(pulated)N +692(in)X +778(main)X +961(memory.)X +1291(In)X +1381(this)X +1519(test,)X +1673(we)X +1790(compare)X +2090(the)X +432 1638(performance)N +859(to)X +941(that)X +1081(of)X +1168(the)X +2 f +1286(hsearch)X +1 f +1560(routines.)X +604 1752(For)N +760(both)X +947(suites,)X +1194(two)X +1358(different)X +1679(databases)X +2031(were)X +432 1840(used.)N +656(The)X +818(\256rst)X +979(is)X +1069(the)X +1204(dictionary)X +1566(database)X +1880(described)X +432 1928(previously.)N +836(The)X +987(second)X +1236(was)X +1386(constructed)X +1781(from)X +1962(a)X +2023(pass-)X +432 2016(word)N +647(\256le)X +799(with)X +990(approximately)X +1502(300)X +1671(accounts.)X +2041(Two)X +432 2104(records)N +700(were)X +887(constructed)X +1287(for)X +1411(each)X +1589(account.)X +1909(The)X +2064(\256rst)X +432 2192(used)N +604(the)X +727(logname)X +1028(as)X +1120(the)X +1243(key)X +1384(and)X +1525(the)X +1648(remainder)X +1999(of)X +2090(the)X +432 2280(password)N +768(entry)X +965(for)X +1091(the)X +1221(data.)X +1427(The)X +1584(second)X +1839(was)X +1996(keyed)X +432 2368(by)N +541(uid)X +672(and)X +817(contained)X +1157(the)X +1283(entire)X +1494(password)X +1825(entry)X +2018(as)X +2113(its)X +432 2456(data)N +589(\256eld.)X +794(The)X +942(tests)X +1107(were)X +1287(all)X +1389(run)X +1518(on)X +1620(the)X +1740(HP)X +1864(9000)X +2046(with)X +432 2544(the)N +574(same)X +783(con\256guration)X +1254(previously)X +1636(described.)X +2027(Each)X +432 2632(test)N +576(was)X +734(run)X +874(\256ve)X +1027(times)X +1232(and)X +1380(the)X +1510(timing)X +1750(results)X +1991(of)X +2090(the)X +432 2720(runs)N +602(were)X +791(averaged.)X +1154(The)X +1311(variance)X +1616(across)X +1849(the)X +1979(5)X +2050(runs)X +432 2808(was)N +591(approximately)X +1088(1%)X +1229(of)X +1330(the)X +1462(average)X +1746(yielding)X +2041(95%)X +432 2896(con\256dence)N +800(intervals)X +1096(of)X +1183(approximately)X +1666(2%.)X +3 f +1021 3050(Disk)N +1196(Based)X +1420(Tests)X +1 f +604 3182(In)N +693(these)X +880(tests,)X +1064(we)X +1180(use)X +1308(a)X +1365(bucket)X +1600(size)X +1746(of)X +1834(1024)X +2015(and)X +2152(a)X +432 3270(\256ll)N +540(factor)X +748(of)X +835(32.)X +3 f +432 3384(create)N +663(test)X +1 f +547 3498(The)N +703(keys)X +881(are)X +1011(entered)X +1279(into)X +1433(the)X +1561(hash)X +1738(table,)X +1944(and)X +2090(the)X +547 3586(\256le)N +669(is)X +742(\257ushed)X +993(to)X +1075(disk.)X +3 f +432 3700(read)N +608(test)X +1 f +547 3814(A)N +640(lookup)X +897(is)X +984(performed)X +1353(for)X +1481(each)X +1663(key)X +1813(in)X +1909(the)X +2041(hash)X +547 3902(table.)N +3 f +432 4016(verify)N +653(test)X +1 f +547 4130(A)N +640(lookup)X +897(is)X +984(performed)X +1353(for)X +1481(each)X +1663(key)X +1813(in)X +1909(the)X +2041(hash)X +547 4218(table,)N +759(and)X +911(the)X +1045(data)X +1215(returned)X +1519(is)X +1608(compared)X +1961(against)X +547 4306(that)N +687(originally)X +1018(stored)X +1234(in)X +1316(the)X +1434(hash)X +1601(table.)X +3 f +432 4420(sequential)N +798(retrieve)X +1 f +547 4534(All)N +674(keys)X +846(are)X +970(retrieved)X +1281(in)X +1367(sequential)X +1716(order)X +1910(from)X +2090(the)X +547 4622(hash)N +724(table.)X +950(The)X +2 f +1105(ndbm)X +1 f +1313(interface)X +1625(allows)X +1863(sequential)X +547 4710(retrieval)N +848(of)X +948(the)X +1079(keys)X +1259(from)X +1448(the)X +1578(database,)X +1907(but)X +2041(does)X +547 4798(not)N +701(return)X +945(the)X +1094(data)X +1279(associated)X +1660(with)X +1853(each)X +2052(key.)X +547 4886(Therefore,)N +929(we)X +1067(compare)X +1388(the)X +1530(performance)X +1980(of)X +2090(the)X +547 4974(new)N +703(package)X +989(to)X +1073(two)X +1215(different)X +1514(runs)X +1674(of)X +2 f +1763(ndbm)X +1 f +1941(.)X +2002(In)X +2090(the)X +547 5062(\256rst)N +697(case,)X +2 f +882(ndbm)X +1 f +1086(returns)X +1335(only)X +1503(the)X +1627(keys)X +1800(while)X +2003(in)X +2090(the)X +547 5150(second,)N +2 f +823(ndbm)X +1 f +1034(returns)X +1290(both)X +1465(the)X +1596(keys)X +1776(and)X +1924(the)X +2054(data)X +547 5238(\(requiring)N +894(a)X +956(second)X +1204(call)X +1345(to)X +1432(the)X +1555(library\).)X +1861(There)X +2074(is)X +2152(a)X +547 5326(single)N +764(run)X +897(for)X +1017(the)X +1141(new)X +1300(library)X +1539(since)X +1729(it)X +1798(returns)X +2046(both)X +547 5414(the)N +665(key)X +801(and)X +937(the)X +1055(data.)X +3 f +3014 538(In-Memory)N +3431(Test)X +1 f +2590 670(This)N +2757(test)X +2892(uses)X +3054(a)X +3114(bucket)X +3352(size)X +3501(of)X +3592(256)X +3736(and)X +3876(a)X +3936(\256ll)X +4048(fac-)X +2418 758(tor)N +2527(of)X +2614(8.)X +3 f +2418 872(create/read)N +2827(test)X +1 f +2533 986(In)N +2627(this)X +2769(test,)X +2927(a)X +2989(hash)X +3162(table)X +3344(is)X +3423(created)X +3682(by)X +3788(inserting)X +4094(all)X +2533 1074(the)N +2660(key/data)X +2961(pairs.)X +3186(Then)X +3380(a)X +3445(keyed)X +3666(retrieval)X +3963(is)X +4044(per-)X +2533 1162(formed)N +2801(for)X +2931(each)X +3115(pair,)X +3295(and)X +3446(the)X +3579(hash)X +3761(table)X +3952(is)X +4040(des-)X +2533 1250(troyed.)N +3 f +2938 1404(Performance)N +3405(Results)X +1 f +2590 1536(Figures)N +2866(8a)X +2978(and)X +3130(8b)X +3246(show)X +3451(the)X +3585(user)X +3755(time,)X +3952(system)X +2418 1624(time,)N +2608(and)X +2752(elapsed)X +3021(time)X +3191(for)X +3312(each)X +3487(test)X +3625(for)X +3746(both)X +3915(the)X +4040(new)X +2418 1712(implementation)N +2951(and)X +3098(the)X +3227(old)X +3360(implementation)X +3893(\()X +2 f +3920(hsearch)X +1 f +2418 1800(or)N +2 f +2528(ndbm)X +1 f +2706(,)X +2769(whichever)X +3147(is)X +3243(appropriate\))X +3678(as)X +3787(well)X +3967(as)X +4076(the)X +2418 1888(improvement.)N +2929(The)X +3098(improvement)X +3569(is)X +3666(expressed)X +4027(as)X +4138(a)X +2418 1976(percentage)N +2787(of)X +2874(the)X +2992(old)X +3114(running)X +3383(time:)X +0 f +8 s +2418 2275(%)N +2494(=)X +2570(100)X +2722(*)X +2798 -0.4219(\(old_time)AX +3178(-)X +3254 -0.4219(new_time\))AX +3634(/)X +3710(old_time)X +1 f +10 s +2590 2600(In)N +2700(nearly)X +2944(all)X +3067(cases,)X +3299(the)X +3439(new)X +3615(routines)X +3915(perform)X +2418 2688(better)N +2628(than)X +2793(the)X +2918(old)X +3047(routines)X +3332(\(both)X +2 f +3527(hsearch)X +1 f +3807(and)X +2 f +3949(ndbm)X +1 f +4127(\).)X +2418 2776(Although)N +2755(the)X +3 f +2888(create)X +1 f +3134(tests)X +3311(exhibit)X +3567(superior)X +3864(user)X +4032(time)X +2418 2864(performance,)N +2869(the)X +2991(test)X +3126(time)X +3292(is)X +3369(dominated)X +3731(by)X +3834(the)X +3955(cost)X +4107(of)X +2418 2952(writing)N +2677(the)X +2803(actual)X +3023(\256le)X +3153(to)X +3243(disk.)X +3444(For)X +3583(the)X +3709(large)X +3897(database)X +2418 3040(\(the)N +2564(dictionary\),)X +2957(this)X +3093(completely)X +3470(overwhelmed)X +3927(the)X +4045(sys-)X +2418 3128(tem)N +2570(time.)X +2783(However,)X +3129(for)X +3254(the)X +3383(small)X +3587(data)X +3752(base,)X +3946(we)X +4071(see)X +2418 3216(that)N +2569(differences)X +2958(in)X +3051(both)X +3224(user)X +3389(and)X +3536(system)X +3788(time)X +3960(contri-)X +2418 3304(bute)N +2576(to)X +2658(the)X +2776(superior)X +3059(performance)X +3486(of)X +3573(the)X +3691(new)X +3845(package.)X +2590 3418(The)N +3 f +2764(read)X +1 f +2920(,)X +3 f +2989(verify)X +1 f +3190(,)X +3259(and)X +3 f +3424(sequential)X +1 f +3818(results)X +4075(are)X +2418 3506(deceptive)N +2758(for)X +2883(the)X +3012(small)X +3216(database)X +3524(since)X +3720(the)X +3849(entire)X +4063(test)X +2418 3594(ran)N +2551(in)X +2643(under)X +2856(a)X +2922(second.)X +3215(However,)X +3560(on)X +3669(the)X +3796(larger)X +4013(data-)X +2418 3682(base)N +2590(the)X +3 f +2716(read)X +1 f +2900(and)X +3 f +3044(verify)X +1 f +3273(tests)X +3443(bene\256t)X +3689(from)X +3873(the)X +3999(cach-)X +2418 3770(ing)N +2546(of)X +2639(buckets)X +2910(in)X +2998(the)X +3122(new)X +3282(package)X +3571(to)X +3658(improve)X +3950(perfor-)X +2418 3858(mance)N +2666(by)X +2784(over)X +2965(80%.)X +3169(Since)X +3384(the)X +3519(\256rst)X +3 f +3680(sequential)X +1 f +4063(test)X +2418 3946(does)N +2598(not)X +2733(require)X +2 f +2994(ndbm)X +1 f +3205(to)X +3299(return)X +3523(the)X +3653(data)X +3819(values,)X +4076(the)X +2418 4034(user)N +2573(time)X +2735(is)X +2808(lower)X +3011(than)X +3169(for)X +3283(the)X +3401(new)X +3555(package.)X +3879(However)X +2418 4122(when)N +2613(we)X +2728(require)X +2977(both)X +3139(packages)X +3454(to)X +3536(return)X +3748(data,)X +3922(the)X +4040(new)X +2418 4210(package)N +2702(excels)X +2923(in)X +3005(all)X +3105(three)X +3286(timings.)X +2590 4324(The)N +2773(small)X +3003(database)X +3337(runs)X +3532(so)X +3660(quickly)X +3957(in)X +4076(the)X +2418 4412(memory-resident)N +3000(case)X +3173(that)X +3326(the)X +3457(results)X +3699(are)X +3831(uninterest-)X +2418 4500(ing.)N +2589(However,)X +2933(for)X +3056(the)X +3183(larger)X +3400(database)X +3706(the)X +3833(new)X +3995(pack-)X +2418 4588(age)N +2567(pays)X +2751(a)X +2824(small)X +3033(penalty)X +3305(in)X +3403(system)X +3661(time)X +3839(because)X +4130(it)X +2418 4676(limits)N +2636(its)X +2748(main)X +2944(memory)X +3247(utilization)X +3607(and)X +3759(swaps)X +3991(pages)X +2418 4764(out)N +2550(to)X +2642(temporary)X +3002(storage)X +3264(in)X +3356(the)X +3484(\256le)X +3616(system)X +3868(while)X +4076(the)X +2 f +2418 4852(hsearch)N +1 f +2698(package)X +2988(requires)X +3273(that)X +3419(the)X +3543(application)X +3924(allocate)X +2418 4940(enough)N +2692(space)X +2909(for)X +3041(all)X +3159(key/data)X +3468(pair.)X +3670(However,)X +4022(even)X +2418 5028(with)N +2600(the)X +2738(system)X +3000(time)X +3182(penalty,)X +3477(the)X +3614(resulting)X +3933(elapsed)X +2418 5116(time)N +2580(improves)X +2898(by)X +2998(over)X +3161(50%.)X +3 f +432 5960(10)N +2970(USENIX)X +9 f +3292(-)X +3 f +3356(Winter)X +3621('91)X +9 f +3748(-)X +3 f +3812(Dallas,)X +4065(TX)X + +11 p +%%Page: 11 11 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +720 258(Seltzer)N +977(&)X +1064(Yigit)X +3278(A)X +3356(New)X +3528(Hashing)X +3831(Package)X +4136(for)X +4259(UNIX)X +1 f +10 f +908 454(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2 f +1379 546(hash)N +1652(ndbm)X +1950(%change)X +1 f +10 f +908 550(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +948 642(CREATE)N +10 f +908 646(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +1125 738(user)N +1424(6.4)X +1671(12.2)X +2073(48)X +1157 826(sys)N +1384(32.5)X +1671(34.7)X +2113(6)X +3 f +1006 914(elapsed)N +10 f +1310 922(c)N +890(c)Y +810(c)Y +730(c)Y +3 f +1384 914(90.4)N +10 f +1581 922(c)N +890(c)Y +810(c)Y +730(c)Y +3 f +1671 914(99.6)N +10 f +1883 922(c)N +890(c)Y +810(c)Y +730(c)Y +3 f +2113 914(9)N +1 f +10 f +908 910(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +908 926(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +948 1010(READ)N +10 f +908 1014(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +1125 1106(user)N +1424(3.4)X +1711(6.1)X +2073(44)X +1157 1194(sys)N +1424(1.2)X +1671(15.3)X +2073(92)X +3 f +1006 1282(elapsed)N +10 f +1310 1290(c)N +1258(c)Y +1178(c)Y +1098(c)Y +3 f +1424 1282(4.0)N +10 f +1581 1290(c)N +1258(c)Y +1178(c)Y +1098(c)Y +3 f +1671 1282(21.2)N +10 f +1883 1290(c)N +1258(c)Y +1178(c)Y +1098(c)Y +3 f +2073 1282(81)N +1 f +10 f +908 1278(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +908 1294(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +948 1378(VERIFY)N +10 f +908 1382(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +1125 1474(user)N +1424(3.5)X +1711(6.3)X +2073(44)X +1157 1562(sys)N +1424(1.2)X +1671(15.3)X +2073(92)X +3 f +1006 1650(elapsed)N +10 f +1310 1658(c)N +1626(c)Y +1546(c)Y +1466(c)Y +3 f +1424 1650(4.0)N +10 f +1581 1658(c)N +1626(c)Y +1546(c)Y +1466(c)Y +3 f +1671 1650(21.2)N +10 f +1883 1658(c)N +1626(c)Y +1546(c)Y +1466(c)Y +3 f +2073 1650(81)N +1 f +10 f +908 1646(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +908 1662(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +948 1746(SEQUENTIAL)N +10 f +908 1750(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +1125 1842(user)N +1424(2.7)X +1711(1.9)X +2046(-42)X +1157 1930(sys)N +1424(0.7)X +1711(3.9)X +2073(82)X +3 f +1006 2018(elapsed)N +10 f +1310 2026(c)N +1994(c)Y +1914(c)Y +1834(c)Y +3 f +1424 2018(3.0)N +10 f +1581 2026(c)N +1994(c)Y +1914(c)Y +1834(c)Y +3 f +1711 2018(5.0)N +10 f +1883 2026(c)N +1994(c)Y +1914(c)Y +1834(c)Y +3 f +2073 2018(40)N +1 f +10 f +908 2014(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +908 2030(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +948 2114(SEQUENTIAL)N +1467(\(with)X +1656(data)X +1810(retrieval\))X +10 f +908 2118(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +1125 2210(user)N +1424(2.7)X +1711(8.2)X +2073(67)X +1157 2298(sys)N +1424(0.7)X +1711(4.3)X +2073(84)X +3 f +1006 2386(elapsed)N +1424(3.0)X +1671(12.0)X +2073(75)X +1 f +10 f +908 2390(i)N +927(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +899 2394(c)N +2378(c)Y +2298(c)Y +2218(c)Y +2138(c)Y +2058(c)Y +1978(c)Y +1898(c)Y +1818(c)Y +1738(c)Y +1658(c)Y +1578(c)Y +1498(c)Y +1418(c)Y +1338(c)Y +1258(c)Y +1178(c)Y +1098(c)Y +1018(c)Y +938(c)Y +858(c)Y +778(c)Y +698(c)Y +618(c)Y +538(c)Y +1310 2394(c)N +2362(c)Y +2282(c)Y +2202(c)Y +1581 2394(c)N +2362(c)Y +2282(c)Y +2202(c)Y +1883 2394(c)N +2362(c)Y +2282(c)Y +2202(c)Y +2278 2394(c)N +2378(c)Y +2298(c)Y +2218(c)Y +2138(c)Y +2058(c)Y +1978(c)Y +1898(c)Y +1818(c)Y +1738(c)Y +1658(c)Y +1578(c)Y +1498(c)Y +1418(c)Y +1338(c)Y +1258(c)Y +1178(c)Y +1098(c)Y +1018(c)Y +938(c)Y +858(c)Y +778(c)Y +698(c)Y +618(c)Y +538(c)Y +905 2574(i)N +930(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2 f +1318 2666(hash)N +1585(hsearch)X +1953(%change)X +1 f +10 f +905 2670(i)N +930(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +945 2762(CREATE/READ)N +10 f +905 2766(i)N +930(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +1064 2858(user)N +1343(6.6)X +1642(17.2)X +2096(62)X +1096 2946(sys)N +1343(1.1)X +1682(0.3)X +2029(-266)X +3 f +945 3034(elapsed)N +1343(7.8)X +1642(17.0)X +2096(54)X +1 f +10 f +905 3038(i)N +930(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +896 3050(c)N +2978(c)Y +2898(c)Y +2818(c)Y +2738(c)Y +2658(c)Y +1249 3034(c)N +3010(c)Y +2930(c)Y +2850(c)Y +1520 3034(c)N +3010(c)Y +2930(c)Y +2850(c)Y +1886 3034(c)N +3010(c)Y +2930(c)Y +2850(c)Y +2281 3050(c)N +2978(c)Y +2898(c)Y +2818(c)Y +2738(c)Y +2658(c)Y +3 f +720 3174(Figure)N +967(8a:)X +1 f +1094(Timing)X +1349(results)X +1578(for)X +1692(the)X +1810(dictionary)X +2155(database.)X +10 f +720 3262 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +1407 3504(Conclusion)N +1 f +892 3636(This)N +1063(paper)X +1271(has)X +1407(presented)X +1744(the)X +1871(design,)X +2129(implemen-)X +720 3724(tation)N +928(and)X +1070(performance)X +1503(of)X +1596(a)X +1658(new)X +1818(hashing)X +2093(package)X +2382(for)X +720 3812(UNIX.)N +993(The)X +1150(new)X +1316(package)X +1612(provides)X +1919(a)X +1986(superset)X +2280(of)X +2378(the)X +720 3900(functionality)N +1159(of)X +1255(existing)X +1537(hashing)X +1815(packages)X +2139(and)X +2284(incor-)X +720 3988(porates)N +975(additional)X +1318(features)X +1596(such)X +1766(as)X +1855(large)X +2038(key)X +2176(handling,)X +720 4076(user)N +876(de\256ned)X +1134(hash)X +1302(functions,)X +1641(multiple)X +1928(hash)X +2096(tables,)X +2324(vari-)X +720 4164(able)N +894(sized)X +1099(pages,)X +1342(and)X +1498(linear)X +1721(hashing.)X +2050(In)X +2156(nearly)X +2396(all)X +720 4252(cases,)N +954(the)X +1096(new)X +1274(package)X +1582(provides)X +1902(improved)X +2252(perfor-)X +720 4340(mance)N +974(on)X +1098(the)X +1240(order)X +1454(of)X +1565(50-80%)X +1863(for)X +2001(the)X +2142(workloads)X +720 4428(shown.)N +990(Applications)X +1420(such)X +1588(as)X +1676(the)X +1794(loader,)X +2035(compiler,)X +2360(and)X +720 4516(mail,)N +921(which)X +1156(currently)X +1485(implement)X +1866(their)X +2051(own)X +2227(hashing)X +720 4604(routines,)N +1032(should)X +1279(be)X +1389(modi\256ed)X +1706(to)X +1801(use)X +1941(the)X +2072(generic)X +2342(rou-)X +720 4692(tines.)N +892 4806(This)N +1087(hashing)X +1389(package)X +1705(is)X +1810(one)X +1978(access)X +2236(method)X +720 4894(which)N +953(is)X +1043(part)X +1205(of)X +1309(a)X +1382(generic)X +1656(database)X +1970(access)X +2212(package)X +720 4982(being)N +955(developed)X +1342(at)X +1457(the)X +1612(University)X +2007(of)X +2131(California,)X +720 5070(Berkeley.)N +1089(It)X +1177(will)X +1340(include)X +1614(a)X +1688(btree)X +1887(access)X +2131(method)X +2409(as)X +720 5158(well)N +916(as)X +1041(\256xed)X +1259(and)X +1433(variable)X +1750(length)X +2007(record)X +2270(access)X +720 5246(methods)N +1024(in)X +1119(addition)X +1414(to)X +1509(the)X +1640(hashed)X +1896(support)X +2168(presented)X +720 5334(here.)N +948(All)X +1099(of)X +1215(the)X +1361(access)X +1615(methods)X +1934(are)X +2081(based)X +2312(on)X +2440(a)X +720 5422(key/data)N +1037(pair)X +1207(interface)X +1533(and)X +1693(appear)X +1952(identical)X +2272(to)X +2378(the)X +720 5510(application)N +1121(layer,)X +1347(allowing)X +1671(application)X +2071(implementa-)X +720 5598(tions)N +906(to)X +999(be)X +1106(largely)X +1360(independent)X +1783(of)X +1881(the)X +2010(database)X +2318(type.)X +720 5686(The)N +873(package)X +1165(is)X +1246(expected)X +1560(to)X +1650(be)X +1754(an)X +1858(integral)X +2131(part)X +2284(of)X +2378(the)X +2706 538(4.4BSD)N +3006(system,)X +3293(with)X +3479(various)X +3759(standard)X +4075(applications)X +2706 626(such)N +2879(as)X +2972(more\(1\),)X +3277(sort\(1\))X +3517(and)X +3659(vi\(1\))X +3841(based)X +4050(on)X +4156(it.)X +4266(While)X +2706 714(the)N +2833(current)X +3089(design)X +3326(does)X +3501(not)X +3631(support)X +3899(multi-user)X +4256(access)X +2706 802(or)N +2804(transactions,)X +3238(they)X +3407(could)X +3616(be)X +3723(incorporated)X +4159(relatively)X +2706 890(easily.)N +10 f +2894 938(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2 f +3365 1030(hash)N +3638(ndbm)X +3936(%change)X +1 f +10 f +2894 1034(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2934 1126(CREATE)N +10 f +2894 1130(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +3111 1222(user)N +3390(0.2)X +3677(0.4)X +4079(50)X +3143 1310(sys)N +3390(0.1)X +3677(1.0)X +4079(90)X +3 f +2992 1398(elapsed)N +10 f +3296 1406(c)N +1374(c)Y +1294(c)Y +1214(c)Y +3 f +3390 1398(0)N +10 f +3567 1406(c)N +1374(c)Y +1294(c)Y +1214(c)Y +3 f +3677 1398(3.2)N +10 f +3869 1406(c)N +1374(c)Y +1294(c)Y +1214(c)Y +3 f +4039 1398(100)N +1 f +10 f +2894 1394(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2894 1410(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2934 1494(READ)N +10 f +2894 1498(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +3111 1590(user)N +3390(0.1)X +3677(0.1)X +4119(0)X +3143 1678(sys)N +3390(0.1)X +3677(0.4)X +4079(75)X +3 f +2992 1766(elapsed)N +10 f +3296 1774(c)N +1742(c)Y +1662(c)Y +1582(c)Y +3 f +3390 1766(0.0)N +10 f +3567 1774(c)N +1742(c)Y +1662(c)Y +1582(c)Y +3 f +3677 1766(0.0)N +10 f +3869 1774(c)N +1742(c)Y +1662(c)Y +1582(c)Y +3 f +4119 1766(0)N +1 f +10 f +2894 1762(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2894 1778(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2934 1862(VERIFY)N +10 f +2894 1866(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +3111 1958(user)N +3390(0.1)X +3677(0.2)X +4079(50)X +3143 2046(sys)N +3390(0.1)X +3677(0.3)X +4079(67)X +3 f +2992 2134(elapsed)N +10 f +3296 2142(c)N +2110(c)Y +2030(c)Y +1950(c)Y +3 f +3390 2134(0.0)N +10 f +3567 2142(c)N +2110(c)Y +2030(c)Y +1950(c)Y +3 f +3677 2134(0.0)N +10 f +3869 2142(c)N +2110(c)Y +2030(c)Y +1950(c)Y +3 f +4119 2134(0)N +1 f +10 f +2894 2130(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2894 2146(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2934 2230(SEQUENTIAL)N +10 f +2894 2234(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +3111 2326(user)N +3390(0.1)X +3677(0.0)X +4012(-100)X +3143 2414(sys)N +3390(0.1)X +3677(0.1)X +4119(0)X +3 f +2992 2502(elapsed)N +10 f +3296 2510(c)N +2478(c)Y +2398(c)Y +2318(c)Y +3 f +3390 2502(0.0)N +10 f +3567 2510(c)N +2478(c)Y +2398(c)Y +2318(c)Y +3 f +3677 2502(0.0)N +10 f +3869 2510(c)N +2478(c)Y +2398(c)Y +2318(c)Y +3 f +4119 2502(0)N +1 f +10 f +2894 2498(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2894 2514(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2934 2598(SEQUENTIAL)N +3453(\(with)X +3642(data)X +3796(retrieval\))X +10 f +2894 2602(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +3111 2694(user)N +3390(0.1)X +3677(0.1)X +4119(0)X +3143 2782(sys)N +3390(0.1)X +3677(0.1)X +4119(0)X +3 f +2992 2870(elapsed)N +3390(0.0)X +3677(0.0)X +4119(0)X +1 f +10 f +2894 2874(i)N +2913(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2885 2878(c)N +2862(c)Y +2782(c)Y +2702(c)Y +2622(c)Y +2542(c)Y +2462(c)Y +2382(c)Y +2302(c)Y +2222(c)Y +2142(c)Y +2062(c)Y +1982(c)Y +1902(c)Y +1822(c)Y +1742(c)Y +1662(c)Y +1582(c)Y +1502(c)Y +1422(c)Y +1342(c)Y +1262(c)Y +1182(c)Y +1102(c)Y +1022(c)Y +3296 2878(c)N +2846(c)Y +2766(c)Y +2686(c)Y +3567 2878(c)N +2846(c)Y +2766(c)Y +2686(c)Y +3869 2878(c)N +2846(c)Y +2766(c)Y +2686(c)Y +4264 2878(c)N +2862(c)Y +2782(c)Y +2702(c)Y +2622(c)Y +2542(c)Y +2462(c)Y +2382(c)Y +2302(c)Y +2222(c)Y +2142(c)Y +2062(c)Y +1982(c)Y +1902(c)Y +1822(c)Y +1742(c)Y +1662(c)Y +1582(c)Y +1502(c)Y +1422(c)Y +1342(c)Y +1262(c)Y +1182(c)Y +1102(c)Y +1022(c)Y +2891 3058(i)N +2916(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2 f +3304 3150(hash)N +3571(hsearch)X +3939(%change)X +1 f +10 f +2891 3154(i)N +2916(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2931 3246(CREATE/READ)N +10 f +2891 3250(i)N +2916(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +3050 3342(user)N +3329(0.3)X +3648(0.4)X +4048(25)X +3082 3430(sys)N +3329(0.0)X +3648(0.0)X +4088(0)X +3 f +2931 3518(elapsed)N +3329(0.0)X +3648(0.0)X +4088(0)X +1 f +10 f +2891 3522(i)N +2916(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2882 3534(c)N +3462(c)Y +3382(c)Y +3302(c)Y +3222(c)Y +3142(c)Y +3235 3518(c)N +3494(c)Y +3414(c)Y +3334(c)Y +3506 3518(c)N +3494(c)Y +3414(c)Y +3334(c)Y +3872 3518(c)N +3494(c)Y +3414(c)Y +3334(c)Y +4267 3534(c)N +3462(c)Y +3382(c)Y +3302(c)Y +3222(c)Y +3142(c)Y +3 f +2706 3658(Figure)N +2953(8b:)X +1 f +3084(Timing)X +3339(results)X +3568(for)X +3682(the)X +3800(password)X +4123(database.)X +10 f +2706 3746 -0.0930(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)AN +3 f +3396 3988(References)N +1 f +2706 4120([ATT79])N +3058(AT&T,)X +3358(DBM\(3X\),)X +2 f +3773(Unix)X +3990(Programmer's)X +2878 4208(Manual,)N +3194(Seventh)X +3491(Edition,)X +3793(Volume)X +4085(1)X +1 f +(,)S +4192(January,)X +2878 4296(1979.)N +2706 4472([ATT85])N +3027(AT&T,)X +3296(HSEARCH\(BA_LIB\),)X +2 f +4053(Unix)X +4239(System)X +2878 4560(User's)N +3112(Manual,)X +3401(System)X +3644(V.3)X +1 f +3753(,)X +3793(pp.)X +3913(506-508,)X +4220(1985.)X +2706 4736([BRE73])N +3025(Brent,)X +3253(Richard)X +3537(P.,)X +3651(``Reducing)X +4041(the)X +4168(Retrieval)X +2878 4824(Time)N +3071(of)X +3162(Scatter)X +3409(Storage)X +3678(Techniques'',)X +2 f +4146(Commun-)X +2878 4912(ications)N +3175(of)X +3281(the)X +3422(ACM)X +1 f +3591(,)X +3654(Volume)X +3955(16,)X +4098(No.)X +4259(2,)X +4362(pp.)X +2878 5000(105-109,)N +3185(February,)X +3515(1973.)X +2706 5176([BSD86])N +3055(NDBM\(3\),)X +2 f +3469(4.3BSD)X +3775(Unix)X +3990(Programmer's)X +2878 5264(Manual)N +3155(Reference)X +3505(Guide)X +1 f +3701(,)X +3749(University)X +4114(of)X +4208(Califor-)X +2878 5352(nia,)N +3016(Berkeley,)X +3346(1986.)X +2706 5528([ENB88])N +3025(Enbody,)X +3319(R.)X +3417(J.,)X +3533(Du,)X +3676(H.)X +3779(C.,)X +3897(``Dynamic)X +4270(Hash-)X +2878 5616(ing)N +3034(Schemes'',)X +2 f +3427(ACM)X +3630(Computing)X +4019(Surveys)X +1 f +4269(,)X +4322(Vol.)X +2878 5704(20,)N +2998(No.)X +3136(2,)X +3216(pp.)X +3336(85-113,)X +3603(June)X +3770(1988.)X +3 f +720 5960(USENIX)N +9 f +1042(-)X +3 f +1106(Winter)X +1371('91)X +9 f +1498(-)X +3 f +1562(Dallas,)X +1815(TX)X +4384(11)X + +12 p +%%Page: 12 12 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +432 258(A)N +510(New)X +682(Hashing)X +985(Package)X +1290(for)X +1413(UNIX)X +3663(Seltzer)X +3920(&)X +4007(Yigit)X +1 f +432 538([FAG79])N +776(Ronald)X +1057(Fagin,)X +1308(Jurg)X +1495(Nievergelt,)X +1903(Nicholas)X +604 626(Pippenger,)N +1003(H.)X +1135(Raymond)X +1500(Strong,)X +1787(``Extendible)X +604 714(Hashing)N +901(--)X +985(A)X +1073(Fast)X +1236(Access)X +1493(Method)X +1771(for)X +1894(Dynamic)X +604 802(Files'',)N +2 f +855(ACM)X +1046(Transactions)X +1485(on)X +1586(Database)X +1914(Systems)X +1 f +2168(,)X +604 890(Volume)N +882(4,)X +962(No.)X +1100(3.,)X +1200(September)X +1563(1979,)X +1763(pp)X +1863(315-34)X +432 1066([KNU68],)N +802(Knuth,)X +1064(D.E.,)X +2 f +1273(The)X +1434(Art)X +1577(of)X +1680(Computer)X +2041(Pro-)X +604 1154(gramming)N +971(Vol.)X +1140(3:)X +1245(Sorting)X +1518(and)X +1676(Searching)X +1 f +2001(,)X +2058(sec-)X +604 1242(tions)N +779(6.3-6.4,)X +1046(pp)X +1146(481-550.)X +432 1418([LAR78])N +747(Larson,)X +1011(Per-Ake,)X +1319(``Dynamic)X +1687(Hashing'',)X +2 f +2048(BIT)X +1 f +(,)S +604 1506(Vol.)N +764(18,)X +884(1978,)X +1084(pp.)X +1204(184-201.)X +432 1682([LAR88])N +752(Larson,)X +1021(Per-Ake,)X +1335(``Dynamic)X +1709(Hash)X +1900(Tables'',)X +2 f +604 1770(Communications)N +1183(of)X +1281(the)X +1415(ACM)X +1 f +1584(,)X +1640(Volume)X +1934(31,)X +2070(No.)X +604 1858(4.,)N +704(April)X +893(1988,)X +1093(pp)X +1193(446-457.)X +432 2034([LIT80])N +731(Witold,)X +1013(Litwin,)X +1286(``Linear)X +1590(Hashing:)X +1939(A)X +2036(New)X +604 2122(Tool)N +786(for)X +911(File)X +1065(and)X +1211(Table)X +1424(Addressing'',)X +2 f +1893(Proceed-)X +604 2210(ings)N +761(of)X +847(the)X +969(6th)X +1095(International)X +1540(Conference)X +1933(on)X +2036(Very)X +604 2298(Large)N +815(Databases)X +1 f +1153(,)X +1193(1980.)X +432 2474([NEL90])N +743(Nelson,)X +1011(Philip)X +1222(A.,)X +2 f +1341(Gdbm)X +1558(1.4)X +1679(source)X +1913(distribu-)X +604 2562(tion)N +748(and)X +888(README)X +1 f +1209(,)X +1249(August)X +1500(1990.)X +432 2738([THOM90])N +840(Ken)X +1011(Thompson,)X +1410(private)X +1670(communication,)X +604 2826(Nov.)N +782(1990.)X +432 3002([TOR87])N +790(Torek,)X +1066(C.,)X +1222(``Re:)X +1470(dbm.a)X +1751(and)X +1950(ndbm.a)X +604 3090(archives'',)N +2 f +966(USENET)X +1279(newsgroup)X +1650(comp.unix)X +1 f +2002(1987.)X +432 3266([TOR88])N +760(Torek,)X +1006(C.,)X +1133(``Re:)X +1351(questions)X +1686(regarding)X +2027(data-)X +604 3354(bases)N +826(created)X +1106(with)X +1295(dbm)X +1484(and)X +1647(ndbm)X +1876(routines'')X +2 f +604 3442(USENET)N +937(newsgroup)X +1328(comp.unix.questions)X +1 f +1982(,)X +2041(June)X +604 3530(1988.)N +432 3706([WAL84])N +773(Wales,)X +1018(R.,)X +1135(``Discussion)X +1564(of)X +1655("dbm")X +1887(data)X +2045(base)X +604 3794(system'',)N +2 f +973(USENET)X +1339(newsgroup)X +1762(unix.wizards)X +1 f +2168(,)X +604 3882(January,)N +894(1984.)X +432 4058([YIG89])N +751(Ozan)X +963(S.)X +1069(Yigit,)X +1294(``How)X +1545(to)X +1648(Roll)X +1826(Your)X +2032(Own)X +604 4146(Dbm/Ndbm'',)N +2 f +1087(unpublished)X +1504(manuscript)X +1 f +(,)S +1910(Toronto,)X +604 4234(July,)N +777(1989)X +3 f +432 5960(12)N +2970(USENIX)X +9 f +3292(-)X +3 f +3356(Winter)X +3621('91)X +9 f +3748(-)X +3 f +3812(Dallas,)X +4065(TX)X + +13 p +%%Page: 13 13 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +720 258(Seltzer)N +977(&)X +1064(Yigit)X +3278(A)X +3356(New)X +3528(Hashing)X +3831(Package)X +4136(for)X +4259(UNIX)X +1 f +720 538(Margo)N +960(I.)X +1033(Seltzer)X +1282(is)X +1361(a)X +1423(Ph.D.)X +1631(student)X +1887(in)X +1974(the)X +2097(Department)X +720 626(of)N +823(Electrical)X +1167(Engineering)X +1595(and)X +1747(Computer)X +2102(Sciences)X +2418(at)X +720 714(the)N +850(University)X +1220(of)X +1318(California,)X +1694(Berkeley.)X +2055(Her)X +2207(research)X +720 802(interests)N +1017(include)X +1283(\256le)X +1415(systems,)X +1718(databases,)X +2076(and)X +2221(transac-)X +720 890(tion)N +896(processing)X +1291(systems.)X +1636(She)X +1807(spent)X +2027(several)X +2306(years)X +720 978(working)N +1026(at)X +1123(startup)X +1380(companies)X +1762(designing)X +2112(and)X +2267(imple-)X +720 1066(menting)N +1048(\256le)X +1216(systems)X +1535(and)X +1716(transaction)X +2133(processing)X +720 1154(software)N +1026(and)X +1170(designing)X +1509(microprocessors.)X +2103(Ms.)X +2253(Seltzer)X +720 1242(received)N +1057(her)X +1223(AB)X +1397(in)X +1522(Applied)X +1843(Mathematics)X +2320(from)X +720 1330 0.1953(Harvard/Radcliffe)AN +1325(College)X +1594(in)X +1676(1983.)X +720 1444(In)N +810(her)X +936(spare)X +1129(time,)X +1313(Margo)X +1549(can)X +1683(usually)X +1936(be)X +2034(found)X +2243(prepar-)X +720 1532(ing)N +868(massive)X +1171(quantities)X +1527(of)X +1639(food)X +1831(for)X +1970(hungry)X +2242(hoards,)X +720 1620(studying)N +1022(Japanese,)X +1355(or)X +1449(playing)X +1716(soccer)X +1948(with)X +2116(an)X +2218(exciting)X +720 1708(Bay)N +912(Area)X +1132(Women's)X +1507(Soccer)X +1788(team,)X +2026(the)X +2186(Berkeley)X +720 1796(Bruisers.)N +720 1910(Ozan)N +915(\()X +3 f +942(Oz)X +1 f +1040(\))X +1092(Yigit)X +1281(is)X +1358(currently)X +1672(a)X +1732(software)X +2033(engineer)X +2334(with)X +720 1998(the)N +886(Communications)X +1499(Research)X +1861(and)X +2044(Development)X +720 2086(group,)N +948(Computing)X +1328(Services,)X +1641(York)X +1826(University.)X +2224(His)X +2355(for-)X +720 2174(mative)N +967(years)X +1166(were)X +1352(also)X +1510(spent)X +1708(at)X +1795(York,)X +2009(where)X +2234(he)X +2338(held)X +720 2262(system)N +985(programmer)X +1425(and)X +1583(administrator)X +2052(positions)X +2382(for)X +720 2350(various)N +995(mixtures)X +1314(of)X +1420(of)X +1526(UNIX)X +1765(systems)X +2056(starting)X +2334(with)X +720 2438(Berkeley)N +1031(4.1)X +1151(in)X +1233(1982,)X +1433(while)X +1631(at)X +1709(the)X +1827(same)X +2012(time)X +2174(obtaining)X +720 2526(a)N +776(degree)X +1011(in)X +1093(Computer)X +1433(Science.)X +720 2640(In)N +813(his)X +931(copious)X +1205(free)X +1356(time,)X +1543(Oz)X +1662(enjoys)X +1896(working)X +2188(on)X +2293(what-)X +720 2728(ever)N +890(software)X +1197(looks)X +1400(interesting,)X +1788(which)X +2014(often)X +2209(includes)X +720 2816(language)N +1044(interpreters,)X +1464(preprocessors,)X +1960(and)X +2110(lately,)X +2342(pro-)X +720 2904(gram)N +905(generators)X +1260(and)X +1396(expert)X +1617(systems.)X +720 3018(Oz)N +836(has)X +964(authored)X +1266(several)X +1515(public-domain)X +2003(software)X +2301(tools,)X +720 3106(including)N +1069(an)X +1191(nroff-like)X +1545(text)X +1711(formatter)X +2 f +2056(proff)X +1 f +2257(that)X +2423(is)X +720 3194(apparently)N +1083(still)X +1226(used)X +1397(in)X +1483(some)X +1676(basement)X +2002(PCs.)X +2173(His)X +2307(latest)X +720 3282(obsessions)N +1143(include)X +1460(the)X +1639(incredible)X +2040(programming)X +720 3370(language)N +1030(Scheme,)X +1324(and)X +1460(Chinese)X +1738(Brush)X +1949(painting.)X +3 f +720 5960(USENIX)N +9 f +1042(-)X +3 f +1106(Winter)X +1371('91)X +9 f +1498(-)X +3 f +1562(Dallas,)X +1815(TX)X +4384(13)X + +14 p +%%Page: 14 14 +0(Courier)xf 0 f +10 s 10 xH 0 xS 0 f +3 f +432 5960(14)N +2970(USENIX)X +9 f +3292(-)X +3 f +3356(Winter)X +3621('91)X +9 f +3748(-)X +3 f +3812(Dallas,)X +4065(TX)X + +14 p +%%Trailer +xt + +xs diff --git a/db/docs/ref/refs/libtp_usenix.ps b/db/docs/ref/refs/libtp_usenix.ps new file mode 100644 index 000000000..ea821a914 --- /dev/null +++ b/db/docs/ref/refs/libtp_usenix.ps @@ -0,0 +1,12340 @@ +%!PS-Adobe-1.0 +%%Creator: utopia:margo (& Seltzer,608-13E,8072,) +%%Title: stdin (ditroff) +%%CreationDate: Thu Dec 12 15:32:11 1991 +%%EndComments +% @(#)psdit.pro 1.3 4/15/88 +% lib/psdit.pro -- prolog for psdit (ditroff) files +% Copyright (c) 1984, 1985 Adobe Systems Incorporated. All Rights Reserved. +% last edit: shore Sat Nov 23 20:28:03 1985 +% RCSID: $Header: psdit.pro,v 2.1 85/11/24 12:19:43 shore Rel $ + +% Changed by Edward Wang (edward@ucbarpa.berkeley.edu) to handle graphics, +% 17 Feb, 87. + +/$DITroff 140 dict def $DITroff begin +/fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def +/xi{0 72 11 mul translate 72 resolution div dup neg scale 0 0 moveto + /fontnum 1 def /fontsize 10 def /fontheight 10 def /fontslant 0 def F + /pagesave save def}def +/PB{save /psv exch def currentpoint translate + resolution 72 div dup neg scale 0 0 moveto}def +/PE{psv restore}def +/arctoobig 90 def /arctoosmall .05 def +/m1 matrix def /m2 matrix def /m3 matrix def /oldmat matrix def +/tan{dup sin exch cos div}def +/point{resolution 72 div mul}def +/dround {transform round exch round exch itransform}def +/xT{/devname exch def}def +/xr{/mh exch def /my exch def /resolution exch def}def +/xp{}def +/xs{docsave restore end}def +/xt{}def +/xf{/fontname exch def /slotno exch def fontnames slotno get fontname eq not + {fonts slotno fontname findfont put fontnames slotno fontname put}if}def +/xH{/fontheight exch def F}def +/xS{/fontslant exch def F}def +/s{/fontsize exch def /fontheight fontsize def F}def +/f{/fontnum exch def F}def +/F{fontheight 0 le{/fontheight fontsize def}if + fonts fontnum get fontsize point 0 0 fontheight point neg 0 0 m1 astore + fontslant 0 ne{1 0 fontslant tan 1 0 0 m2 astore m3 concatmatrix}if + makefont setfont .04 fontsize point mul 0 dround pop setlinewidth}def +/X{exch currentpoint exch pop moveto show}def +/N{3 1 roll moveto show}def +/Y{exch currentpoint pop exch moveto show}def +/S{show}def +/ditpush{}def/ditpop{}def +/AX{3 -1 roll currentpoint exch pop moveto 0 exch ashow}def +/AN{4 2 roll moveto 0 exch ashow}def +/AY{3 -1 roll currentpoint pop exch moveto 0 exch ashow}def +/AS{0 exch ashow}def +/MX{currentpoint exch pop moveto}def +/MY{currentpoint pop exch moveto}def +/MXY{moveto}def +/cb{pop}def % action on unknown char -- nothing for now +/n{}def/w{}def +/p{pop showpage pagesave restore /pagesave save def}def +/Dt{/Dlinewidth exch def}def 1 Dt +/Ds{/Ddash exch def}def -1 Ds +/Di{/Dstipple exch def}def 1 Di +/Dsetlinewidth{2 Dlinewidth mul setlinewidth}def +/Dsetdash{Ddash 4 eq{[8 12]}{Ddash 16 eq{[32 36]} + {Ddash 20 eq{[32 12 8 12]}{[]}ifelse}ifelse}ifelse 0 setdash}def +/Dstroke{gsave Dsetlinewidth Dsetdash 1 setlinecap stroke grestore + currentpoint newpath moveto}def +/Dl{rlineto Dstroke}def +/arcellipse{/diamv exch def /diamh exch def oldmat currentmatrix pop + currentpoint translate 1 diamv diamh div scale /rad diamh 2 div def + currentpoint exch rad add exch rad -180 180 arc oldmat setmatrix}def +/Dc{dup arcellipse Dstroke}def +/De{arcellipse Dstroke}def +/Da{/endv exch def /endh exch def /centerv exch def /centerh exch def + /cradius centerv centerv mul centerh centerh mul add sqrt def + /eradius endv endv mul endh endh mul add sqrt def + /endang endv endh atan def + /startang centerv neg centerh neg atan def + /sweep startang endang sub dup 0 lt{360 add}if def + sweep arctoobig gt + {/midang startang sweep 2 div sub def /midrad cradius eradius add 2 div def + /midh midang cos midrad mul def /midv midang sin midrad mul def + midh neg midv neg endh endv centerh centerv midh midv Da + Da} + {sweep arctoosmall ge + {/controldelt 1 sweep 2 div cos sub 3 sweep 2 div sin mul div 4 mul def + centerv neg controldelt mul centerh controldelt mul + endv neg controldelt mul centerh add endh add + endh controldelt mul centerv add endv add + centerh endh add centerv endv add rcurveto Dstroke} + {centerh endh add centerv endv add rlineto Dstroke} + ifelse} + ifelse}def +/Dpatterns[ +[%cf[widthbits] +[8<0000000000000010>] +[8<0411040040114000>] +[8<0204081020408001>] +[8<0000103810000000>] +[8<6699996666999966>] +[8<0000800100001008>] +[8<81c36666c3810000>] +[8<0f0e0c0800000000>] +[8<0000000000000010>] +[8<0411040040114000>] +[8<0204081020408001>] +[8<0000001038100000>] +[8<6699996666999966>] +[8<0000800100001008>] +[8<81c36666c3810000>] +[8<0f0e0c0800000000>] +[8<0042660000246600>] +[8<0000990000990000>] +[8<0804020180402010>] +[8<2418814242811824>] +[8<6699996666999966>] +[8<8000000008000000>] +[8<00001c3e363e1c00>] +[8<0000000000000000>] +[32<00000040000000c00000004000000040000000e0000000000000000000000000>] +[32<00000000000060000000900000002000000040000000f0000000000000000000>] +[32<000000000000000000e0000000100000006000000010000000e0000000000000>] +[32<00000000000000002000000060000000a0000000f00000002000000000000000>] +[32<0000000e0000000000000000000000000000000f000000080000000e00000001>] +[32<0000090000000600000000000000000000000000000007000000080000000e00>] +[32<00010000000200000004000000040000000000000000000000000000000f0000>] +[32<0900000006000000090000000600000000000000000000000000000006000000>]] +[%ug +[8<0000020000000000>] +[8<0000020000002000>] +[8<0004020000002000>] +[8<0004020000402000>] +[8<0004060000402000>] +[8<0004060000406000>] +[8<0006060000406000>] +[8<0006060000606000>] +[8<00060e0000606000>] +[8<00060e000060e000>] +[8<00070e000060e000>] +[8<00070e000070e000>] +[8<00070e020070e000>] +[8<00070e020070e020>] +[8<04070e020070e020>] +[8<04070e024070e020>] +[8<04070e064070e020>] +[8<04070e064070e060>] +[8<06070e064070e060>] +[8<06070e066070e060>] +[8<06070f066070e060>] +[8<06070f066070f060>] +[8<060f0f066070f060>] +[8<060f0f0660f0f060>] +[8<060f0f0760f0f060>] +[8<060f0f0760f0f070>] +[8<0e0f0f0760f0f070>] +[8<0e0f0f07e0f0f070>] +[8<0e0f0f0fe0f0f070>] +[8<0e0f0f0fe0f0f0f0>] +[8<0f0f0f0fe0f0f0f0>] +[8<0f0f0f0ff0f0f0f0>] +[8<1f0f0f0ff0f0f0f0>] +[8<1f0f0f0ff1f0f0f0>] +[8<1f0f0f8ff1f0f0f0>] +[8<1f0f0f8ff1f0f0f8>] +[8<9f0f0f8ff1f0f0f8>] +[8<9f0f0f8ff9f0f0f8>] +[8<9f0f0f9ff9f0f0f8>] +[8<9f0f0f9ff9f0f0f9>] +[8<9f8f0f9ff9f0f0f9>] +[8<9f8f0f9ff9f8f0f9>] +[8<9f8f1f9ff9f8f0f9>] +[8<9f8f1f9ff9f8f1f9>] +[8<bf8f1f9ff9f8f1f9>] +[8<bf8f1f9ffbf8f1f9>] +[8<bf8f1fdffbf8f1f9>] +[8<bf8f1fdffbf8f1fd>] +[8<ff8f1fdffbf8f1fd>] +[8<ff8f1fdffff8f1fd>] +[8<ff8f1ffffff8f1fd>] +[8<ff8f1ffffff8f1ff>] +[8<ff9f1ffffff8f1ff>] +[8<ff9f1ffffff9f1ff>] +[8<ff9f9ffffff9f1ff>] +[8<ff9f9ffffff9f9ff>] +[8<ffbf9ffffff9f9ff>] +[8<ffbf9ffffffbf9ff>] +[8<ffbfdffffffbf9ff>] +[8<ffbfdffffffbfdff>] +[8<ffffdffffffbfdff>] +[8<ffffdffffffffdff>] +[8<fffffffffffffdff>] +[8<ffffffffffffffff>]] +[%mg +[8<8000000000000000>] +[8<0822080080228000>] +[8<0204081020408001>] +[8<40e0400000000000>] +[8<66999966>] +[8<8001000010080000>] +[8<81c36666c3810000>] +[8<f0e0c08000000000>] +[16<07c00f801f003e007c00f800f001e003c007800f001f003e007c00f801f003e0>] +[16<1f000f8007c003e001f000f8007c003e001f800fc007e003f001f8007c003e00>] +[8<c3c300000000c3c3>] +[16<0040008001000200040008001000200040008000000100020004000800100020>] +[16<0040002000100008000400020001800040002000100008000400020001000080>] +[16<1fc03fe07df0f8f8f07de03fc01f800fc01fe03ff07df8f87df03fe01fc00f80>] +[8<80>] +[8<8040201000000000>] +[8<84cc000048cc0000>] +[8<9900009900000000>] +[8<08040201804020100800020180002010>] +[8<2418814242811824>] +[8<66999966>] +[8<8000000008000000>] +[8<70f8d8f870000000>] +[8<0814224180402010>] +[8<aa00440a11a04400>] +[8<018245aa45820100>] +[8<221c224180808041>] +[8<88000000>] +[8<0855800080550800>] +[8<2844004482440044>] +[8<0810204080412214>] +[8<00>]]]def +/Dfill{ + transform /maxy exch def /maxx exch def + transform /miny exch def /minx exch def + minx maxx gt{/minx maxx /maxx minx def def}if + miny maxy gt{/miny maxy /maxy miny def def}if + Dpatterns Dstipple 1 sub get exch 1 sub get + aload pop /stip exch def /stipw exch def /stiph 128 def + /imatrix[stipw 0 0 stiph 0 0]def + /tmatrix[stipw 0 0 stiph 0 0]def + /minx minx cvi stiph idiv stiph mul def + /miny miny cvi stipw idiv stipw mul def + gsave eoclip 0 setgray + miny stiph maxy{ + tmatrix exch 5 exch put + minx stipw maxx{ + tmatrix exch 4 exch put tmatrix setmatrix + stipw stiph true imatrix {stip} imagemask + }for + }for + grestore +}def +/Dp{Dfill Dstroke}def +/DP{Dfill currentpoint newpath moveto}def +end + +/ditstart{$DITroff begin + /nfonts 60 def % NFONTS makedev/ditroff dependent! + /fonts[nfonts{0}repeat]def + /fontnames[nfonts{()}repeat]def +/docsave save def +}def + +% character outcalls +/oc{ + /pswid exch def /cc exch def /name exch def + /ditwid pswid fontsize mul resolution mul 72000 div def + /ditsiz fontsize resolution mul 72 div def + ocprocs name known{ocprocs name get exec}{name cb}ifelse +}def +/fractm [.65 0 0 .6 0 0] def +/fraction{ + /fden exch def /fnum exch def gsave /cf currentfont def + cf fractm makefont setfont 0 .3 dm 2 copy neg rmoveto + fnum show rmoveto currentfont cf setfont(\244)show setfont fden show + grestore ditwid 0 rmoveto +}def +/oce{grestore ditwid 0 rmoveto}def +/dm{ditsiz mul}def +/ocprocs 50 dict def ocprocs begin +(14){(1)(4)fraction}def +(12){(1)(2)fraction}def +(34){(3)(4)fraction}def +(13){(1)(3)fraction}def +(23){(2)(3)fraction}def +(18){(1)(8)fraction}def +(38){(3)(8)fraction}def +(58){(5)(8)fraction}def +(78){(7)(8)fraction}def +(sr){gsave 0 .06 dm rmoveto(\326)show oce}def +(is){gsave 0 .15 dm rmoveto(\362)show oce}def +(->){gsave 0 .02 dm rmoveto(\256)show oce}def +(<-){gsave 0 .02 dm rmoveto(\254)show oce}def +(==){gsave 0 .05 dm rmoveto(\272)show oce}def +(uc){gsave currentpoint 400 .009 dm mul add translate + 8 -8 scale ucseal oce}def +end + +% an attempt at a PostScript FONT to implement ditroff special chars +% this will enable us to +% cache the little buggers +% generate faster, more compact PS out of psdit +% confuse everyone (including myself)! +50 dict dup begin +/FontType 3 def +/FontName /DIThacks def +/FontMatrix [.001 0 0 .001 0 0] def +/FontBBox [-260 -260 900 900] def% a lie but ... +/Encoding 256 array def +0 1 255{Encoding exch /.notdef put}for +Encoding + dup 8#040/space put %space + dup 8#110/rc put %right ceil + dup 8#111/lt put %left top curl + dup 8#112/bv put %bold vert + dup 8#113/lk put %left mid curl + dup 8#114/lb put %left bot curl + dup 8#115/rt put %right top curl + dup 8#116/rk put %right mid curl + dup 8#117/rb put %right bot curl + dup 8#120/rf put %right floor + dup 8#121/lf put %left floor + dup 8#122/lc put %left ceil + dup 8#140/sq put %square + dup 8#141/bx put %box + dup 8#142/ci put %circle + dup 8#143/br put %box rule + dup 8#144/rn put %root extender + dup 8#145/vr put %vertical rule + dup 8#146/ob put %outline bullet + dup 8#147/bu put %bullet + dup 8#150/ru put %rule + dup 8#151/ul put %underline + pop +/DITfd 100 dict def +/BuildChar{0 begin + /cc exch def /fd exch def + /charname fd /Encoding get cc get def + /charwid fd /Metrics get charname get def + /charproc fd /CharProcs get charname get def + charwid 0 fd /FontBBox get aload pop setcachedevice + 2 setlinejoin 40 setlinewidth + newpath 0 0 moveto gsave charproc grestore + end}def +/BuildChar load 0 DITfd put +/CharProcs 50 dict def +CharProcs begin +/space{}def +/.notdef{}def +/ru{500 0 rls}def +/rn{0 840 moveto 500 0 rls}def +/vr{0 800 moveto 0 -770 rls}def +/bv{0 800 moveto 0 -1000 rls}def +/br{0 840 moveto 0 -1000 rls}def +/ul{0 -140 moveto 500 0 rls}def +/ob{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath stroke}def +/bu{200 250 rmoveto currentpoint newpath 200 0 360 arc closepath fill}def +/sq{80 0 rmoveto currentpoint dround newpath moveto + 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath stroke}def +/bx{80 0 rmoveto currentpoint dround newpath moveto + 640 0 rlineto 0 640 rlineto -640 0 rlineto closepath fill}def +/ci{500 360 rmoveto currentpoint newpath 333 0 360 arc + 50 setlinewidth stroke}def + +/lt{0 -200 moveto 0 550 rlineto currx 800 2cx s4 add exch s4 a4p stroke}def +/lb{0 800 moveto 0 -550 rlineto currx -200 2cx s4 add exch s4 a4p stroke}def +/rt{0 -200 moveto 0 550 rlineto currx 800 2cx s4 sub exch s4 a4p stroke}def +/rb{0 800 moveto 0 -500 rlineto currx -200 2cx s4 sub exch s4 a4p stroke}def +/lk{0 800 moveto 0 300 -300 300 s4 arcto pop pop 1000 sub + 0 300 4 2 roll s4 a4p 0 -200 lineto stroke}def +/rk{0 800 moveto 0 300 s2 300 s4 arcto pop pop 1000 sub + 0 300 4 2 roll s4 a4p 0 -200 lineto stroke}def +/lf{0 800 moveto 0 -1000 rlineto s4 0 rls}def +/rf{0 800 moveto 0 -1000 rlineto s4 neg 0 rls}def +/lc{0 -200 moveto 0 1000 rlineto s4 0 rls}def +/rc{0 -200 moveto 0 1000 rlineto s4 neg 0 rls}def +end + +/Metrics 50 dict def Metrics begin +/.notdef 0 def +/space 500 def +/ru 500 def +/br 0 def +/lt 416 def +/lb 416 def +/rt 416 def +/rb 416 def +/lk 416 def +/rk 416 def +/rc 416 def +/lc 416 def +/rf 416 def +/lf 416 def +/bv 416 def +/ob 350 def +/bu 350 def +/ci 750 def +/bx 750 def +/sq 750 def +/rn 500 def +/ul 500 def +/vr 0 def +end + +DITfd begin +/s2 500 def /s4 250 def /s3 333 def +/a4p{arcto pop pop pop pop}def +/2cx{2 copy exch}def +/rls{rlineto stroke}def +/currx{currentpoint pop}def +/dround{transform round exch round exch itransform} def +end +end +/DIThacks exch definefont pop +ditstart +(psc)xT +576 1 1 xr +1(Times-Roman)xf 1 f +2(Times-Italic)xf 2 f +3(Times-Bold)xf 3 f +4(Times-BoldItalic)xf 4 f +5(Helvetica)xf 5 f +6(Helvetica-Bold)xf 6 f +7(Courier)xf 7 f +8(Courier-Bold)xf 8 f +9(Symbol)xf 9 f +10(DIThacks)xf 10 f +10 s +1 f +xi +%%EndProlog + +%%Page: 1 1 +10 s 10 xH 0 xS 1 f +3 f +14 s +1205 1206(LIBTP:)N +1633(Portable,)X +2100(M)X +2206(odular)X +2551(Transactions)X +3202(for)X +3374(UNIX)X +1 f +11 s +3661 1162(1)N +2 f +12 s +2182 1398(Margo)N +2467(Seltzer)X +2171 1494(Michael)N +2511(Olson)X +1800 1590(University)N +2225(of)X +2324(California,)X +2773(Berkeley)X +3 f +2277 1878(Abstract)N +1 f +10 s +755 2001(Transactions)N +1198(provide)X +1475(a)X +1543(useful)X +1771(programming)X +2239(paradigm)X +2574(for)X +2700(maintaining)X +3114(logical)X +3364(consistency,)X +3790(arbitrating)X +4156(con-)X +555 2091(current)N +808(access,)X +1059(and)X +1200(managing)X +1540(recovery.)X +1886(In)X +1977(traditional)X +2330(UNIX)X +2555(systems,)X +2852(the)X +2974(only)X +3140(easy)X +3307(way)X +3465(of)X +3556(using)X +3753(transactions)X +4160(is)X +4237(to)X +555 2181(purchase)N +876(a)X +947(database)X +1258(system.)X +1554(Such)X +1748(systems)X +2035(are)X +2168(often)X +2367(slow,)X +2572(costly,)X +2817(and)X +2967(may)X +3139(not)X +3275(provide)X +3554(the)X +3686(exact)X +3890(functionality)X +555 2271(desired.)N +848(This)X +1011(paper)X +1210(presents)X +1493(the)X +1611(design,)X +1860(implementation,)X +2402(and)X +2538(performance)X +2965(of)X +3052(LIBTP,)X +3314(a)X +3370(simple,)X +3623(non-proprietary)X +4147(tran-)X +555 2361(saction)N +809(library)X +1050(using)X +1249(the)X +1373(4.4BSD)X +1654(database)X +1957(access)X +2189(routines)X +2473(\()X +3 f +2500(db)X +1 f +2588(\(3\)\).)X +2775(On)X +2899(a)X +2961(conventional)X +3401(transaction)X +3779(processing)X +4148(style)X +555 2451(benchmark,)N +959(its)X +1061(performance)X +1495(is)X +1575(approximately)X +2065(85%)X +2239(that)X +2386(of)X +2480(the)X +2604(database)X +2907(access)X +3139(routines)X +3423(without)X +3693(transaction)X +4071(protec-)X +555 2541(tion,)N +725(200%)X +938(that)X +1084(of)X +1177(using)X +3 f +1376(fsync)X +1 f +1554(\(2\))X +1674(to)X +1761(commit)X +2030(modi\256cations)X +2490(to)X +2577(disk,)X +2755(and)X +2896(125%)X +3108(that)X +3253(of)X +3345(a)X +3406(commercial)X +3810(relational)X +4138(data-)X +555 2631(base)N +718(system.)X +3 f +555 2817(1.)N +655(Introduction)X +1 f +755 2940(Transactions)N +1186(are)X +1306(used)X +1474(in)X +1557(database)X +1855(systems)X +2129(to)X +2212(enable)X +2443(concurrent)X +2807(users)X +2992(to)X +3074(apply)X +3272(multi-operation)X +3790(updates)X +4055(without)X +555 3030(violating)N +863(the)X +985(integrity)X +1280(of)X +1371(the)X +1493(database.)X +1814(They)X +2003(provide)X +2271(the)X +2392(properties)X +2736(of)X +2826(atomicity,)X +3171(consistency,)X +3588(isolation,)X +3906(and)X +4045(durabil-)X +555 3120(ity.)N +701(By)X +816(atomicity,)X +1160(we)X +1276(mean)X +1472(that)X +1614(the)X +1734(set)X +1845(of)X +1934(updates)X +2200(comprising)X +2581(a)X +2638(transaction)X +3011(must)X +3187(be)X +3284(applied)X +3541(as)X +3629(a)X +3686(single)X +3898(unit;)X +4085(that)X +4226(is,)X +555 3210(they)N +714(must)X +890(either)X +1094(all)X +1195(be)X +1292(applied)X +1549(to)X +1632(the)X +1751(database)X +2049(or)X +2137(all)X +2238(be)X +2335(absent.)X +2601(Consistency)X +3013(requires)X +3293(that)X +3434(a)X +3491(transaction)X +3864(take)X +4019(the)X +4138(data-)X +555 3300(base)N +725(from)X +908(one)X +1051(logically)X +1358(consistent)X +1704(state)X +1877(to)X +1965(another.)X +2272(The)X +2423(property)X +2721(of)X +2814(isolation)X +3115(requires)X +3400(that)X +3546(concurrent)X +3916(transactions)X +555 3390(yield)N +750(results)X +994(which)X +1225(are)X +1358(indistinguishable)X +1938(from)X +2128(the)X +2260(results)X +2503(which)X +2733(would)X +2967(be)X +3077(obtained)X +3387(by)X +3501(running)X +3784(the)X +3916(transactions)X +555 3480(sequentially.)N +1002(Finally,)X +1268(durability)X +1599(requires)X +1878(that)X +2018(once)X +2190(transactions)X +2593(have)X +2765(been)X +2937(committed,)X +3319(their)X +3486(results)X +3715(must)X +3890(be)X +3986(preserved)X +555 3570(across)N +776(system)X +1018(failures)X +1279([TPCB90].)X +755 3693(Although)N +1080(these)X +1268(properties)X +1612(are)X +1734(most)X +1912(frequently)X +2265(discussed)X +2595(in)X +2680(the)X +2801(context)X +3060(of)X +3150(databases,)X +3501(they)X +3661(are)X +3782(useful)X +4000(program-)X +555 3783(ming)N +750(paradigms)X +1114(for)X +1238(more)X +1433(general)X +1700(purpose)X +1984(applications.)X +2441(There)X +2659(are)X +2788(several)X +3046(different)X +3353(situations)X +3689(where)X +3916(transactions)X +555 3873(can)N +687(be)X +783(used)X +950(to)X +1032(replace)X +1285(current)X +1533(ad-hoc)X +1772(mechanisms.)X +755 3996(One)N +910(situation)X +1206(is)X +1280(when)X +1475(multiple)X +1762(\256les)X +1916(or)X +2004(parts)X +2181(of)X +2269(\256les)X +2422(need)X +2594(to)X +2676(be)X +2772(updated)X +3046(in)X +3128(an)X +3224(atomic)X +3462(fashion.)X +3758(For)X +3889(example,)X +4201(the)X +555 4086(traditional)N +907(UNIX)X +1131(\256le)X +1256(system)X +1501(uses)X +1661(ordering)X +1955(constraints)X +2324(to)X +2408(achieve)X +2676(recoverability)X +3144(in)X +3228(the)X +3348(face)X +3505(of)X +3594(crashes.)X +3893(When)X +4107(a)X +4165(new)X +555 4176(\256le)N +678(is)X +752(created,)X +1026(its)X +1122(inode)X +1321(is)X +1395(written)X +1642(to)X +1724(disk)X +1877(before)X +2103(the)X +2221(new)X +2375(\256le)X +2497(is)X +2570(added)X +2782(to)X +2864(the)X +2982(directory)X +3292(structure.)X +3633(This)X +3795(guarantees)X +4159(that,)X +555 4266(if)N +627(the)X +748(system)X +993(crashes)X +1253(between)X +1544(the)X +1665(two)X +1808(I/O's,)X +2016(the)X +2137(directory)X +2450(does)X +2620(not)X +2744(contain)X +3002(a)X +3060 0.4531(reference)AX +3383(to)X +3467(an)X +3565(invalid)X +3809(inode.)X +4049(In)X +4138(actu-)X +555 4356(ality,)N +741(the)X +863(desired)X +1119(effect)X +1326(is)X +1402(that)X +1545(these)X +1733(two)X +1876(updates)X +2144(have)X +2319(the)X +2440(transactional)X +2873(property)X +3168(of)X +3258(atomicity)X +3583(\(either)X +3816(both)X +3981(writes)X +4200(are)X +555 4446(visible)N +790(or)X +879(neither)X +1124(is\).)X +1266(Rather)X +1501(than)X +1660(building)X +1947(special)X +2191(purpose)X +2466(recovery)X +2769(mechanisms)X +3186(into)X +3331(the)X +3450(\256le)X +3573(system)X +3816(or)X +3904(related)X +4144(tools)X +555 4536(\()N +2 f +582(e.g.)X +3 f +726(fsck)X +1 f +864(\(8\)\),)X +1033(one)X +1177(could)X +1383(use)X +1518(general)X +1783(purpose)X +2064(transaction)X +2443(recovery)X +2752(protocols)X +3077(after)X +3252(system)X +3501(failure.)X +3778(Any)X +3943(application)X +555 4626(that)N +705(needs)X +918(to)X +1010(keep)X +1192(multiple,)X +1508(related)X +1757(\256les)X +1920(\(or)X +2044(directories\))X +2440(consistent)X +2790(should)X +3032(do)X +3141(so)X +3241(using)X +3443(transactions.)X +3895(Source)X +4147(code)X +555 4716(control)N +805(systems,)X +1101(such)X +1271(as)X +1361(RCS)X +1534(and)X +1673(SCCS,)X +1910(should)X +2146(use)X +2276(transaction)X +2651(semantics)X +2990(to)X +3075(allow)X +3276(the)X +3397(``checking)X +3764(in'')X +3903(of)X +3992(groups)X +4232(of)X +555 4806(related)N +801(\256les.)X +1001(In)X +1095(this)X +1237(way,)X +1418(if)X +1493(the)X +1617 0.2841(``check-in'')AX +2028(fails,)X +2212(the)X +2336(transaction)X +2714(may)X +2878(be)X +2980(aborted,)X +3267(backing)X +3547(out)X +3675(the)X +3799(partial)X +4030(``check-)X +555 4896(in'')N +691(leaving)X +947(the)X +1065(source)X +1295(repository)X +1640(in)X +1722(a)X +1778(consistent)X +2118(state.)X +755 5019(A)N +842(second)X +1094(situation)X +1398(where)X +1624(transactions)X +2036(can)X +2177(be)X +2282(used)X +2458(to)X +2549(replace)X +2811(current)X +3068(ad-hoc)X +3316(mechanisms)X +3741(is)X +3822(in)X +3912(applications)X +555 5109(where)N +776(concurrent)X +1144(updates)X +1413(to)X +1499(a)X +1559(shared)X +1793(\256le)X +1919(are)X +2042(desired,)X +2318(but)X +2444(there)X +2629(is)X +2706(logical)X +2948(consistency)X +3345(of)X +3435(the)X +3556(data)X +3713(which)X +3932(needs)X +4138(to)X +4223(be)X +555 5199(preserved.)N +928(For)X +1059(example,)X +1371(when)X +1565(the)X +1683(password)X +2006(\256le)X +2128(is)X +2201(updated,)X +2495(\256le)X +2617(locking)X +2877(is)X +2950(used)X +3117(to)X +3199(disallow)X +3490(concurrent)X +3854(access.)X +4120(Tran-)X +555 5289(saction)N +804(semantics)X +1142(on)X +1244(the)X +1364(password)X +1689(\256les)X +1844(would)X +2066(allow)X +2266(concurrent)X +2632(updates,)X +2919(while)X +3119(preserving)X +3479(the)X +3598(logical)X +3837(consistency)X +4232(of)X +555 5379(the)N +681(password)X +1012(database.)X +1357(Similarly,)X +1702(UNIX)X +1930(utilities)X +2196(which)X +2419(rewrite)X +2674(\256les)X +2834(face)X +2996(a)X +3059(potential)X +3366(race)X +3528(condition)X +3857(between)X +4152(their)X +555 5469(rewriting)N +871(a)X +929(\256le)X +1053(and)X +1191(another)X +1453(process)X +1715(reading)X +1977(the)X +2096(\256le.)X +2259(For)X +2391(example,)X +2704(the)X +2823(compiler)X +3129(\(more)X +3342(precisely,)X +3673(the)X +3792(assembler\))X +4161(may)X +8 s +10 f +555 5541(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N +5 s +1 f +727 5619(1)N +8 s +763 5644(To)N +850(appear)X +1035(in)X +1101(the)X +2 f +1195(Proceedings)X +1530(of)X +1596(the)X +1690(1992)X +1834(Winter)X +2024(Usenix)X +1 f +2201(,)X +2233(San)X +2345(Francisco,)X +2625(CA,)X +2746(January)X +2960(1992.)X + +2 p +%%Page: 2 2 +8 s 8 xH 0 xS 1 f +10 s +3 f +1 f +555 630(have)N +737(to)X +829(rewrite)X +1087(a)X +1152(\256le)X +1283(to)X +1374(which)X +1599(it)X +1672(has)X +1808(write)X +2002(permission)X +2382(in)X +2473(a)X +2538(directory)X +2857(to)X +2948(which)X +3173(it)X +3246(does)X +3422(not)X +3553(have)X +3734(write)X +3928(permission.)X +555 720(While)N +779(the)X +904(``.o'')X +1099(\256le)X +1228(is)X +1308(being)X +1513(written,)X +1787(another)X +2055(utility)X +2272(such)X +2446(as)X +3 f +2540(nm)X +1 f +2651(\(1\))X +2772(or)X +3 f +2866(ar)X +1 f +2942(\(1\))X +3063(may)X +3228(read)X +3394(the)X +3519(\256le)X +3648(and)X +3791(produce)X +4077(invalid)X +555 810(results)N +790(since)X +981(the)X +1105(\256le)X +1233(has)X +1366(not)X +1494(been)X +1672(completely)X +2054(written.)X +2347(Currently,)X +2700(some)X +2895(utilities)X +3160(use)X +3293(special)X +3542(purpose)X +3821(code)X +3998(to)X +4085(handle)X +555 900(such)N +722(cases)X +912(while)X +1110(others)X +1326(ignore)X +1551(the)X +1669(problem)X +1956(and)X +2092(force)X +2278(users)X +2463(to)X +2545(live)X +2685(with)X +2847(the)X +2965(consequences.)X +755 1023(In)N +845(this)X +983(paper,)X +1205(we)X +1322(present)X +1577(a)X +1635(simple)X +1870(library)X +2106(which)X +2324(provides)X +2622(transaction)X +2996(semantics)X +3334(\(atomicity,)X +3705(consistency,)X +4121(isola-)X +555 1113(tion,)N +720(and)X +857(durability\).)X +1236(The)X +1382(4.4BSD)X +1658(database)X +1956(access)X +2182(methods)X +2473(have)X +2645(been)X +2817(modi\256ed)X +3121(to)X +3203(use)X +3330(this)X +3465(library,)X +3719(optionally)X +4063(provid-)X +555 1203(ing)N +682(shared)X +917(buffer)X +1139(management)X +1574(between)X +1867(applications,)X +2298(locking,)X +2582(and)X +2722(transaction)X +3098(semantics.)X +3478(Any)X +3640(UNIX)X +3865(program)X +4161(may)X +555 1293(transaction)N +930(protect)X +1176(its)X +1274(data)X +1430(by)X +1532(requesting)X +1888(transaction)X +2262(protection)X +2609(with)X +2773(the)X +3 f +2893(db)X +1 f +2981(\(3\))X +3097(library)X +3333(or)X +3422(by)X +3524(adding)X +3764(appropriate)X +4152(calls)X +555 1383(to)N +646(the)X +773(transaction)X +1154(manager,)X +1480(buffer)X +1706(manager,)X +2032(lock)X +2199(manager,)X +2525(and)X +2670(log)X +2801(manager.)X +3147(The)X +3301(library)X +3543(routines)X +3829(may)X +3995(be)X +4099(linked)X +555 1473(into)N +708(the)X +834(host)X +995(application)X +1379(and)X +1523(called)X +1743(by)X +1851(subroutine)X +2217(interface,)X +2547(or)X +2642(they)X +2808(may)X +2974(reside)X +3194(in)X +3284(a)X +3348(separate)X +3640(server)X +3865(process.)X +4174(The)X +555 1563(server)N +772(architecture)X +1172(provides)X +1468(for)X +1582(network)X +1865(access)X +2091(and)X +2227(better)X +2430(protection)X +2775(mechanisms.)X +3 f +555 1749(2.)N +655(Related)X +938(Work)X +1 f +755 1872(There)N +1000(has)X +1164(been)X +1373(much)X +1608(discussion)X +1998(in)X +2117(recent)X +2371(years)X +2597(about)X +2831(new)X +3021(transaction)X +3429(models)X +3716(and)X +3888(architectures)X +555 1962 0.1172([SPEC88][NODI90][CHEN91][MOHA91].)AN +2009(Much)X +2220(of)X +2310(this)X +2448(work)X +2636(focuses)X +2900(on)X +3003(new)X +3160(ways)X +3348(to)X +3433(model)X +3656(transactions)X +4062(and)X +4201(the)X +555 2052(interactions)N +953(between)X +1245(them,)X +1449(while)X +1651(the)X +1772(work)X +1960(presented)X +2291(here)X +2453(focuses)X +2717(on)X +2820(the)X +2941(implementation)X +3466(and)X +3605(performance)X +4035(of)X +4125(tradi-)X +555 2142(tional)N +757(transaction)X +1129(techniques)X +1492(\(write-ahead)X +1919(logging)X +2183(and)X +2319(two-phase)X +2669(locking\))X +2956(on)X +3056(a)X +3112(standard)X +3404(operating)X +3727(system)X +3969(\(UNIX\).)X +755 2265(Such)N +947(traditional)X +1308(operating)X +1643(systems)X +1928(are)X +2059(often)X +2256(criticized)X +2587(for)X +2713(their)X +2892(inability)X +3190(to)X +3283(perform)X +3573(transaction)X +3956(processing)X +555 2355(adequately.)N +971([STON81])X +1342(cites)X +1517(three)X +1706(main)X +1894(areas)X +2088(of)X +2183(inadequate)X +2559(support:)X +2849(buffer)X +3074(management,)X +3532(the)X +3658(\256le)X +3788(system,)X +4058(and)X +4201(the)X +555 2445(process)N +823(structure.)X +1191(These)X +1410(arguments)X +1771(are)X +1897(summarized)X +2316(in)X +2405(table)X +2587(one.)X +2769(Fortunately,)X +3184(much)X +3388(has)X +3521(changed)X +3815(since)X +4006(1981.)X +4232(In)X +555 2535(the)N +683(area)X +848(of)X +945(buffer)X +1172(management,)X +1632(most)X +1817(UNIX)X +2048(systems)X +2331(provide)X +2606(the)X +2734(ability)X +2968(to)X +3060(memory)X +3357(map)X +3525(\256les,)X +3708(thus)X +3870(obviating)X +4201(the)X +555 2625(need)N +734(for)X +855(a)X +918(copy)X +1101(between)X +1396(kernel)X +1624(and)X +1766(user)X +1926(space.)X +2171(If)X +2251(a)X +2313(database)X +2616(system)X +2864(is)X +2943(going)X +3151(to)X +3239(use)X +3372(the)X +3496(\256le)X +3624(system)X +3872(buffer)X +4095(cache,)X +555 2715(then)N +719(a)X +781(system)X +1029(call)X +1171(is)X +1250(required.)X +1584(However,)X +1924(if)X +1998(buffering)X +2322(is)X +2400(provided)X +2710(at)X +2793(user)X +2952(level)X +3133(using)X +3331(shared)X +3566(memory,)X +3878(as)X +3970(in)X +4057(LIBTP,)X +555 2805(buffer)N +776(management)X +1210(is)X +1287(only)X +1452(as)X +1542(slow)X +1716(as)X +1806(access)X +2035(to)X +2120(shared)X +2353(memory)X +2643(and)X +2782(any)X +2921(replacement)X +3337(algorithm)X +3671(may)X +3832(be)X +3931(used.)X +4121(Since)X +555 2895(multiple)N +849(processes)X +1185(can)X +1325(access)X +1559(the)X +1685(shared)X +1923(data,)X +2105(prefetching)X +2499(may)X +2665(be)X +2769(accomplished)X +3238(by)X +3346(separate)X +3638(processes)X +3973(or)X +4067(threads)X +555 2985(whose)N +782(sole)X +932(purpose)X +1207(is)X +1281(to)X +1364(prefetch)X +1649(pages)X +1853(and)X +1990(wait)X +2149(on)X +2250(them.)X +2471(There)X +2680(is)X +2754(still)X +2894(no)X +2995(way)X +3150(to)X +3233(enforce)X +3496(write)X +3682(ordering)X +3975(other)X +4161(than)X +555 3075(keeping)N +829(pages)X +1032(in)X +1114(user)X +1268(memory)X +1555(and)X +1691(using)X +1884(the)X +3 f +2002(fsync)X +1 f +2180(\(3\))X +2294(system)X +2536(call)X +2672(to)X +2754(perform)X +3033(synchronous)X +3458(writes.)X +755 3198(In)N +845(the)X +966(area)X +1124(of)X +1214(\256le)X +1339(systems,)X +1635(the)X +1756(fast)X +1895(\256le)X +2020(system)X +2265(\(FFS\))X +2474([MCKU84])X +2871(allows)X +3103(allocation)X +3442(in)X +3527(units)X +3704(up)X +3806(to)X +3890(64KBytes)X +4232(as)X +555 3288(opposed)N +846(to)X +932(the)X +1054(4KByte)X +1327(and)X +1466(8KByte)X +1738(\256gures)X +1979(quoted)X +2220(in)X +2305([STON81].)X +2711(The)X +2859(measurements)X +3341(in)X +3426(this)X +3564(paper)X +3766(were)X +3946(taken)X +4143(from)X +555 3378(an)N +655(8KByte)X +928(FFS,)X +1104(but)X +1230(as)X +1320(LIBTP)X +1565(runs)X +1726(exclusively)X +2114(in)X +2199(user)X +2356(space,)X +2578(there)X +2762(is)X +2838(nothing)X +3105(to)X +3190(prevent)X +3454(it)X +3521(from)X +3700(being)X +3901(run)X +4031(on)X +4134(other)X +555 3468(UNIX)N +776(compatible)X +1152(\256le)X +1274(systems)X +1547(\(e.g.)X +1710(log-structured)X +2180([ROSE91],)X +2558(extent-based,)X +3004(or)X +3091(multi-block)X +3484([SELT91]\).)X +755 3591(Finally,)N +1029(with)X +1199(regard)X +1433(to)X +1523(the)X +1648(process)X +1916(structure,)X +2244(neither)X +2494(context)X +2757(switch)X +2993(time)X +3162(nor)X +3296(scheduling)X +3670(around)X +3920(semaphores)X +555 3681(seems)N +785(to)X +881(affect)X +1099(the)X +1231(system)X +1487(performance.)X +1968(However,)X +2317(the)X +2449(implementation)X +2984(of)X +3084(semaphores)X +3496(can)X +3641(impact)X +3892(performance)X +555 3771(tremendously.)N +1051(This)X +1213(is)X +1286(discussed)X +1613(in)X +1695(more)X +1880(detail)X +2078(in)X +2160(section)X +2407(4.3.)X +755 3894(The)N +908(Tuxedo)X +1181(system)X +1431(from)X +1615(AT&T)X +1861(is)X +1941(a)X +2004(transaction)X +2383(manager)X +2687(which)X +2910(coordinates)X +3307(distributed)X +3676(transaction)X +4055(commit)X +555 3984(from)N +738(a)X +801(variety)X +1051(of)X +1145(different)X +1449(local)X +1632(transaction)X +2011(managers.)X +2386(At)X +2493(this)X +2634(time,)X +2822(LIBTP)X +3070(does)X +3243(not)X +3371(have)X +3549(its)X +3650(own)X +3814(mechanism)X +4205(for)X +555 4074(distributed)N +942(commit)X +1231(processing,)X +1639(but)X +1786(could)X +2009(be)X +2130(used)X +2322(as)X +2434(a)X +2515(local)X +2716(transaction)X +3113(agent)X +3331(by)X +3455(systems)X +3752(such)X +3943(as)X +4054(Tuxedo)X +555 4164([ANDR89].)N +10 f +863 4393(i)N +870(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +903 4483(Buffer)N +1133(Management)X +10 f +1672(g)X +1 f +1720(Data)X +1892(must)X +2067(be)X +2163(copied)X +2397(between)X +2685(kernel)X +2906(space)X +3105(and)X +3241(user)X +3395(space.)X +10 f +1672 4573(g)N +1 f +1720(Buffer)X +1950(pool)X +2112(access)X +2338(is)X +2411(too)X +2533(slow.)X +10 f +1672 4663(g)N +1 f +1720(There)X +1928(is)X +2001(no)X +2101(way)X +2255(to)X +2337(request)X +2589(prefetch.)X +10 f +1672 4753(g)N +1 f +1720(Replacement)X +2159(is)X +2232(usually)X +2483(LRU)X +2663(which)X +2879(may)X +3037(be)X +3133(suboptimal)X +3508(for)X +3622(databases.)X +10 f +1672 4843(g)N +1 f +1720(There)X +1928(is)X +2001(no)X +2101(way)X +2255(to)X +2337(guarantee)X +2670(write)X +2855(ordering.)X +10 f +863 4853(i)N +870(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +903 4943(File)N +1047(System)X +10 f +1672(g)X +1 f +1720(Allocation)X +2078(is)X +2151(done)X +2327(in)X +2409(small)X +2602(blocks)X +2831(\(usually)X +3109(4K)X +3227(or)X +3314(8K\).)X +10 f +1672 5033(g)N +1 f +1720(Logical)X +1985(organization)X +2406(of)X +2493(\256les)X +2646(is)X +2719(redundantly)X +3122(expressed.)X +10 f +863 5043(i)N +870(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +903 5133(Process)N +1168(Structure)X +10 f +1672(g)X +1 f +1720(Context)X +1993(switching)X +2324(and)X +2460(message)X +2752(passing)X +3012(are)X +3131(too)X +3253(slow.)X +10 f +1672 5223(g)N +1 f +1720(A)X +1798(process)X +2059(may)X +2217(be)X +2313(descheduled)X +2730(while)X +2928(holding)X +3192(a)X +3248(semaphore.)X +10 f +863 5233(i)N +870(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +863(c)X +5193(c)Y +5113(c)Y +5033(c)Y +4953(c)Y +4873(c)Y +4793(c)Y +4713(c)Y +4633(c)Y +4553(c)Y +4473(c)Y +3990 5233(c)N +5193(c)Y +5113(c)Y +5033(c)Y +4953(c)Y +4873(c)Y +4793(c)Y +4713(c)Y +4633(c)Y +4553(c)Y +4473(c)Y +3 f +1156 5446(Table)N +1371(One:)X +1560(Shortcomings)X +2051(of)X +2138(UNIX)X +2363(transaction)X +2770(support)X +3056(cited)X +3241(in)X +3327([STON81].)X + +3 p +%%Page: 3 3 +10 s 10 xH 0 xS 3 f +1 f +755 630(The)N +901(transaction)X +1274(architecture)X +1675(presented)X +2004(in)X +2087([YOUN91])X +2474(is)X +2548(very)X +2712(similar)X +2955(to)X +3038(that)X +3179(implemented)X +3618(in)X +3701(the)X +3820(LIBTP.)X +4103(While)X +555 720([YOUN91])N +947(presents)X +1236(a)X +1298(model)X +1524(for)X +1644(providing)X +1981(transaction)X +2359(services,)X +2663(this)X +2803(paper)X +3007(focuses)X +3273(on)X +3378(the)X +3501(implementation)X +4028(and)X +4169(per-)X +555 810(formance)N +881(of)X +970(a)X +1028(particular)X +1358(system.)X +1642(In)X +1731(addition,)X +2034(we)X +2149(provide)X +2415(detailed)X +2690(comparisons)X +3116(with)X +3279(alternative)X +3639(solutions:)X +3970(traditional)X +555 900(UNIX)N +776(services)X +1055(and)X +1191(commercial)X +1590(database)X +1887(management)X +2317(systems.)X +3 f +555 1086(3.)N +655(Architecture)X +1 f +755 1209(The)N +906(library)X +1146(is)X +1224(designed)X +1534(to)X +1621(provide)X +1891(well)X +2054(de\256ned)X +2315(interfaces)X +2653(to)X +2740(the)X +2863(services)X +3147(required)X +3440(for)X +3559(transaction)X +3936(processing.)X +555 1299(These)N +777(services)X +1066(are)X +1195(recovery,)X +1527(concurrency)X +1955(control,)X +2232(and)X +2378(the)X +2506(management)X +2946(of)X +3043(shared)X +3283(data.)X +3487(First)X +3663(we)X +3787(will)X +3941(discuss)X +4201(the)X +555 1389(design)N +795(tradeoffs)X +1112(in)X +1205(the)X +1334(selection)X +1650(of)X +1748(recovery,)X +2081(concurrency)X +2510(control,)X +2787(and)X +2933(buffer)X +3160(management)X +3600(implementations,)X +4183(and)X +555 1479(then)N +713(we)X +827(will)X +971(present)X +1223(the)X +1341(overall)X +1584(library)X +1818(architecture)X +2218(and)X +2354(module)X +2614(descriptions.)X +3 f +555 1665(3.1.)N +715(Design)X +966(Tradeoffs)X +1 f +3 f +555 1851(3.1.1.)N +775(Crash)X +1004(Recovery)X +1 f +755 1974(The)N +909(recovery)X +1220(protocol)X +1516(is)X +1598(responsible)X +1992(for)X +2115(providing)X +2455(the)X +2582(transaction)X +2963(semantics)X +3308(discussed)X +3644(earlier.)X +3919(There)X +4136(are)X +4263(a)X +555 2064(wide)N +739(range)X +946(of)X +1041(recovery)X +1351(protocols)X +1677(available)X +1995([HAER83],)X +2395(but)X +2525(we)X +2647(can)X +2786(crudely)X +3054(divide)X +3281(them)X +3468(into)X +3619(two)X +3766(main)X +3953(categories.)X +555 2154(The)N +706(\256rst)X +856(category)X +1159(records)X +1422(all)X +1528(modi\256cations)X +1989(to)X +2077(the)X +2201(database)X +2504(in)X +2592(a)X +2653(separate)X +2942(\256le,)X +3089(and)X +3230(uses)X +3393(this)X +3533(\256le)X +3660(\(log\))X +3841(to)X +3928(back)X +4105(out)X +4232(or)X +555 2244(reapply)N +825(these)X +1019(modi\256cations)X +1483(if)X +1561(a)X +1626(transaction)X +2007(aborts)X +2232(or)X +2328(the)X +2455(system)X +2706(crashes.)X +3012(We)X +3153(call)X +3298(this)X +3442(set)X +3560(the)X +3 f +3687(logging)X +3963(protocols)X +1 f +4279(.)X +555 2334(The)N +703(second)X +949(category)X +1249(avoids)X +1481(the)X +1602(use)X +1732(of)X +1822(a)X +1881(log)X +2006(by)X +2109(carefully)X +2418(controlling)X +2792(when)X +2989(data)X +3146(are)X +3268(written)X +3518(to)X +3603(disk.)X +3799(We)X +3934(call)X +4073(this)X +4210(set)X +555 2424(the)N +3 f +673(non-logging)X +1096(protocols)X +1 f +1412(.)X +755 2547(Non-logging)N +1185(protocols)X +1504(hold)X +1666(dirty)X +1837(buffers)X +2085(in)X +2167(main)X +2347(memory)X +2634(or)X +2721(temporary)X +3071(\256les)X +3224(until)X +3390(commit)X +3654(and)X +3790(then)X +3948(force)X +4134(these)X +555 2637(pages)N +769(to)X +862(disk)X +1026(at)X +1115(transaction)X +1498(commit.)X +1813(While)X +2040(we)X +2165(can)X +2308(use)X +2446(temporary)X +2807(\256les)X +2971(to)X +3064(hold)X +3237(dirty)X +3418(pages)X +3631(that)X +3781(may)X +3949(need)X +4131(to)X +4223(be)X +555 2727(evicted)N +810(from)X +988(memory)X +1277(during)X +1508(a)X +1566(long-running)X +2006(transaction,)X +2400(the)X +2520(only)X +2684(user-level)X +3023(mechanism)X +3410(to)X +3494(force)X +3682(pages)X +3887(to)X +3971(disk)X +4126(is)X +4201(the)X +3 f +555 2817(fsync)N +1 f +733(\(2\))X +850(system)X +1095(call.)X +1274(Unfortunately,)X +3 f +1767(fsync)X +1 f +1945(\(2\))X +2062(is)X +2138(an)X +2237(expensive)X +2581(system)X +2826(call)X +2965(in)X +3050(that)X +3193(it)X +3260(forces)X +3480(all)X +3583(pages)X +3789(of)X +3879(a)X +3938(\256le)X +4062(to)X +4146(disk,)X +555 2907(and)N +691(transactions)X +1094(that)X +1234(manage)X +1504(more)X +1689(than)X +1847(one)X +1983(\256le)X +2105(must)X +2280(issue)X +2460(one)X +2596(call)X +2732(per)X +2855(\256le.)X +755 3030(In)N +853(addition,)X +3 f +1166(fsync)X +1 f +1344(\(2\))X +1469(provides)X +1776(no)X +1887(way)X +2051(to)X +2143(control)X +2400(the)X +2528(order)X +2728(in)X +2820(which)X +3046(dirty)X +3227(pages)X +3440(are)X +3569(written)X +3826(to)X +3918(disk.)X +4121(Since)X +555 3120(non-logging)N +976(protocols)X +1304(must)X +1489(sometimes)X +1861(order)X +2061(writes)X +2287(carefully)X +2603([SULL92],)X +2987(they)X +3155(are)X +3284(dif\256cult)X +3567(to)X +3659(implement)X +4030(on)X +4139(Unix)X +555 3210(systems.)N +868(As)X +977(a)X +1033(result,)X +1251(we)X +1365(have)X +1537(chosen)X +1780(to)X +1862(implement)X +2224(a)X +2280(logging)X +2544(protocol.)X +755 3333(Logging)N +1050(protocols)X +1372(may)X +1534(be)X +1634(categorized)X +2029(based)X +2236(on)X +2340(how)X +2502(information)X +2904(is)X +2981(logged)X +3223(\(physically)X +3602(or)X +3692(logically\))X +4022(and)X +4161(how)X +555 3423(much)N +767(is)X +854(logged)X +1106(\(before)X +1373(images,)X +1654(after)X +1836(images)X +2097(or)X +2198(both\).)X +2441(In)X +3 f +2542(physical)X +2855(logging)X +1 f +3103(,)X +3157(images)X +3417(of)X +3517(complete)X +3844(physical)X +4144(units)X +555 3513(\(pages)N +786(or)X +874(buffers\))X +1150(are)X +1270(recorded,)X +1593(while)X +1792(in)X +3 f +1875(logical)X +2118(logging)X +1 f +2387(a)X +2444(description)X +2820(of)X +2907(the)X +3025(operation)X +3348(is)X +3421(recorded.)X +3763(Therefore,)X +4121(while)X +555 3603(we)N +675(may)X +839(record)X +1071(entire)X +1280(pages)X +1489(in)X +1577(a)X +1639(physical)X +1932(log,)X +2080(we)X +2200(need)X +2378(only)X +2546(record)X +2777(the)X +2900(records)X +3162(being)X +3365(modi\256ed)X +3674(in)X +3761(a)X +3822(logical)X +4065(log.)X +4232(In)X +555 3693(fact,)N +718(physical)X +1006(logging)X +1271(can)X +1404(be)X +1501(thought)X +1766(of)X +1854(as)X +1942(a)X +1999(special)X +2243(case)X +2403(of)X +2491(logical)X +2730(logging,)X +3015(since)X +3201(the)X +3320 0.3125(``records'')AX +3686(that)X +3827(we)X +3942(log)X +4065(in)X +4148(logi-)X +555 3783(cal)N +673(logging)X +941(might)X +1151(be)X +1251(physical)X +1542(pages.)X +1789(Since)X +1991(logical)X +2233(logging)X +2501(is)X +2578(both)X +2743(more)X +2931(space-ef\256cient)X +3423(and)X +3562(more)X +3750(general,)X +4030(we)X +4147(have)X +555 3873(chosen)N +798(it)X +862(for)X +976(our)X +1103(logging)X +1367(protocol.)X +755 3996(In)N +3 f +843(before-image)X +1315(logging)X +1 f +1563(,)X +1604(we)X +1719(log)X +1842(a)X +1899(copy)X +2076(of)X +2164(the)X +2283(data)X +2438(before)X +2665(the)X +2784(update,)X +3039(while)X +3238(in)X +3 f +3321(after-image)X +3739(logging)X +1 f +3987(,)X +4027(we)X +4141(log)X +4263(a)X +555 4086(copy)N +740(of)X +836(the)X +963(data)X +1126(after)X +1303(the)X +1429(update.)X +1711(If)X +1793(we)X +1915(log)X +2045(only)X +2215(before-images,)X +2723(then)X +2889(there)X +3078(is)X +3159(suf\256cient)X +3485(information)X +3891(in)X +3981(the)X +4107(log)X +4237(to)X +555 4176(allow)N +761(us)X +860(to)X +3 f +950(undo)X +1 f +1150(the)X +1276(transaction)X +1656(\(go)X +1791(back)X +1971(to)X +2061(the)X +2187(state)X +2361(represented)X +2759(by)X +2866(the)X +2991(before-image\).)X +3514(However,)X +3876(if)X +3952(the)X +4077(system)X +555 4266(crashes)N +814(and)X +952(a)X +1010(committed)X +1374(transaction's)X +1806(changes)X +2087(have)X +2261(not)X +2385(reached)X +2658(the)X +2778(disk,)X +2953(we)X +3068(have)X +3241(no)X +3342(means)X +3568(to)X +3 f +3651(redo)X +1 f +3828(the)X +3947(transaction)X +555 4356(\(reapply)N +849(the)X +973(updates\).)X +1311(Therefore,)X +1675(logging)X +1945(only)X +2113(before-images)X +2599(necessitates)X +3004(forcing)X +3262(dirty)X +3439(pages)X +3648(at)X +3732(commit)X +4002(time.)X +4210(As)X +555 4446(mentioned)N +913(above,)X +1145(forcing)X +1397(pages)X +1600(at)X +1678(commit)X +1942(is)X +2015(considered)X +2383(too)X +2505(costly.)X +755 4569(If)N +834(we)X +953(log)X +1080(only)X +1247(after-images,)X +1694(then)X +1857(there)X +2043(is)X +2121(suf\256cient)X +2444(information)X +2847(in)X +2934(the)X +3057(log)X +3184(to)X +3271(allow)X +3474(us)X +3570(to)X +3657(redo)X +3825(the)X +3947(transaction)X +555 4659(\(go)N +687(forward)X +967(to)X +1054(the)X +1177(state)X +1348(represented)X +1743(by)X +1847(the)X +1969(after-image\),)X +2411(but)X +2537(we)X +2655(do)X +2759(not)X +2885(have)X +3061(the)X +3183(information)X +3585(required)X +3877(to)X +3963(undo)X +4147(tran-)X +555 4749(sactions)N +845(which)X +1073(aborted)X +1346(after)X +1526(dirty)X +1709(pages)X +1924(were)X +2113(written)X +2372(to)X +2466(disk.)X +2670(Therefore,)X +3039(logging)X +3314(only)X +3487(after-images)X +3920(necessitates)X +555 4839(holding)N +819(all)X +919(dirty)X +1090(buffers)X +1338(in)X +1420(main)X +1600(memory)X +1887(until)X +2053(commit)X +2317(or)X +2404(writing)X +2655(them)X +2835(to)X +2917(a)X +2973(temporary)X +3323(\256le.)X +755 4962(Since)N +956(neither)X +1202(constraint)X +1541(\(forcing)X +1823(pages)X +2029(on)X +2132(commit)X +2399(or)X +2489(buffering)X +2811(pages)X +3016(until)X +3184(commit\))X +3477(was)X +3624(feasible,)X +3916(we)X +4032(chose)X +4237(to)X +555 5052(log)N +683(both)X +851(before)X +1083(and)X +1225(after)X +1399(images.)X +1672(The)X +1823(only)X +1991(remaining)X +2342(consideration)X +2800(is)X +2879(when)X +3079(changes)X +3363(get)X +3486(written)X +3738(to)X +3825(disk.)X +4023(Changes)X +555 5142(affect)N +764(both)X +931(data)X +1090(pages)X +1298(and)X +1438(the)X +1560(log.)X +1726(If)X +1804(the)X +1926(changed)X +2218(data)X +2376(page)X +2552(is)X +2629(written)X +2880(before)X +3110(the)X +3232(log)X +3358(page,)X +3554(and)X +3694(the)X +3816(system)X +4062(crashes)X +555 5232(before)N +787(the)X +911(log)X +1039(page)X +1217(is)X +1296(written,)X +1569(the)X +1693(log)X +1820(will)X +1969(contain)X +2230(insuf\256cient)X +2615(information)X +3018(to)X +3105(undo)X +3290(the)X +3413(change.)X +3706(This)X +3873(violates)X +4147(tran-)X +555 5322(saction)N +803(semantics,)X +1160(since)X +1346(some)X +1536(changed)X +1825(data)X +1980(pages)X +2184(may)X +2343(not)X +2466(have)X +2638(been)X +2810(written,)X +3077(and)X +3213(the)X +3331(database)X +3628(cannot)X +3862(be)X +3958(restored)X +4237(to)X +555 5412(its)N +650(pre-transaction)X +1152(state.)X +755 5535(The)N +914(log)X +1050(record)X +1290(describing)X +1658(an)X +1768(update)X +2016(must)X +2205(be)X +2315(written)X +2576(to)X +2672(stable)X +2893(storage)X +3159(before)X +3398(the)X +3529(modi\256ed)X +3846(page.)X +4071(This)X +4246(is)X +3 f +555 5625(write-ahead)N +992(logging)X +1 f +1240(.)X +1307(If)X +1388(log)X +1517(records)X +1781(are)X +1907(safely)X +2126(written)X +2380(to)X +2469(disk,)X +2649(data)X +2810(pages)X +3020(may)X +3185(be)X +3288(written)X +3542(at)X +3627(any)X +3770(time)X +3939(afterwards.)X +555 5715(This)N +721(means)X +950(that)X +1094(the)X +1216(only)X +1382(\256le)X +1508(that)X +1652(ever)X +1815(needs)X +2022(to)X +2108(be)X +2208(forced)X +2438(to)X +2524(disk)X +2681(is)X +2758(the)X +2880(log.)X +3046(Since)X +3248(the)X +3370(log)X +3495(is)X +3571(append-only,)X +4015(modi\256ed)X + +4 p +%%Page: 4 4 +10 s 10 xH 0 xS 1 f +3 f +1 f +555 630(pages)N +760(always)X +1005(appear)X +1242(at)X +1322(the)X +1442(end)X +1580(and)X +1718(may)X +1878(be)X +1976(written)X +2224(to)X +2307(disk)X +2461(ef\256ciently)X +2807(in)X +2890(any)X +3027(\256le)X +3150(system)X +3393(that)X +3534(favors)X +3756(sequential)X +4102(order-)X +555 720(ing)N +677(\()X +2 f +704(e.g.)X +1 f +820(,)X +860(FFS,)X +1032(log-structured)X +1502(\256le)X +1624(system,)X +1886(or)X +1973(an)X +2069(extent-based)X +2495(system\).)X +3 f +555 906(3.1.2.)N +775(Concurrency)X +1245(Control)X +1 f +755 1029(The)N +918(concurrency)X +1354(control)X +1619(protocol)X +1923(is)X +2013(responsible)X +2415(for)X +2546(maintaining)X +2965(consistency)X +3376(in)X +3475(the)X +3610(presence)X +3929(of)X +4033(multiple)X +555 1119(accesses.)N +897(There)X +1114(are)X +1242(several)X +1499(alternative)X +1867(solutions)X +2183(such)X +2358(as)X +2453(locking,)X +2741(optimistic)X +3088(concurrency)X +3514(control)X +3769([KUNG81],)X +4183(and)X +555 1209(timestamp)N +912(ordering)X +1208([BERN80].)X +1619(Since)X +1821(optimistic)X +2164(methods)X +2459(and)X +2599(timestamp)X +2956(ordering)X +3252(are)X +3374(generally)X +3696(more)X +3884(complex)X +4183(and)X +555 1299(restrict)N +804(concurrency)X +1228(without)X +1498(eliminating)X +1888(starvation)X +2230(or)X +2323(deadlocks,)X +2690(we)X +2810(chose)X +3018(two-phase)X +3373(locking)X +3638(\(2PL\).)X +3890(Strict)X +4088(2PL)X +4246(is)X +555 1389(suboptimal)N +935(for)X +1054(certain)X +1297(data)X +1455(structures)X +1791(such)X +1962(as)X +2053(B-trees)X +2309(because)X +2588(it)X +2656(can)X +2792(limit)X +2966(concurrency,)X +3408(so)X +3503(we)X +3621(use)X +3752(a)X +3812(special)X +4059(locking)X +555 1479(protocol)N +842(based)X +1045(on)X +1145(one)X +1281(described)X +1609(in)X +1691([LEHM81].)X +755 1602(The)N +901(B-tree)X +1123(locking)X +1384(protocol)X +1672(we)X +1787(implemented)X +2226(releases)X +2502(locks)X +2691(at)X +2769(internal)X +3034(nodes)X +3241(in)X +3323(the)X +3441(tree)X +3582(as)X +3669(it)X +3733(descends.)X +4083(A)X +4161(lock)X +555 1692(on)N +658(an)X +757(internal)X +1025(page)X +1200(is)X +1276(always)X +1522(released)X +1808(before)X +2036(a)X +2094(lock)X +2254(on)X +2356(its)X +2453(child)X +2635(is)X +2710(obtained)X +3008(\(that)X +3177(is,)X +3272(locks)X +3463(are)X +3584(not)X +3 f +3708(coupled)X +1 f +3996([BAY77])X +555 1782(during)N +786(descent\).)X +1116(When)X +1330(a)X +1388(leaf)X +1531(\(or)X +1647(internal\))X +1941(page)X +2115(is)X +2190(split,)X +2369(a)X +2427(write)X +2614(lock)X +2774(is)X +2849(acquired)X +3148(on)X +3250(the)X +3370(parent)X +3593(before)X +3821(the)X +3941(lock)X +4100(on)X +4201(the)X +555 1872(just-split)N +855(page)X +1028(is)X +1102(released)X +1387(\(locks)X +1604(are)X +3 f +1724(coupled)X +1 f +2011(during)X +2241(ascent\).)X +2530(Write)X +2734(locks)X +2924(on)X +3025(internal)X +3291(pages)X +3495(are)X +3615(released)X +3899(immediately)X +555 1962(after)N +723(the)X +841(page)X +1013(is)X +1086(updated,)X +1380(but)X +1502(locks)X +1691(on)X +1791(leaf)X +1932(pages)X +2135(are)X +2254(held)X +2412(until)X +2578(the)X +2696(end)X +2832(of)X +2919(the)X +3037(transaction.)X +755 2085(Since)N +964(locks)X +1164(are)X +1294(released)X +1589(during)X +1828(descent,)X +2119(the)X +2247(structure)X +2558(of)X +2655(the)X +2783(tree)X +2934(may)X +3102(change)X +3360(above)X +3582(a)X +3648(node)X +3834(being)X +4042(used)X +4219(by)X +555 2175(some)N +752(process.)X +1061(If)X +1143(that)X +1291(process)X +1560(must)X +1743(later)X +1914(ascend)X +2161(the)X +2287(tree)X +2435(because)X +2717(of)X +2811(a)X +2874(page)X +3053(split,)X +3237(any)X +3380(such)X +3554(change)X +3809(must)X +3991(not)X +4120(cause)X +555 2265(confusion.)N +938(We)X +1077(use)X +1211(the)X +1336(technique)X +1675(described)X +2010(in)X +2099([LEHM81])X +2487(which)X +2710(exploits)X +2989(the)X +3113(ordering)X +3411(of)X +3504(data)X +3664(on)X +3770(a)X +3832(B-tree)X +4059(page)X +4237(to)X +555 2355(guarantee)N +888(that)X +1028(no)X +1128(process)X +1389(ever)X +1548(gets)X +1697(lost)X +1832(as)X +1919(a)X +1975(result)X +2173(of)X +2260(internal)X +2525(page)X +2697(updates)X +2962(made)X +3156(by)X +3256(other)X +3441(processes.)X +755 2478(If)N +836(a)X +899(transaction)X +1278(that)X +1425(updates)X +1697(a)X +1760(B-tree)X +1988(aborts,)X +2231(the)X +2356(user-visible)X +2757(changes)X +3043(to)X +3131(the)X +3255(tree)X +3402(must)X +3583(be)X +3685(rolled)X +3898(back.)X +4116(How-)X +555 2568(ever,)N +735(changes)X +1015(to)X +1097(the)X +1215(internal)X +1480(nodes)X +1687(of)X +1774(the)X +1892(tree)X +2033(need)X +2205(not)X +2327(be)X +2423(rolled)X +2630(back,)X +2822(since)X +3007(these)X +3192(pages)X +3395(contain)X +3651(no)X +3751(user-visible)X +4145(data.)X +555 2658(When)N +771(rolling)X +1008(back)X +1184(a)X +1244(transaction,)X +1640(we)X +1758(roll)X +1893(back)X +2069(all)X +2173(leaf)X +2318(page)X +2494(updates,)X +2783(but)X +2909(no)X +3013(internal)X +3281(insertions)X +3615(or)X +3705(page)X +3880(splits.)X +4111(In)X +4201(the)X +555 2748(worst)N +759(case,)X +944(this)X +1085(will)X +1235(leave)X +1431(a)X +1493(leaf)X +1640(page)X +1818(less)X +1964(than)X +2128(half)X +2279(full.)X +2456(This)X +2624(may)X +2788(cause)X +2993(poor)X +3166(space)X +3371(utilization,)X +3741(but)X +3869(does)X +4042(not)X +4170(lose)X +555 2838(user)N +709(data.)X +755 2961(Holding)N +1038(locks)X +1228(on)X +1329(leaf)X +1471(pages)X +1675(until)X +1842(transaction)X +2215(commit)X +2480(guarantees)X +2845(that)X +2986(no)X +3087(other)X +3273(process)X +3535(can)X +3668(insert)X +3866(or)X +3953(delete)X +4165(data)X +555 3051(that)N +711(has)X +854(been)X +1042(touched)X +1332(by)X +1448(this)X +1598(process.)X +1914(Rolling)X +2188(back)X +2375(insertions)X +2721(and)X +2872(deletions)X +3196(on)X +3311(leaf)X +3467(pages)X +3685(guarantees)X +4064(that)X +4219(no)X +555 3141(aborted)N +819(updates)X +1087(are)X +1209(ever)X +1371(visible)X +1607(to)X +1692(other)X +1880(transactions.)X +2326(Leaving)X +2612(page)X +2787(splits)X +2978(intact)X +3179(permits)X +3442(us)X +3536(to)X +3621(release)X +3867(internal)X +4134(write)X +555 3231(locks)N +744(early.)X +965(Thus)X +1145(transaction)X +1517(semantics)X +1853(are)X +1972(preserved,)X +2325(and)X +2461(locks)X +2650(are)X +2769(held)X +2927(for)X +3041(shorter)X +3284(periods.)X +755 3354(The)N +901(extra)X +1083(complexity)X +1464(introduced)X +1828(by)X +1929(this)X +2065(locking)X +2326(protocol)X +2614(appears)X +2881(substantial,)X +3264(but)X +3387(it)X +3452(is)X +3525(important)X +3856(for)X +3970(multi-user)X +555 3444(execution.)N +950(The)X +1118(bene\256ts)X +1410(of)X +1520(non-two-phase)X +2040(locking)X +2323(on)X +2446(B-trees)X +2721(are)X +2863(well)X +3044(established)X +3443(in)X +3548(the)X +3689(database)X +4009(literature)X +555 3534([BAY77],)N +899([LEHM81].)X +1320(If)X +1394(a)X +1450(process)X +1711(held)X +1869(locks)X +2058(until)X +2224(it)X +2288(committed,)X +2670(then)X +2828(a)X +2884(long-running)X +3322(update)X +3556(could)X +3754(lock)X +3912(out)X +4034(all)X +4134(other)X +555 3624(transactions)N +967(by)X +1076(preventing)X +1448(any)X +1593(other)X +1787(process)X +2057(from)X +2241(locking)X +2509(the)X +2635(root)X +2792(page)X +2972(of)X +3067(the)X +3193(tree.)X +3382(The)X +3535(B-tree)X +3764(locking)X +4032(protocol)X +555 3714(described)N +884(above)X +1096(guarantees)X +1460(that)X +1600(locks)X +1789(on)X +1889(internal)X +2154(pages)X +2357(are)X +2476(held)X +2634(for)X +2748(extremely)X +3089(short)X +3269(periods,)X +3545(thereby)X +3806(increasing)X +4156(con-)X +555 3804(currency.)N +3 f +555 3990(3.1.3.)N +775(Management)X +1245(of)X +1332(Shared)X +1596(Data)X +1 f +755 4113(Database)N +1075(systems)X +1353(permit)X +1587(many)X +1790(users)X +1980(to)X +2067(examine)X +2364(and)X +2505(update)X +2744(the)X +2866(same)X +3055(data)X +3213(concurrently.)X +3683(In)X +3774(order)X +3968(to)X +4054(provide)X +555 4203(this)N +702(concurrent)X +1078(access)X +1316(and)X +1464(enforce)X +1738(the)X +1868(write-ahead)X +2280(logging)X +2556(protocol)X +2855(described)X +3195(in)X +3289(section)X +3548(3.1.1,)X +3759(we)X +3884(use)X +4022(a)X +4089(shared)X +555 4293(memory)N +848(buffer)X +1071(manager.)X +1414(Not)X +1559(only)X +1726(does)X +1898(this)X +2038(provide)X +2308(the)X +2431(guarantees)X +2800(we)X +2919(require,)X +3192(but)X +3319(a)X +3380(user-level)X +3722(buffer)X +3944(manager)X +4246(is)X +555 4383(frequently)N +916(faster)X +1126(than)X +1295(using)X +1498(the)X +1626(\256le)X +1758(system)X +2010(buffer)X +2237(cache.)X +2491(Reads)X +2717(or)X +2814(writes)X +3040(involving)X +3376(the)X +3504(\256le)X +3636(system)X +3888(buffer)X +4115(cache)X +555 4473(often)N +746(require)X +1000(copying)X +1284(data)X +1444(between)X +1738(user)X +1898(and)X +2040(kernel)X +2266(space)X +2470(while)X +2673(a)X +2734(user-level)X +3076(buffer)X +3298(manager)X +3600(can)X +3737(return)X +3954(pointers)X +4237(to)X +555 4563(data)N +709(pages)X +912(directly.)X +1217(Additionally,)X +1661(if)X +1730(more)X +1915(than)X +2073(one)X +2209(process)X +2470(uses)X +2628(the)X +2746(same)X +2931(page,)X +3123(then)X +3281(fewer)X +3485(copies)X +3710(may)X +3868(be)X +3964(required.)X +3 f +555 4749(3.2.)N +715(Module)X +997(Architecture)X +1 f +755 4872(The)N +913(preceding)X +1262(sections)X +1552(described)X +1892(modules)X +2195(for)X +2321(managing)X +2669(the)X +2799(transaction)X +3183(log,)X +3337(locks,)X +3558(and)X +3706(a)X +3774(cache)X +3990(of)X +4089(shared)X +555 4962(buffers.)N +847(In)X +938(addition,)X +1244(we)X +1362(need)X +1538(to)X +1624(provide)X +1893(functionality)X +2326(for)X +2444(transaction)X +2 f +2819(begin)X +1 f +2997(,)X +2 f +3040(commit)X +1 f +3276(,)X +3319(and)X +2 f +3458(abort)X +1 f +3654(processing,)X +4040(necessi-)X +555 5052(tating)N +769(a)X +837(transaction)X +1221(manager.)X +1570(In)X +1669(order)X +1871(to)X +1965(arbitrate)X +2265(concurrent)X +2641(access)X +2879(to)X +2973(locks)X +3173(and)X +3320(buffers,)X +3599(we)X +3724(include)X +3991(a)X +4058(process)X +555 5142(management)N +995(module)X +1264(which)X +1489(manages)X +1799(a)X +1864(collection)X +2209(of)X +2305(semaphores)X +2713(used)X +2889(to)X +2980(block)X +3187(and)X +3332(release)X +3585(processes.)X +3962(Finally,)X +4237(in)X +555 5232(order)N +752(to)X +841(provide)X +1113(a)X +1176(simple,)X +1436(standard)X +1735(interface)X +2044(we)X +2165(have)X +2344(modi\256ed)X +2655(the)X +2780(database)X +3084(access)X +3317(routines)X +3602(\()X +3 f +3629(db)X +1 f +3717(\(3\)\).)X +3904(For)X +4041(the)X +4165(pur-)X +555 5322(poses)N +758(of)X +850(this)X +990(paper)X +1194(we)X +1313(call)X +1453(the)X +1575(modi\256ed)X +1883(package)X +2171(the)X +3 f +2293(Record)X +2567(Manager)X +1 f +2879(.)X +2943(Figure)X +3176(one)X +3316(shows)X +3540(the)X +3662(main)X +3846(interfaces)X +4183(and)X +555 5412(architecture)N +955(of)X +1042(LIBTP.)X + +5 p +%%Page: 5 5 +10 s 10 xH 0 xS 1 f +3 f +1 f +11 s +1851 1520(log_commit)N +2764 2077(buf_unpin)N +2764 1987(buf_get)N +3633 1408(buf_unpin)N +3633 1319(buf_pin)N +3633 1230(buf_get)N +3 f +17 s +1163 960(Txn)N +1430(M)X +1559(anager)X +2582(Record)X +3040(M)X +3169(anager)X +1 Dt +2363 726 MXY +0 355 Dl +1426 0 Dl +0 -355 Dl +-1426 0 Dl +3255 1616 MXY +0 535 Dl +534 0 Dl +0 -535 Dl +-534 0 Dl +2185 MX +0 535 Dl +535 0 Dl +0 -535 Dl +-535 0 Dl +1116 MX +0 535 Dl +534 0 Dl +0 -535 Dl +-534 0 Dl +726 MY +0 355 Dl +891 0 Dl +0 -355 Dl +-891 0 Dl +1 f +11 s +2207 1297(lock)N +2564 1386(log)N +865(unlock_all)X +1851 1609(log_unroll)N +1650 2508 MXY +0 178 Dl +1605 0 Dl +0 -178 Dl +-1605 0 Dl +1294 1616 MXY +19 -30 Dl +-19 11 Dl +-20 -11 Dl +20 30 Dl +0 -535 Dl +2319 2508 MXY +-22 -30 Dl +4 23 Dl +-18 14 Dl +36 -7 Dl +-936 -357 Dl +3277 2455(sleep_on)N +1405 1616 MXY +36 4 Dl +-18 -13 Dl +1 -22 Dl +-19 31 Dl +1070 -535 Dl +2631 2508 MXY +36 6 Dl +-18 -14 Dl +3 -22 Dl +-21 30 Dl +891 -357 Dl +1426 2455(sleep_on)N +3255 1884 MXY +-31 -20 Dl +11 20 Dl +-11 19 Dl +31 -19 Dl +-535 0 Dl +1554 2366(wake)N +3277(wake)X +2185 1884 MXY +-31 -20 Dl +12 20 Dl +-12 19 Dl +31 -19 Dl +-356 0 Dl +0 -803 Dl +3 f +17 s +1236 1851(Lock)N +1118 2030(M)N +1247(anager)X +2339 1851(Log)N +2187 2030(M)N +2316(anager)X +3333 1851(Buffer)N +3257 2030(M)N +3386(anager)X +3522 1616 MXY +20 -30 Dl +-20 11 Dl +-20 -11 Dl +20 30 Dl +0 -535 Dl +1950 2654(Process)N +2424(M)X +2553(anager)X +2542 1616 MXY +19 -30 Dl +-19 11 Dl +-20 -11 Dl +20 30 Dl +0 -535 Dl +1 f +11 s +2207 1364(unlock)N +2452 2508 MXY +20 -31 Dl +-20 11 Dl +-19 -11 Dl +19 31 Dl +0 -357 Dl +2497 2322(sleep_on)N +2497 2233(wake)N +3 Dt +-1 Ds +3 f +10 s +1790 2830(Figure)N +2037(1:)X +2144(Library)X +2435(module)X +2708(interfaces.)X +1 f +10 f +555 3010(h)N +579(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)X +3 f +555 3286(3.2.1.)N +775(The)X +928(Log)X +1081(Manager)X +1 f +755 3409(The)N +3 f +907(Log)X +1067(Manager)X +1 f +1406(enforces)X +1706(the)X +1831(write-ahead)X +2238(logging)X +2509(protocol.)X +2843(Its)X +2949(primitive)X +3268(operations)X +3628(are)X +2 f +3753(log)X +1 f +3855(,)X +2 f +3901(log_commit)X +1 f +4279(,)X +2 f +555 3499(log_read)N +1 f +844(,)X +2 f +889(log_roll)X +1 f +1171(and)X +2 f +1312(log_unroll)X +1 f +1649(.)X +1714(The)X +2 f +1864(log)X +1 f +1991(call)X +2132(performs)X +2447(a)X +2508(buffered)X +2806(write)X +2996(of)X +3088(the)X +3211(speci\256ed)X +3520(log)X +3646(record)X +3876(and)X +4016(returns)X +4263(a)X +555 3589(unique)N +809(log)X +947(sequence)X +1278(number)X +1559(\(LSN\).)X +1840(This)X +2017(LSN)X +2203(may)X +2376(then)X +2549(be)X +2660(used)X +2842(to)X +2939(retrieve)X +3220(a)X +3291(record)X +3532(from)X +3723(the)X +3856(log)X +3993(using)X +4201(the)X +2 f +555 3679(log_read)N +1 f +865(call.)X +1042(The)X +2 f +1188(log)X +1 f +1311(interface)X +1614(knows)X +1844(very)X +2008(little)X +2175(about)X +2374(the)X +2493(internal)X +2759(format)X +2993(of)X +3080(the)X +3198(log)X +3320(records)X +3577(it)X +3641(receives.)X +3965(Rather,)X +4219(all)X +555 3769(log)N +681(records)X +942(are)X +1065 0.4028(referenced)AX +1430(by)X +1534(a)X +1594(header)X +1833(structure,)X +2158(a)X +2218(log)X +2344(record)X +2574(type,)X +2756(and)X +2896(a)X +2956(character)X +3276(buffer)X +3497(containing)X +3859(the)X +3981(data)X +4138(to)X +4223(be)X +555 3859(logged.)N +834(The)X +980(log)X +1103(record)X +1330(type)X +1489(is)X +1563(used)X +1731(to)X +1814(call)X +1951(the)X +2070(appropriate)X +2457(redo)X +2621(and)X +2758(undo)X +2939(routines)X +3217(during)X +2 f +3446(abort)X +1 f +3639(and)X +2 f +3775(commit)X +1 f +4031(process-)X +555 3949(ing.)N +721(While)X +941(we)X +1059(have)X +1235(used)X +1406(the)X +3 f +1528(Log)X +1684(Manager)X +1 f +2019(to)X +2104(provide)X +2372(before)X +2601(and)X +2740(after)X +2911(image)X +3130(logging,)X +3417(it)X +3484(may)X +3645(also)X +3797(be)X +3896(used)X +4066(for)X +4183(any)X +555 4039(of)N +642(the)X +760(logging)X +1024(algorithms)X +1386(discussed.)X +755 4162(The)N +2 f +905(log_commit)X +1 f +1308(operation)X +1636(behaves)X +1920(exactly)X +2177(like)X +2322(the)X +2 f +2445(log)X +1 f +2572(operation)X +2900(but)X +3026(guarantees)X +3394(that)X +3538(the)X +3660(log)X +3786(has)X +3917(been)X +4093(forced)X +555 4252(to)N +643(disk)X +802(before)X +1034(returning.)X +1394(A)X +1478(discussion)X +1837(of)X +1930(our)X +2063(commit)X +2333(strategy)X +2613(appears)X +2884(in)X +2971(the)X +3094(implementation)X +3621(section)X +3873(\(section)X +4152(4.2\).)X +2 f +555 4342(Log_unroll)N +1 f +935(reads)X +1126(log)X +1249(records)X +1507(from)X +1684(the)X +1803(log,)X +1946(following)X +2278(backward)X +2611(transaction)X +2983(pointers)X +3261(and)X +3397(calling)X +3635(the)X +3753(appropriate)X +4139(undo)X +555 4432(routines)N +839(to)X +927(implement)X +1295(transaction)X +1673(abort.)X +1904(In)X +1997(a)X +2059(similar)X +2307(manner,)X +2 f +2594(log_roll)X +1 f +2877(reads)X +3073(log)X +3201(records)X +3464(sequentially)X +3877(forward,)X +4178(cal-)X +555 4522(ling)N +699(the)X +817(appropriate)X +1203(redo)X +1366(routines)X +1644(to)X +1726(recover)X +1988(committed)X +2350(transactions)X +2753(after)X +2921(a)X +2977(system)X +3219(crash.)X +3 f +555 4708(3.2.2.)N +775(The)X +928(Buffer)X +1171(Manager)X +1 f +755 4831(The)N +3 f +912(Buffer)X +1167(Manager)X +1 f +1511(uses)X +1681(a)X +1749(pool)X +1923(of)X +2022(shared)X +2264(memory)X +2563(to)X +2657(provide)X +2934(a)X +3002(least-recently-used)X +3641(\(LRU\))X +3886(block)X +4095(cache.)X +555 4921(Although)N +886(the)X +1013(current)X +1270(library)X +1513(provides)X +1818(an)X +1923(LRU)X +2112(cache,)X +2345(it)X +2418(would)X +2647(be)X +2752(simple)X +2994(to)X +3085(add)X +3229(alternate)X +3534(replacement)X +3955(policies)X +4232(as)X +555 5011(suggested)N +903(by)X +1015([CHOU85])X +1408(or)X +1507(to)X +1601(provide)X +1878(multiple)X +2176(buffer)X +2405(pools)X +2610(with)X +2784(different)X +3092(policies.)X +3412(Transactions)X +3853(request)X +4116(pages)X +555 5101(from)N +736(the)X +859(buffer)X +1081(manager)X +1383(and)X +1524(keep)X +1701(them)X +3 f +1886(pinned)X +1 f +2145(to)X +2232(ensure)X +2466(that)X +2610(they)X +2772(are)X +2895(not)X +3021(written)X +3272(to)X +3358(disk)X +3515(while)X +3717(they)X +3879(are)X +4002(in)X +4088(a)X +4148(logi-)X +555 5191(cally)N +732(inconsistent)X +1135(state.)X +1343(When)X +1556(page)X +1729(replacement)X +2143(is)X +2217(necessary,)X +2571(the)X +3 f +2689(Buffer)X +2932(Manager)X +1 f +3264(\256nds)X +3439(an)X +3535(unpinned)X +3853(page)X +4025(and)X +4161(then)X +555 5281(checks)N +794(with)X +956(the)X +3 f +1074(Log)X +1227(Manager)X +1 f +1559(to)X +1641(ensure)X +1871(that)X +2011(the)X +2129(write-ahead)X +2529(protocol)X +2816(is)X +2889(enforced.)X +3 f +555 5467(3.2.3.)N +775(The)X +928(Lock)X +1121(Manager)X +1 f +755 5590(The)N +3 f +901(Lock)X +1095(Manager)X +1 f +1428(supports)X +1720(general)X +1978(purpose)X +2253(locking)X +2514(\(single)X +2753(writer,)X +2986(multiple)X +3273(readers\))X +3553(which)X +3769(is)X +3842(currently)X +4152(used)X +555 5680(to)N +638(provide)X +904(two-phase)X +1254(locking)X +1514(and)X +1650(high)X +1812(concurrency)X +2230(B-tree)X +2451(locking.)X +2751(However,)X +3086(the)X +3204(general)X +3461(purpose)X +3735(nature)X +3956(of)X +4043(the)X +4161(lock)X + +6 p +%%Page: 6 6 +10 s 10 xH 0 xS 1 f +3 f +1 f +555 630(manager)N +857(provides)X +1158(the)X +1281(ability)X +1510(to)X +1597(support)X +1862(a)X +1923(variety)X +2171(of)X +2263(locking)X +2528(protocols.)X +2890(Currently,)X +3241(all)X +3345(locks)X +3538(are)X +3661(issued)X +3885(at)X +3967(the)X +4089(granu-)X +555 720(larity)N +747(of)X +837(a)X +896(page)X +1071(\(the)X +1219(size)X +1367(of)X +1457(a)X +1516(buffer)X +1736(in)X +1821(the)X +1942(buffer)X +2161(pool\))X +2352(which)X +2570(is)X +2645(identi\256ed)X +2969(by)X +3071(two)X +3213(4-byte)X +3440(integers)X +3716(\(a)X +3801(\256le)X +3925(id)X +4009(and)X +4147(page)X +555 810(number\).)N +898(This)X +1071(provides)X +1378(the)X +1507(necessary)X +1851(information)X +2259(to)X +2351(extend)X +2595(the)X +3 f +2723(Lock)X +2926(Manager)X +1 f +3268(to)X +3360(perform)X +3649(hierarchical)X +4059(locking)X +555 900([GRAY76].)N +982(The)X +1133(current)X +1387(implementation)X +1915(does)X +2088(not)X +2216(support)X +2482(locks)X +2677(at)X +2760(other)X +2950(granularities)X +3376(and)X +3517(does)X +3689(not)X +3816(promote)X +4108(locks;)X +555 990(these)N +740(are)X +859(obvious)X +1132(future)X +1344(additions)X +1657(to)X +1739(the)X +1857(system.)X +755 1113(If)N +831(an)X +929(incoming)X +1253(lock)X +1413(request)X +1667(cannot)X +1903(be)X +2001(granted,)X +2284(the)X +2404(requesting)X +2760(process)X +3023(is)X +3098(queued)X +3352(for)X +3467(the)X +3586(lock)X +3745(and)X +3882(descheduled.)X +555 1203(When)N +769(a)X +827(lock)X +987(is)X +1062(released,)X +1368(the)X +1488(wait)X +1647(queue)X +1860(is)X +1934(traversed)X +2250(and)X +2387(any)X +2524(newly)X +2741(compatible)X +3118(locks)X +3308(are)X +3428(granted.)X +3730(Locks)X +3947(are)X +4067(located)X +555 1293(via)N +680(a)X +743(\256le)X +872(and)X +1015(page)X +1194(hash)X +1368(table)X +1551(and)X +1694(are)X +1820(chained)X +2097(both)X +2266(by)X +2373(object)X +2595(and)X +2737(by)X +2843(transaction,)X +3241(facilitating)X +3614(rapid)X +3805(traversal)X +4108(of)X +4201(the)X +555 1383(lock)N +713(table)X +889(during)X +1118(transaction)X +1490(commit)X +1754(and)X +1890(abort.)X +755 1506(The)N +907(primary)X +1188(interfaces)X +1528(to)X +1617(the)X +1742(lock)X +1907(manager)X +2211(are)X +2 f +2337(lock)X +1 f +2471(,)X +2 f +2518(unlock)X +1 f +2732(,)X +2779(and)X +2 f +2922(lock_unlock_all)X +1 f +3434(.)X +2 f +3500(Lock)X +1 f +3682(obtains)X +3939(a)X +4001(new)X +4161(lock)X +555 1596(for)N +680(a)X +747(speci\256c)X +1023(object.)X +1290(There)X +1509(are)X +1638(also)X +1797(two)X +1947(variants)X +2231(of)X +2328(the)X +2 f +2456(lock)X +1 f +2620(request,)X +2 f +2902(lock_upgrade)X +1 f +3373(and)X +2 f +3519(lock_downgrade)X +1 f +4053(,)X +4103(which)X +555 1686(allow)N +755(the)X +875(caller)X +1076(to)X +1160(atomically)X +1519(trade)X +1701(a)X +1758(lock)X +1917(of)X +2005(one)X +2142(type)X +2301(for)X +2416(a)X +2473(lock)X +2632(of)X +2720(another.)X +2 f +3022(Unlock)X +1 f +3275(releases)X +3551(a)X +3608(speci\256c)X +3874(mode)X +4073(of)X +4161(lock)X +555 1776(on)N +655(a)X +711(speci\256c)X +976(object.)X +2 f +1232(Lock_unlock_all)X +1 f +1786(releases)X +2061(all)X +2161(the)X +2279(locks)X +2468(associated)X +2818(with)X +2980(a)X +3036(speci\256c)X +3301(transaction.)X +3 f +555 1962(3.2.4.)N +775(The)X +928(Process)X +1207(Manager)X +1 f +755 2085(The)N +3 f +900(Process)X +1179(Manager)X +1 f +1511(acts)X +1656(as)X +1743(a)X +1799(user-level)X +2136(scheduler)X +2464(to)X +2546(make)X +2740(processes)X +3068(wait)X +3226(on)X +3326(unavailable)X +3716(locks)X +3905(and)X +4041(pending)X +555 2175(buffer)N +778(cache)X +988(I/O.)X +1161(For)X +1297(each)X +1470(process,)X +1756(a)X +1817(semaphore)X +2190(is)X +2268(maintained)X +2649(upon)X +2834(which)X +3055(that)X +3200(process)X +3466(waits)X +3660(when)X +3859(it)X +3928(needs)X +4136(to)X +4223(be)X +555 2265(descheduled.)N +1014(When)X +1228(a)X +1286(process)X +1549(needs)X +1754(to)X +1838(be)X +1936(run,)X +2084(its)X +2180(semaphore)X +2549(is)X +2623(cleared,)X +2897(and)X +3034(the)X +3153(operating)X +3477(system)X +3720(reschedules)X +4116(it.)X +4201(No)X +555 2355(sophisticated)N +1002(scheduling)X +1378(algorithm)X +1718(is)X +1799(applied;)X +2085(if)X +2162(the)X +2288(lock)X +2454(for)X +2576(which)X +2800(a)X +2864(process)X +3133(was)X +3286(waiting)X +3554(becomes)X +3863(available,)X +4201(the)X +555 2445(process)N +824(is)X +905(made)X +1107(runnable.)X +1456(It)X +1533(would)X +1761(have)X +1941(been)X +2121(possible)X +2411(to)X +2501(change)X +2757(the)X +2883(kernel's)X +3170(process)X +3439(scheduler)X +3775(to)X +3865(interact)X +4134(more)X +555 2535(ef\256ciently)N +900(with)X +1062(the)X +1180(lock)X +1338(manager,)X +1655(but)X +1777(doing)X +1979(so)X +2070(would)X +2290(have)X +2462(compromised)X +2918(our)X +3045(commitment)X +3469(to)X +3551(a)X +3607(user-level)X +3944(package.)X +3 f +555 2721(3.2.5.)N +775(The)X +928(Transaction)X +1361(Manager)X +1 f +755 2844(The)N +3 f +901(Transaction)X +1335(Manager)X +1 f +1668(provides)X +1965(the)X +2084(standard)X +2377(interface)X +2680(of)X +2 f +2768(txn_begin)X +1 f +3084(,)X +2 f +3125(txn_commit)X +1 f +3499(,)X +3540(and)X +2 f +3676(txn_abort)X +1 f +3987(.)X +4047(It)X +4116(keeps)X +555 2934(track)N +742(of)X +835(all)X +941(active)X +1159(transactions,)X +1588(assigns)X +1845(unique)X +2089(transaction)X +2467(identi\256ers,)X +2833(and)X +2974(directs)X +3213(the)X +3336(abort)X +3526(and)X +3667(commit)X +3936(processing.)X +555 3024(When)N +772(a)X +2 f +833(txn_begin)X +1 f +1174(is)X +1252(issued,)X +1497(the)X +3 f +1620(Transaction)X +2058(Manager)X +1 f +2395(assigns)X +2651(the)X +2773(next)X +2935(available)X +3249(transaction)X +3625(identi\256er,)X +3958(allocates)X +4263(a)X +555 3114(per-process)N +948(transaction)X +1322(structure)X +1625(in)X +1709(shared)X +1941(memory,)X +2249(increments)X +2622(the)X +2741(count)X +2940(of)X +3028(active)X +3241(transactions,)X +3665(and)X +3802(returns)X +4046(the)X +4165(new)X +555 3204(transaction)N +937(identi\256er)X +1256(to)X +1348(the)X +1476(calling)X +1724(process.)X +2034(The)X +2188(in-memory)X +2573(transaction)X +2954(structure)X +3264(contains)X +3560(a)X +3625(pointer)X +3881(into)X +4034(the)X +4161(lock)X +555 3294(table)N +734(for)X +851(locks)X +1043(held)X +1204(by)X +1307(this)X +1445(transaction,)X +1840(the)X +1961(last)X +2095(log)X +2220(sequence)X +2538(number,)X +2826(a)X +2885(transaction)X +3260(state)X +3430(\()X +2 f +3457(idle)X +1 f +(,)S +2 f +3620(running)X +1 f +3873(,)X +2 f +3915(aborting)X +1 f +4190(,)X +4232(or)X +2 f +555 3384(committing\))N +1 f +942(,)X +982(an)X +1078(error)X +1255(code,)X +1447(and)X +1583(a)X +1639(semaphore)X +2007(identi\256er.)X +755 3507(At)N +859(commit,)X +1147(the)X +3 f +1269(Transaction)X +1706(Manager)X +1 f +2042(calls)X +2 f +2213(log_commit)X +1 f +2615(to)X +2700(record)X +2929(the)X +3050(end)X +3189(of)X +3279(transaction)X +3654(and)X +3793(to)X +3878(\257ush)X +4056(the)X +4177(log.)X +555 3597(Then)N +743(it)X +810(directs)X +1047(the)X +3 f +1168(Lock)X +1364(Manager)X +1 f +1699(to)X +1784(release)X +2031(all)X +2134(locks)X +2325(associated)X +2677(with)X +2841(the)X +2961(given)X +3161(transaction.)X +3575(If)X +3651(a)X +3709(transaction)X +4083(aborts,)X +555 3687(the)N +3 f +680(Transaction)X +1120(Manager)X +1 f +1459(calls)X +1633(on)X +2 f +1739(log_unroll)X +1 f +2102(to)X +2190(read)X +2355(the)X +2479(transaction's)X +2915(log)X +3043(records)X +3306(and)X +3448(undo)X +3634(any)X +3776(modi\256cations)X +4237(to)X +555 3777(the)N +673(database.)X +1010(As)X +1119(in)X +1201(the)X +1319(commit)X +1583(case,)X +1762(it)X +1826(then)X +1984(calls)X +2 f +2151(lock_unlock_all)X +1 f +2683(to)X +2765(release)X +3009(the)X +3127(transaction's)X +3557(locks.)X +3 f +555 3963(3.2.6.)N +775(The)X +928(Record)X +1198(Manager)X +1 f +755 4086(The)N +3 f +919(Record)X +1208(Manager)X +1 f +1559(supports)X +1869(the)X +2006(abstraction)X +2397(of)X +2503(reading)X +2783(and)X +2938(writing)X +3208(records)X +3484(to)X +3585(a)X +3660(database.)X +3996(We)X +4147(have)X +555 4176(modi\256ed)N +861(the)X +981(the)X +1101(database)X +1399(access)X +1626(routines)X +3 f +1905(db)X +1 f +1993(\(3\))X +2108([BSD91])X +2418(to)X +2501(call)X +2638(the)X +2757(log,)X +2900(lock,)X +3079(and)X +3216(buffer)X +3434(managers.)X +3803(In)X +3891(order)X +4082(to)X +4165(pro-)X +555 4266(vide)N +718(functionality)X +1152(to)X +1239(perform)X +1523(undo)X +1708(and)X +1849(redo,)X +2037(the)X +3 f +2160(Record)X +2434(Manager)X +1 f +2770(de\256nes)X +3021(a)X +3081(collection)X +3421(of)X +3512(log)X +3638(record)X +3868(types)X +4061(and)X +4201(the)X +555 4356(associated)N +920(undo)X +1115(and)X +1266(redo)X +1444(routines.)X +1777(The)X +3 f +1937(Log)X +2105(Manager)X +1 f +2452(performs)X +2777(a)X +2848(table)X +3039(lookup)X +3296(on)X +3411(the)X +3543(record)X +3783(type)X +3955(to)X +4051(call)X +4201(the)X +555 4446(appropriate)N +951(routines.)X +1299(For)X +1440(example,)X +1762(the)X +1890(B-tree)X +2121(access)X +2356(method)X +2625(requires)X +2913(two)X +3062(log)X +3193(record)X +3428(types:)X +3648(insert)X +3855(and)X +4000(delete.)X +4241(A)X +555 4536(replace)N +808(operation)X +1131(is)X +1204(implemented)X +1642(as)X +1729(a)X +1785(delete)X +1997(followed)X +2302(by)X +2402(an)X +2498(insert)X +2696(and)X +2832(is)X +2905(logged)X +3143(accordingly.)X +3 f +555 4722(3.3.)N +715(Application)X +1134(Architectures)X +1 f +755 4845(The)N +907(structure)X +1215(of)X +1309(LIBTP)X +1558(allows)X +1794(application)X +2177(designers)X +2507(to)X +2596(trade)X +2784(off)X +2905(performance)X +3339(and)X +3481(protection.)X +3872(Since)X +4076(a)X +4138(large)X +555 4935(portion)N +810(of)X +901(LIBTP's)X +1205(functionality)X +1638(is)X +1715(provided)X +2024(by)X +2128(managing)X +2468(structures)X +2804(in)X +2889(shared)X +3122(memory,)X +3432(its)X +3530(structures)X +3865(are)X +3987(subject)X +4237(to)X +555 5025(corruption)N +926(by)X +1043(applications)X +1467(when)X +1678(the)X +1813(library)X +2064(is)X +2154(linked)X +2391(directly)X +2673(with)X +2852(the)X +2987(application.)X +3420(For)X +3568(this)X +3720(reason,)X +3987(LIBTP)X +4246(is)X +555 5115(designed)N +864(to)X +950(allow)X +1152(compilation)X +1558(into)X +1706(a)X +1766(separate)X +2053(server)X +2273(process)X +2537(which)X +2756(may)X +2917(be)X +3016(accessed)X +3321(via)X +3442(a)X +3501(socket)X +3729(interface.)X +4094(In)X +4184(this)X +555 5205(way)N +712(LIBTP's)X +1015(data)X +1172(structures)X +1507(are)X +1629(protected)X +1951(from)X +2130(application)X +2509(code,)X +2704(but)X +2829(communication)X +3349(overhead)X +3666(is)X +3741(increased.)X +4107(When)X +555 5295(applications)N +975(are)X +1107(trusted,)X +1377(LIBTP)X +1631(may)X +1801(be)X +1909(compiled)X +2239(directly)X +2516(into)X +2672(the)X +2802(application)X +3190(providing)X +3533(improved)X +3872(performance.)X +555 5385(Figures)N +815(two)X +955(and)X +1091(three)X +1272(show)X +1461(the)X +1579(two)X +1719(alternate)X +2016(application)X +2392(architectures.)X +755 5508(There)N +964(are)X +1084(potentially)X +1447(two)X +1588(modes)X +1818(in)X +1901(which)X +2118(one)X +2255(might)X +2462(use)X +2590(LIBTP)X +2833(in)X +2916(a)X +2972(server)X +3189(based)X +3392(architecture.)X +3832(In)X +3919(the)X +4037(\256rst,)X +4201(the)X +555 5598(server)N +778(would)X +1004(provide)X +1275(the)X +1399(capability)X +1741(to)X +1829(respond)X +2109(to)X +2197(requests)X +2486(to)X +2574(each)X +2747(of)X +2839(the)X +2962(low)X +3107(level)X +3288(modules)X +3584(\(lock,)X +3794(log,)X +3941(buffer,)X +4183(and)X +555 5688(transaction)N +944(managers\).)X +1356(Unfortunately,)X +1863(the)X +1998(performance)X +2442(of)X +2546(such)X +2730(a)X +2803(system)X +3062(is)X +3152(likely)X +3371(to)X +3470(be)X +3583(blindingly)X +3947(slow)X +4134(since)X + +7 p +%%Page: 7 7 +10 s 10 xH 0 xS 1 f +3 f +1 f +1 Dt +1864 1125 MXY +15 -26 Dl +-15 10 Dl +-14 -10 Dl +14 26 Dl +0 -266 Dl +1315 1125 MXY +15 -26 Dl +-15 10 Dl +-14 -10 Dl +14 26 Dl +0 -266 Dl +3 Dt +1133 1125 MXY +0 798 Dl +931 0 Dl +0 -798 Dl +-931 0 Dl +1 Dt +1266 1257 MXY +0 133 Dl +665 0 Dl +0 -133 Dl +-665 0 Dl +3 f +8 s +1513 1351(driver)N +1502 1617(LIBTP)N +1266 1390 MXY +0 400 Dl +665 0 Dl +0 -400 Dl +-665 0 Dl +3 Dt +1133 726 MXY +0 133 Dl +931 0 Dl +0 -133 Dl +-931 0 Dl +1 f +1029 1098(txn_abort)N +964 1015(txn_commit)N +1018 932(txn_begin)N +1910 1015(db_ops)N +3 f +1308 820(Application)N +1645(Program)X +1398 1218(Server)N +1594(Process)X +1 f +1390 986(socket)N +1569(interface)X +1 Dt +1848 967 MXY +-23 -14 Dl +8 14 Dl +-8 15 Dl +23 -15 Dl +-50 0 Dl +1324 MX +23 15 Dl +-9 -15 Dl +9 -14 Dl +-23 14 Dl +50 0 Dl +3 Dt +2862 859 MXY +0 1064 Dl +932 0 Dl +0 -1064 Dl +-932 0 Dl +1 Dt +3178 1390 MXY +24 -12 Dl +-17 0 Dl +-8 -15 Dl +1 27 Dl +150 -265 Dl +3494 1390 MXY +0 -27 Dl +-8 15 Dl +-16 1 Dl +24 11 Dl +-166 -265 Dl +3 f +3232 1617(LIBTP)N +2995 1390 MXY +0 400 Dl +666 0 Dl +0 -400 Dl +-666 0 Dl +992 MY +0 133 Dl +666 0 Dl +0 -133 Dl +-666 0 Dl +3168 1086(Application)N +1 f +2939 1201(txn_begin)N +2885 1284(txn_commit)N +2950 1368(txn_abort)N +3465 1284(db_ops)N +3 f +3155 766(Single)N +3339(Process)X +3 Dt +-1 Ds +811 2100(Figure)N +1023(2:)X +1107(Server)X +1318(Architecture.)X +1 f +1727(In)X +1811(this)X +1934(con\256guration,)X +811 2190(the)N +916(library)X +1113(is)X +1183(loaded)X +1380(into)X +1507(a)X +1562(server)X +1744(process)X +1962(which)X +2145(is)X +2214(ac-)X +811 2280(cessed)N +993(via)X +1087(a)X +1131(socket)X +1310(interface.)X +3 f +2563 2100(Figure)N +2803(3:)X +2914(Single)X +3140(Process)X +3403(Architecture.)X +1 f +3839(In)X +3950(this)X +2563 2190(con\256guration,)N +2948(the)X +3053(library)X +3250(routines)X +3483(are)X +3587(loaded)X +3784(as)X +3864(part)X +3990(of)X +2563 2280(the)N +2657(application)X +2957(and)X +3065(accessed)X +3303(via)X +3397(a)X +3441(subroutine)X +3727(interface.)X +10 s +10 f +555 2403(h)N +579(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)X +1 f +555 2679(modifying)N +909(a)X +966(piece)X +1157(of)X +1245(data)X +1400(would)X +1621(require)X +1870(three)X +2051(or)X +2138(possibly)X +2424(four)X +2578(separate)X +2862(communications:)X +3433(one)X +3569(to)X +3651(lock)X +3809(the)X +3927(data,)X +4101(one)X +4237(to)X +555 2769(obtain)N +781(the)X +905(data,)X +1085(one)X +1227(to)X +1315(log)X +1443(the)X +1567(modi\256cation,)X +2017(and)X +2159(possibly)X +2451(one)X +2593(to)X +2681(transmit)X +2969(the)X +3093(modi\256ed)X +3403(data.)X +3583(Figure)X +3817(four)X +3976(shows)X +4201(the)X +555 2859(relative)N +826(performance)X +1263(for)X +1387(retrieving)X +1728(a)X +1793(single)X +2013(record)X +2248(using)X +2450(the)X +2577(record)X +2812(level)X +2997(call)X +3142(versus)X +3376(using)X +3578(the)X +3705(lower)X +3917(level)X +4102(buffer)X +555 2949(management)N +987(and)X +1125(locking)X +1387(calls.)X +1616(The)X +1763(2:1)X +1887(ratio)X +2056(observed)X +2367(in)X +2450(the)X +2569(single)X +2781(process)X +3043(case)X +3203(re\257ects)X +3456(the)X +3575(additional)X +3916(overhead)X +4232(of)X +555 3039(parsing)N +819(eight)X +1006(commands)X +1380(rather)X +1595(than)X +1760(one)X +1903(while)X +2108(the)X +2233(3:1)X +2362(ratio)X +2536(observed)X +2853(in)X +2942(the)X +3067(client/server)X +3491(architecture)X +3898(re\257ects)X +4157(both)X +555 3129(the)N +679(parsing)X +941(and)X +1083(the)X +1207(communication)X +1731(overheard.)X +2118(Although)X +2445(there)X +2631(may)X +2794(be)X +2895(applications)X +3307(which)X +3528(could)X +3731(tolerate)X +3997(such)X +4169(per-)X +555 3219(formance,)N +904(it)X +973(seems)X +1194(far)X +1309(more)X +1499(feasible)X +1774(to)X +1861(support)X +2126(a)X +2187(higher)X +2417(level)X +2597(interface,)X +2923(such)X +3094(as)X +3185(that)X +3329(provided)X +3638(by)X +3742(a)X +3802(query)X +4009(language)X +555 3309(\()N +2 f +582(e.g.)X +1 f +718(SQL)X +889([SQL86]\).)X +755 3432(Although)N +1081(LIBTP)X +1327(does)X +1498(not)X +1624(have)X +1800(an)X +1900(SQL)X +2075(parser,)X +2316(we)X +2433(have)X +2608(built)X +2777(a)X +2836(server)X +3056(application)X +3435(using)X +3631(the)X +3752(toolkit)X +3983(command)X +555 3522(language)N +882(\(TCL\))X +1124([OUST90].)X +1544(The)X +1706(server)X +1940(supports)X +2248(a)X +2321(command)X +2674(line)X +2831(interface)X +3150(similar)X +3409(to)X +3508(the)X +3643(subroutine)X +4017(interface)X +555 3612(de\256ned)N +811(in)X +3 f +893(db)X +1 f +981(\(3\).)X +1135(Since)X +1333(it)X +1397(is)X +1470(based)X +1673(on)X +1773(TCL,)X +1964(it)X +2028(provides)X +2324(control)X +2571(structures)X +2903(as)X +2990(well.)X +3 f +555 3798(4.)N +655(Implementation)X +1 f +3 f +555 3984(4.1.)N +715(Locking)X +1014(and)X +1162(Deadlock)X +1502(Detection)X +1 f +755 4107(LIBTP)N +1007(uses)X +1175(two-phase)X +1535(locking)X +1805(for)X +1929(user)X +2093(data.)X +2297(Strictly)X +2562(speaking,)X +2897(the)X +3024(two)X +3173(phases)X +3416(in)X +3507(two-phase)X +3866(locking)X +4135(are)X +4263(a)X +3 f +555 4197(grow)N +1 f +756(phase,)X +986(during)X +1221(which)X +1443(locks)X +1638(are)X +1763(acquired,)X +2086(and)X +2228(a)X +3 f +2290(shrink)X +1 f +2537(phase,)X +2766(during)X +3001(which)X +3223(locks)X +3418(are)X +3543(released.)X +3873(No)X +3997(lock)X +4161(may)X +555 4287(ever)N +720(be)X +822(acquired)X +1124(during)X +1358(the)X +1481(shrink)X +1706(phase.)X +1954(The)X +2104(grow)X +2294(phase)X +2502(lasts)X +2669(until)X +2840(the)X +2963(\256rst)X +3112(release,)X +3381(which)X +3602(marks)X +3823(the)X +3946(start)X +4109(of)X +4201(the)X +555 4377(shrink)N +780(phase.)X +1028(In)X +1120(practice,)X +1420(the)X +1543(grow)X +1733(phase)X +1941(lasts)X +2108(for)X +2227(the)X +2350(duration)X +2642(of)X +2734(a)X +2795(transaction)X +3172(in)X +3259(LIBTP)X +3506(and)X +3647(in)X +3734(commercial)X +4138(data-)X +555 4467(base)N +721(systems.)X +1037(The)X +1184(shrink)X +1406(phase)X +1611(takes)X +1798(place)X +1990(during)X +2221(transaction)X +2595(commit)X +2861(or)X +2950(abort.)X +3177(This)X +3341(means)X +3568(that)X +3710(locks)X +3901(are)X +4022(acquired)X +555 4557(on)N +655(demand)X +929(during)X +1158(the)X +1276(lifetime)X +1545(of)X +1632(a)X +1688(transaction,)X +2080(and)X +2216(held)X +2374(until)X +2540(commit)X +2804(time,)X +2986(at)X +3064(which)X +3280(point)X +3464(all)X +3564(locks)X +3753(are)X +3872(released.)X +755 4680(If)N +832(multiple)X +1121(transactions)X +1527(are)X +1649(active)X +1864(concurrently,)X +2313(deadlocks)X +2657(can)X +2792(occur)X +2994(and)X +3133(must)X +3311(be)X +3410(detected)X +3701(and)X +3840(resolved.)X +4174(The)X +555 4770(lock)N +715(table)X +893(can)X +1027(be)X +1125(thought)X +1391(of)X +1480(as)X +1569(a)X +1627(representation)X +2104(of)X +2193(a)X +2251(directed)X +2532(graph.)X +2777(The)X +2924(nodes)X +3133(in)X +3216(the)X +3335(graph)X +3539(are)X +3659(transactions.)X +4103(Edges)X +555 4860(represent)N +878(the)X +3 f +1004(waits-for)X +1 f +1340(relation)X +1613(between)X +1909(transactions;)X +2342(if)X +2419(transaction)X +2 f +2799(A)X +1 f +2876(is)X +2957(waiting)X +3225(for)X +3347(a)X +3411(lock)X +3577(held)X +3743(by)X +3851(transaction)X +2 f +4230(B)X +1 f +4279(,)X +555 4950(then)N +716(a)X +775(directed)X +1057(edge)X +1232(exists)X +1437(from)X +2 f +1616(A)X +1 f +1687(to)X +2 f +1771(B)X +1 f +1842(in)X +1926(the)X +2046(graph.)X +2291(A)X +2371(deadlock)X +2683(exists)X +2887(if)X +2958(a)X +3016(cycle)X +3208(appears)X +3476(in)X +3560(the)X +3680(graph.)X +3925(By)X +4040(conven-)X +555 5040(tion,)N +719(no)X +819(transaction)X +1191(ever)X +1350(waits)X +1539(for)X +1653(a)X +1709(lock)X +1867(it)X +1931(already)X +2188(holds,)X +2401(so)X +2492(re\257exive)X +2793(edges)X +2996(are)X +3115(impossible.)X +755 5163(A)N +836(distinguished)X +1285(process)X +1549(monitors)X +1856(the)X +1977(lock)X +2138(table,)X +2337(searching)X +2668(for)X +2785(cycles.)X +3048(The)X +3195(frequency)X +3539(with)X +3703(which)X +3921(this)X +4058(process)X +555 5253(runs)N +716(is)X +792(user-settable;)X +1243(for)X +1360(the)X +1481(multi-user)X +1833(tests)X +1998(discussed)X +2328(in)X +2413(section)X +2663(5.1.2,)X +2866(it)X +2933(has)X +3063(been)X +3238(set)X +3350(to)X +3435(wake)X +3628(up)X +3731(every)X +3932(second,)X +4197(but)X +555 5343(more)N +742(sophisticated)X +1182(schedules)X +1516(are)X +1636(certainly)X +1938(possible.)X +2261(When)X +2474(a)X +2531(cycle)X +2722(is)X +2796(detected,)X +3105(one)X +3242(of)X +3330(the)X +3449(transactions)X +3853(in)X +3936(the)X +4055(cycle)X +4246(is)X +555 5433(nominated)N +917(and)X +1057(aborted.)X +1362(When)X +1578(the)X +1700(transaction)X +2076(aborts,)X +2315(it)X +2382(rolls)X +2547(back)X +2722(its)X +2820(changes)X +3102(and)X +3241(releases)X +3519(its)X +3617(locks,)X +3829(thereby)X +4093(break-)X +555 5523(ing)N +677(the)X +795(cycle)X +985(in)X +1067(the)X +1185(graph.)X + +8 p +%%Page: 8 8 +10 s 10 xH 0 xS 1 f +3 f +1 f +4 Ds +1 Dt +1866 865 MXY +1338 0 Dl +1866 1031 MXY +1338 0 Dl +1866 1199 MXY +1338 0 Dl +1866 1366 MXY +1338 0 Dl +1866 1533 MXY +1338 0 Dl +1866 1701 MXY +1338 0 Dl +-1 Ds +5 Dt +1866 1868 MXY +1338 0 Dl +1 Dt +1 Di +2981 MX + 2981 1868 lineto + 2981 1575 lineto + 3092 1575 lineto + 3092 1868 lineto + 2981 1868 lineto +closepath 21 2981 1575 3092 1868 Dp +2646 MX + 2646 1868 lineto + 2646 949 lineto + 2758 949 lineto + 2758 1868 lineto + 2646 1868 lineto +closepath 14 2646 949 2758 1868 Dp +2312 MX + 2312 1868 lineto + 2312 1701 lineto + 2423 1701 lineto + 2423 1868 lineto + 2312 1868 lineto +closepath 3 2312 1701 2423 1868 Dp +1977 MX + 1977 1868 lineto + 1977 1512 lineto + 2089 1512 lineto + 2089 1868 lineto + 1977 1868 lineto +closepath 19 1977 1512 2089 1868 Dp +3 f +2640 2047(Client/Server)N +1957(Single)X +2185(Process)X +7 s +2957 1957(record)N +2570(component)X +2289(record)X +1890(components)X +1733 1724(.1)N +1733 1556(.2)N +1733 1389(.3)N +1733 1222(.4)N +1733 1055(.5)N +1733 889(.6)N +1590 726(Elapsed)N +1794(Time)X +1613 782(\(in)N +1693(seconds\))X +3 Dt +-1 Ds +8 s +555 2255(Figure)N +756(4:)X +829(Comparison)X +1187(of)X +1260(High)X +1416(and)X +1540(Low)X +1681(Level)X +1850(Interfaces.)X +1 f +2174(Elapsed)X +2395(time)X +2528(in)X +2597(seconds)X +2818(to)X +2887(perform)X +3111(a)X +3158(single)X +3330(record)X +3511(retrieval)X +3742(from)X +3885(a)X +3932(command)X +4203(line)X +555 2345(\(rather)N +751(than)X +888(a)X +943(procedural)X +1241(interface\))X +1510(is)X +1579(shown)X +1772(on)X +1862(the)X +1966(y)X +2024(axis.)X +2185(The)X +2310(``component'')X +2704(numbers)X +2950(re\257ect)X +3135(the)X +3239(timings)X +3458(when)X +3622(the)X +3726(record)X +3914(is)X +3983(retrieved)X +4235(by)X +555 2435(separate)N +785(calls)X +924(to)X +996(the)X +1096(lock)X +1228(manager)X +1469(and)X +1583(buffer)X +1760(manager)X +2001(while)X +2165(the)X +2264(``record'')X +2531(timings)X +2745(were)X +2889(obtained)X +3130(by)X +3215(using)X +3375(a)X +3424(single)X +3598(call)X +3711(to)X +3782(the)X +3881(record)X +4064(manager.)X +555 2525(The)N +674(2:1)X +776(ratio)X +913(observed)X +1163(for)X +1257(the)X +1355(single)X +1528(process)X +1739(case)X +1868(is)X +1930(a)X +1977(re\257ection)X +2237(of)X +2309(the)X +2406(parsing)X +2613(overhead)X +2865(for)X +2958(executing)X +3225(eight)X +3372(separate)X +3599(commands)X +3895(rather)X +4062(than)X +4191(one.)X +555 2615(The)N +673(additional)X +948(factor)X +1115(of)X +1187(one)X +1298(re\257ected)X +1536(in)X +1605(the)X +1702(3:1)X +1803(ratio)X +1939(for)X +2031(the)X +2127(client/server)X +2460(architecture)X +2794(is)X +2855(due)X +2965(to)X +3033(the)X +3129(communication)X +3545(overhead.)X +3828(The)X +3945(true)X +4062(ratios)X +4222(are)X +555 2705(actually)N +775(worse)X +945(since)X +1094(the)X +1190(component)X +1492(timings)X +1703(do)X +1785(not)X +1884(re\257ect)X +2060(the)X +2155(search)X +2334(times)X +2490(within)X +2671(each)X +2804(page)X +2941(or)X +3011(the)X +3106(time)X +3237(required)X +3466(to)X +3533(transmit)X +3760(the)X +3855(page)X +3992(between)X +4221(the)X +555 2795(two)N +667(processes.)X +10 s +10 f +555 2885(h)N +579(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)X +3 f +555 3161(4.2.)N +715(Group)X +961(Commit)X +1 f +755 3284(Since)N +959(the)X +1083(log)X +1211(must)X +1392(be)X +1494(\257ushed)X +1751(to)X +1839(disk)X +1997(at)X +2080(commit)X +2349(time,)X +2536(disk)X +2694(bandwidth)X +3057(fundamentally)X +3545(limits)X +3751(the)X +3874(rate)X +4020(at)X +4103(which)X +555 3374(transactions)N +959(complete.)X +1314(Since)X +1513(most)X +1688(transactions)X +2091(write)X +2276(only)X +2438(a)X +2494(few)X +2635(small)X +2828(records)X +3085(to)X +3167(the)X +3285(log,)X +3427(the)X +3545(last)X +3676(page)X +3848(of)X +3935(the)X +4053(log)X +4175(will)X +555 3464(be)N +658(\257ushed)X +916(once)X +1095(by)X +1202(every)X +1408(transaction)X +1787(which)X +2010(writes)X +2233(to)X +2322(it.)X +2433(In)X +2527(the)X +2652(naive)X +2853(implementation,)X +3402(these)X +3593(\257ushes)X +3841(would)X +4067(happen)X +555 3554(serially.)N +755 3677(LIBTP)N +1008(uses)X +3 f +1177(group)X +1412(commit)X +1 f +1702([DEWI84])X +2077(in)X +2170(order)X +2371(to)X +2464(amortize)X +2775(the)X +2903(cost)X +3062(of)X +3159(one)X +3305(synchronous)X +3740(disk)X +3903(write)X +4098(across)X +555 3767(multiple)N +851(transactions.)X +1304(Group)X +1539(commit)X +1812(provides)X +2117(a)X +2182(way)X +2345(for)X +2468(a)X +2533(group)X +2749(of)X +2845(transactions)X +3257(to)X +3348(commit)X +3621(simultaneously.)X +4174(The)X +555 3857(\256rst)N +709(several)X +967(transactions)X +1380(to)X +1472(commit)X +1745(write)X +1939(their)X +2115(changes)X +2403(to)X +2494(the)X +2621(in-memory)X +3006(log)X +3137(page,)X +3338(then)X +3505(sleep)X +3699(on)X +3808(a)X +3873(distinguished)X +555 3947(semaphore.)N +966(Later,)X +1179(a)X +1238(committing)X +1629(transaction)X +2004(\257ushes)X +2249(the)X +2370(page)X +2545(to)X +2630(disk,)X +2805(and)X +2943(wakes)X +3166(up)X +3268(all)X +3370(its)X +3467(sleeping)X +3756(peers.)X +3988(The)X +4135(point)X +555 4037(at)N +635(which)X +853(changes)X +1134(are)X +1255(actually)X +1531(written)X +1780(is)X +1855(determined)X +2238(by)X +2340(three)X +2523(thresholds.)X +2914(The)X +3061(\256rst)X +3207(is)X +3281(the)X +2 f +3400(group)X +3612(threshold)X +1 f +3935(and)X +4072(de\256nes)X +555 4127(the)N +674(minimum)X +1005(number)X +1271(of)X +1359(transactions)X +1763(which)X +1979(must)X +2154(be)X +2250(active)X +2462(in)X +2544(the)X +2662(system)X +2904(before)X +3130(transactions)X +3533(are)X +3652(forced)X +3878(to)X +3960(participate)X +555 4217(in)N +646(a)X +711(group)X +927(commit.)X +1240(The)X +1394(second)X +1646(is)X +1728(the)X +2 f +1855(wait)X +2021(threshold)X +1 f +2352(which)X +2577(is)X +2658(expressed)X +3003(as)X +3098(the)X +3224(percentage)X +3601(of)X +3696(active)X +3916(transactions)X +555 4307(waiting)N +826(to)X +919(be)X +1026(committed.)X +1439(The)X +1595(last)X +1737(is)X +1821(the)X +2 f +1950(logdelay)X +2257(threshold)X +1 f +2590(which)X +2816(indicates)X +3131(how)X +3299(much)X +3507(un\257ushed)X +3848(log)X +3980(should)X +4223(be)X +555 4397(allowed)N +829(to)X +911(accumulate)X +1297(before)X +1523(a)X +1579(waiting)X +1839(transaction's)X +2289(commit)X +2553(record)X +2779(is)X +2852(\257ushed.)X +755 4520(Group)N +981(commit)X +1246(can)X +1379(substantially)X +1803(improve)X +2090(performance)X +2517(for)X +2631(high-concurrency)X +3218(environments.)X +3714(If)X +3788(only)X +3950(a)X +4006(few)X +4147(tran-)X +555 4610(sactions)N +836(are)X +957(running,)X +1248(it)X +1314(is)X +1389(unlikely)X +1673(to)X +1757(improve)X +2046(things)X +2263(at)X +2343(all.)X +2485(The)X +2632(crossover)X +2962(point)X +3148(is)X +3223(the)X +3343(point)X +3529(at)X +3609(which)X +3827(the)X +3947(transaction)X +555 4700(commit)N +823(rate)X +968(is)X +1045(limited)X +1295(by)X +1399(the)X +1521(bandwidth)X +1883(of)X +1974(the)X +2096(device)X +2330(on)X +2434(which)X +2654(the)X +2776(log)X +2902(resides.)X +3189(If)X +3267(processes)X +3599(are)X +3722(trying)X +3937(to)X +4023(\257ush)X +4201(the)X +555 4790(log)N +677(faster)X +876(than)X +1034(the)X +1152(log)X +1274(disk)X +1427(can)X +1559(accept)X +1785(data,)X +1959(then)X +2117(group)X +2324(commit)X +2588(will)X +2732(increase)X +3016(the)X +3134(commit)X +3398(rate.)X +3 f +555 4976(4.3.)N +715(Kernel)X +971(Intervention)X +1418(for)X +1541(Synchronization)X +1 f +755 5099(Since)N +954(LIBTP)X +1197(uses)X +1356(data)X +1511(in)X +1594(shared)X +1825(memory)X +2113(\()X +2 f +2140(e.g.)X +1 f +2277(the)X +2395(lock)X +2553(table)X +2729(and)X +2865(buffer)X +3082(pool\))X +3271(it)X +3335(must)X +3510(be)X +3606(possible)X +3888(for)X +4002(a)X +4058(process)X +555 5189(to)N +640(acquire)X +900(exclusive)X +1226(access)X +1454(to)X +1538(shared)X +1770(data)X +1926(in)X +2010(order)X +2202(to)X +2286(prevent)X +2549(corruption.)X +2945(In)X +3034(addition,)X +3338(the)X +3458(process)X +3721(manager)X +4020(must)X +4197(put)X +555 5279(processes)N +886(to)X +971(sleep)X +1159(when)X +1356(the)X +1477(lock)X +1638(or)X +1728(buffer)X +1948(they)X +2109(request)X +2364(is)X +2440(in)X +2525(use)X +2655(by)X +2758(some)X +2950(other)X +3138(process.)X +3441(In)X +3530(the)X +3650(LIBTP)X +3894(implementa-)X +555 5385(tion)N +705(under)X +914(Ultrix)X +1131(4.0)X +7 s +5353(2)Y +10 s +5385(,)Y +1305(we)X +1424(use)X +1556(System)X +1816(V)X +1899(semaphores)X +2303(to)X +2390(provide)X +2660(this)X +2800(synchronization.)X +3377(Semaphores)X +3794(implemented)X +4237(in)X +555 5475(this)N +701(fashion)X +968(turn)X +1128(out)X +1261(to)X +1354(be)X +1461(an)X +1568(expensive)X +1920(choice)X +2161(for)X +2285(synchronization,)X +2847(because)X +3132(each)X +3310(access)X +3546(traps)X +3732(to)X +3824(the)X +3952(kernel)X +4183(and)X +8 s +10 f +555 5547(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N +5 s +1 f +727 5625(2)N +8 s +763 5650(Ultrix)N +932(and)X +1040(DEC)X +1184(are)X +1277(trademarks)X +1576(of)X +1645(Digital)X +1839(Equipment)X +2136(Corporation.)X + +9 p +%%Page: 9 9 +8 s 8 xH 0 xS 1 f +10 s +3 f +1 f +555 630(executes)N +852(atomically)X +1210(there.)X +755 753(On)N +878(architectures)X +1314(that)X +1459(support)X +1724(atomic)X +1967(test-and-set,)X +2382(a)X +2443(much)X +2646(better)X +2854(choice)X +3089(would)X +3314(be)X +3415(to)X +3502(attempt)X +3767(to)X +3854(obtain)X +4079(a)X +4139(spin-)X +555 843(lock)N +714(with)X +877(a)X +934(test-and-set,)X +1345(and)X +1482(issue)X +1663(a)X +1720(system)X +1963(call)X +2100(only)X +2263(if)X +2333(the)X +2452(spinlock)X +2744(is)X +2818(unavailable.)X +3249(Since)X +3447(virtually)X +3738(all)X +3838(semaphores)X +4237(in)X +555 933(LIBTP)N +801(are)X +924(uncontested)X +1330(and)X +1469(are)X +1591(held)X +1752(for)X +1869(very)X +2035(short)X +2218(periods)X +2477(of)X +2567(time,)X +2752(this)X +2890(would)X +3113(improve)X +3403(performance.)X +3873(For)X +4007(example,)X +555 1023(processes)N +885(must)X +1062(acquire)X +1321(exclusive)X +1646(access)X +1874(to)X +1958(buffer)X +2177(pool)X +2341(metadata)X +2653(in)X +2737(order)X +2929(to)X +3013(\256nd)X +3159(and)X +3297(pin)X +3421(a)X +3479(buffer)X +3698(in)X +3781(shared)X +4012(memory.)X +555 1113(This)N +721(semaphore)X +1093(is)X +1170(requested)X +1502(most)X +1681(frequently)X +2034(in)X +2119(LIBTP.)X +2404(However,)X +2742(once)X +2917(it)X +2984(is)X +3060(acquired,)X +3380(only)X +3545(a)X +3604(few)X +3748(instructions)X +4144(must)X +555 1203(be)N +656(executed)X +966(before)X +1196(it)X +1264(is)X +1341(released.)X +1669(On)X +1791(one)X +1931(architecture)X +2335(for)X +2453(which)X +2673(we)X +2791(were)X +2972(able)X +3130(to)X +3216(gather)X +3441(detailed)X +3719(pro\256ling)X +4018(informa-)X +555 1293(tion,)N +729(the)X +857(cost)X +1015(of)X +1111(the)X +1238(semaphore)X +1615(calls)X +1791(accounted)X +2146(for)X +2269(25%)X +2445(of)X +2541(the)X +2668(total)X +2839(time)X +3010(spent)X +3208(updating)X +3517(the)X +3644(metadata.)X +4003(This)X +4174(was)X +555 1383(fairly)N +749(consistent)X +1089(across)X +1310(most)X +1485(of)X +1572(the)X +1690(critical)X +1933(sections.)X +755 1506(In)N +848(an)X +950(attempt)X +1216(to)X +1304(quantify)X +1597(the)X +1720(overhead)X +2040(of)X +2132(kernel)X +2358(synchronization,)X +2915(we)X +3034(ran)X +3162(tests)X +3329(on)X +3434(a)X +3495(version)X +3756(of)X +3848(4.3BSD-Reno)X +555 1596(which)N +786(had)X +937(been)X +1123(modi\256ed)X +1441(to)X +1537(support)X +1811(binary)X +2050(semaphore)X +2432(facilities)X +2742(similar)X +2998(to)X +3094(those)X +3297(described)X +3639(in)X +3735([POSIX91].)X +4174(The)X +555 1686(hardware)N +880(platform)X +1181(consisted)X +1504(of)X +1595(an)X +1695(HP300)X +1941(\(33MHz)X +2237(MC68030\))X +2612(workstation)X +3014(with)X +3180(16MBytes)X +3537(of)X +3628(main)X +3812(memory,)X +4123(and)X +4263(a)X +555 1776(600MByte)N +920(HP7959)X +1205(SCSI)X +1396(disk)X +1552(\(17)X +1682(ms)X +1798(average)X +2072(seek)X +2237(time\).)X +2468(We)X +2602(ran)X +2727(three)X +2910(sets)X +3052(of)X +3141(comparisons)X +3568(which)X +3786(are)X +3907(summarized)X +555 1866(in)N +645(\256gure)X +860(\256ve.)X +1028(In)X +1123(each)X +1299(comparison)X +1701(we)X +1823(ran)X +1954(two)X +2102(tests,)X +2292(one)X +2436(using)X +2637(hardware)X +2965(spinlocks)X +3295(and)X +3438(the)X +3563(other)X +3755(using)X +3955(kernel)X +4183(call)X +555 1956(synchronization.)N +1135(Since)X +1341(the)X +1467(test)X +1606(was)X +1758(run)X +1892(single-user,)X +2291(none)X +2474(of)X +2568(the)X +2693(the)X +2818(locks)X +3014(were)X +3198(contested.)X +3568(In)X +3662(the)X +3787(\256rst)X +3938(two)X +4085(sets)X +4232(of)X +555 2046(tests,)N +743(we)X +863(ran)X +992(the)X +1116(full)X +1253(transaction)X +1631(processing)X +2000(benchmark)X +2383(described)X +2717(in)X +2805(section)X +3058(5.1.)X +3223(In)X +3315(one)X +3456(case)X +3620(we)X +3739(ran)X +3867(with)X +4034(both)X +4201(the)X +555 2136(database)N +854(and)X +992(log)X +1116(on)X +1218(the)X +1338(same)X +1525(disk)X +1680(\(1)X +1769(Disk\))X +1969(and)X +2107(in)X +2191(the)X +2311(second,)X +2576(we)X +2692(ran)X +2817(with)X +2981(the)X +3101(database)X +3400(and)X +3538(log)X +3661(on)X +3762(separate)X +4047(disks)X +4232(\(2)X +555 2226(Disk\).)N +800(In)X +894(the)X +1019(last)X +1157(test,)X +1315(we)X +1436(wanted)X +1695(to)X +1784(create)X +2004(a)X +2067(CPU)X +2249(bound)X +2476(environment,)X +2928(so)X +3026(we)X +3146(used)X +3319(a)X +3381(database)X +3684(small)X +3883(enough)X +4145(to)X +4233(\256t)X +555 2316(completely)N +941(in)X +1033(the)X +1161(cache)X +1375(and)X +1521(issued)X +1751(read-only)X +2089(transactions.)X +2541(The)X +2695(results)X +2933(in)X +3024(\256gure)X +3240(\256ve)X +3389(express)X +3659(the)X +3786(kernel)X +4016(call)X +4161(syn-)X +555 2406(chronization)N +980(performance)X +1411(as)X +1502(a)X +1562(percentage)X +1935(of)X +2026(the)X +2148(spinlock)X +2443(performance.)X +2914(For)X +3049(example,)X +3365(in)X +3451(the)X +3573(1)X +3637(disk)X +3794(case,)X +3977(the)X +4098(kernel)X +555 2496(call)N +697(implementation)X +1225(achieved)X +1537(4.4)X +1662(TPS)X +1824(\(transactions)X +2259(per)X +2387(second\))X +2662(while)X +2865(the)X +2988(semaphore)X +3361(implementation)X +3888(achieved)X +4199(4.6)X +555 2586(TPS,)N +735(and)X +874(the)X +995(relative)X +1259(performance)X +1689(of)X +1779(the)X +1900(kernel)X +2123(synchronization)X +2657(is)X +2732(96%)X +2901(that)X +3043(of)X +3132(the)X +3252(spinlock)X +3545(\(100)X +3714(*)X +3776(4.4)X +3898(/)X +3942(4.6\).)X +4111(There)X +555 2676(are)N +674(two)X +814(striking)X +1078(observations)X +1503(from)X +1679(these)X +1864(results:)X +10 f +635 2799(g)N +1 f +755(even)X +927(when)X +1121(the)X +1239(system)X +1481(is)X +1554(disk)X +1707(bound,)X +1947(the)X +2065(CPU)X +2240(cost)X +2389(of)X +2476(synchronization)X +3008(is)X +3081(noticeable,)X +3451(and)X +10 f +635 2922(g)N +1 f +755(when)X +949(we)X +1063(are)X +1182(CPU)X +1357(bound,)X +1597(the)X +1715(difference)X +2062(is)X +2135(dramatic)X +2436(\(67%\).)X +3 f +555 3108(4.4.)N +715(Transaction)X +1148(Protected)X +1499(Access)X +1747(Methods)X +1 f +755 3231(The)N +903(B-tree)X +1127(and)X +1266(\256xed)X +1449(length)X +1671(recno)X +1872(\(record)X +2127(number\))X +2421(access)X +2649(methods)X +2942(have)X +3116(been)X +3290(modi\256ed)X +3596(to)X +3680(provide)X +3947(transaction)X +555 3321(protection.)N +941(Whereas)X +1244(the)X +1363(previously)X +1722(published)X +2054(interface)X +2357(to)X +2440(the)X +2559(access)X +2786(routines)X +3065(had)X +3202(separate)X +3487(open)X +3664(calls)X +3832(for)X +3946(each)X +4114(of)X +4201(the)X +10 f +555 3507(h)N +579(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)X +1 Dt +2978 5036 MXY + 2978 5036 lineto + 2978 4662 lineto + 3093 4662 lineto + 3093 5036 lineto + 2978 5036 lineto +closepath 21 2978 4662 3093 5036 Dp +2518 MX + 2518 5036 lineto + 2518 3960 lineto + 2633 3960 lineto + 2633 5036 lineto + 2518 5036 lineto +closepath 3 2518 3960 2633 5036 Dp +2059 MX + 2059 5036 lineto + 2059 3946 lineto + 2174 3946 lineto + 2174 5036 lineto + 2059 5036 lineto +closepath 1 2059 3946 2174 5036 Dp +3 f +7 s +2912 5141(Read-only)N +1426 3767(of)N +1487(Spinlock)X +1710(Throughput)X +1480 3710(Throughput)N +1786(as)X +1850(a)X +1892(%)X +11 s +1670 4843(20)N +1670 4614(40)N +1670 4384(60)N +1670 4155(80)N +1648 3925(100)N +7 s +2041 5141(1)N +2083(Disk)X +2490(2)X +2532(Disks)X +5 Dt +1829 5036 MXY +1494 0 Dl +4 Ds +1 Dt +1829 4806 MXY +1494 0 Dl +1829 4577 MXY +1494 0 Dl +1829 4347 MXY +1494 0 Dl +1829 4118 MXY +1494 0 Dl +1829 3888 MXY +1494 0 Dl +3 Dt +-1 Ds +8 s +555 5360(Figure)N +753(5:)X +823(Kernel)X +1028(Overhead)X +1315(for)X +1413(System)X +1625(Call)X +1756(Synchronization.)X +1 f +2254(The)X +2370(performance)X +2708(of)X +2778(the)X +2873(kernel)X +3049(call)X +3158(synchronization)X +3583(is)X +3643(expressed)X +3911(as)X +3980(a)X +4024(percentage)X +555 5450(of)N +625(the)X +720(spinlock)X +954(synchronization)X +1379(performance.)X +1749(In)X +1819(disk)X +1943(bound)X +2120(cases)X +2271(\(1)X +2341(Disk)X +2479(and)X +2588(2)X +2637(Disks\),)X +2837(we)X +2928(see)X +3026(that)X +3139(4-6%)X +3294(of)X +3364(the)X +3459(performance)X +3797(is)X +3857(lost)X +3966(due)X +4074(to)X +4140(kernel)X +555 5540(calls)N +688(while)X +846(in)X +912(the)X +1006(CPU)X +1147(bound)X +1323(case,)X +1464(we)X +1554(have)X +1690(lost)X +1799(67%)X +1932(of)X +2001(the)X +2095(performance)X +2432(due)X +2540(to)X +2606(kernel)X +2781(calls.)X + +10 p +%%Page: 10 10 +8 s 8 xH 0 xS 1 f +10 s +3 f +1 f +555 630(access)N +781(methods,)X +1092(we)X +1206(now)X +1364(have)X +1536(an)X +1632(integrated)X +1973(open)X +2149(call)X +2285(with)X +2447(the)X +2565(following)X +2896(calling)X +3134(conventions:)X +7 f +715 753(DB)N +859(*dbopen)X +1243(\(const)X +1579(char)X +1819(*file,)X +2155(int)X +2347(flags,)X +2683(int)X +2875(mode,)X +3163(DBTYPE)X +3499(type,)X +1291 843(int)N +1483(dbflags,)X +1915(const)X +2203(void)X +2443(*openinfo\))X +1 f +555 966(where)N +2 f +774(\256le)X +1 f +894(is)X +969(the)X +1089(name)X +1285(of)X +1374(the)X +1494(\256le)X +1618(being)X +1818(opened,)X +2 f +2092(\257ags)X +1 f +2265(and)X +2 f +2402(mode)X +1 f +2597(are)X +2717(the)X +2836(standard)X +3129(arguments)X +3484(to)X +3 f +3567(open)X +1 f +3731(\(2\),)X +2 f +3866(type)X +1 f +4021(is)X +4095(one)X +4232(of)X +555 1056(the)N +680(access)X +913(method)X +1180(types,)X +2 f +1396(db\257ags)X +1 f +1654(indicates)X +1966(the)X +2091(mode)X +2296(of)X +2390(the)X +2515(buffer)X +2739(pool)X +2907(and)X +3049(transaction)X +3427(protection,)X +3798(and)X +2 f +3940(openinfo)X +1 f +4246(is)X +555 1146(the)N +681(access)X +915(method)X +1183(speci\256c)X +1456(information.)X +1902(Currently,)X +2257(the)X +2383(possible)X +2673(values)X +2906(for)X +2 f +3028(db\257ags)X +1 f +3287(are)X +3414(DB_SHARED)X +3912(and)X +4055(DB_TP)X +555 1236(indicating)N +895(that)X +1035(buffers)X +1283(should)X +1516(be)X +1612(kept)X +1770(in)X +1852(a)X +1908(shared)X +2138(buffer)X +2355(pool)X +2517(and)X +2653(that)X +2793(the)X +2911(\256le)X +3033(should)X +3266(be)X +3362(transaction)X +3734(protected.)X +755 1359(The)N +900(modi\256cations)X +1355(required)X +1643(to)X +1725(add)X +1861(transaction)X +2233(protection)X +2578(to)X +2660(an)X +2756(access)X +2982(method)X +3242(are)X +3361(quite)X +3541(simple)X +3774(and)X +3910(localized.)X +715 1482(1.)N +795(Replace)X +1074(\256le)X +2 f +1196(open)X +1 f +1372(with)X +2 f +1534(buf_open)X +1 f +1832(.)X +715 1572(2.)N +795(Replace)X +1074(\256le)X +2 f +1196(read)X +1 f +1363(and)X +2 f +1499(write)X +1 f +1683(calls)X +1850(with)X +2012(buffer)X +2229(manager)X +2526(calls)X +2693(\()X +2 f +2720(buf_get)X +1 f +(,)S +2 f +3000(buf_unpin)X +1 f +3324(\).)X +715 1662(3.)N +795(Precede)X +1070(buffer)X +1287(manager)X +1584(calls)X +1751(with)X +1913(an)X +2009(appropriate)X +2395(\(read)X +2581(or)X +2668(write\))X +2880(lock)X +3038(call.)X +715 1752(4.)N +795(Before)X +1034(updates,)X +1319(issue)X +1499(a)X +1555(logging)X +1819(operation.)X +715 1842(5.)N +795(After)X +985(data)X +1139(have)X +1311(been)X +1483(accessed,)X +1805(release)X +2049(the)X +2167(buffer)X +2384(manager)X +2681(pin.)X +715 1932(6.)N +795(Provide)X +1064(undo/redo)X +1409(code)X +1581(for)X +1695(each)X +1863(type)X +2021(of)X +2108(log)X +2230(record)X +2456(de\256ned.)X +555 2071(The)N +702(following)X +1035(code)X +1209(fragments)X +1552(show)X +1743(how)X +1903(to)X +1987(transaction)X +2361(protect)X +2606(several)X +2856(updates)X +3123(to)X +3206(a)X +3263(B-tree.)X +7 s +3484 2039(3)N +10 s +3533 2071(In)N +3621(the)X +3740(unprotected)X +4140(case,)X +555 2161(an)N +652(open)X +829(call)X +966(is)X +1040(followed)X +1346(by)X +1447(a)X +1504(read)X +1664(call)X +1801(to)X +1884(obtain)X +2105(the)X +2224(meta-data)X +2562(for)X +2677(the)X +2796(B-tree.)X +3058(Instead,)X +3331(we)X +3446(issue)X +3627(an)X +3724(open)X +3901(to)X +3984(the)X +4102(buffer)X +555 2251(manager)N +852(to)X +934(obtain)X +1154(a)X +1210(\256le)X +1332(id)X +1414(and)X +1550(a)X +1606(buffer)X +1823(request)X +2075(to)X +2157(obtain)X +2377(the)X +2495(meta-data)X +2832(as)X +2919(shown)X +3148(below.)X +7 f +715 2374(char)N +955(*path;)X +715 2464(int)N +907(fid,)X +1147(flags,)X +1483(len,)X +1723(mode;)X +715 2644(/*)N +859(Obtain)X +1195(a)X +1291(file)X +1531(id)X +1675(with)X +1915(which)X +2203(to)X +2347(access)X +2683(the)X +2875(buffer)X +3211(pool)X +3451(*/)X +715 2734(fid)N +907(=)X +1003(buf_open\(path,)X +1723(flags,)X +2059(mode\);)X +715 2914(/*)N +859(Read)X +1099(the)X +1291(meta)X +1531(data)X +1771(\(page)X +2059(0\))X +2203(for)X +2395(the)X +2587(B-tree)X +2923(*/)X +715 3004(if)N +859(\(tp_lock\(fid,)X +1531(0,)X +1675(READ_LOCK\)\))X +1003 3094(return)N +1339(error;)X +715 3184(meta_data_ptr)N +1387(=)X +1483(buf_get\(fid,)X +2107(0,)X +2251(BF_PIN,)X +2635(&len\);)X +1 f +555 3307(The)N +714(BF_PIN)X +1014(argument)X +1350(to)X +2 f +1445(buf_get)X +1 f +1718(indicates)X +2036(that)X +2189(we)X +2316(wish)X +2500(to)X +2595(leave)X +2798(this)X +2946(page)X +3131(pinned)X +3382(in)X +3477(memory)X +3777(so)X +3881(that)X +4034(it)X +4111(is)X +4197(not)X +555 3397(swapped)N +862(out)X +990(while)X +1194(we)X +1314(are)X +1439(accessing)X +1772(it.)X +1881(The)X +2031(last)X +2167(argument)X +2495(to)X +2 f +2582(buf_get)X +1 f +2847(returns)X +3095(the)X +3218(number)X +3488(of)X +3580(bytes)X +3774(on)X +3879(the)X +4002(page)X +4179(that)X +555 3487(were)N +732(valid)X +912(so)X +1003(that)X +1143(the)X +1261(access)X +1487(method)X +1747(may)X +1905(initialize)X +2205(the)X +2323(page)X +2495(if)X +2564(necessary.)X +755 3610(Next,)N +955(consider)X +1251(inserting)X +1555(a)X +1615(record)X +1845(on)X +1949(a)X +2009(particular)X +2341(page)X +2517(of)X +2608(a)X +2668(B-tree.)X +2932(In)X +3022(the)X +3143(unprotected)X +3545(case,)X +3727(we)X +3844(read)X +4006(the)X +4127(page,)X +555 3700(call)N +2 f +693(_bt_insertat)X +1 f +1079(,)X +1121(and)X +1258(write)X +1444(the)X +1563(page.)X +1776(Instead,)X +2049(we)X +2164(lock)X +2323(the)X +2442(page,)X +2635(request)X +2888(the)X +3007(buffer,)X +3245(log)X +3368(the)X +3487(change,)X +3756(modify)X +4008(the)X +4127(page,)X +555 3790(and)N +691(release)X +935(the)X +1053(buffer.)X +7 f +715 3913(int)N +907(fid,)X +1147(len,)X +1387(pageno;)X +1867(/*)X +2011(Identifies)X +2539(the)X +2731(buffer)X +3067(*/)X +715 4003(int)N +907(index;)X +1867(/*)X +2011(Location)X +2443(at)X +2587(which)X +2875(to)X +3019(insert)X +3355(the)X +3547(new)X +3739(pair)X +3979(*/)X +715 4093(DBT)N +907(*keyp,)X +1243(*datap;)X +1867(/*)X +2011(Key/Data)X +2443(pair)X +2683(to)X +2827(be)X +2971(inserted)X +3403(*/)X +715 4183(DATUM)N +1003(*d;)X +1867(/*)X +2011(Key/data)X +2443(structure)X +2923(to)X +3067(insert)X +3403(*/)X +715 4363(/*)N +859(Lock)X +1099(and)X +1291(request)X +1675(the)X +1867(buffer)X +2203(*/)X +715 4453(if)N +859(\(tp_lock\(fid,)X +1531(pageno,)X +1915(WRITE_LOCK\)\))X +1003 4543(return)N +1339(error;)X +715 4633(buffer_ptr)N +1243(=)X +1339(buf_get\(fid,)X +1963(pageno,)X +2347(BF_PIN,)X +2731(&len\);)X +715 4813(/*)N +859(Log)X +1051(and)X +1243(perform)X +1627(the)X +1819(update)X +2155(*/)X +715 4903(log_insdel\(BTREE_INSERT,)N +1915(fid,)X +2155(pageno,)X +2539(keyp,)X +2827(datap\);)X +715 4993(_bt_insertat\(buffer_ptr,)N +1915(d,)X +2059(index\);)X +715 5083(buf_unpin\(buffer_ptr\);)N +1 f +555 5206(Succinctly,)N +942(the)X +1068(algorithm)X +1407(for)X +1529(turning)X +1788(unprotected)X +2195(code)X +2375(into)X +2527(protected)X +2854(code)X +3034(is)X +3115(to)X +3205(replace)X +3466(read)X +3633(operations)X +3995(with)X +2 f +4165(lock)X +1 f +555 5296(and)N +2 f +691(buf_get)X +1 f +951(operations)X +1305(and)X +1441(write)X +1626(operations)X +1980(with)X +2 f +2142(log)X +1 f +2264(and)X +2 f +2400(buf_unpin)X +1 f +2744(operations.)X +8 s +10 f +555 5458(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N +5 s +1 f +727 5536(3)N +8 s +766 5561(The)N +884(following)X +1152(code)X +1291(fragments)X +1565(are)X +1661(examples,)X +1937(but)X +2038(do)X +2120(not)X +2220(de\256ne)X +2394(the)X +2490(\256nal)X +2622(interface.)X +2894(The)X +3011(\256nal)X +3143(interface)X +3383(will)X +3501(be)X +3579(determined)X +3884(after)X +4018(LIBTP)X +4214(has)X +555 5633(been)N +691(fully)X +828(integrated)X +1099(with)X +1229(the)X +1323(most)X +1464(recent)X +3 f +1635(db)X +1 f +1707(\(3\))X +1797(release)X +1989(from)X +2129(the)X +2223(Computer)X +2495(Systems)X +2725(Research)X +2974(Group)X +3153(at)X +3215(University)X +3501(of)X +3570(California,)X +3861(Berkeley.)X + +11 p +%%Page: 11 11 +8 s 8 xH 0 xS 1 f +10 s +3 f +555 630(5.)N +655(Performance)X +1 f +755 753(In)N +845(this)X +983(section,)X +1253(we)X +1370(present)X +1625(the)X +1746(results)X +1978(of)X +2067(two)X +2209(very)X +2374(different)X +2673(benchmarks.)X +3103(The)X +3250(\256rst)X +3396(is)X +3471(an)X +3569(online)X +3791(transaction)X +4165(pro-)X +555 843(cessing)N +824(benchmark,)X +1234(similar)X +1489(to)X +1584(the)X +1715(standard)X +2020(TPCB,)X +2272(but)X +2407(has)X +2547(been)X +2732(adapted)X +3015(to)X +3110(run)X +3250(in)X +3345(a)X +3414(desktop)X +3696(environment.)X +4174(The)X +555 933(second)N +798(emulates)X +1103(a)X +1159(computer-aided)X +1683(design)X +1912(environment)X +2337(and)X +2473(provides)X +2769(more)X +2954(complex)X +3250(query)X +3453(processing.)X +3 f +555 1119(5.1.)N +715(Transaction)X +1148(Processing)X +1533(Benchmark)X +1 f +755 1242(For)N +887(this)X +1023(section,)X +1291(all)X +1392(performance)X +1820(numbers)X +2117(shown)X +2346(except)X +2576(for)X +2690(the)X +2808(commercial)X +3207(database)X +3504(system)X +3746(were)X +3923(obtained)X +4219(on)X +555 1332(a)N +614(DECstation)X +1009(5000/200)X +1333(with)X +1497(32MBytes)X +1852(of)X +1941(memory)X +2230(running)X +2501(Ultrix)X +2714(V4.0,)X +2914(accessing)X +3244(a)X +3302(DEC)X +3484(RZ57)X +3688(1GByte)X +3959(disk)X +4114(drive.)X +555 1422(The)N +720(commercial)X +1139(relational)X +1482(database)X +1799(system)X +2061(tests)X +2242(were)X +2438(run)X +2584(on)X +2703(a)X +2778(comparable)X +3192(machine,)X +3523(a)X +3598(Sparcstation)X +4033(1+)X +4157(with)X +555 1512(32MBytes)N +915(memory)X +1209(and)X +1352(a)X +1415(1GByte)X +1691(external)X +1976(disk)X +2135(drive.)X +2366(The)X +2517(database,)X +2840(binaries)X +3120(and)X +3262(log)X +3390(resided)X +3648(on)X +3754(the)X +3878(same)X +4069(device.)X +555 1602(Reported)N +869(times)X +1062(are)X +1181(the)X +1299(means)X +1524(of)X +1611(\256ve)X +1751(tests)X +1913(and)X +2049(have)X +2221(standard)X +2513(deviations)X +2862(within)X +3086(two)X +3226(percent)X +3483(of)X +3570(the)X +3688(mean.)X +755 1725(The)N +905(test)X +1041(database)X +1343(was)X +1493(con\256gured)X +1861(according)X +2203(to)X +2290(the)X +2413(TPCB)X +2637(scaling)X +2889(rules)X +3070(for)X +3189(a)X +3250(10)X +3355(transaction)X +3732(per)X +3860(second)X +4108(\(TPS\))X +555 1815(system)N +817(with)X +999(1,000,000)X +1359(account)X +1649(records,)X +1946(100)X +2106(teller)X +2311(records,)X +2607(and)X +2762(10)X +2881(branch)X +3139(records.)X +3455(Where)X +3709(TPS)X +3885(numbers)X +4200(are)X +555 1905(reported,)N +865(we)X +981(are)X +1102(running)X +1373(a)X +1431(modi\256ed)X +1737(version)X +1995(of)X +2084(the)X +2203(industry)X +2486(standard)X +2779(transaction)X +3152(processing)X +3516(benchmark,)X +3914(TPCB.)X +4174(The)X +555 1995(TPCB)N +780(benchmark)X +1163(simulates)X +1491(a)X +1553(withdrawal)X +1940(performed)X +2301(by)X +2407(a)X +2469(hypothetical)X +2891(teller)X +3082(at)X +3166(a)X +3228(hypothetical)X +3650(bank.)X +3872(The)X +4022(database)X +555 2085(consists)N +831(of)X +921(relations)X +1220(\(\256les\))X +1430(for)X +1547(accounts,)X +1871(branches,)X +2200(tellers,)X +2439(and)X +2578(history.)X +2863(For)X +2997(each)X +3168(transaction,)X +3563(the)X +3684(account,)X +3976(teller,)X +4183(and)X +555 2175(branch)N +795(balances)X +1093(must)X +1269(be)X +1366(updated)X +1641(to)X +1724(re\257ect)X +1946(the)X +2065(withdrawal)X +2447(and)X +2584(a)X +2640(history)X +2882(record)X +3108(is)X +3181(written)X +3428(which)X +3644(contains)X +3931(the)X +4049(account)X +555 2265(id,)N +657(branch)X +896(id,)X +998(teller)X +1183(id,)X +1285(and)X +1421(the)X +1539(amount)X +1799(of)X +1886(the)X +2004(withdrawal)X +2385([TPCB90].)X +755 2388(Our)N +914(implementation)X +1450(of)X +1551(the)X +1683(benchmark)X +2074(differs)X +2317(from)X +2506(the)X +2637(speci\256cation)X +3075(in)X +3170(several)X +3431(aspects.)X +3736(The)X +3894(speci\256cation)X +555 2478(requires)N +840(that)X +985(the)X +1108(database)X +1410(keep)X +1587(redundant)X +1933(logs)X +2091(on)X +2196(different)X +2498(devices,)X +2784(but)X +2911(we)X +3030(use)X +3162(a)X +3223(single)X +3439(log.)X +3606(Furthermore,)X +4052(all)X +4157(tests)X +555 2568(were)N +734(run)X +863(on)X +965(a)X +1023(single,)X +1256(centralized)X +1631(system)X +1875(so)X +1968(there)X +2151(is)X +2226(no)X +2328(notion)X +2553(of)X +2641(remote)X +2885(accesses.)X +3219(Finally,)X +3486(we)X +3601(calculated)X +3948(throughput)X +555 2658(by)N +662(dividing)X +955(the)X +1080(total)X +1249(elapsed)X +1517(time)X +1686(by)X +1793(the)X +1918(number)X +2190(of)X +2284(transactions)X +2694(processed)X +3038(rather)X +3253(than)X +3418(by)X +3525(computing)X +3894(the)X +4018(response)X +555 2748(time)N +717(for)X +831(each)X +999(transaction.)X +755 2871(The)N +912(performance)X +1351(comparisons)X +1788(focus)X +1993(on)X +2104(traditional)X +2464(Unix)X +2655(techniques)X +3029(\(unprotected,)X +3486(using)X +3 f +3690(\257ock)X +1 f +3854(\(2\))X +3979(and)X +4126(using)X +3 f +555 2961(fsync)N +1 f +733(\(2\)\))X +884(and)X +1030(a)X +1096(commercial)X +1504(relational)X +1836(database)X +2142(system.)X +2433(Well-behaved)X +2913(applications)X +3329(using)X +3 f +3531(\257ock)X +1 f +3695(\(2\))X +3818(are)X +3946(guaranteed)X +555 3051(that)N +704(concurrent)X +1077(processes')X +1441(updates)X +1715(do)X +1824(not)X +1955(interact)X +2225(with)X +2396(one)X +2541(another,)X +2831(but)X +2962(no)X +3070(guarantees)X +3442(about)X +3648(atomicity)X +3978(are)X +4105(made.)X +555 3141(That)N +731(is,)X +833(if)X +911(the)X +1038(system)X +1289(crashes)X +1555(in)X +1646(mid-transaction,)X +2198(only)X +2369(parts)X +2554(of)X +2649(that)X +2797(transaction)X +3177(will)X +3329(be)X +3433(re\257ected)X +3738(in)X +3828(the)X +3954 0.3125(after-crash)AX +555 3231(state)N +725(of)X +815(the)X +936(database.)X +1276(The)X +1424(use)X +1554(of)X +3 f +1643(fsync)X +1 f +1821(\(2\))X +1937(at)X +2017(transaction)X +2391(commit)X +2657(time)X +2821(provides)X +3119(guarantees)X +3485(of)X +3574(durability)X +3907(after)X +4077(system)X +555 3321(failure.)N +825(However,)X +1160(there)X +1341(is)X +1414(no)X +1514(mechanism)X +1899(to)X +1981(perform)X +2260(transaction)X +2632(abort.)X +3 f +555 3507(5.1.1.)N +775(Single-User)X +1191(Tests)X +1 f +755 3630(These)N +978(tests)X +1151(compare)X +1459(LIBTP)X +1712(in)X +1804(a)X +1870(variety)X +2123(of)X +2220(con\256gurations)X +2708(to)X +2800(traditional)X +3159(UNIX)X +3390(solutions)X +3708(and)X +3854(a)X +3920(commercial)X +555 3720(relational)N +884(database)X +1187(system)X +1435(\(RDBMS\).)X +1814(To)X +1929(demonstrate)X +2347(the)X +2471(server)X +2694(architecture)X +3100(we)X +3220(built)X +3392(a)X +3454(front)X +3636(end)X +3777(test)X +3913(process)X +4179(that)X +555 3810(uses)N +732(TCL)X +922([OUST90])X +1304(to)X +1405(parse)X +1614(database)X +1930(access)X +2175(commands)X +2561(and)X +2716(call)X +2870(the)X +3006(database)X +3321(access)X +3565(routines.)X +3901(In)X +4006(one)X +4160(case)X +555 3900(\(SERVER\),)N +956(frontend)X +1249(and)X +1386(backend)X +1675(processes)X +2004(were)X +2181(created)X +2434(which)X +2650(communicated)X +3142(via)X +3260(an)X +3356(IP)X +3447(socket.)X +3712(In)X +3799(the)X +3917(second)X +4160(case)X +555 3990(\(TCL\),)N +802(a)X +860(single)X +1073(process)X +1336(read)X +1497(queries)X +1751(from)X +1929(standard)X +2223(input,)X +2429(parsed)X +2660(them,)X +2861(and)X +2998(called)X +3211(the)X +3330(database)X +3628(access)X +3855(routines.)X +4174(The)X +555 4080(performance)N +987(difference)X +1338(between)X +1630(the)X +1752(TCL)X +1927(and)X +2067(SERVER)X +2397(tests)X +2563(quanti\256es)X +2898(the)X +3020(communication)X +3542(overhead)X +3861(of)X +3952(the)X +4074(socket.)X +555 4170(The)N +732(RDBMS)X +1063(implementation)X +1617(used)X +1816(embedded)X +2198(SQL)X +2401(in)X +2515(C)X +2620(with)X +2814(stored)X +3062(database)X +3391(procedures.)X +3835(Therefore,)X +4224(its)X +555 4260(con\256guration)N +1003(is)X +1076(a)X +1132(hybrid)X +1361(of)X +1448(the)X +1566(single)X +1777(process)X +2038(architecture)X +2438(and)X +2574(the)X +2692(server)X +2909(architecture.)X +3349(The)X +3494(graph)X +3697(in)X +3779(\256gure)X +3986(six)X +4099(shows)X +555 4350(a)N +611(comparison)X +1005(of)X +1092(the)X +1210(following)X +1541(six)X +1654(con\256gurations:)X +1126 4506(LIBTP)N +1552(Uses)X +1728(the)X +1846(LIBTP)X +2088(library)X +2322(in)X +2404(a)X +2460(single)X +2671(application.)X +1126 4596(TCL)N +1552(Uses)X +1728(the)X +1846(LIBTP)X +2088(library)X +2322(in)X +2404(a)X +2460(single)X +2671(application,)X +3067(requires)X +3346(query)X +3549(parsing.)X +1126 4686(SERVER)N +1552(Uses)X +1728(the)X +1846(LIBTP)X +2088(library)X +2322(in)X +2404(a)X +2460(server)X +2677(con\256guration,)X +3144(requires)X +3423(query)X +3626(parsing.)X +1126 4776(NOTP)N +1552(Uses)X +1728(no)X +1828(locking,)X +2108(logging,)X +2392(or)X +2479(concurrency)X +2897(control.)X +1126 4866(FLOCK)N +1552(Uses)X +3 f +1728(\257ock)X +1 f +1892(\(2\))X +2006(for)X +2120(concurrency)X +2538(control)X +2785(and)X +2921(nothing)X +3185(for)X +3299(durability.)X +1126 4956(FSYNC)N +1552(Uses)X +3 f +1728(fsync)X +1 f +1906(\(2\))X +2020(for)X +2134(durability)X +2465(and)X +2601(nothing)X +2865(for)X +2979(concurrency)X +3397(control.)X +1126 5046(RDBMS)N +1552(Uses)X +1728(a)X +1784(commercial)X +2183(relational)X +2506(database)X +2803(system.)X +755 5235(The)N +902(results)X +1133(show)X +1324(that)X +1466(LIBTP,)X +1730(both)X +1894(in)X +1978(the)X +2098(procedural)X +2464(and)X +2602(parsed)X +2834(environments,)X +3312(is)X +3387(competitive)X +3787(with)X +3951(a)X +4009(commer-)X +555 5325(cial)N +692(system)X +935(\(comparing)X +1326(LIBTP,)X +1589(TCL,)X +1781(and)X +1917(RDBMS\).)X +2263(Compared)X +2617(to)X +2699(existing)X +2972(UNIX)X +3193(solutions,)X +3521(LIBTP)X +3763(is)X +3836(approximately)X +555 5415(15%)N +738(slower)X +988(than)X +1162(using)X +3 f +1371(\257ock)X +1 f +1535(\(2\))X +1665(or)X +1768(no)X +1884(protection)X +2245(but)X +2383(over)X +2562(80%)X +2745(better)X +2964(than)X +3137(using)X +3 f +3345(fsync)X +1 f +3523(\(2\))X +3652(\(comparing)X +4057(LIBTP,)X +555 5505(FLOCK,)N +857(NOTP,)X +1106(and)X +1242(FSYNC\).)X + +12 p +%%Page: 12 12 +10 s 10 xH 0 xS 1 f +3 f +8 s +3500 2184(RDBMS)N +1 Dt +3553 2085 MXY + 3553 2085 lineto + 3676 2085 lineto + 3676 1351 lineto + 3553 1351 lineto + 3553 2085 lineto +closepath 16 3553 1351 3676 2085 Dp +2018 2184(SERVER)N +1720 1168 MXY +0 917 Dl +122 0 Dl +0 -917 Dl +-122 0 Dl +1715 2184(TCL)N +2087 1534 MXY + 2087 1534 lineto + 2209 1534 lineto + 2209 2085 lineto + 2087 2085 lineto + 2087 1534 lineto +closepath 12 2087 1534 2209 2085 Dp +3187 MX + 3187 1534 lineto + 3309 1534 lineto + 3309 2085 lineto + 3187 2085 lineto + 3187 1534 lineto +closepath 19 3187 1534 3309 2085 Dp +3142 2184(FSYNC)N +2425(NOTP)X +2453 955 MXY + 2453 955 lineto + 2576 955 lineto + 2576 2085 lineto + 2453 2085 lineto + 2453 955 lineto +closepath 21 2453 955 2576 2085 Dp +2820 1000 MXY + 2820 1000 lineto + 2942 1000 lineto + 2942 2085 lineto + 2820 2085 lineto + 2820 1000 lineto +closepath 14 2820 1000 2942 2085 Dp +5 Dt +1231 2085 MXY +2567 0 Dl +4 Ds +1 Dt +1231 1840 MXY +2567 0 Dl +1231 1596 MXY +2567 0 Dl +1231 1351 MXY +2567 0 Dl +1231 1108 MXY +2567 0 Dl +1231 863 MXY +2567 0 Dl +11 s +1087 1877(2)N +1087 1633(4)N +1087 1388(6)N +1087 1145(8)N +1065 900(10)N +1028 763(TPS)N +-1 Ds +1353 2085 MXY + 1353 2085 lineto + 1353 1151 lineto + 1476 1151 lineto + 1476 2085 lineto + 1353 2085 lineto +closepath 3 1353 1151 1476 2085 Dp +8 s +1318 2184(LIBTP)N +2767(FLOCK)X +3 Dt +-1 Ds +10 s +1597 2399(Figure)N +1844(6:)X +1931(Single-User)X +2347(Performance)X +2814(Comparison.)X +1 f +10 f +555 2579(h)N +579(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)X +3 f +555 2855(5.1.2.)N +775(Multi-User)X +1174(Tests)X +1 f +755 2978(While)N +975(the)X +1097(single-user)X +1473(tests)X +1639(form)X +1819(a)X +1878(basis)X +2061(for)X +2178(comparing)X +2544(LIBTP)X +2789(to)X +2874(other)X +3062(systems,)X +3358(our)X +3488(goal)X +3649(in)X +3734(multi-user)X +4086(testing)X +555 3068(was)N +714(to)X +810(analyze)X +1089(its)X +1197(scalability.)X +1579(To)X +1701(this)X +1849(end,)X +2018(we)X +2145(have)X +2330(run)X +2470(the)X +2601(benchmark)X +2991(in)X +3086(three)X +3280(modes,)X +3542(the)X +3673(normal)X +3933(disk)X +4099(bound)X +555 3158(con\256guration)N +1010(\(\256gure)X +1252(seven\),)X +1510(a)X +1573(CPU)X +1755(bound)X +1982(con\256guration)X +2436(\(\256gure)X +2677(eight,)X +2884(READ-ONLY\),)X +3426(and)X +3569(lock)X +3734(contention)X +4099(bound)X +555 3248(\(\256gure)N +796(eight,)X +1003(NO_FSYNC\).)X +1510(Since)X +1715(the)X +1840(normal)X +2094(con\256guration)X +2548(is)X +2628(completely)X +3011(disk)X +3171(bound)X +3398(\(each)X +3600(transaction)X +3978(requires)X +4263(a)X +555 3354(random)N +823(read,)X +1005(a)X +1064(random)X +1332(write,)X +1540(and)X +1679(a)X +1738(sequential)X +2086(write)X +7 s +2251 3322(4)N +10 s +3354(\))Y +2329(we)X +2446(expect)X +2679(to)X +2764(see)X +2890(little)X +3059(performance)X +3489(improvement)X +3939(as)X +4028(the)X +4148(mul-)X +555 3444(tiprogramming)N +1064(level)X +1249(increases.)X +1613(In)X +1709(fact,)X +1879(\256gure)X +2095(seven)X +2307(reveals)X +2564(that)X +2713(we)X +2836(are)X +2964(able)X +3127(to)X +3218(overlap)X +3487(CPU)X +3670(and)X +3814(disk)X +3975(utilization)X +555 3534(slightly)N +825(producing)X +1181(approximately)X +1674(a)X +1740(10%)X +1917(performance)X +2354(improvement)X +2811(with)X +2983(two)X +3133(processes.)X +3511(After)X +3711(that)X +3861(point,)X +4075(perfor-)X +555 3624(mance)N +785(drops)X +983(off,)X +1117(and)X +1253(at)X +1331(a)X +1387(multi-programming)X +2038(level)X +2214(of)X +2301(4,)X +2381(we)X +2495(are)X +2614(performing)X +2995(worse)X +3207(than)X +3365(in)X +3447(the)X +3565(single)X +3776(process)X +4037(case.)X +755 3747(Similar)N +1021(behavior)X +1333(was)X +1489(reported)X +1787(on)X +1897(the)X +2025(commercial)X +2434(relational)X +2767(database)X +3074(system)X +3326(using)X +3529(the)X +3657(same)X +3852(con\256guration.)X +555 3837(The)N +707(important)X +1045(conclusion)X +1419(to)X +1508(draw)X +1696(from)X +1879(this)X +2021(is)X +2101(that)X +2248(you)X +2395(cannot)X +2636(attain)X +2841(good)X +3028(multi-user)X +3384(scaling)X +3638(on)X +3745(a)X +3808(badly)X +4013(balanced)X +555 3927(system.)N +839(If)X +915(multi-user)X +1266(performance)X +1695(on)X +1797(applications)X +2205(of)X +2293(this)X +2429(sort)X +2570(is)X +2644(important,)X +2996(one)X +3133(must)X +3309(have)X +3482(a)X +3539(separate)X +3824(logging)X +4089(device)X +555 4017(and)N +697(horizontally)X +1110(partition)X +1407(the)X +1531(database)X +1834(to)X +1921(allow)X +2124(a)X +2185(suf\256ciently)X +2570(high)X +2737(degree)X +2977(of)X +3069(multiprogramming)X +3698(that)X +3843(group)X +4055(commit)X +555 4107(can)N +687(amortize)X +988(the)X +1106(cost)X +1255(of)X +1342(log)X +1464(\257ushing.)X +755 4230(By)N +871(using)X +1067(a)X +1126(very)X +1292(small)X +1488(database)X +1788(\(one)X +1954(that)X +2097(can)X +2232(be)X +2331(entirely)X +2599(cached)X +2846(in)X +2930(main)X +3112(memory\))X +3428(and)X +3566(read-only)X +3896(transactions,)X +555 4320(we)N +670(generated)X +1004(a)X +1061(CPU)X +1236(bound)X +1456(environment.)X +1921(By)X +2034(using)X +2227(the)X +2345(same)X +2530(small)X +2723(database,)X +3040(the)X +3158(complete)X +3472(TPCB)X +3691(transaction,)X +4083(and)X +4219(no)X +3 f +555 4410(fsync)N +1 f +733(\(2\))X +862(on)X +977(the)X +1110(log)X +1247(at)X +1340(commit,)X +1639(we)X +1768(created)X +2036(a)X +2107(lock)X +2280(contention)X +2652(bound)X +2886(environment.)X +3365(The)X +3524(small)X +3731(database)X +4042(used)X +4223(an)X +555 4500(account)N +828(\256le)X +953(containing)X +1314(only)X +1479(1000)X +1662(records)X +1922(rather)X +2133(than)X +2294(the)X +2415(full)X +2549(1,000,000)X +2891(records)X +3150(and)X +3288(ran)X +3413(enough)X +3671(transactions)X +4076(to)X +4160(read)X +555 4590(the)N +677(entire)X +883(database)X +1183(into)X +1330(the)X +1451(buffer)X +1671(pool)X +1836(\(2000\))X +2073(before)X +2302(beginning)X +2645(measurements.)X +3147(The)X +3295(read-only)X +3626(transaction)X +4001(consisted)X +555 4680(of)N +646(three)X +831(database)X +1132(reads)X +1326(\(from)X +1533(the)X +1655(1000)X +1839(record)X +2069(account)X +2343(\256le,)X +2489(the)X +2611(100)X +2754(record)X +2983(teller)X +3171(\256le,)X +3316(and)X +3455(the)X +3576(10)X +3679(record)X +3908(branch)X +4150(\256le\).)X +555 4770(Since)N +759(no)X +865(data)X +1025(were)X +1208(modi\256ed)X +1518(and)X +1660(no)X +1766(history)X +2014(records)X +2277(were)X +2460(written,)X +2733(no)X +2839(log)X +2966(records)X +3228(were)X +3410(written.)X +3702(For)X +3838(the)X +3961(contention)X +555 4860(bound)N +780(con\256guration,)X +1252(we)X +1371(used)X +1543(the)X +1666(normal)X +1918(TPCB)X +2142(transaction)X +2519(\(against)X +2798(the)X +2920(small)X +3117(database\))X +3445(and)X +3585(disabled)X +3876(the)X +3998(log)X +4124(\257ush.)X +555 4950(Figure)N +784(eight)X +964(shows)X +1184(both)X +1346(of)X +1433(these)X +1618(results.)X +755 5073(The)N +902(read-only)X +1231(test)X +1363(indicates)X +1669(that)X +1810(we)X +1925(barely)X +2147(scale)X +2329(at)X +2408(all)X +2509(in)X +2592(the)X +2711(CPU)X +2887(bound)X +3108(case.)X +3308(The)X +3454(explanation)X +3849(for)X +3964(that)X +4105(is)X +4179(that)X +555 5163(even)N +735(with)X +905(a)X +969(single)X +1188(process,)X +1477(we)X +1599(are)X +1726(able)X +1888(to)X +1978(drive)X +2171(the)X +2297(CPU)X +2480(utilization)X +2832(to)X +2922(96%.)X +3137(As)X +3254(a)X +3317(result,)X +3542(that)X +3689(gives)X +3885(us)X +3983(very)X +4153(little)X +555 5253(room)N +753(for)X +876(improvement,)X +1352(and)X +1497(it)X +1570(takes)X +1764(a)X +1829(multiprogramming)X +2462(level)X +2647(of)X +2743(four)X +2906(to)X +2997(approach)X +3321(100%)X +3537(CPU)X +3721(saturation.)X +4106(In)X +4201(the)X +555 5343(case)N +718(where)X +939(we)X +1057(do)X +1161(perform)X +1444(writes,)X +1684(we)X +1802(are)X +1925(interested)X +2261(in)X +2347(detecting)X +2665(when)X +2863(lock)X +3025(contention)X +3387(becomes)X +3691(a)X +3750(dominant)X +4075(perfor-)X +555 5433(mance)N +787(factor.)X +1037(Contention)X +1414(will)X +1560(cause)X +1761(two)X +1903(phenomena;)X +2317(we)X +2433(will)X +2579(see)X +2704(transactions)X +3109(queueing)X +3425(behind)X +3665(frequently)X +4017(accessed)X +555 5523(data,)N +731(and)X +869(we)X +985(will)X +1131(see)X +1256(transaction)X +1629(abort)X +1815(rates)X +1988(increasing)X +2339(due)X +2476(to)X +2559(deadlock.)X +2910(Given)X +3127(that)X +3268(the)X +3387(branch)X +3627(\256le)X +3750(contains)X +4038(only)X +4201(ten)X +8 s +10 f +555 5595(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N +5 s +1 f +727 5673(4)N +8 s +763 5698(Although)N +1021(the)X +1115(log)X +1213(is)X +1272(written)X +1469(sequentially,)X +1810(we)X +1900(do)X +1980(not)X +2078(get)X +2172(the)X +2266(bene\256t)X +2456(of)X +2525(sequentiality)X +2868(since)X +3015(the)X +3109(log)X +3207(and)X +3315(database)X +3550(reside)X +3718(on)X +3798(the)X +3892(same)X +4039(disk.)X + +13 p +%%Page: 13 13 +8 s 8 xH 0 xS 1 f +10 s +3 f +1 f +3187 2051 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3286 2028 MXY +0 17 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3384 1926 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3483 1910 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3581 1910 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3680 1832 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3778 1909 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3877 1883 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3975 1679 MXY +0 17 Dl +0 -8 Dl +9 0 Dl +-18 0 Dl +4074 1487 MXY +0 17 Dl +0 -8 Dl +9 0 Dl +-18 0 Dl +5 Dt +3187 2060 MXY +99 -24 Dl +98 -101 Dl +99 -16 Dl +98 0 Dl +99 -78 Dl +98 77 Dl +99 -26 Dl +98 -204 Dl +99 -192 Dl +3 f +6 s +4088 1516(SMALL)N +3 Dt +3187 2051 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3286 2051 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3384 2041 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3483 1990 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3581 1843 MXY +0 17 Dl +0 -8 Dl +9 0 Dl +-18 0 Dl +3680 1578 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3778 1496 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3877 1430 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +3975 1269 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +4074 1070 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1 Dt +3187 2060 MXY +99 0 Dl +98 -10 Dl +99 -51 Dl +98 -147 Dl +99 -265 Dl +98 -82 Dl +99 -66 Dl +98 -161 Dl +99 -199 Dl +4088 1099(LARGE)N +5 Dt +3089 2060 MXY +985 0 Dl +3089 MX +0 -1174 Dl +4 Ds +1 Dt +3581 2060 MXY +0 -1174 Dl +4074 2060 MXY +0 -1174 Dl +3089 1825 MXY +985 0 Dl +9 s +2993 1855(25)N +3089 1591 MXY +985 0 Dl +2993 1621(50)N +3089 1356 MXY +985 0 Dl +2993 1386(75)N +3089 1121 MXY +985 0 Dl +2957 1151(100)N +3089 886 MXY +985 0 Dl +2957 916(125)N +3281 2199(Multiprogramming)N +3071 2152(0)N +3569(5)X +4038(10)X +2859 787(Aborts)N +3089(per)X +3211(500)X +2901 847(transactions)N +-1 Ds +3 Dt +2037 1342 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2125 1358 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2213 1341 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2301 1191 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2388 1124 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-17 0 Dl +2476 1157 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2564 1157 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2652 1161 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2740 1153 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2828 1150 MXY +0 18 Dl +0 -9 Dl +8 0 Dl +-17 0 Dl +5 Dt +2037 1351 MXY +88 16 Dl +88 -17 Dl +88 -150 Dl +87 -67 Dl +88 33 Dl +88 0 Dl +88 4 Dl +88 -8 Dl +88 -3 Dl +6 s +2685 1234(READ-ONLY)N +3 Dt +2037 1464 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2125 1640 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2213 1854 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2301 1872 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2388 1871 MXY +0 17 Dl +0 -9 Dl +9 0 Dl +-17 0 Dl +2476 1933 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2564 1914 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2652 1903 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2740 1980 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +2828 2004 MXY +0 18 Dl +0 -9 Dl +8 0 Dl +-17 0 Dl +1 Dt +2037 1473 MXY +88 176 Dl +88 214 Dl +88 18 Dl +87 -2 Dl +88 63 Dl +88 -19 Dl +88 -11 Dl +88 77 Dl +88 24 Dl +2759 1997(NO-FSYNC)N +5 Dt +1949 2060 MXY +879 0 Dl +1949 MX +0 -1174 Dl +4 Ds +1 Dt +2388 2060 MXY +0 -1174 Dl +2828 2060 MXY +0 -1174 Dl +1949 1825 MXY +879 0 Dl +9 s +1842 1855(40)N +1949 1591 MXY +879 0 Dl +1842 1621(80)N +1949 1356 MXY +879 0 Dl +1806 1386(120)N +1949 1121 MXY +879 0 Dl +1806 1151(160)N +1949 886 MXY +879 0 Dl +1806 916(200)N +2088 2199(Multiprogramming)N +1844 863(in)N +1922(TPS)X +1761 792(Throughput)N +1931 2121(0)N +2370 2133(5)N +2792(10)X +6 s +1679 1833(LIBTP)N +-1 Ds +3 Dt +837 1019 MXY +0 17 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +929 878 MXY +0 17 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1021 939 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1113 1043 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1205 1314 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1297 1567 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1389 1665 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1481 1699 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1573 1828 MXY +0 18 Dl +0 -9 Dl +9 0 Dl +-18 0 Dl +1665 1804 MXY +0 18 Dl +0 -9 Dl +8 0 Dl +-17 0 Dl +5 Dt +837 1027 MXY +92 -141 Dl +92 62 Dl +92 104 Dl +92 271 Dl +92 253 Dl +92 98 Dl +92 34 Dl +92 129 Dl +92 -24 Dl +745 2060 MXY +920 0 Dl +745 MX +0 -1174 Dl +4 Ds +1 Dt +1205 2060 MXY +0 -1174 Dl +1665 2060 MXY +0 -1174 Dl +745 1766 MXY +920 0 Dl +9 s +673 1796(3)N +745 1473 MXY +920 0 Dl +673 1503(5)N +745 1180 MXY +920 0 Dl +673 1210(8)N +745 886 MXY +920 0 Dl +637 916(10)N +905 2199(Multiprogramming)N +622 851(in)N +700(TPS)X +575 792(Throughput)N +733 2152(0)N +1196(5)X +1629(10)X +3 Dt +-1 Ds +8 s +655 2441(Figure)N +872(7:)X +960(Multi-user)X +1286(Performance.)X +1 f +655 2531(Since)N +825(the)X +931(con\256guration)X +1300(is)X +1371(completely)X +655 2621(disk)N +790(bound,)X +994(we)X +1096(see)X +1204(only)X +1345(a)X +1400(small)X +1566(im-)X +655 2711(provement)N +964(by)X +1064(adding)X +1274(a)X +1337(second)X +1549(pro-)X +655 2801(cess.)N +849(Adding)X +1081(any)X +1213(more)X +1383(concurrent)X +655 2891(processes)N +935(causes)X +1137(performance)X +1493(degra-)X +655 2981(dation.)N +3 f +1927 2441(Figure)N +2149(8:)X +2243(Multi-user)X +2574(Performance)X +1927 2531(on)N +2021(a)X +2079(small)X +2251(database.)X +1 f +2551(With)X +2704(one)X +2821(pro-)X +1927 2621(cess,)N +2075(we)X +2174(are)X +2276(driving)X +2486(the)X +2589(CPU)X +2739(at)X +2810(96%)X +1927 2711(utilization)N +2215(leaving)X +2430(little)X +2575(room)X +2737(for)X +2838(im-)X +1927 2801(provement)N +2238(as)X +2328(the)X +2443(multiprogramming)X +1927 2891(level)N +2091(increases.)X +2396(In)X +2489(the)X +2607(NO-FSYNC)X +1927 2981(case,)N +2076(lock)X +2209(contention)X +2502(degrades)X +2751(perfor-)X +1927 3071(mance)N +2117(as)X +2194(soon)X +2339(as)X +2416(a)X +2468(second)X +2669(process)X +2884(is)X +1927 3161(added.)N +3 f +3199 2441(Figure)N +3405(9:)X +3482(Abort)X +3669(rates)X +3827(on)X +3919(the)X +4028(TPCB)X +3199 2531(Benchmark.)N +1 f +3589(The)X +3726(abort)X +3895(rate)X +4028(climbs)X +3199 2621(more)N +3366(quickly)X +3594(for)X +3704(the)X +3818(large)X +3980(database)X +3199 2711(test)N +3324(since)X +3491(processes)X +3771(are)X +3884(descheduled)X +3199 2801(more)N +3409(frequently,)X +3766(allowing)X +4068(more)X +3199 2891(processes)N +3459(to)X +3525(vie)X +3619(for)X +3709(the)X +3803(same)X +3950(locks.)X +10 s +10 f +555 3284(h)N +579(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)X +1 f +555 3560(records,)N +835(we)X +952(expect)X +1185(contention)X +1546(to)X +1631(become)X +1904(a)X +1963(factor)X +2174(quickly)X +2437(and)X +2576(the)X +2697(NO-FSYNC)X +3120(line)X +3263(in)X +3348(\256gure)X +3557(eight)X +3739(demonstrates)X +4184(this)X +555 3650(dramatically.)N +1022(Each)X +1209(additional)X +1555(process)X +1822(causes)X +2058(both)X +2226(more)X +2417(waiting)X +2682(and)X +2823(more)X +3013(deadlocking.)X +3470(Figure)X +3704(nine)X +3867(shows)X +4092(that)X +4237(in)X +555 3740(the)N +681(small)X +882(database)X +1187(case)X +1353(\(SMALL\),)X +1725(waiting)X +1992(is)X +2072(the)X +2197(dominant)X +2526(cause)X +2732(of)X +2826(declining)X +3151(performance)X +3585(\(the)X +3737(number)X +4009(of)X +4103(aborts)X +555 3830(increases)N +878(less)X +1026(steeply)X +1281(than)X +1447(the)X +1573(performance)X +2008(drops)X +2214(off)X +2336(in)X +2426(\256gure)X +2641(eight\),)X +2876(while)X +3082(in)X +3172(the)X +3298(large)X +3487(database)X +3792(case)X +3958(\(LARGE\),)X +555 3920(deadlocking)N +967(contributes)X +1343(more)X +1528(to)X +1610(the)X +1728(declining)X +2046(performance.)X +755 4043(Deadlocks)N +1116(are)X +1237(more)X +1424(likely)X +1628(to)X +1712(occur)X +1913(in)X +1997(the)X +2116(LARGE)X +2404(test)X +2536(than)X +2695(in)X +2778(the)X +2897(SMALL)X +3189(test)X +3321(because)X +3597(there)X +3779(are)X +3899(more)X +4085(oppor-)X +555 4133(tunities)N +814(to)X +900(wait.)X +1082(In)X +1173(the)X +1295(SMALL)X +1590(case,)X +1773(processes)X +2105(never)X +2307(do)X +2410(I/O)X +2540(and)X +2679(are)X +2801(less)X +2944(likely)X +3149(to)X +3234(be)X +3333(descheduled)X +3753(during)X +3985(a)X +4044(transac-)X +555 4223(tion.)N +740(In)X +828(the)X +947(LARGE)X +1235(case,)X +1415(processes)X +1744(will)X +1889(frequently)X +2240(be)X +2337(descheduled)X +2755(since)X +2941(they)X +3100(have)X +3273(to)X +3356(perform)X +3636(I/O.)X +3804(This)X +3967(provides)X +4263(a)X +555 4313(window)N +837(where)X +1058(a)X +1118(second)X +1365(process)X +1630(can)X +1766(request)X +2022(locks)X +2215(on)X +2318(already)X +2578(locked)X +2815(pages,)X +3041(thus)X +3197(increasing)X +3550(the)X +3671(likelihood)X +4018(of)X +4108(build-)X +555 4403(ing)N +677(up)X +777(long)X +939(chains)X +1164(of)X +1251(waiting)X +1511(processes.)X +1879(Eventually,)X +2266(this)X +2401(leads)X +2586(to)X +2668(deadlock.)X +3 f +555 4589(5.2.)N +715(The)X +868(OO1)X +1052(Benchmark)X +1 f +755 4712(The)N +903(TPCB)X +1125(benchmark)X +1505(described)X +1836(in)X +1921(the)X +2042(previous)X +2341(section)X +2591(measures)X +2913(performance)X +3343(under)X +3549(a)X +3608(conventional)X +4044(transac-)X +555 4802(tion)N +706(processing)X +1076(workload.)X +1446(Other)X +1656(application)X +2039(domains,)X +2357(such)X +2531(as)X +2625(computer-aided)X +3156(design,)X +3412(have)X +3591(substantially)X +4022(different)X +555 4892(access)N +786(patterns.)X +1105(In)X +1197(order)X +1392(to)X +1479(measure)X +1772(the)X +1895(performance)X +2327(of)X +2418(LIBTP)X +2664(under)X +2871(workloads)X +3229(of)X +3320(this)X +3459(type,)X +3641(we)X +3759(implemented)X +4201(the)X +555 4982(OO1)N +731(benchmark)X +1108(described)X +1436(in)X +1518([CATT91].)X +755 5105(The)N +908(database)X +1213(models)X +1472(a)X +1535(set)X +1651(of)X +1745(electronics)X +2120(components)X +2534(with)X +2703(connections)X +3113(among)X +3358(them.)X +3585(One)X +3746(table)X +3929(stores)X +4143(parts)X +555 5195(and)N +696(another)X +962(stores)X +1174(connections.)X +1622(There)X +1835(are)X +1959(three)X +2145(connections)X +2552(originating)X +2927(at)X +3009(any)X +3149(given)X +3351(part.)X +3540(Ninety)X +3782(percent)X +4043(of)X +4134(these)X +555 5285(connections)N +960(are)X +1081(to)X +1165(nearby)X +1406(parts)X +1584(\(those)X +1802(with)X +1966(nearby)X +2 f +2207(ids)X +1 f +2300(\))X +2348(to)X +2431(model)X +2652(the)X +2771(spatial)X +3001(locality)X +3262(often)X +3448(exhibited)X +3767(in)X +3850(CAD)X +4040(applica-)X +555 5375(tions.)N +779(Ten)X +933(percent)X +1198(of)X +1293(the)X +1419(connections)X +1830(are)X +1957(randomly)X +2292(distributed)X +2662(among)X +2908(all)X +3016(other)X +3209(parts)X +3393(in)X +3483(the)X +3609(database.)X +3954(Every)X +4174(part)X +555 5465(appears)N +829(exactly)X +1089(three)X +1278(times)X +1479(in)X +1569(the)X +2 f +1695(from)X +1 f +1874(\256eld)X +2043(of)X +2137(a)X +2200(connection)X +2579(record,)X +2832(and)X +2975(zero)X +3141(or)X +3235(more)X +3427(times)X +3627(in)X +3716(the)X +2 f +3841(to)X +1 f +3930(\256eld.)X +4139(Parts)X +555 5555(have)N +2 f +727(x)X +1 f +783(and)X +2 f +919(y)X +1 f +975(locations)X +1284(set)X +1393(randomly)X +1720(in)X +1802(an)X +1898(appropriate)X +2284(range.)X + +14 p +%%Page: 14 14 +10 s 10 xH 0 xS 1 f +3 f +1 f +755 630(The)N +900(intent)X +1102(of)X +1189(OO1)X +1365(is)X +1438(to)X +1520(measure)X +1808(the)X +1926(overall)X +2169(cost)X +2318(of)X +2405(a)X +2461(query)X +2664(mix)X +2808(characteristic)X +3257(of)X +3344(engineering)X +3743(database)X +4040(applica-)X +555 720(tions.)N +770(There)X +978(are)X +1097(three)X +1278(tests:)X +10 f +635 843(g)N +2 f +755(Lookup)X +1 f +1022(generates)X +1353(1,000)X +1560(random)X +1832(part)X +2 f +1984(ids)X +1 f +2077(,)X +2124(fetches)X +2378(the)X +2502(corresponding)X +2987(parts)X +3169(from)X +3351(the)X +3475(database,)X +3798(and)X +3940(calls)X +4113(a)X +4175(null)X +755 933(procedure)N +1097(in)X +1179(the)X +1297(host)X +1450(programming)X +1906(language)X +2216(with)X +2378(the)X +2496(parts')X +2 f +2699(x)X +1 f +2755(and)X +2 f +2891(y)X +1 f +2947(positions.)X +10 f +635 1056(g)N +2 f +755(Traverse)X +1 f +1067(retrieves)X +1371(a)X +1434(random)X +1706(part)X +1858(from)X +2041(the)X +2166(database)X +2470(and)X +2613(follows)X +2880(connections)X +3290(from)X +3473(it)X +3544(to)X +3632(other)X +3823(parts.)X +4045(Each)X +4232(of)X +755 1146(those)N +947(parts)X +1126(is)X +1202(retrieved,)X +1531(and)X +1670(all)X +1773(connections)X +2179(from)X +2358(it)X +2424(followed.)X +2771(This)X +2935(procedure)X +3279(is)X +3354(repeated)X +3649(depth-\256rst)X +4000(for)X +4116(seven)X +755 1236(hops)N +930(from)X +1110(the)X +1232(original)X +1505(part,)X +1674(for)X +1792(a)X +1852(total)X +2018(of)X +2109(3280)X +2293(parts.)X +2513(Backward)X +2862(traversal)X +3162(also)X +3314(exists,)X +3539(and)X +3678(follows)X +3941(all)X +4044(connec-)X +755 1326(tions)N +930(into)X +1074(a)X +1130(given)X +1328(part)X +1473(to)X +1555(their)X +1722(origin.)X +10 f +635 1449(g)N +2 f +755(Insert)X +1 f +962(adds)X +1129(100)X +1269(new)X +1423(parts)X +1599(and)X +1735(their)X +1902(connections.)X +755 1572(The)N +913(benchmark)X +1303(is)X +1389(single-user,)X +1794(but)X +1929(multi-user)X +2291(access)X +2530(controls)X +2821(\(locking)X +3120(and)X +3268(transaction)X +3652(protection\))X +4036(must)X +4223(be)X +555 1662(enforced.)N +898(It)X +968(is)X +1042(designed)X +1348(to)X +1431(be)X +1528(run)X +1656(on)X +1757(a)X +1814(database)X +2112(with)X +2275(20,000)X +2516(parts,)X +2713(and)X +2850(on)X +2951(one)X +3087(with)X +3249(200,000)X +3529(parts.)X +3745(Because)X +4033(we)X +4147(have)X +555 1752(insuf\256cient)N +935(disk)X +1088(space)X +1287(for)X +1401(the)X +1519(larger)X +1727(database,)X +2044(we)X +2158(report)X +2370(results)X +2599(only)X +2761(for)X +2875(the)X +2993(20,000)X +3233(part)X +3378(database.)X +3 f +555 1938(5.2.1.)N +775(Implementation)X +1 f +755 2061(The)N +920(LIBTP)X +1182(implementation)X +1724(of)X +1831(OO1)X +2027(uses)X +2205(the)X +2342(TCL)X +2532([OUST90])X +2914(interface)X +3235(described)X +3582(earlier.)X +3867(The)X +4031(backend)X +555 2151(accepts)N +813(commands)X +1181(over)X +1345(an)X +1442(IP)X +1534(socket)X +1760(and)X +1897(performs)X +2208(the)X +2327(requested)X +2656(database)X +2954(actions.)X +3242(The)X +3387(frontend)X +3679(opens)X +3886(and)X +4022(executes)X +555 2241(a)N +618(TCL)X +796(script.)X +1041(This)X +1210(script)X +1415(contains)X +1709(database)X +2013(accesses)X +2313(interleaved)X +2697(with)X +2866(ordinary)X +3165(program)X +3463(control)X +3716(statements.)X +4120(Data-)X +555 2331(base)N +718(commands)X +1085(are)X +1204(submitted)X +1539(to)X +1621(the)X +1739(backend)X +2027(and)X +2163(results)X +2392(are)X +2511(bound)X +2731(to)X +2813(program)X +3105(variables.)X +755 2454(The)N +903(parts)X +1082(table)X +1261(was)X +1409(stored)X +1628(as)X +1718(a)X +1776(B-tree)X +1999(indexed)X +2275(by)X +2 f +2377(id)X +1 f +2439(.)X +2501(The)X +2648(connection)X +3022(table)X +3200(was)X +3347(stored)X +3565(as)X +3654(a)X +3712(set)X +3823(of)X +3912(\256xed-length)X +555 2544(records)N +824(using)X +1029(the)X +1159(4.4BSD)X +1446(recno)X +1657(access)X +1895(method.)X +2207(In)X +2306(addition,)X +2620(two)X +2771(B-tree)X +3003(indices)X +3261(were)X +3449(maintained)X +3836(on)X +3947(connection)X +555 2634(table)N +732(entries.)X +1007(One)X +1162(index)X +1360(mapped)X +1634(the)X +2 f +1752(from)X +1 f +1923(\256eld)X +2085(to)X +2167(a)X +2223(connection)X +2595(record)X +2821(number,)X +3106(and)X +3242(the)X +3360(other)X +3545(mapped)X +3819(the)X +2 f +3937(to)X +1 f +4019(\256eld)X +4181(to)X +4263(a)X +555 2724(connection)N +932(record)X +1163(number.)X +1473(These)X +1690(indices)X +1941(support)X +2205(fast)X +2345(lookups)X +2622(on)X +2726(connections)X +3133(in)X +3219(both)X +3385(directions.)X +3765(For)X +3900(the)X +4022(traversal)X +555 2814(tests,)N +743(the)X +867(frontend)X +1165(does)X +1338(an)X +1439(index)X +1642(lookup)X +1889(to)X +1976(discover)X +2273(the)X +2396(connected)X +2747(part's)X +2 f +2955(id)X +1 f +3017(,)X +3062(and)X +3203(then)X +3366(does)X +3538(another)X +3804(lookup)X +4051(to)X +4138(fetch)X +555 2904(the)N +673(part)X +818(itself.)X +3 f +555 3090(5.2.2.)N +775(Performance)X +1242(Measurements)X +1766(for)X +1889(OO1)X +1 f +755 3213(We)N +888(compare)X +1186(LIBTP's)X +1487(OO1)X +1664(performance)X +2092(to)X +2174(that)X +2314(reported)X +2602(in)X +2684([CATT91].)X +3087(Those)X +3303(results)X +3532(were)X +3709(collected)X +4019(on)X +4119(a)X +4175(Sun)X +555 3303(3/280)N +759(\(25)X +888(MHz)X +1075(MC68020\))X +1448(with)X +1612(16)X +1714(MBytes)X +1989(of)X +2078(memory)X +2367(and)X +2505(two)X +2647(Hitachi)X +2904(892MByte)X +3267(disks)X +3452(\(15)X +3580(ms)X +3694(average)X +3966(seek)X +4130(time\))X +555 3393(behind)N +793(an)X +889(SMD-4)X +1149(controller.)X +1521(Frontends)X +1861(ran)X +1984(on)X +2084(an)X +2180(8MByte)X +2462(Sun)X +2606(3/260.)X +755 3516(In)N +844(order)X +1036(to)X +1120(measure)X +1410(performance)X +1839(on)X +1941(a)X +1999(machine)X +2293(of)X +2382(roughly)X +2653(equivalent)X +3009(processor)X +3339(power,)X +3582(we)X +3698(ran)X +3822(one)X +3959(set)X +4069(of)X +4157(tests)X +555 3606(on)N +666(a)X +733(standalone)X +1107(MC68030-based)X +1671(HP300)X +1923(\(33MHz)X +2225(MC68030\).)X +2646(The)X +2801(database)X +3108(was)X +3263(stored)X +3489(on)X +3599(a)X +3665(300MByte)X +4037(HP7959)X +555 3696(SCSI)N +744(disk)X +898(\(17)X +1026(ms)X +1139(average)X +1410(seek)X +1573(time\).)X +1802(Since)X +2000(this)X +2135(machine)X +2427(is)X +2500(not)X +2622(connected)X +2968(to)X +3050(a)X +3106(network,)X +3409(we)X +3523(ran)X +3646(local)X +3822(tests)X +3984(where)X +4201(the)X +555 3786(frontend)N +855(and)X +999(backend)X +1295(run)X +1430(on)X +1538(the)X +1664(same)X +1856(machine.)X +2195(We)X +2334(compare)X +2638(these)X +2830(measurements)X +3316(with)X +3485(Cattell's)X +3783(local)X +3966(Sun)X +4117(3/280)X +555 3876(numbers.)N +755 3999(Because)N +1051(the)X +1177(benchmark)X +1562(requires)X +1849(remote)X +2100(access,)X +2354(we)X +2476(ran)X +2607(another)X +2876(set)X +2993(of)X +3088(tests)X +3258(on)X +3365(a)X +3428(DECstation)X +3828(5000/200)X +4157(with)X +555 4089(32M)N +732(of)X +825(memory)X +1118(running)X +1393(Ultrix)X +1610(V4.0)X +1794(and)X +1936(a)X +1998(DEC)X +2184(1GByte)X +2459(RZ57)X +2666(SCSI)X +2859(disk.)X +3057(We)X +3194(compare)X +3496(the)X +3619(local)X +3800(performance)X +4232(of)X +555 4179(OO1)N +734(on)X +837(the)X +958(DECstation)X +1354(to)X +1439(its)X +1536(remote)X +1781(performance.)X +2250(For)X +2383(the)X +2503(remote)X +2748(case,)X +2929(we)X +3045(ran)X +3170(the)X +3290(frontend)X +3584(on)X +3686(a)X +3744(DECstation)X +4139(3100)X +555 4269(with)N +717(16)X +817(MBytes)X +1090(of)X +1177(main)X +1357(memory.)X +755 4392(The)N +900(databases)X +1228(tested)X +1435(in)X +1517([CATT91])X +1880(are)X +10 f +635 4515(g)N +1 f +755(INDEX,)X +1045(a)X +1101(highly-optimized)X +1672(access)X +1898(method)X +2158(package)X +2442(developed)X +2792(at)X +2870(Sun)X +3014(Microsystems.)X +10 f +635 4638(g)N +1 f +755(OODBMS,)X +1137(a)X +1193(beta)X +1347(release)X +1591(of)X +1678(a)X +1734(commercial)X +2133(object-oriented)X +2639(database)X +2936(management)X +3366(system.)X +10 f +635 4761(g)N +1 f +755(RDBMS,)X +1076(a)X +1133(UNIX-based)X +1565(commercial)X +1965(relational)X +2289(data)X +2444(manager)X +2742(at)X +2821(production)X +3189(release.)X +3474(The)X +3620(OO1)X +3797(implementation)X +755 4851(used)N +922(embedded)X +1272(SQL)X +1443(in)X +1525(C.)X +1638(Stored)X +1867(procedures)X +2240(were)X +2417(de\256ned)X +2673(to)X +2755(reduce)X +2990(client-server)X +3412(traf\256c.)X +755 4974(Table)N +974(two)X +1130(shows)X +1366(the)X +1500(measurements)X +1995(from)X +2187([CATT91])X +2566(and)X +2718(LIBTP)X +2976(for)X +3106(a)X +3178(local)X +3370(test)X +3517(on)X +3632(the)X +3765(MC680x0-based)X +555 5064(hardware.)N +915(All)X +1037(caches)X +1272(are)X +1391(cleared)X +1644(before)X +1870(each)X +2038(test.)X +2209(All)X +2331(times)X +2524(are)X +2643(in)X +2725(seconds.)X +755 5187(Table)N +960(two)X +1102(shows)X +1324(that)X +1466(LIBTP)X +1710(outperforms)X +2123(the)X +2242(commercial)X +2642(relational)X +2966(system,)X +3229(but)X +3352(is)X +3426(slower)X +3661(than)X +3820(OODBMS)X +4183(and)X +555 5277(INDEX.)N +872(Since)X +1077(the)X +1202(caches)X +1444(were)X +1628(cleared)X +1888(at)X +1973(the)X +2098(start)X +2263(of)X +2356(each)X +2530(test,)X +2687(disk)X +2846(throughput)X +3223(is)X +3302(critical)X +3551(in)X +3639(this)X +3780(test.)X +3957(The)X +4108(single)X +555 5367(SCSI)N +749(HP)X +877(drive)X +1068(used)X +1241(by)X +1347(LIBTP)X +1595(is)X +1674(approximately)X +2163(13%)X +2336(slower)X +2576(than)X +2739(the)X +2862(disks)X +3051(used)X +3223(in)X +3310([CATT91])X +3678(which)X +3899(accounts)X +4205(for)X +555 5457(part)N +700(of)X +787(the)X +905(difference.)X +755 5580(OODBMS)N +1118(and)X +1255(INDEX)X +1525(outperform)X +1906(LIBTP)X +2148(most)X +2323(dramatically)X +2744(on)X +2844(traversal.)X +3181(This)X +3343(is)X +3416(because)X +3691(we)X +3805(use)X +3932(index)X +4130(look-)X +555 5670(ups)N +689(to)X +774(\256nd)X +921(connections,)X +1347(whereas)X +1634(the)X +1755(other)X +1942(two)X +2084(systems)X +2359(use)X +2488(a)X +2546(link)X +2692(access)X +2920(method.)X +3222(The)X +3369(index)X +3569(requires)X +3850(us)X +3943(to)X +4027(examine)X + +15 p +%%Page: 15 15 +10 s 10 xH 0 xS 1 f +3 f +1 f +10 f +555 679(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N +2 f +606 769(Measure)N +1 f +1019(INDEX)X +1389(OODBMS)X +1851(RDBMS)X +2250(LIBTP)X +10 f +555 771(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N +555 787(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N +1 f +595 869(Lookup)N +1114(5.4)X +1490(12.9)X +1950(27)X +2291(27.2)X +595 959(Traversal)N +1074(13)X +1530(9.8)X +1950(90)X +2291(47.3)X +595 1049(Insert)N +1114(7.4)X +1530(1.5)X +1950(22)X +2331(9.7)X +10 f +555 1059(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)N +555(c)X +999(c)Y +919(c)Y +839(c)Y +759(c)Y +959 1059(c)N +999(c)Y +919(c)Y +839(c)Y +759(c)Y +1329 1059(c)N +999(c)Y +919(c)Y +839(c)Y +759(c)Y +1791 1059(c)N +999(c)Y +919(c)Y +839(c)Y +759(c)Y +2190 1059(c)N +999(c)Y +919(c)Y +839(c)Y +759(c)Y +2512 1059(c)N +999(c)Y +919(c)Y +839(c)Y +759(c)Y +2618 679(i)N +2629(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2 f +2829 769(Measure)N +3401(Cache)X +3726(Local)X +4028(Remote)X +1 f +10 f +2618 771(i)N +2629(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2618 787(i)N +2629(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2658 869(Lookup)N +3401(cold)X +3747(15.7)X +4078(20.6)X +3401 959(warm)N +3787(7.8)X +4078(12.4)X +10 f +2618 969(i)N +2629(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2658 1059(Forward)N +2950(traversal)X +3401(cold)X +3747(28.4)X +4078(52.6)X +3401 1149(warm)N +3747(23.5)X +4078(47.4)X +10 f +2618 1159(i)N +2629(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2658 1249(Backward)N +3004(traversal)X +3401(cold)X +3747(24.2)X +4078(47.4)X +3401 1339(warm)N +3747(24.3)X +4078(47.6)X +10 f +2618 1349(i)N +2629(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +1 f +2658 1439(Insert)N +3401(cold)X +3787(7.5)X +4078(10.3)X +3401 1529(warm)N +3787(6.7)X +4078(10.9)X +10 f +2618 1539(i)N +2629(iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii)X +2618(c)X +1479(c)Y +1399(c)Y +1319(c)Y +1239(c)Y +1159(c)Y +1079(c)Y +999(c)Y +919(c)Y +839(c)Y +759(c)Y +3341 1539(c)N +1479(c)Y +1399(c)Y +1319(c)Y +1239(c)Y +1159(c)Y +1079(c)Y +999(c)Y +919(c)Y +839(c)Y +759(c)Y +3666 1539(c)N +1479(c)Y +1399(c)Y +1319(c)Y +1239(c)Y +1159(c)Y +1079(c)Y +999(c)Y +919(c)Y +839(c)Y +759(c)Y +3968 1539(c)N +1479(c)Y +1399(c)Y +1319(c)Y +1239(c)Y +1159(c)Y +1079(c)Y +999(c)Y +919(c)Y +839(c)Y +759(c)Y +4309 1539(c)N +1479(c)Y +1399(c)Y +1319(c)Y +1239(c)Y +1159(c)Y +1079(c)Y +999(c)Y +919(c)Y +839(c)Y +759(c)Y +3 f +587 1785(Table)N +823(2:)X +931(Local)X +1163(MC680x0)X +1538(Performance)X +2026(of)X +2133(Several)X +587 1875(Systems)N +883(on)X +987(OO1.)X +2667 1785(Table)N +2909(3:)X +3023(Local)X +3260(vs.)X +3397(Remote)X +3707(Performance)X +4200(of)X +2667 1875(LIBTP)N +2926(on)X +3030(OO1.)X +1 f +10 f +555 1998(h)N +579(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)X +1 f +555 2274(two)N +696(disk)X +850(pages,)X +1074(but)X +1197(the)X +1316(links)X +1492(require)X +1741(only)X +1904(one,)X +2061(regardless)X +2408(of)X +2496(database)X +2794(size.)X +2980(Cattell)X +3214(reports)X +3458(that)X +3599(lookups)X +3873(using)X +4067(B-trees)X +555 2364(instead)N +808(of)X +901(links)X +1082(makes)X +1313(traversal)X +1616(take)X +1776(twice)X +1976(as)X +2069(long)X +2237(in)X +2325(INDEX.)X +2641(Adding)X +2907(a)X +2969(link)X +3119(access)X +3351(method)X +3617(to)X +3 f +3704(db)X +1 f +3792(\(3\))X +3911(or)X +4003(using)X +4201(the)X +555 2454(existing)N +828(hash)X +995(method)X +1255(would)X +1475(apparently)X +1834(be)X +1930(a)X +1986(good)X +2166(idea.)X +755 2577(Both)N +936(OODBMS)X +1304(and)X +1446(INDEX)X +1722(issue)X +1908 0.1944(coarser-granularity)AX +2545(locks)X +2739(than)X +2902(LIBTP.)X +3189(This)X +3356(limits)X +3562(concurrency)X +3985(for)X +4104(multi-)X +555 2667(user)N +711(applications,)X +1140(but)X +1264(helps)X +1455(single-user)X +1829(applications.)X +2278(In)X +2367(addition,)X +2671(the)X +2791(fact)X +2934(that)X +3076(LIBTP)X +3319(releases)X +3595(B-tree)X +3817(locks)X +4007(early)X +4189(is)X +4263(a)X +555 2757(drawback)N +896(in)X +986(OO1.)X +1210(Since)X +1416(there)X +1605(is)X +1686(no)X +1793(concurrency)X +2218(in)X +2307(the)X +2432(benchmark,)X +2836(high-concurrency)X +3430(strategies)X +3760(only)X +3929(show)X +4125(up)X +4232(as)X +555 2847(increased)N +882(locking)X +1145(overhead.)X +1503(Finally,)X +1772(the)X +1892(architecture)X +2294(of)X +2383(the)X +2503(LIBTP)X +2747(implementation)X +3271(was)X +3418(substantially)X +3844(different)X +4143(from)X +555 2937(that)N +702(of)X +796(either)X +1006(OODBMS)X +1375(or)X +1469(INDEX.)X +1786(Both)X +1968(of)X +2062(those)X +2258(systems)X +2538(do)X +2645(the)X +2770(searches)X +3070(in)X +3159(the)X +3284(user's)X +3503(address)X +3771(space,)X +3997(and)X +4139(issue)X +555 3027(requests)N +844(for)X +964(pages)X +1173(to)X +1260(the)X +1383(server)X +1605(process.)X +1911(Pages)X +2123(are)X +2247(cached)X +2496(in)X +2583(the)X +2706(client,)X +2929(and)X +3070(many)X +3273(queries)X +3530(can)X +3667(be)X +3768(satis\256ed)X +4055(without)X +555 3117(contacting)N +910(the)X +1029(server)X +1247(at)X +1326(all.)X +1467(LIBTP)X +1710(submits)X +1979(all)X +2080(the)X +2199(queries)X +2452(to)X +2535(the)X +2653(server)X +2870(process,)X +3151(and)X +3287(receives)X +3571(database)X +3868(records)X +4125(back;)X +555 3207(it)N +619(does)X +786(no)X +886(client)X +1084(caching.)X +755 3330(The)N +911(RDBMS)X +1221(architecture)X +1632(is)X +1716(much)X +1925(closer)X +2148(to)X +2241(that)X +2392(of)X +2490(LIBTP.)X +2783(A)X +2872(server)X +3100(process)X +3372(receives)X +3667(queries)X +3930(and)X +4076(returns)X +555 3420(results)N +786(to)X +870(a)X +928(client.)X +1168(The)X +1315(timing)X +1545(results)X +1776(in)X +1860(table)X +2038(two)X +2180(clearly)X +2421(show)X +2612(that)X +2754(the)X +2874(conventional)X +3309(database)X +3607(client/server)X +4025(model)X +4246(is)X +555 3510(expensive.)N +941(LIBTP)X +1188(outperforms)X +1605(the)X +1728(RDBMS)X +2032(on)X +2136(traversal)X +2437(and)X +2577(insertion.)X +2921(We)X +3057(speculate)X +3380(that)X +3524(this)X +3663(is)X +3740(due)X +3880(in)X +3966(part)X +4115(to)X +4201(the)X +555 3600(overhead)N +870(of)X +957(query)X +1160(parsing,)X +1436(optimization,)X +1880(and)X +2016(repeated)X +2309(interpretation)X +2761(of)X +2848(the)X +2966(plan)X +3124(tree)X +3265(in)X +3347(the)X +3465(RDBMS')X +3791(query)X +3994(executor.)X +755 3723(Table)N +962(three)X +1147(shows)X +1371(the)X +1492(differences)X +1873(between)X +2164(local)X +2343(and)X +2482(remote)X +2728(execution)X +3063(of)X +3153(LIBTP's)X +3456(OO1)X +3635(implementation)X +4160(on)X +4263(a)X +555 3813(DECstation.)N +989(We)X +1122(measured)X +1451(performance)X +1879(with)X +2042(a)X +2099(populated)X +2436(\(warm\))X +2694(cache)X +2899(and)X +3036(an)X +3133(empty)X +3354(\(cold\))X +3567(cache.)X +3812(Reported)X +4126(times)X +555 3903(are)N +681(the)X +806(means)X +1037(of)X +1130(twenty)X +1374(tests,)X +1562(and)X +1704(are)X +1829(in)X +1917(seconds.)X +2237(Standard)X +2548(deviations)X +2903(were)X +3086(within)X +3316(seven)X +3525(percent)X +3788(of)X +3881(the)X +4005(mean)X +4205(for)X +555 3993(remote,)N +818(and)X +954(two)X +1094(percent)X +1351(of)X +1438(the)X +1556(mean)X +1750(for)X +1864(local.)X +755 4116(The)N +914(20ms)X +1121(overhead)X +1450(of)X +1551(TCP/IP)X +1824(on)X +1938(an)X +2048(Ethernet)X +2354(entirely)X +2633(accounts)X +2948(for)X +3076(the)X +3207(difference)X +3567(in)X +3662(speed.)X +3918(The)X +4076(remote)X +555 4206(traversal)N +857(times)X +1055(are)X +1179(nearly)X +1405(double)X +1648(the)X +1771(local)X +1952(times)X +2150(because)X +2430(we)X +2549(do)X +2653(index)X +2855(lookups)X +3132(and)X +3272(part)X +3421(fetches)X +3673(in)X +3759(separate)X +4047(queries.)X +555 4296(It)N +629(would)X +854(make)X +1053(sense)X +1252(to)X +1339(do)X +1444(indexed)X +1723(searches)X +2021(on)X +2126(the)X +2248(server,)X +2489(but)X +2615(we)X +2733(were)X +2914(unwilling)X +3244(to)X +3330(hard-code)X +3676(knowledge)X +4052(of)X +4143(OO1)X +555 4386(indices)N +803(into)X +948(our)X +1075(LIBTP)X +1317(TCL)X +1488(server.)X +1745(Cold)X +1920(and)X +2056(warm)X +2259(insertion)X +2559(times)X +2752(are)X +2871(identical)X +3167(since)X +3352(insertions)X +3683(do)X +3783(not)X +3905(bene\256t)X +4143(from)X +555 4476(caching.)N +755 4599(One)N +915(interesting)X +1279(difference)X +1632(shown)X +1867(by)X +1973(table)X +2155(three)X +2342(is)X +2421(the)X +2545(cost)X +2700(of)X +2793(forward)X +3074(versus)X +3305(backward)X +3644(traversal.)X +3987(When)X +4205(we)X +555 4689(built)N +725(the)X +847(database,)X +1168(we)X +1285(inserted)X +1562(parts)X +1741(in)X +1826(part)X +2 f +1974(id)X +1 f +2059(order.)X +2292(We)X +2427(built)X +2596(the)X +2717(indices)X +2967(at)X +3048(the)X +3169(same)X +3357(time.)X +3562(Therefore,)X +3923(the)X +4044(forward)X +555 4779(index)N +757(had)X +897(keys)X +1068(inserted)X +1346(in)X +1432(order,)X +1646(while)X +1848(the)X +1970(backward)X +2307(index)X +2509(had)X +2649(keys)X +2820(inserted)X +3098(more)X +3286(randomly.)X +3656(In-order)X +3943(insertion)X +4246(is)X +555 4885(pessimal)N +858(for)X +975(B-tree)X +1199(indices,)X +1469(so)X +1563(the)X +1684(forward)X +1962(index)X +2163(is)X +2239(much)X +2440(larger)X +2651(than)X +2812(the)X +2933(backward)X +3269(one)X +7 s +3385 4853(5)N +10 s +4885(.)Y +3476(This)X +3640(larger)X +3850(size)X +3997(shows)X +4219(up)X +555 4975(as)N +642(extra)X +823(disk)X +976(reads)X +1166(in)X +1248(the)X +1366(cold)X +1524(benchmark.)X +3 f +555 5161(6.)N +655(Conclusions)X +1 f +755 5284(LIBTP)N +1006(provides)X +1311(the)X +1438(basic)X +1632(building)X +1927(blocks)X +2165(to)X +2256(support)X +2525(transaction)X +2906(protection.)X +3300(In)X +3396(comparison)X +3799(with)X +3970(traditional)X +555 5374(Unix)N +746(libraries)X +1040(and)X +1187(commercial)X +1597(systems,)X +1900(it)X +1974(offers)X +2192(a)X +2258(variety)X +2511(of)X +2608(tradeoffs.)X +2964(Using)X +3185(complete)X +3509(transaction)X +3891(protection)X +4246(is)X +555 5464(more)N +747(complicated)X +1166(than)X +1331(simply)X +1575(adding)X +3 f +1820(fsync)X +1 f +1998(\(2\))X +2119(and)X +3 f +2262(\257ock)X +1 f +2426(\(2\))X +2547(calls)X +2721(to)X +2810(code,)X +3008(but)X +3136(it)X +3206(is)X +3285(faster)X +3490(in)X +3578(some)X +3773(cases)X +3969(and)X +4111(offers)X +8 s +10 f +555 5536(hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh)N +5 s +1 f +727 5614(5)N +8 s +763 5639(The)N +878(next)X +1004(release)X +1196(of)X +1265(the)X +1359(4.4BSD)X +1580(access)X +1758(method)X +1966(will)X +2082(automatically)X +2446(detect)X +2614(and)X +2722(compensate)X +3039(for)X +3129(in-order)X +3350(insertion,)X +3606(eliminating)X +3914(this)X +4023(problem.)X + +16 p +%%Page: 16 16 +8 s 8 xH 0 xS 1 f +10 s +3 f +1 f +555 630(stricter)N +801(guarantees)X +1168(\(atomicity,)X +1540(consistency,)X +1957(isolation,)X +2275(and)X +2414(durability\).)X +2815(If)X +2892(the)X +3013(data)X +3170(to)X +3255(be)X +3354(protected)X +3676(are)X +3798(already)X +4058(format-)X +555 720(ted)N +675(\()X +2 f +702(i.e.)X +1 f +821(use)X +949(one)X +1086(of)X +1174(the)X +1293(database)X +1591(access)X +1818(methods\),)X +2157(then)X +2316(adding)X +2555(transaction)X +2928(protection)X +3274(requires)X +3554(no)X +3655(additional)X +3996(complex-)X +555 810(ity,)N +679(but)X +801(incurs)X +1017(a)X +1073(performance)X +1500(penalty)X +1756(of)X +1843(approximately)X +2326(15%.)X +755 933(In)N +844(comparison)X +1240(with)X +1404(commercial)X +1805(database)X +2104(systems,)X +2399(the)X +2519(tradeoffs)X +2827(are)X +2948(more)X +3135(complex.)X +3473(LIBTP)X +3717(does)X +3886(not)X +4009(currently)X +555 1023(support)N +825(a)X +891(standard)X +1193(query)X +1406(language.)X +1766(The)X +1921(TCL-based)X +2312(server)X +2539(process)X +2810(allows)X +3049(a)X +3115(certain)X +3364(ease)X +3533(of)X +3630(use)X +3767(which)X +3993(would)X +4223(be)X +555 1113(enhanced)N +882(with)X +1047(a)X +1106(more)X +1294(user-friendly)X +1732(interface)X +2037(\()X +2 f +2064(e.g.)X +1 f +2203(a)X +2261(windows)X +2572(based)X +2777(query-by-form)X +3272(application\),)X +3697(for)X +3813(which)X +4031(we)X +4147(have)X +555 1203(a)N +620(working)X +916(prototype.)X +1292(When)X +1513(accesses)X +1815(do)X +1924(not)X +2055(require)X +2312(sophisticated)X +2758(query)X +2969(processing,)X +3360(the)X +3486(TCL)X +3665(interface)X +3975(is)X +4056(an)X +4160(ade-)X +555 1293(quate)N +756(solution.)X +1080(What)X +1281(LIBTP)X +1529(fails)X +1693(to)X +1781(provide)X +2052(in)X +2140(functionality,)X +2595(it)X +2665(makes)X +2896(up)X +3002(for)X +3122(in)X +3210(performance)X +3643(and)X +3785(\257exibility.)X +4161(Any)X +555 1383(application)N +931(may)X +1089(make)X +1283(use)X +1410(of)X +1497(its)X +1592(record)X +1818(interface)X +2120(or)X +2207(the)X +2325(more)X +2510(primitive)X +2823(log,)X +2965(lock,)X +3143(and)X +3279(buffer)X +3496(calls.)X +755 1506(Future)N +987(work)X +1175(will)X +1322(focus)X +1519(on)X +1621(overcoming)X +2026(some)X +2217(of)X +2306(the)X +2426(areas)X +2614(in)X +2698(which)X +2916(LIBTP)X +3160(is)X +3235(currently)X +3547(de\256cient)X +3845(and)X +3983(extending)X +555 1596(its)N +652(transaction)X +1026(model.)X +1288(The)X +1435(addition)X +1719(of)X +1808(an)X +1905(SQL)X +2077(parser)X +2295(and)X +2432(forms)X +2640(front)X +2817(end)X +2954(will)X +3099(improve)X +3387(the)X +3506(system's)X +3807(ease)X +3967(of)X +4055(use)X +4183(and)X +555 1686(make)N +750(it)X +815(more)X +1001(competitive)X +1400(with)X +1563(commercial)X +1963(systems.)X +2277(In)X +2365(the)X +2484(long)X +2647(term,)X +2835(we)X +2950(would)X +3170(like)X +3310(to)X +3392(add)X +3528(generalized)X +3919(hierarchical)X +555 1776(locking,)N +836(nested)X +1062(transactions,)X +1486(parallel)X +1748(transactions,)X +2171(passing)X +2431(of)X +2518(transactions)X +2921(between)X +3209(processes,)X +3557(and)X +3693(distributed)X +4055(commit)X +555 1866(handling.)N +900(In)X +992(the)X +1115(short)X +1300(term,)X +1492(the)X +1614(next)X +1776(step)X +1929(is)X +2006(to)X +2092(integrate)X +2397(LIBTP)X +2643(with)X +2809(the)X +2931(most)X +3110(recent)X +3331(release)X +3579(of)X +3670(the)X +3792(database)X +4093(access)X +555 1956(routines)N +833(and)X +969(make)X +1163(it)X +1227(freely)X +1435(available)X +1745(via)X +1863(anonymous)X +2252(ftp.)X +3 f +555 2142(7.)N +655(Acknowledgements)X +1 f +755 2265(We)N +888(would)X +1109(like)X +1250(to)X +1332(thank)X +1530(John)X +1701(Wilkes)X +1948(and)X +2084(Carl)X +2242(Staelin)X +2484(of)X +2571(Hewlett-Packard)X +3131(Laboratories)X +3557(and)X +3693(Jon)X +3824(Krueger.)X +4148(John)X +555 2355(and)N +694(Carl)X +855(provided)X +1162(us)X +1255(with)X +1419(an)X +1517(extra)X +1700(disk)X +1855(for)X +1971(the)X +2091(HP)X +2215(testbed)X +2464(less)X +2606(than)X +2766(24)X +2868(hours)X +3068(after)X +3238(we)X +3354(requested)X +3684(it.)X +3770(Jon)X +3903(spent)X +4094(count-)X +555 2445(less)N +699(hours)X +901(helping)X +1164(us)X +1258(understand)X +1633(the)X +1754(intricacies)X +2107(of)X +2197(commercial)X +2599(database)X +2899(products)X +3198(and)X +3337(their)X +3507(behavior)X +3811(under)X +4017(a)X +4076(variety)X +555 2535(of)N +642(system)X +884(con\256gurations.)X +3 f +555 2721(8.)N +655(References)X +1 f +555 2901([ANDR89])N +942(Andrade,)X +1265(J.,)X +1361(Carges,)X +1629(M.,)X +1765(Kovach,)X +2060(K.,)X +2183(``Building)X +2541(an)X +2642(On-Line)X +2939(Transaction)X +3343(Processing)X +3715(System)X +3975(On)X +4098(UNIX)X +727 2991(System)N +982(V'',)X +2 f +1134(CommUNIXations)X +1 f +1725(,)X +1765 0.2188(November/December)AX +2477(1989.)X +555 3171([BAY77])N +878(Bayer,)X +1110(R.,)X +1223(Schkolnick,)X +1623(M.,)X +1754(``Concurrency)X +2243(of)X +2330(Operations)X +2702(on)X +2802(B-Trees'',)X +2 f +3155(Acta)X +3322(Informatica)X +1 f +3700(,)X +3740(1977.)X +555 3351([BERN80])N +936(Bernstein,)X +1297(P.,)X +1415(Goodman,)X +1785(N.,)X +1917(``Timestamp)X +2365(Based)X +2595(Algorithms)X +2992(for)X +3119(Concurrency)X +3567(Control)X +3844(in)X +3939(Distributed)X +727 3441(Database)N +1042(Systems'',)X +2 f +1402(Proceedings)X +1823(6th)X +1945(International)X +2387(Conference)X +2777(on)X +2877(Very)X +3049(Large)X +3260(Data)X +3440(Bases)X +1 f +3627(,)X +3667(October)X +3946(1980.)X +555 3621([BSD91])N +864(DB\(3\),)X +2 f +1109(4.4BSD)X +1376(Unix)X +1552(Programmer's)X +2044(Manual)X +2313(Reference)X +2655(Guide)X +1 f +2851(,)X +2891(University)X +3249(of)X +3336(California,)X +3701(Berkeley,)X +4031(1991.)X +555 3801([CATT91])N +923(Cattell,)X +1181(R.G.G.,)X +1455(``An)X +1632(Engineering)X +2049(Database)X +2369(Benchmark'',)X +2 f +2838(The)X +2983(Benchmark)X +3373(Handbook)X +3731(for)X +3848(Database)X +4179(and)X +727 3891(Transaction)N +1133(Processing)X +1509(Systems)X +1 f +1763(,)X +1803(J.)X +1874(Gray,)X +2075(editor,)X +2302(Morgan)X +2576(Kaufman)X +2895(1991.)X +555 4071([CHEN91])N +929(Cheng,)X +1180(E.,)X +1291(Chang,)X +1542(E.,)X +1653(Klein,)X +1872(J.,)X +1964(Lee,)X +2126(D.,)X +2245(Lu,)X +2375(E.,)X +2485(Lutgardo,)X +2820(A.,)X +2939(Obermarck,)X +3342(R.,)X +3456(``An)X +3629(Open)X +3824(and)X +3961(Extensible)X +727 4161(Event-Based)N +1157(Transaction)X +1556(Manager'',)X +2 f +1936(Proceedings)X +2357(1991)X +2537(Summer)X +2820(Usenix)X +1 f +3043(,)X +3083(Nashville,)X +3430(TN,)X +3577(June)X +3744(1991.)X +555 4341([CHOU85])N +943(Chou,)X +1163(H.,)X +1288(DeWitt,)X +1570(D.,)X +1694(``An)X +1872(Evaluation)X +2245(of)X +2338(Buffer)X +2574(Management)X +3019(Strategies)X +3361(for)X +3481(Relational)X +3836(Database)X +4157(Sys-)X +727 4431(tems'',)N +2 f +972(Proceedings)X +1393(of)X +1475(the)X +1593(11th)X +1755(International)X +2197(Conference)X +2587(on)X +2687(Very)X +2859(Large)X +3070(Databases)X +1 f +3408(,)X +3448(1985.)X +555 4611([DEWI84])N +925(DeWitt,)X +1207(D.,)X +1331(Katz,)X +1529(R.,)X +1648(Olken,)X +1890(F.,)X +2000(Shapiro,)X +2295(L.,)X +2410(Stonebraker,)X +2843(M.,)X +2979(Wood,)X +3220(D.,)X +3343(``Implementation)X +3929(Techniques)X +727 4701(for)N +841(Main)X +1030(Memory)X +1326(Database)X +1641(Systems'',)X +2 f +2001(Proceedings)X +2422(of)X +2504(SIGMOD)X +1 f +2812(,)X +2852(pp.)X +2972(1-8,)X +3119(June)X +3286(1984.)X +555 4881([GRAY76])N +944(Gray,)X +1153(J.,)X +1252(Lorie,)X +1474(R.,)X +1595(Putzolu,)X +1887(F.,)X +1999(and)X +2143(Traiger,)X +2428(I.,)X +2522(``Granularity)X +2973(of)X +3067(locks)X +3263(and)X +3406(degrees)X +3679(of)X +3773(consistency)X +4174(in)X +4263(a)X +727 4971(large)N +909(shared)X +1140(data)X +1295(base'',)X +2 f +1533(Modeling)X +1861(in)X +1944(Data)X +2125(Base)X +2301(Management)X +2740(Systems)X +1 f +2994(,)X +3034(Elsevier)X +3317(North)X +3524(Holland,)X +3822(New)X +3994(York,)X +4199(pp.)X +727 5061(365-394.)N +555 5241([HAER83])N +931(Haerder,)X +1235(T.)X +1348(Reuter,)X +1606(A.)X +1728(``Principles)X +2126(of)X +2217(Transaction-Oriented)X +2928(Database)X +3246(Recovery'',)X +2 f +3651(Computing)X +4029(Surveys)X +1 f +4279(,)X +727 5331(15\(4\);)N +943(237-318,)X +1250(1983.)X +555 5511([KUNG81])N +943(Kung,)X +1162(H.)X +1261(T.,)X +1371(Richardson,)X +1777(J.,)X +1869(``On)X +2042(Optimistic)X +2400(Methods)X +2701(for)X +2816(Concurrency)X +3252(Control'',)X +2 f +3591(ACM)X +3781(Transactions)X +4219(on)X +727 5601(Database)N +1054(Systems)X +1 f +1328(6\(2\);)X +1504(213-226,)X +1811(1981.)X + +17 p +%%Page: 17 17 +10 s 10 xH 0 xS 1 f +3 f +1 f +555 630([LEHM81])N +939(Lehman,)X +1245(P.,)X +1352(Yao,)X +1529(S.,)X +1636(``Ef\256cient)X +1989(Locking)X +2279(for)X +2396(Concurrent)X +2780(Operations)X +3155(on)X +3258(B-trees'',)X +2 f +3587(ACM)X +3779(Transactions)X +4219(on)X +727 720(Database)N +1054(Systems)X +1 f +1308(,)X +1348(6\(4\),)X +1522(December)X +1873(1981.)X +555 900([MOHA91])N +964(Mohan,)X +1241(C.,)X +1364(Pirahesh,)X +1690(H.,)X +1818(``ARIES-RRH:)X +2366(Restricted)X +2721(Repeating)X +3076(of)X +3173(History)X +3442(in)X +3533(the)X +3660(ARIES)X +3920(Transaction)X +727 990(Recovery)N +1055(Method'',)X +2 f +1398(Proceedings)X +1819(7th)X +1941(International)X +2383(Conference)X +2773(on)X +2873(Data)X +3053(Engineering)X +1 f +3449(,)X +3489(Kobe,)X +3703(Japan,)X +3926(April)X +4115(1991.)X +555 1170([NODI90])N +914(Nodine,)X +1194(M.,)X +1328(Zdonik,)X +1602(S.,)X +1709(``Cooperative)X +2178(Transaction)X +2580(Hierarchies:)X +2996(A)X +3077(Transaction)X +3479(Model)X +3711(to)X +3796(Support)X +4072(Design)X +727 1260(Applications'',)N +2 f +1242(Proceedings)X +1675(16th)X +1849(International)X +2303(Conference)X +2704(on)X +2815(Very)X +2998(Large)X +3220(Data)X +3411(Bases)X +1 f +3598(,)X +3649(Brisbane,)X +3985(Australia,)X +727 1350(August)N +978(1990.)X +555 1530([OUST90])N +923(Ousterhout,)X +1324(J.,)X +1420(``Tcl:)X +1648(An)X +1771(Embeddable)X +2197(Command)X +2555(Language'',)X +2 f +2971(Proceedings)X +3396(1990)X +3580(Winter)X +3822(Usenix)X +1 f +4045(,)X +4089(Wash-)X +727 1620(ington,)N +971(D.C.,)X +1162(January)X +1432(1990.)X +555 1800([POSIX91])N +955(``Unapproved)X +1441(Draft)X +1645(for)X +1773(Realtime)X +2096(Extension)X +2450(for)X +2578(Portable)X +2879(Operating)X +3234(Systems'',)X +3608(Draft)X +3812(11,)X +3946(October)X +4239(7,)X +727 1890(1991,)N +927(IEEE)X +1121(Computer)X +1461(Society.)X +555 2070([ROSE91])N +925(Rosenblum,)X +1341(M.,)X +1484(Ousterhout,)X +1892(J.,)X +1995(``The)X +2206(Design)X +2464(and)X +2611(Implementation)X +3149(of)X +3247(a)X +3314(Log-Structured)X +3835(File)X +3990(System'',)X +2 f +727 2160(Proceedings)N +1148(of)X +1230(the)X +1348(13th)X +1510(Symposium)X +1895(on)X +1995(Operating)X +2344(Systems)X +2618(Principles)X +1 f +2947(,)X +2987(1991.)X +555 2340([SELT91])N +904(Seltzer,)X +1171(M.,)X +1306(Stonebraker,)X +1738(M.,)X +1873(``Read)X +2116(Optimized)X +2478(File)X +2626(Systems:)X +2938(A)X +3020(Performance)X +3454(Evaluation'',)X +2 f +3898(Proceedings)X +727 2430(7th)N +849(Annual)X +1100(International)X +1542(Conference)X +1932(on)X +2032(Data)X +2212(Engineering)X +1 f +2608(,)X +2648(Kobe,)X +2862(Japan,)X +3085(April)X +3274(1991.)X +555 2610([SPEC88])N +907(Spector,)X +1200(Rausch,)X +1484(Bruell,)X +1732(``Camelot:)X +2107(A)X +2192(Flexible,)X +2501(Distributed)X +2888(Transaction)X +3294(Processing)X +3668(System'',)X +2 f +4004(Proceed-)X +727 2700(ings)N +880(of)X +962(Spring)X +1195(COMPCON)X +1606(1988)X +1 f +(,)S +1806(February)X +2116(1988.)X +555 2880([SQL86])N +862(American)X +1201(National)X +1499(Standards)X +1836(Institute,)X +2139(``Database)X +2509(Language)X +2847(SQL'',)X +3093(ANSI)X +3301(X3.135-1986)X +3747(\(ISO)X +3924(9075\),)X +4152(May)X +727 2970(1986.)N +555 3150([STON81])N +919(Stonebraker,)X +1348(M.,)X +1480(``Operating)X +1876(System)X +2132(Support)X +2406(for)X +2520(Database)X +2835(Management'',)X +2 f +3348(Communications)X +3910(of)X +3992(the)X +4110(ACM)X +1 f +4279(,)X +727 3240(1981.)N +555 3420([SULL92])N +925(Sullivan,)X +1247(M.,)X +1394(Olson,)X +1641(M.,)X +1788(``An)X +1976(Index)X +2195(Implementation)X +2737(Supporting)X +3127(Fast)X +3295(Recovery)X +3638(for)X +3767(the)X +3900(POSTGRES)X +727 3510(Storage)N +1014(System'',)X +1365(to)X +1469(appear)X +1726(in)X +2 f +1830(Proceedings)X +2272(8th)X +2415(Annual)X +2687(International)X +3150(Conference)X +3561(on)X +3682(Data)X +3883(Engineering)X +1 f +4279(,)X +727 3600(Tempe,)N +990(Arizona,)X +1289(February)X +1599(1992.)X +555 3780([TPCB90])N +914(Transaction)X +1319(Processing)X +1692(Performance)X +2129(Council,)X +2428(``TPC)X +2653(Benchmark)X +3048(B'',)X +3200(Standard)X +3510(Speci\256cation,)X +3973(Waterside)X +727 3870(Associates,)N +1110(Fremont,)X +1421(CA.,)X +1592(1990.)X +555 4050([YOUN91])N +947(Young,)X +1211(M.)X +1328(W.,)X +1470(Thompson,)X +1858(D.)X +1962(S.,)X +2072(Jaffe,)X +2274(E.,)X +2388(``A)X +2525(Modular)X +2826(Architecture)X +3253(for)X +3372(Distributed)X +3757(Transaction)X +4161(Pro-)X +727 4140(cessing'',)N +2 f +1057(Proceedings)X +1478(1991)X +1658(Winter)X +1896(Usenix)X +1 f +2119(,)X +2159(Dallas,)X +2404(TX,)X +2551(January)X +2821(1991.)X +3 f +755 4263(Margo)N +1008(I.)X +1080(Seltzer)X +1 f +1338(is)X +1411(a)X +1467(Ph.D.)X +1669(student)X +1920(in)X +2002(the)X +2120(Department)X +2519(of)X +2606(Electrical)X +2934(Engineering)X +3346(and)X +3482(Computer)X +3822(Sciences)X +4123(at)X +4201(the)X +555 4353(University)N +919(of)X +1012(California,)X +1383(Berkeley.)X +1739(Her)X +1886(research)X +2181(interests)X +2474(include)X +2735(\256le)X +2862(systems,)X +3160(databases,)X +3513(and)X +3654(transaction)X +4031(process-)X +555 4443(ing)N +686(systems.)X +1008(She)X +1157(spent)X +1355(several)X +1612(years)X +1811(working)X +2107(at)X +2194(startup)X +2441(companies)X +2813(designing)X +3153(and)X +3298(implementing)X +3771(\256le)X +3902(systems)X +4183(and)X +555 4533(transaction)N +929(processing)X +1294(software)X +1592(and)X +1729(designing)X +2061(microprocessors.)X +2648(Ms.)X +2791(Seltzer)X +3035(received)X +3329(her)X +3453(AB)X +3585(in)X +3668(Applied)X +3947(Mathemat-)X +555 4623(ics)N +664(from)X +840 0.1953(Harvard/Radcliffe)AX +1445(College)X +1714(in)X +1796(1983.)X +755 4746(In)N +845(her)X +971(spare)X +1163(time,)X +1347(Margo)X +1583(can)X +1717(usually)X +1970(be)X +2068(found)X +2277(preparing)X +2607(massive)X +2887(quantities)X +3220(of)X +3309(food)X +3478(for)X +3594(hungry)X +3843(hordes,)X +4099(study-)X +555 4836(ing)N +677(Japanese,)X +1003(or)X +1090(playing)X +1350(soccer)X +1576(with)X +1738(an)X +1834(exciting)X +2112(Bay)X +2261(Area)X +2438(Women's)X +2770(Soccer)X +3009(team,)X +3205(the)X +3323(Berkeley)X +3633(Bruisers.)X +3 f +755 5049(Michael)N +1056(A.)X +1159(Olson)X +1 f +1383(is)X +1461(a)X +1522(Master's)X +1828(student)X +2084(in)X +2170(the)X +2292(Department)X +2695(of)X +2786(Electrical)X +3118(Engineering)X +3534(and)X +3674(Computer)X +4018(Sciences)X +555 5139(at)N +645(the)X +774(University)X +1143(of)X +1241(California,)X +1617(Berkeley.)X +1978(His)X +2120(primary)X +2405(interests)X +2703(are)X +2833(database)X +3141(systems)X +3425(and)X +3572(mass)X +3763(storage)X +4026(systems.)X +555 5229(Mike)N +759(spent)X +963(two)X +1118(years)X +1323(working)X +1625(for)X +1754(a)X +1825(commercial)X +2239(database)X +2551(system)X +2808(vendor)X +3066(before)X +3307(joining)X +3567(the)X +3699(Postgres)X +4004(Research)X +555 5319(Group)N +780(at)X +858(Berkeley)X +1168(in)X +1250(1988.)X +1470(He)X +1584(received)X +1877(his)X +1990(B.A.)X +2161(in)X +2243(Computer)X +2583(Science)X +2853(from)X +3029(Berkeley)X +3339(in)X +3421(May)X +3588(1991.)X +755 5442(Mike)N +945(only)X +1108(recently)X +1388(transferred)X +1758(into)X +1903(Sin)X +2030(City,)X +2208(but)X +2330(is)X +2403(rapidly)X +2650(adopting)X +2950(local)X +3126(customs)X +3408(and)X +3544(coloration.)X +3929(In)X +4016(his)X +4129(spare)X +555 5532(time,)N +742(he)X +843(organizes)X +1176(informal)X +1477(Friday)X +1711(afternoon)X +2043(study)X +2240(groups)X +2482(to)X +2568(discuss)X +2823(recent)X +3044(technical)X +3358(and)X +3498(economic)X +3834(developments.)X +555 5622(Among)N +815(his)X +928(hobbies)X +1197(are)X +1316(Charles)X +1581(Dickens,)X +1884(Red)X +2033(Rock,)X +2242(and)X +2378(speaking)X +2683(Dutch)X +2899(to)X +2981(anyone)X +3233(who)X +3391(will)X +3535(permit)X +3764(it.)X + +17 p +%%Trailer +xt + +xs + diff --git a/db/docs/ref/refs/refs.html b/db/docs/ref/refs/refs.html new file mode 100644 index 000000000..9e321b938 --- /dev/null +++ b/db/docs/ref/refs/refs.html @@ -0,0 +1,75 @@ +<!--$Id: refs.so,v 10.24 2000/12/19 18:54:17 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Additional references</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Additional References</dl></h3></td> +<td width="1%"><a href="../../ref/distrib/layout.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a> +</td></tr></table> +<p> +<h1 align=center>Additional references</h1> +<p>For more information on Berkeley DB, or on database systems theory in general, +we recommend the sources listed below. +<h3>Technical Papers on Berkeley DB</h3> +<p>These papers have appeared in refereed conference proceedings, and are +subject to copyrights held by the conference organizers and the authors +of the papers. Sleepycat Software makes them available here as a courtesy +with the permission of the copyright holders. +<p><dl compact> +<p><dt><i>Berkeley DB</i> (<a href="bdb_usenix.html">HTML</a>, <a href="bdb_usenix.ps">Postscript</a>)<dd>Michael Olson, Keith Bostic, and Margo Seltzer, Proceedings of the 1999 +Summer Usenix Technical Conference, Monterey, California, June 1999. This +paper describes recent commercial releases of Berkeley DB, its most important +features, the history of the software, and Sleepycat's Open Source +licensing policies. +<p><dt><i>Challenges in Embedded Database System Administration</i> +(<a href="embedded.html">HTML</a>)<dd>Margo Seltzer and Michael Olson, First Workshop on Embedded Systems, +Cambridge, Massachusetts, March 1999. This paper describes the challenges +that face embedded systems developers, and how Berkeley DB has been designed to +address them. +<p><dt><i>LIBTP: Portable Modular Transactions for UNIX</i> +(<a href="libtp_usenix.ps">Postscript</a>)<dd>Margo Seltzer and Michael Olson, USENIX Conference Proceedings, Winter +1992. This paper describes an early prototype of the transactional system +for Berkeley DB. +<p><dt><i>A New Hashing Package for UNIX</i> +(<a href="hash_usenix.ps">Postscript</a>)<dd>Margo Seltzer and Oz Yigit, USENIX Conference Proceedings, Winter 1991. +This paper describes the Extended Linear Hashing techniques used by Berkeley DB. +</dl> +<h3>Background on Berkeley DB Features</h3> +<p>These papers, while not specific to Berkeley DB, give a good overview of how +different Berkeley DB features were implemented. +<p><dl compact> +<p><dt><i>Operating System Support for Database Management</i><dd>Michael Stonebraker, Communications of the ACM 24(7), 1981, pp. 412-418. +<p><dt><i>Dynamic Hash Tables</i><dd>Per-Ake Larson, Communications of the ACM, April 1988. +<p><dt><i>Linear Hashing: A New Tool for File and Table Addressing</i><dd><a href="witold.html">Witold Litwin</a>, Proceedings of the 6th International +Conference on Very Large Databases (VLDB), 1980 +<p><dt><i>The Ubiquitous B-tree</i><dd>Douglas Comer, ACM Comput. Surv. 11, 2 (June 1979), pp. 121-138. +<p><dt><i>Prefix B-trees</i><dd>Bayer and Unterauer, ACM Transactions on Database Systems, Vol. 2, 1 +(March 1977), pp. 11-26. +<p><dt><i>The Art of Computer Programming Vol. 3: Sorting and Searching</i><dd>D.E. Knuth, 1968, pp. 471-480. +<p><dt><i>Document Processing in a Relational Database System</i><dd>Michael Stonebraker, Heidi Stettner, Joseph Kalash, Antonin Guttman, +Nadene Lynn, Memorandum No. UCB/ERL M82/32, May 1982. +</dl> +<h3>Database Systems Theory</h3> +<p>These publications are standard reference works on the design and +implementation of database systems. Berkeley DB uses many of the ideas they +describe. +<p><dl compact> +<p><dt><i>Transaction Processing Concepts and Techniques</i><dd>by Jim Gray and Andreas Reuter, Morgan Kaufmann Publishers. +We recommend chapters 1, 4 (skip 4.6, 4.7, 4.9, 4.10 and 4.11), +7, 9, 10.3, and 10.4. +<p><dt><i>An Introduction to Database Systems, Volume 1</i><dd>by C.J. Date, Addison Wesley Longman Publishers. +In the 5th Edition, we recommend chapters 1, 2, 3, 16 and 17. +<p><dt><i>Concurrency Control and Recovery in Database Systems</i><dd>by Bernstein, Goodman, Hadzilaco. Currently out of print, but available +from <a href="http://research.microsoft.com/pubs/ccontrol/">http://research.microsoft.com/pubs/ccontrol/</a>. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/distrib/layout.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/refs/witold.html b/db/docs/ref/refs/witold.html new file mode 100644 index 000000000..d81065e66 --- /dev/null +++ b/db/docs/ref/refs/witold.html @@ -0,0 +1,16 @@ +<!--$Id: witold.so,v 10.4 1999/11/19 17:21:03 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB: Witold Litwin</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<h1 align=center>Witold Litwin</h1> +Witold is a hell of a guy to take you on a late-night high-speed car +chase up the mountains of Austria in search of very green wine. +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/rpc/client.html b/db/docs/ref/rpc/client.html new file mode 100644 index 000000000..e8eb90dcf --- /dev/null +++ b/db/docs/ref/rpc/client.html @@ -0,0 +1,75 @@ +<!--$Id: client.so,v 1.6 2000/03/18 21:43:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Client program</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>RPC Client/Server</dl></h3></td> +<td width="1%"><a href="../../ref/rpc/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rpc/server.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Client program</h1> +<p>Changing a Berkeley DB application to remotely call a server program requires +only a few changes on the client side: +<p><ol> +<p><li>The client application must create and use a Berkeley DB environment, +that is, it cannot simply call the <a href="../../api_c/db_create.html">db_create</a> interface, but must +first call the <a href="../../api_c/env_create.html">db_env_create</a> interface to create an environment in +which the database will live. +<p><li>The client application must call <a href="../../api_c/env_create.html">db_env_create</a> using the +<a href="../../api_c/env_create.html#DB_CLIENT">DB_CLIENT</a> flag. +<p><li>The client application must call the additional DB_ENV +method <a href="../../api_c/env_set_server.html">DBENV->set_server</a> to specify the database server. This call +must be made before opening the environment with the <a href="../../api_c/env_open.html">DBENV->open</a> +call. +</ol> +<p>The client application provides three pieces of information to Berkeley DB as +part of the <a href="../../api_c/env_set_server.html">DBENV->set_server</a> call: +<p><ol> +<p><li>The hostname of the server. The hostname format is not +specified by Berkeley DB, but must be in a format acceptable to the local +network support, specifically, the RPC clnt_create interface. +<p><li>The client timeout. This is the number of seconds the client +will wait for the server to respond to its requests. A default is used +if this value is zero. +<p><li>The server timeout. This is the number of seconds the server +will allow client resources to remain idle before releasing those +resources. The resources this applies to are transactions and cursors, +as those objects hold locks and if a client dies, the server needs to +release those resources in a timely manner. This value +is really a hint to the server, as the server may choose to override this +value with its own. +</ol> +<p>The only other item of interest to the client is the home directory +that is given to the <a href="../../api_c/env_open.html">DBENV->open</a> call. +The server is started with a list of allowed home directories. +The client must use one of those names (where a name is the last +component of the home directory). This allows the pathname structure +on the server to change without client applications needing to be +aware of it. +<p>Once the <a href="../../api_c/env_set_server.html">DBENV->set_server</a> call has been made, the client is +connected to the server and all subsequent Berkeley DB +operations will be forwarded to the server. The client does not need to +be otherwise aware that it is using a database server rather than +accessing the database locally. +<p>It is important to realize that the client portion of the Berkeley DB library +acts as a simple conduit, forwarding Berkeley DB interface arguments to the +server without interpretation. This has two important implications. +First, all pathnames must be specified relative to the server. For +example, the home directory and other configuration information passed by +the application when creating its environment or databases must be +pathnames for the server, not the client system. In addition, as there +is no logical bundling of operations at the server, performance is usually +significantly less than when Berkeley DB is embedded within the client's address +space, even if the RPC is to a local address. +<table><tr><td><br></td><td width="1%"><a href="../../ref/rpc/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rpc/server.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/rpc/intro.html b/db/docs/ref/rpc/intro.html new file mode 100644 index 000000000..25e4f4aea --- /dev/null +++ b/db/docs/ref/rpc/intro.html @@ -0,0 +1,62 @@ +<!--$Id: intro.so,v 1.6 2000/12/04 21:51:04 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>RPC Client/Server</dl></h3></td> +<td width="1%"><a href="../../ref/txn/other.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rpc/client.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Introduction</h1> +<p>Berkeley DB includes a basic implementation of a client-server protocol, using +Sun Microsystem's Remote Procedure Call Protocol. RPC support is only +available for UNIX systems, and is not included in the Berkeley DB library by +default, but must be enabled during configuration. See +<a href="../../ref/build_unix/conf.html">Configuring Berkeley DB</a> for more +information. For more information on RPC itself, see your UNIX system +documentation or <i>RPC: Remote Procedure Call Protocol +Specification, RFC1832, Sun Microsystems, Inc., USC-ISI</i>. +<p>Only some of the complete Berkeley DB functionality is available when using RPC. +The following functionality is available: +<p><ol> +<li>The <a href="../../api_c/env_create.html">db_env_create</a> interface and the DB_ENV +handle methods. +<li>The <a href="../../api_c/db_create.html">db_create</a> interface and the DB handle +methods. +<li>The <a href="../../api_c/txn_begin.html">txn_begin</a>, <a href="../../api_c/txn_commit.html">txn_commit</a> and +<a href="../../api_c/txn_abort.html">txn_abort</a> interfaces. +</ol> +<p>The RPC client/server code does not support any of the user-defined +comparison or allocation functions, e.g., an application using the RPC +support may not specify its own Btree comparison function. If your +application only requires those portions of Berkeley DB, then using RPC is +fairly simple. If your application requires other Berkeley DB functionality, +such as direct access to locking, logging or shared memory buffer memory +pools, then your application cannot use the RPC support. +<p><b>The Berkeley DB RPC support does not provide any security or authentication of +any kind.</b> Sites needing any kind of data security measures must modify +the client and server code to provide whatever level of security they +require. +<p>One particularly interesting use of the RPC support is for debugging Berkeley DB +applications. The seamless nature of the interface means that with very +minor application code changes, an application can run outside of the +Berkeley DB address space, making it far easier to track down many types of +errors such as memory misuse. +<p>Using the RPC mechanisms in Berkeley DB involves two basic steps: +<p><ol> +<p><li>Modify your Berkeley DB application to act as a client and call the +RPC server. +<li>Run the <a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a> server program on the system +where the database resides. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/txn/other.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rpc/client.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/rpc/server.html b/db/docs/ref/rpc/server.html new file mode 100644 index 000000000..64572a90d --- /dev/null +++ b/db/docs/ref/rpc/server.html @@ -0,0 +1,54 @@ +<!--$Id: server.so,v 1.6 2000/03/18 21:43:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Server program</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>RPC Client/Server</dl></h3></td> +<td width="1%"><a href="../../ref/rpc/client.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/conf.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Server program</h1> +<p>The Berkeley DB server utility, <a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a>, handles all of the +client application requests. +<p>Currently, the <a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a> utility is single-threaded, +limiting the number of requests that it can handle. Modifying the server +implementation to run in multi-thread or multi-process mode will require +modification of the server code automatically generated by the rpcgen +program. +<p>There are two different types of timeouts used by <a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a>. +The first timeout (which can be modified within some constraints by the +client application), is the resource timeout. When clients use +transactions or cursors, those resources hold locks in Berkeley DB across calls +to the server. If a client application dies or loses its connection to +the server while holding those resources, it prevents any other client +from acquiring them. Therefore, it is important to detect that a client +has not used a resource for some period of time and release them. In the +case of transactions, the server aborts the transaction. In the case of +cursors, the server closes the cursor. +<p>The second timeout is an idle timeout. A client application may remain +idle with an open handle to an environment and a database. Doing so +simply consumes some memory, it does not hold locks. However, the Berkeley DB +server may want to eventually reclaim resources if a client dies or +remains disconnected for a long period of time, so there is a separate +idle timeout for open Berkeley DB handles. +<p>The list of home directories specified to <a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a> are the +only ones client applications are allowed to use. When +<a href="../../utility/berkeley_db_svc.html">berkeley_db_svc</a> is started, it is given a list of pathnames. +Clients are expected to specify the name of the home directory (defined +as the last component in the directory pathname) as the database +environment they are opening. In this manner, clients need only know the +name of their home environment, and not its full pathname on the server +machine. This means, of course, that only one environment of a particular +name is allowed on the server at any given time. +<table><tr><td><br></td><td width="1%"><a href="../../ref/rpc/client.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/java/conf.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/sendmail/intro.html b/db/docs/ref/sendmail/intro.html new file mode 100644 index 000000000..9dc1b4a14 --- /dev/null +++ b/db/docs/ref/sendmail/intro.html @@ -0,0 +1,51 @@ +<!--$Id: intro.so,v 10.20 2001/01/09 18:48:06 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Using Berkeley DB with Sendmail</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Sendmail</dl></h3></td> +<td width="1%"><a href="../../ref/tcl/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/dumpload/utility.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Using Berkeley DB with Sendmail</h1> +<p>If you are attempting to use Berkeley DB with Sendmail 8.8.X, you must use +Berkeley DB version 1.85 (see the Sleepycat Software web site's +<a href="http://www.sleepycat.com/historic.html">historic releases</a> +of Berkeley DB page for more information. +<p>Berkeley DB versions 2.0 and later are only supported by Sendmail versions 8.9.X +and later. +<p>Berkeley DB versions 3.0 and later are only supported by Sendmail versions +8.10.X and later. +<p>We strongly recommend that you not use Berkeley DB version 1.85. It is no longer +maintained or supported and has known bugs that can cause Sendmail to +fail. Instead, please upgrade to Sendmail version 8.9.X or later and use +a later version of Berkeley DB. For more information on using Berkeley DB with +Sendmail, please review the README and src/README files in the Sendmail +distribution. +<p>To load sendmail against Berkeley DB, add the following lines to +BuildTools/Site/site.config.m4: +<p><blockquote><pre>APPENDDEF(`confINCDIRS', `-I/usr/local/BerkeleyDB/include') +APPENDDEF(`confLIBDIRS', `-L/usr/local/BerkeleyDB/lib')</pre></blockquote> +<p>where those are the paths to #include <db.h> and libdb.a respectively. +Then, run "Build -c" from the src directory. +<p>Note that this Build script will use -DNEWDB on the compiles +and -L/path/to/libdb/directory -ldb on the link if it can find libdb.a; +the search path is $LIBDIRS:/lib:/usr/lib:/usr/shlib. $LIBDIRS is +NULL by default for most systems, but some set it in BuildTools/OS/foo. +Anyone can append to it as above (confLIBDIRS is the m4 variable name; +LIBDIRS is the shell-script variable name). +<p>To download Sendmail, or to obtain more information on Sendmail, see the +<a href="http://www.sendmail.org">Sendmail home page</a>, which includes +FAQ pages and problem addresses. +<table><tr><td><br></td><td width="1%"><a href="../../ref/tcl/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/dumpload/utility.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/close.html b/db/docs/ref/simple_tut/close.html new file mode 100644 index 000000000..a268a591c --- /dev/null +++ b/db/docs/ref/simple_tut/close.html @@ -0,0 +1,102 @@ +<!--$Id: close.so,v 10.22 2000/12/18 21:05:15 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Closing a database</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/del.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Closing a database</h1> +<p>The only other operation that we need for our simple example is closing +the database, and cleaning up the DB handle. +<p>It is necessary that the database be closed. The most important reason +for this is that Berkeley DB runs on top of an underlying buffer cache. If +the modified database pages are never explicitly flushed to disk and +the database is never closed, changes made to the database may never +make it out to disk, because they are held in the Berkeley DB cache. As the +default behavior of the close function is to flush the Berkeley DB cache, +closing the database will update the on-disk information. +<p>The <a href="../../api_c/db_close.html">DB->close</a> interface takes two arguments: +<p><dl compact> +<p><dt>db<dd>The database handle returned by <a href="../../api_c/db_create.html">db_create</a>. +<p><dt>flags<dd>Optional flags modifying the underlying behavior of the <a href="../../api_c/db_close.html">DB->close</a> +interface. +</dl> +<p>Here's what the code to call <a href="../../api_c/db_close.html">DB->close</a> looks like: +<p><blockquote><pre>#include <sys/types.h> +#include <stdio.h> +#include <db.h> +<p> +#define DATABASE "access.db" +<p> +int +main() +{ + DB *dbp; + DBT key, data; + <b>int ret, t_ret;</b> +<p> + if ((ret = db_create(&dbp, NULL, 0)) != 0) { + fprintf(stderr, "db_create: %s\n", db_strerror(ret)); + exit (1); + } + if ((ret = dbp->open( + dbp, DATABASE, NULL, DB_BTREE, DB_CREATE, 0664)) != 0) { + dbp->err(dbp, ret, "%s", DATABASE); + goto err; + } +<p> + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = "fruit"; + key.size = sizeof("fruit"); + data.data = "apple"; + data.size = sizeof("apple"); +<p> + if ((ret = dbp->put(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key stored.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->put"); + goto err; + } +<p> + if ((ret = dbp->get(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key retrieved: data was %s.\n", + (char *)key.data, (char *)data.data); + else { + dbp->err(dbp, ret, "DB->get"); + goto err; + } +<p> + if ((ret = dbp->del(dbp, NULL, &key, 0)) == 0) + printf("db: %s: key was deleted.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->del"); + goto err; + } +<p> + if ((ret = dbp->get(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key retrieved: data was %s.\n", + (char *)key.data, (char *)data.data); + else + dbp->err(dbp, ret, "DB->get"); +<p><b>err: if ((t_ret = dbp->close(dbp, 0)) != 0 && ret == 0) + ret = t_ret; </b> +<p> + exit(ret); +} +</pre></blockquote> +<p>Note that we do not necessarily overwrite the <b>ret</b> variable, as it +may contain error return information from a previous Berkeley DB call. +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/del.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/del.html b/db/docs/ref/simple_tut/del.html new file mode 100644 index 000000000..ac4d41260 --- /dev/null +++ b/db/docs/ref/simple_tut/del.html @@ -0,0 +1,93 @@ +<!--$Id: del.so,v 10.20 2000/03/18 21:43:17 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Removing elements from a database</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/get.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/close.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Removing elements from a database</h1> +<p>The simplest way to remove elements from a database is the <a href="../../api_c/db_del.html">DB->del</a> +interface. +<p>The <a href="../../api_c/db_del.html">DB->del</a> interface takes four of the same five arguments that +the <a href="../../api_c/db_get.html">DB->get</a> and <a href="../../api_c/db_put.html">DB->put</a> interfaces take. The difference +is that there is no need to specify a data item, as the delete operation +is only interested in the key that you want to remove. +<p><dl compact> +<p><dt>db<dd>The database handle returned by <a href="../../api_c/db_create.html">db_create</a>. +<p><dt>txnid<dd>A transaction ID. +In our simple case, we aren't expecting to recover the database after +application or system crash, so we aren't using transactions, and will +leave this argument unspecified. +<p><dt>key<dd>The key item for the key/data pair that we want to delete from the +database. +<p><dt>flags<dd>Optional flags modifying the underlying behavior of the <a href="../../api_c/db_del.html">DB->del</a> +interface. There are currently no available flags for this interface, +so the flags argument should always be set to 0. +</dl> +<p>Here's what the code to call <a href="../../api_c/db_del.html">DB->del</a> looks like: +<p><blockquote><pre>#include <sys/types.h> +#include <stdio.h> +#include <db.h> +<p> +#define DATABASE "access.db" +<p> +int +main() +{ + DB *dbp; + DBT key, data; + int ret; +<p> + if ((ret = db_create(&dbp, NULL, 0)) != 0) { + fprintf(stderr, "db_create: %s\n", db_strerror(ret)); + exit (1); + } + if ((ret = dbp->open( + dbp, DATABASE, NULL, DB_BTREE, DB_CREATE, 0664)) != 0) { + dbp->err(dbp, ret, "%s", DATABASE); + goto err; + } +<p> + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = "fruit"; + key.size = sizeof("fruit"); + data.data = "apple"; + data.size = sizeof("apple"); +<p> + if ((ret = dbp->put(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key stored.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->put"); + goto err; + } +<p> + if ((ret = dbp->get(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key retrieved: data was %s.\n", + (char *)key.data, (char *)data.data); + else { + dbp->err(dbp, ret, "DB->get"); + goto err; + } +<p><b> if ((ret = dbp->del(dbp, NULL, &key, 0)) == 0) + printf("db: %s: key was deleted.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->del"); + goto err; + } +</b></pre></blockquote> +<p>After the <a href="../../api_c/db_del.html">DB->del</a> call returns, the entry referenced by the key +fruit has been removed from the database. +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/get.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/close.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/errors.html b/db/docs/ref/simple_tut/errors.html new file mode 100644 index 000000000..bb7e8a671 --- /dev/null +++ b/db/docs/ref/simple_tut/errors.html @@ -0,0 +1,46 @@ +<!--$Id: errors.so,v 10.19 2000/12/14 21:42:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Error returns</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/handles.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Error returns</h1> +<p>The Berkeley DB interfaces always return a value of 0 on success. If the +operation does not succeed for any reason, the return value will be +non-zero. +<p>If a system error occurred (e.g., Berkeley DB ran out of disk space, or +permission to access a file was denied, or an illegal argument was +specified to one of the interfaces), Berkeley DB returns an <b>errno</b> +value. All of the possible values of <b>errno</b> are greater than +0. +<p>If the operation didn't fail due to a system error, but wasn't +successful either, Berkeley DB returns a special error value. For example, +if you tried to retrieve the data item associated with the key +<b>fruit</b>, and there was no such key/data pair in the database, +Berkeley DB would return <a href="../../ref/program/errorret.html#DB_NOTFOUND">DB_NOTFOUND</a>, a special error value that means +the requested key does not appear in the database. All of the possible +special error values are less than 0. +<p>Berkeley DB also offers programmatic support for displaying error return values. +First, the <a href="../../api_c/env_strerror.html">db_strerror</a> interface returns a pointer to the error +message corresponding to any Berkeley DB error return, similar to the ANSI C +strerror interface, but is able to handle both system error returns and +Berkeley DB-specific return values. +<p>Second, there are two error functions, <a href="../../api_c/db_err.html">DB->err</a> and <a href="../../api_c/db_err.html">DB->errx</a>. +These functions work like the ANSI C printf interface, taking a +printf-style format string and argument list, and optionally appending +the standard error string to a message constructed from the format string +and other arguments. +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/handles.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/example.txt b/db/docs/ref/simple_tut/example.txt new file mode 100644 index 000000000..e610648d1 --- /dev/null +++ b/db/docs/ref/simple_tut/example.txt @@ -0,0 +1,73 @@ +#include <sys/types.h> + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "db.h" + +#define DATABASE "access.db" + +int +main() +{ + DB *dbp; + DBT key, data; + int ret, t_ret; + + /* Create the database handle and open the underlying database. */ + if ((ret = db_create(&dbp, NULL, 0)) != 0) { + fprintf(stderr, "db_create: %s\n", db_strerror(ret)); + exit (1); + } + if ((ret = + dbp->open(dbp, DATABASE, NULL, DB_BTREE, DB_CREATE, 0664)) != 0) { + dbp->err(dbp, ret, "%s", DATABASE); + goto err; + } + + /* Initialize key/data structures. */ + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = "fruit"; + key.size = sizeof("fruit"); + data.data = "apple"; + data.size = sizeof("apple"); + + /* Store a key/data pair. */ + if ((ret = dbp->put(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key stored.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->put"); + goto err; + } + + /* Retrieve a key/data pair. */ + if ((ret = dbp->get(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key retrieved: data was %s.\n", + (char *)key.data, (char *)data.data); + else { + dbp->err(dbp, ret, "DB->get"); + goto err; + } + + /* Delete a key/data pair. */ + if ((ret = dbp->del(dbp, NULL, &key, 0)) == 0) + printf("db: %s: key was deleted.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->del"); + goto err; + } + + /* Retrieve a key/data pair. */ + if ((ret = dbp->get(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key retrieved: data was %s.\n", + (char *)key.data, (char *)data.data); + else + dbp->err(dbp, ret, "DB->get"); + +err: if ((t_ret = dbp->close(dbp, 0)) != 0 && ret == 0) + ret = t_ret; + + exit(ret); +} diff --git a/db/docs/ref/simple_tut/get.html b/db/docs/ref/simple_tut/get.html new file mode 100644 index 000000000..697aa8f51 --- /dev/null +++ b/db/docs/ref/simple_tut/get.html @@ -0,0 +1,97 @@ +<!--$Id: get.so,v 10.23 2000/12/14 21:42:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Retrieving elements from a database</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/del.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Retrieving elements from a database</h1> +<p>The simplest way to retrieve elements from a database is the +<a href="../../api_c/db_get.html">DB->get</a> interface. +<p>The <a href="../../api_c/db_get.html">DB->get</a> interface takes the same five arguments that the +<a href="../../api_c/db_put.html">DB->put</a> interface takes: +<p><dl compact> +<p><dt>db<dd>The database handle returned by <a href="../../api_c/db_create.html">db_create</a>. +<p><dt>txnid<dd>A transaction ID. In our simple case, we aren't expecting to recover +the database after application or system crash, so we aren't using +transactions, and will leave this argument NULL. +<p><dt>key<dd>The key item for the key/data pair that we want to retrieve from the +database. +<p><dt>data<dd>The data item for the key/data pair that we want to retrieve from the +database. +<p><dt>flags<dd>Optional flags modifying the underlying behavior of the <a href="../../api_c/db_get.html">DB->get</a> +interface. +</dl> +<p>Here's what the code to call <a href="../../api_c/db_get.html">DB->get</a> looks like: +<p><blockquote><pre>#include <sys/types.h> +#include <stdio.h> +#include <db.h> +<p> +#define DATABASE "access.db" +<p> +int +main() +{ + DB *dbp; + DBT key, data; + int ret; +<p> + if ((ret = db_create(&dbp, NULL, 0)) != 0) { + fprintf(stderr, "db_create: %s\n", db_strerror(ret)); + exit (1); + } + if ((ret = dbp->open( + dbp, DATABASE, NULL, DB_BTREE, DB_CREATE, 0664)) != 0) { + dbp->err(dbp, ret, "%s", DATABASE); + goto err; + } +<p> + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = "fruit"; + key.size = sizeof("fruit"); + data.data = "apple"; + data.size = sizeof("apple"); +<p> + if ((ret = dbp->put(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key stored.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->put"); + goto err; + } +<p><b> if ((ret = dbp->get(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key retrieved: data was %s.\n", + (char *)key.data, (char *)data.data); + else { + dbp->err(dbp, ret, "DB->get"); + goto err; + } +</b></pre></blockquote> +<p>It is not usually necessary to clear the <a href="../../api_c/dbt.html">DBT</a> structures passed +to the Berkeley DB functions between calls. This is not always true, when +some of the less commonly used flags for <a href="../../api_c/dbt.html">DBT</a> structures are +used. The <a href="../../api_c/dbt.html">DBT</a> manual page specified the details of those cases. +<p>It is possible, of course, to distinguish between system errors and the +key/data pair simply not existing in the database. There are three +standard returns from <a href="../../api_c/db_get.html">DB->get</a>: +<p><ol> +<p><li>The call might be successful and the key found, in which case the return +value will be 0. +<li>The call might be successful, but the key not found, in which case the +return value will be <a href="../../ref/program/errorret.html#DB_NOTFOUND">DB_NOTFOUND</a>. +<li>The call might not be successful, in which case the return value will +be a system error. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/del.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/handles.html b/db/docs/ref/simple_tut/handles.html new file mode 100644 index 000000000..2396a224e --- /dev/null +++ b/db/docs/ref/simple_tut/handles.html @@ -0,0 +1,29 @@ +<!--$Id: handles.so,v 10.8 2000/03/18 21:43:17 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Handles</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/keydata.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/errors.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Handles</h1> +<p>With a few minor exceptions, Berkeley DB functionality is accessed by creating +a structure and then calling functions that are fields in that structure. +This is, of course, similar to object-oriented concepts, of instances and +methods on them. For simplicity, we will often refer to these structure +fields as methods of the handle. +<p>The manual pages will show these methods as C structure references. For +example, the open-a-database method for a database handle is represented +as <a href="../../api_c/db_open.html">DB->open</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/keydata.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/errors.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/intro.html b/db/docs/ref/simple_tut/intro.html new file mode 100644 index 000000000..a9b6f648c --- /dev/null +++ b/db/docs/ref/simple_tut/intro.html @@ -0,0 +1,40 @@ +<!--$Id: intro.so,v 10.20 2000/12/04 18:05:44 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/intro/products.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/keydata.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Introduction</h1> +<p>As an introduction to Berkeley DB, we will present a few Berkeley DB programming +concepts, and then a simple database application. +<p>The programming concepts are: +<ul type=disc> +<li><a href="keydata.html">Key/data pairs</a> +<li><a href="handles.html">Object handles</a> +<li><a href="errors.html">Error returns</a> +</ul> +<p>This database application will: +<ul type=disc> +<li><a href="open.html">Create a simple database</a> +<li><a href="put.html">Store items</a> +<li><a href="get.html">Retrieve items</a> +<li><a href="del.html">Remove items</a> +<li><a href="close.html">Close the database</a> +</ul> +<p>The introduction will be presented using the programming language C. The +<a href="example.txt">complete source</a> of the final version of the +example program is included in the Berkeley DB distribution. +<table><tr><td><br></td><td width="1%"><a href="../../ref/intro/products.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/keydata.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/keydata.html b/db/docs/ref/simple_tut/keydata.html new file mode 100644 index 000000000..38d34aebc --- /dev/null +++ b/db/docs/ref/simple_tut/keydata.html @@ -0,0 +1,48 @@ +<!--$Id: keydata.so,v 10.19 2000/12/14 21:42:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Key/data pairs</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/handles.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Key/data pairs</h1> +<p>Berkeley DB uses key/data pairs to identify elements in the database. +That is, in the general case, whenever you call a Berkeley DB interface, +you present a key to identify the key/data pair on which you intend +to operate. +<p>For example, you might store some key/data pairs as follows: +<p><table border=1> +<tr><th>Key:</th><th>Data:</th></tr> +<tr><td>fruit</td><td>apple</td></tr> +<tr><td>sport</td><td>cricket</td></tr> +<tr><td>drink</td><td>water</td></tr> +</table> +<p>In each case, the first element of the pair is the key, and the second is +the data. To store the first of these key/data pairs into the database, +you would call the Berkeley DB interface to store items, with <b>fruit</b> as +the key, and <b>apple</b> as the data. At some future time, you could +then retrieve the data item associated with <b>fruit</b>, and the Berkeley DB +retrieval interface would return <b>apple</b> to you. While there are +many variations and some subtleties, all accesses to data in Berkeley DB come +down to key/data pairs. +<p>Both key and data items are stored in simple structures (called +<a href="../../api_c/dbt.html">DBT</a>s) that contain a reference to memory and a length, counted +in bytes. (The name <a href="../../api_c/dbt.html">DBT</a> is an acronym for <i>database +thang</i>, chosen because nobody could think of a sensible name that wasn't +already in use somewhere else.) Key and data items can be arbitrary +binary data of practically any length, including 0 bytes. There is a +single data item for each key item, by default, but databases can be +configured to support multiple data items for each key item. +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/handles.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/open.html b/db/docs/ref/simple_tut/open.html new file mode 100644 index 000000000..24df8a8e1 --- /dev/null +++ b/db/docs/ref/simple_tut/open.html @@ -0,0 +1,90 @@ +<!--$Id: open.so,v 10.27 2000/12/14 21:42:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Opening a database</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/errors.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Opening a database</h1> +<p>Opening a database is done in two steps: first, a DB handle is +created using the Berkeley DB <a href="../../api_c/db_create.html">db_create</a> interface, and then the +actual database is opened using the <a href="../../api_c/db_open.html">DB->open</a> function. +<p>The <a href="../../api_c/db_create.html">db_create</a> interface takes three arguments: +<p><dl compact> +<p><dt>dbp<dd>A location to store a reference to the created structure. +<p><dt>environment<dd>A location to specify an enclosing Berkeley DB environment, not used in our +example. +<p><dt>flags<dd>A placeholder for flags, not used in our example. +</dl> +<p>The <a href="../../api_c/db_open.html">DB->open</a> interface takes five arguments: +<p><dl compact> +<p><dt>file<dd>The name of the database file to be opened. +<p><dt>database<dd>The optional database name, not used in this example. +<p><dt>type<dd>The type of database to open. This value will be one of the four access +methods Berkeley DB supports: DB_BTREE, DB_HASH, DB_QUEUE or DB_RECNO, or the +special value DB_UNKNOWN, which allows you to open an existing file +without knowing its type. +<p><dt>flags<dd>Various flags that modify the behavior of <a href="../../api_c/db_open.html">DB->open</a>. In our +simple case, the only interesting flag is <a href="../../api_c/env_open.html#DB_CREATE">DB_CREATE</a>. This flag +behaves similarly to the IEEE/ANSI Std 1003.1 (POSIX) O_CREATE flag to the open system +call, causing Berkeley DB to create the underlying database if it does not +yet exist. +<p><dt>mode<dd>The file mode of any underlying files that <a href="../../api_c/db_open.html">DB->open</a> will create. +The mode behaves as does the IEEE/ANSI Std 1003.1 (POSIX) mode argument to the open +system call, and specifies file read, write and execute permissions. +Of course, only the read and write permissions are relevant to Berkeley DB. +</dl> +<p>Here's what the code to create the handle and then call <a href="../../api_c/db_open.html">DB->open</a> +looks like: +<p><blockquote><pre><b>#include <sys/types.h> +#include <stdio.h> +#include <db.h> +<p> +#define DATABASE "access.db" +<p> +int +main() +{ + DB *dbp; + int ret; +<p> + if ((ret = db_create(&dbp, NULL, 0)) != 0) { + fprintf(stderr, "db_create: %s\n", db_strerror(ret)); + exit (1); + } + if ((ret = dbp->open( + dbp, DATABASE, NULL, DB_BTREE, DB_CREATE, 0664)) != 0) { + dbp->err(dbp, ret, "%s", DATABASE); + goto err; + }</b> +</pre></blockquote> +<p>If the call to <a href="../../api_c/db_create.html">db_create</a> is successful, the variable <b>dbp</b> +will contain a database handle that will be used to configure and access +an underlying database. +<p>As you see, the program opens a database named <b>access.db</b>. The +underlying database is a Btree. Because the <a href="../../api_c/env_open.html#DB_CREATE">DB_CREATE</a> flag was +specified, the file will be created if it does not already exist. The +mode of any created files will be 0664 (i.e., readable and writeable by +the owner and the group, and readable by everyone else). +<p>One additional function call is used in this code sample, <a href="../../api_c/db_err.html">DB->err</a>. +This method works like the ANSI C printf interface. The second argument +is the error return from a Berkeley DB function, and the rest of the arguments +are a printf-style format string and argument list. The error message +associated with the error return will be appended to a message constructed +from the format string and other arguments. In the above code, if the +<a href="../../api_c/db_open.html">DB->open</a> call were to fail, the message it would display would be +something like +<p><blockquote><pre>access.db: Operation not permitted</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/errors.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/simple_tut/put.html b/db/docs/ref/simple_tut/put.html new file mode 100644 index 000000000..8ecdfa6ca --- /dev/null +++ b/db/docs/ref/simple_tut/put.html @@ -0,0 +1,127 @@ +<!--$Id: put.so,v 10.31 2000/12/18 21:05:15 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Adding elements to a database</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Simple Tutorial</dl></h3></td> +<td width="1%"><a href="../../ref/simple_tut/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/get.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Adding elements to a database</h1> +<p>The simplest way to add elements to a database is the <a href="../../api_c/db_put.html">DB->put</a> +interface. +<p>The <a href="../../api_c/db_put.html">DB->put</a> interface takes five arguments: +<p><dl compact> +<p><dt>db<dd>The database handle returned by <a href="../../api_c/db_create.html">db_create</a>. +<p><dt>txnid<dd>A transaction handle. In our simple case, we aren't expecting to +recover the database after application or system crash, so we aren't +using transactions, and will leave this argument NULL. +<p><dt>key<dd>The key item for the key/data pair that we want to add to the database. +<p><dt>data<dd>The data item for the key/data pair that we want to add to the database. +<p><dt>flags<dd>Optional flags modifying the underlying behavior of the <a href="../../api_c/db_put.html">DB->put</a> +interface. +</dl> +<p>Here's what the code to call <a href="../../api_c/db_put.html">DB->put</a> looks like: +<p><blockquote><pre>#include <sys/types.h> +#include <stdio.h> +#include <db.h> +<p> +#define DATABASE "access.db" +<p> +int +main() +{ + DB *dbp; + <b>DBT key, data;</b> + int ret; +<p> + if ((ret = db_create(&dbp, NULL, 0)) != 0) { + fprintf(stderr, "db_create: %s\n", db_strerror(ret)); + exit (1); + } + if ((ret = dbp->open( + dbp, DATABASE, NULL, DB_BTREE, DB_CREATE, 0664)) != 0) { + dbp->err(dbp, ret, "%s", DATABASE); + goto err; + } +<p><b> memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = "fruit"; + key.size = sizeof("fruit"); + data.data = "apple"; + data.size = sizeof("apple"); +<p> + if ((ret = dbp->put(dbp, NULL, &key, &data, 0)) == 0) + printf("db: %s: key stored.\n", (char *)key.data); + else { + dbp->err(dbp, ret, "DB->put"); + goto err; + } +</b></pre></blockquote> +<p>The first thing to notice about this new code is that we clear the +<a href="../../api_c/dbt.html">DBT</a> structures that we're about to pass as arguments to Berkeley DB +functions. This is very important, and being careful to do so will +result in fewer errors in your programs. All Berkeley DB structures +instantiated in the application and handed to Berkeley DB should be cleared +before use, without exception. This is necessary so that future +versions of Berkeley DB may add additional fields to the structures. If +applications clear the structures before use, it will be possible for +Berkeley DB to change those structures without requiring that the applications +be rewritten to be aware of the changes. +<p>Notice also that we're storing the trailing nul byte found in the C +strings <b>"fruit"</b> and <b>"apple"</b> in both the key and data +items, that is, the trailing nul byte is part of the stored key, and +therefore has to be specified in order to access the data item. There is +no requirement to store the trailing nul byte, it simply makes it easier +for us to display strings that we've stored in programming languages that +use nul bytes to terminate strings. +<p>In many applications, it is important not to overwrite existing +data. For example, we might not want to store the key/data pair +<b>fruit/apple</b> if it already existed, e.g., if someone had +previously stored the key/data pair <b>fruit/cherry</b> into the +database. +<p>This is easily accomplished by adding the <a href="../../api_c/db_put.html#DB_NOOVERWRITE">DB_NOOVERWRITE</a> flag to +the <a href="../../api_c/db_put.html">DB->put</a> call: +<p><blockquote><pre><b>if ((ret = + dbp->put(dbp, NULL, &key, &data, DB_NOOVERWRITE)) == 0) + printf("db: %s: key stored.\n", (char *)key.data); +else { + dbp->err(dbp, ret, "DB->put"); + goto err; +}</b></pre></blockquote> +<p>This flag causes the underlying database functions to not overwrite any +previously existing key/data pair. (Note that the value of the previously +existing data doesn't matter in this case. The only question is if a +key/data pair already exists where the key matches the key that we are +trying to store.) +<p>Specifying <a href="../../api_c/db_put.html#DB_NOOVERWRITE">DB_NOOVERWRITE</a> opens up the possibility of a new +Berkeley DB return value from the <a href="../../api_c/db_put.html">DB->put</a> function, <a href="../../api_c/dbc_put.html#DB_KEYEXIST">DB_KEYEXIST</a>, +which means we were unable to add the key/data pair to the database +because the key already existed in the database. While the above sample +code simply displays a message in this case: +<p><blockquote><pre>DB->put: DB_KEYEXIST: Key/data pair already exists</pre></blockquote> +<p>The following code shows an explicit check for this possibility: +<p><blockquote><pre><b>switch (ret = + dbp->put(dbp, NULL, &key, &data, DB_NOOVERWRITE)) { +case 0: + printf("db: %s: key stored.\n", (char *)key.data); + break; +case DB_KEYEXIST: + printf("db: %s: key previously stored.\n", + (char *)key.data); + break; +default: + dbp->err(dbp, ret, "DB->put"); + goto err; +}</b></pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/simple_tut/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/simple_tut/get.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/tcl/error.html b/db/docs/ref/tcl/error.html new file mode 100644 index 000000000..3d1de037d --- /dev/null +++ b/db/docs/ref/tcl/error.html @@ -0,0 +1,69 @@ +<!--$Id: error.so,v 11.13 2001/01/09 18:48:06 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Tcl error handling</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Tcl</dl></h3></td> +<td width="1%"><a href="../../ref/tcl/program.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Tcl error handling</h1> +<p>The Tcl interfaces to Berkeley DB generally return TCL_OK on success and throw +a Tcl error on failure, using the appropriate Tcl interfaces to provide +the user with an informative error message. There are some "expected" +failures, however, for which no Tcl error will be thrown and for which +Tcl commands will return TCL_OK. These failures include when a +searched-for key is not found, a requested key/data pair was previously +deleted, or a key/data pair cannot be written because the key already +exists. +<p>These failures can be detected by searching the Berkeley DB error message that +is returned. For example, to detect that an attempt to put a record into +the database failed because the key already existed: +<p><blockquote><pre>% berkdb open -create -btree a.db +db0 +% db0 put dog cat +0 +% set ret [db0 put -nooverwrite dog newcat] +DB_KEYEXIST: Key/data pair already exists +% if { [string first DB_KEYEXIST $ret] != -1 } { + puts "This was an error; the key existed" +} +This was an error; the key existed +% db0 close +0 +% exit</pre></blockquote> +<p>To simplify parsing, it is recommended that the initial Berkeley DB error name +be checked, e.g., DB_KEYEXIST in the above example. These values will +not change in future releases of Berkeley DB to ensure that Tcl scripts are not +broken by upgrading to new releases of Berkeley DB. There are currently only +three such "expected" error returns. They are: +<p><blockquote><pre>DB_NOTFOUND: No matching key/data pair found +DB_KEYEMPTY: Non-existent key/data pair +DB_KEYEXIST: Key/data pair already exists</pre></blockquote> +<p>Finally, in some cases, when a Berkeley DB error occurs Berkeley DB will output +additional error information. By default, all Berkeley DB error messages will +be prefixed with the created command in whose context the error occurred +(e.g., "env0", "db2", etc.). There are several ways to capture and +access this information. +<p>First, if Berkeley DB invokes the error callback function, the additional +information will be placed in the error result returned from the +command and in the errorInfo backtrace variable in Tcl. +<p>Also the two calls to open an environment and +open a database take an option, <b>-errfile filename</b>, which sets an +output file to which these additional error messages should be written. +<p>Additionally the two calls to open an environment and +open a database take an option, <b>-errpfx string</b>, which sets the +error prefix to the given string. This option may be useful +in circumstances where a more descriptive prefix is desired or +where a constant prefix indicating an error is desired. +<table><tr><td><br></td><td width="1%"><a href="../../ref/tcl/program.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/tcl/faq.html b/db/docs/ref/tcl/faq.html new file mode 100644 index 000000000..29f63b423 --- /dev/null +++ b/db/docs/ref/tcl/faq.html @@ -0,0 +1,60 @@ +<!--$Id: faq.so,v 11.2 2001/01/15 17:50:48 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Frequently Asked Questions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> <a name="3"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Tcl API</dl></h3></td> +<td width="1%"><a href="../../ref/tcl/error.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/sendmail/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Frequently Asked Questions</h1> +<p><ol> +<p><li><b>I have several versions of Tcl installed. How do I configure +Berkeley DB to use a particular version?</b> +<p>To compile the Tcl interface with a particular version of Tcl, use the +--with-tcl option to specify the Tcl installation directory that contains +the tclConfig.sh file. +<p>See <a href="../../ref/build_unix/flags.html">Changing compile or load options</a> +for more information. +<hr size=1 noshade> +<p><li><b>Berkeley DB was configured using --enable-tcl or --with-tcl and fails +to build.</b> +<p>The Berkeley DB Tcl interface requires Tcl version 8.1 or greater. You can +download a copy of Tcl from the +<a href="http://www.ajubasolutions.com/home.html">Ajuba Solutions</a> +corporate web site. +<hr size=1 noshade> +<p><li><b>Berkeley DB was configured using --enable-tcl or --with-tcl and fails +to build.</b> +<p>If the Tcl installation was moved after it was configured and installed, +try re-configuring and re-installing Tcl. +<p>Also, some systems do not search for shared libraries by default, or do +not search for shared libraries named the way the Tcl installation names +them, or are searching for a different kind of library than those in +your Tcl installation. For example, Linux systems often require linking +"libtcl.a" to "libtcl#.#.a", while AIX systems often require adding the +"-brtl" flag to the linker. A simpler solution that almost always works +on all systems is to create a link from "libtcl.#.#.a" or "libtcl.so" +(or whatever you happen to have) to "libtcl.a" and reconfigure. +<hr size=1 noshade> +<p><li><b>Loading the Berkeley DB library into Tcl on AIX causes a core dump.</b> +<p>In some versions of Tcl, the "tclConfig.sh" autoconfiguration script +created by the Tcl installation does not work properly under AIX. To +build a working Berkeley DB Tcl API when this happens, use the "--enable-tcl" +flag to configure Berkeley DB (rather than "--with-tcl"). In addition, you +will have to specify any necessary include and library paths and linker +flags needed to build with Tcl by setting the CPPFLAGS, LIBS and LDFLAGS +environment variables before running configure. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/tcl/error.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/sendmail/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/tcl/intro.html b/db/docs/ref/tcl/intro.html new file mode 100644 index 000000000..6484eaac6 --- /dev/null +++ b/db/docs/ref/tcl/intro.html @@ -0,0 +1,70 @@ +<!--$Id: intro.so,v 11.14 2000/12/04 20:49:18 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Loading Berkeley DB with Tcl</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Tcl</dl></h3></td> +<td width="1%"><a href="../../ref/perl/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/using.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Loading Berkeley DB with Tcl</h1> +<p>Berkeley DB includes a dynamically loadable Tcl API. The Tcl API requires that +Tcl/Tk 8.1 or later already be installed on your system. We recommend +that you install later releases of Tcl/Tk than 8.1, if possible, +especially on Windows platforms, as we found that we had to make local +fixes to the 8.1 release in a few cases. You can download a copy of +Tcl from the <a href="http://www.ajubasolutions.com/home.html">Ajuba +Solutions</a> corporate web site. +<p>This document assumes that you have already configured Berkeley DB for Tcl +support and you have built and installed everything where you want it +to be. If you have not done so, see +<a href="../../ref/build_unix/conf.html">Configuring Berkeley DB</a> or +<a href="../../ref/build_win/intro.html">Building for Win32</a> for more +information. +<h3>Installing as a Tcl Package</h3> +<p>Once enabled, the Berkeley DB shared library for Tcl is automatically installed +as part of the standard installation process. However, if you wish to be +able to dynamically load it as a Tcl package into your script there are +several steps that must be performed: +<p><ol> +<p><li>Run the Tcl shell in the install directory +<li>Append this directory to your auto_path variable +<li>Run the pkg_mkIndex proc giving the name of the Berkeley DB Tcl library +</ol> +<p>For example: +<p><blockquote><pre># tclsh8.1 +% lappend auto_path /usr/local/BerkeleyDB/lib +% pkg_mkIndex /usr/local/BerkeleyDB/lib libdb_tcl-3.2.so libdb-3.2.so</pre></blockquote> +<p>Note that your Tcl and Berkeley DB version numbers may differ from the example, +and so your tclsh and and library names may be different. +<h3>Loading Berkeley DB with Tcl</h3> +<p>The Berkeley DB package may be loaded into the user's interactive Tcl script +(or wish session) via the "load" command. For example: +<p><blockquote><pre>load /usr/local/BerkeleyDB/lib/libdb_tcl-3.2.so</pre></blockquote> +<p>Note that your Berkeley DB version numbers may differ from the example, and so +the library name may be different. +<p>If you installed your library to run as a Tcl package, Tcl application +scripts should use the "package" command to indicate to the Tcl +interpreter that it needs the Berkeley DB package and where to find it. For +example: +<p><blockquote><pre>lappend auto_path "/usr/local/BerkeleyDB/lib" +package require Db_tcl</pre></blockquote> +<p>No matter which way the library gets loaded, it creates a command named +<b>berkdb</b>. All of the Berkeley DB functionality is accessed via this +command and additional commands it creates on behalf of the application. +A simple test to determine if everything is loaded and ready is to ask +for the version: +<p><blockquote><pre>berkdb version -string</pre></blockquote> +<p>This should return you the Berkeley DB version in a string format. +<table><tr><td><br></td><td width="1%"><a href="../../ref/perl/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/using.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/tcl/program.html b/db/docs/ref/tcl/program.html new file mode 100644 index 000000000..881c8848b --- /dev/null +++ b/db/docs/ref/tcl/program.html @@ -0,0 +1,33 @@ +<!--$Id: program.so,v 11.9 2000/12/04 18:05:44 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Tcl API programming notes</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Tcl</dl></h3></td> +<td width="1%"><a href="../../ref/tcl/using.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/error.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Tcl API programming notes</h1> +<p>The Tcl API closely parallels the Berkeley DB programmatic interfaces. If you +are already familiar with one of those interfaces there will not be many +surprises in the Tcl API. +<p>Several pieces of Berkeley DB functionality are not available in the Tcl API. +Any of the functions that require a user-provided function are not +supported via the Tcl API. For example, there is no equivalent to the +<a href="../../api_c/db_set_dup_compare.html">DB->set_dup_compare</a> or the <a href="../../api_c/env_set_errcall.html">DBENV->set_errcall</a> +methods. +<p>The Berkeley DB Tcl API always turns on the DB_THREAD flag for environments and +databases making no assumptions about the existence or lack thereof of +threads support in current or future releases of Tcl. +<table><tr><td><br></td><td width="1%"><a href="../../ref/tcl/using.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/error.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/tcl/using.html b/db/docs/ref/tcl/using.html new file mode 100644 index 000000000..6c927477c --- /dev/null +++ b/db/docs/ref/tcl/using.html @@ -0,0 +1,53 @@ +<!--$Id: using.so,v 11.6 2000/03/18 21:43:17 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Using Berkeley DB with Tcl</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Tcl</dl></h3></td> +<td width="1%"><a href="../../ref/tcl/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/program.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Using Berkeley DB with Tcl</h1> +<p>All commands in the Berkeley DB Tcl interface are of the form: +<p><blockquote><pre>command_handle operation options</pre></blockquote> +<p>The <i>command handle</i> is <b>berkdb</b> or one of the additional +commands that may be created. The <i>operation</i> is what you want +to do to that handle and the <i>options</i> apply to the operation. +Commands that get created on behalf of the application have their own sets +of operations. Generally any calls in DB that result in new object +handles will translate into a new command handle in Tcl. Then the user +can access the operations of the handle via the new Tcl command handle. +<p>Newly created commands are named with an abbreviated form of their objects +followed by a number. Some created commands are subcommands of other +created commands and will be the first command, followed by a period, '.' +followed by the new subcommand. For example, suppose you have a database +already existing called my_data.db. The following example shows the +commands created when you open the database, and when you open a cursor: +<p><blockquote><pre># First open the database and get a database command handle +% berkdb open my_data.db +db0 +#Get some data from that database +% db0 get my_key +{{my_key my_data0}{my_key my_data1}} +#Open a cursor in this database, get a new cursor handle +% db0 cursor +db0.c0 +#Get the first data from the cursor +% db0.c0 get -first +{{first_key first_data}}</pre></blockquote> +<p>All commands in the library support a special option <b>-?</b> that will +list the correct operations for a command or the correct options. +<p>A list of commands and operations can be found in the +<a href="../../api_tcl/tcl_index.html">Tcl Interface</a> documentation. +<table><tr><td><br></td><td width="1%"><a href="../../ref/tcl/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/tcl/program.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/test/faq.html b/db/docs/ref/test/faq.html new file mode 100644 index 000000000..ec5d2d3f0 --- /dev/null +++ b/db/docs/ref/test/faq.html @@ -0,0 +1,32 @@ +<!--$Id: faq.so,v 10.2 2000/08/10 17:54:49 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Test suite FAQ</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Test Suite</dl></h3></td> +<td width="1%"><a href="../../ref/test/run.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/distrib/layout.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Test suite FAQ</h1> +<p><ol> +<p><li><b>The test suite has been running for over a day. What's wrong?</b> +<p>The test suite an take anywhere from some number of hours to several +days to run, depending on your hardware configuration. As long as the +run is making forward progress and new lines are being written to the +<b>ALL.OUT</b> file, everything is probably fine. +<p><li><b>The test suite hangs.</b> +<p>The test suite requires Tcl 8.1 or greater, preferably at least Tcl 8.3. +If you are using an earlier version of Tcl, the test suite may simply +hang at some point. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/test/run.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/distrib/layout.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/test/run.html b/db/docs/ref/test/run.html new file mode 100644 index 000000000..078951a05 --- /dev/null +++ b/db/docs/ref/test/run.html @@ -0,0 +1,78 @@ +<!--$Id: run.so,v 10.34 2000/11/28 21:27:49 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Running the test suite</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Test Suite</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/test/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Running the test suite</h1> +<p>Once you have started tclsh and have loaded the test.tcl source file (see +<a href="../../ref/build_unix/test.html">Running the test suite under UNIX</a> +and <a href="../../ref/build_win/test.html">Running the test suite under +Windows</a> for more information), you are ready to run the test suite. At +the tclsh prompt, to run the entire test suite, enter: +<p><blockquote><pre>% run_all</pre></blockquote> +<p>Running all the tests can take from several hours to a few days to +complete, depending on your hardware. For this reason, the output from +this command is re-directed to a file in the current directory named +<b>ALL.OUT</b>. Periodically, a line will be written to the standard +output indicating what test is being run. When the suite has finished, +a single message indicating that the test suite completed successfully or +that it failed will be written. If the run failed, you should review the +file ALL.OUT to determine which tests failed. Any errors will appear in +that file as output lines beginning with the string: FAIL. +<p>It is also possible to run specific tests or tests for a particular +subsystem: +<p><blockquote><pre>% r archive +% r btree +% r env +% r frecno +% r hash +% r join +% r join +% r lock +% r log +% r mpool +% r mutex +% r queue +% r rbtree +% r recno +% r rrecno +% r subdb +% r txn</pre></blockquote> +<p>Or to run a single, individual test: +<p><blockquote><pre>% test001 btree</pre></blockquote> +<p>It is also possible to modify the test run based on arguments on the +command line. For example, the command: +<p><blockquote><pre>% test001 btree 10</pre></blockquote> +<p>will run a greatly abbreviated form of test001, doing 10 operations +instead of 10,000. +<p>In all cases, when not running the entire test suite as described above, +a successful test run will return you to the tclsh prompt (%). On +failure, a message is displayed indicating what failed. +<p>Tests are run, by default, in the directory <b>TESTDIR</b>. However, +the test files are often very large. To use a different directory for +the test directory, edit the file include.tcl in your build directory, +and change the line: +<p><blockquote><pre>set testdir ./TESTDIR</pre></blockquote> +<p>to a more appropriate value for your system, e.g.: +<p><blockquote><pre>set testdir /var/tmp/db.test</pre></blockquote> +<p>Alternatively, you can create a symbolic link named TESTDIR in your build +directory to an appropriate location for running the tests. Regardless +of where you run the tests, the TESTDIR directory should be on a local +filesystem, using a remote filesystem (e.g., NFS) will almost certainly +cause spurious test failures. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/test/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/toc.html b/db/docs/ref/toc.html new file mode 100644 index 000000000..e56ee5d48 --- /dev/null +++ b/db/docs/ref/toc.html @@ -0,0 +1,310 @@ +<!--$Id: toc.so,v 10.166 2001/01/18 20:31:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB: Reference Guide Table of Contents</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<h1 align=center>Reference Guide Table of Contents</h1> +<ol> +<font size="+1"><li>Introduction</font> + <ol> + <li><a href="intro/data.html">An introduction to data management</a> + <li><a href="intro/terrain.html">Mapping the terrain: theory and practice</a> + <li><a href="intro/dbis.html">What is Berkeley DB?</a> + <li><a href="intro/dbisnot.html">What is Berkeley DB not?</a> + <li><a href="intro/need.html">Do you need Berkeley DB?</a> + <li><a href="intro/what.html">What other services does Berkeley DB provide?</a> + <li><a href="intro/distrib.html">What does the Berkeley DB distribution include?</a> + <li><a href="intro/where.html">Where does Berkeley DB run?</a> + <li><a href="intro/products.html">Sleepycat Software's Berkeley DB products</a> + </ol> +<font size="+1"><li>Getting Started: A Simple Tutorial</font> + <ol> + <li><a href="simple_tut/intro.html">Introduction</a> + <li><a href="simple_tut/keydata.html">Key/data pairs</a> + <li><a href="simple_tut/handles.html">Object handles</a> + <li><a href="simple_tut/errors.html">Error returns</a> + <li><a href="simple_tut/open.html">Opening a database</a> + <li><a href="simple_tut/put.html">Adding elements to a database</a> + <li><a href="simple_tut/get.html">Retrieving elements from a database</a> + <li><a href="simple_tut/del.html">Removing elements from a database</a> + <li><a href="simple_tut/close.html">Closing a database</a> + </ol> +<font size="+1"><li>Access Method Configuration</font> + <ol> + <li><a href="am_conf/intro.html">What are the available access methods?</a> + <li><a href="am_conf/select.html">Selecting an access method</a> + <li><a href="am_conf/logrec.html">Logical record numbers</a> + <li>General access method configuration + <ol> + <li><a href="am_conf/pagesize.html">Selecting a page size</a> + <li><a href="am_conf/cachesize.html">Selecting a cache size</a> + <li><a href="am_conf/byteorder.html">Selecting a byte order</a> + <li><a href="am_conf/dup.html">Duplicate data items</a> + <li><a href="am_conf/malloc.html">Non-local memory allocation</a> + </ol> + <li>Btree access method specific configuration + <ol> + <li><a href="am_conf/bt_compare.html">Btree comparison</a> + <li><a href="am_conf/bt_prefix.html">Btree prefix comparison</a> + <li><a href="am_conf/bt_minkey.html">Minimum keys per page</a> + <li><a href="am_conf/bt_recnum.html">Retrieving Btree records by logical record number</a> + </ol> + <li>Hash access method specific configuration + <ol> + <li><a href="am_conf/h_ffactor.html">Page fill factor</a> + <li><a href="am_conf/h_hash.html">Specifying a database hash</a> + <li><a href="am_conf/h_nelem.html">Hash table size</a> + </ol> + <li>Queue and Recno access method specific configuration + <ol> + <li><a href="am_conf/recno.html">Managing record-based databases</a> + <li><a href="am_conf/extentsize.html">Selecting a Queue extent size</a> + <li><a href="am_conf/re_source.html">Flat-text backing files</a> + <li><a href="am_conf/renumber.html">Logically renumbering records</a> + </ol> + </ol> +<font size="+1"><li>Access Method Operations</font> + <ol> + <li><a href="am/ops.html">Access method operations</a> + <li><a href="am/open.html">Opening a database</a> + <li><a href="am/opensub.html">Opening multiple databases in a single file</a> + <li><a href="am/upgrade.html">Upgrading databases</a> + <li><a href="am/get.html">Retrieving records</a> + <li><a href="am/put.html">Storing records</a> + <li><a href="am/delete.html">Deleting records</a> + <li><a href="am/sync.html">Flushing the database cache</a> + <li><a href="am/stat.html">Database statistics</a> + <li><a href="am/close.html">Closing a database</a> + <li><a href="am/cursor.html">Database cursors</a> + <ol> + <li><a href="am/curget.html">Retrieving records with a cursor</a> + <li><a href="am/curput.html">Storing records with a cursor</a> + <li><a href="am/curdel.html">Deleting records with a cursor</a> + <li><a href="am/curdup.html">Duplicating a cursor</a> + <li><a href="am/join.html">Logical join</a> + <li><a href="am/count.html">Data item count</a> + <li><a href="am/curclose.html">Closing a cursor</a> + <li><a href="am/stability.html">Cursor stability</a> + </ol> + <li><a href="am/partial.html">Partial record storage and retrieval</a> + <li><a href="am/verify.html">Database verification and salvage</a> + <li><a href="am/error.html">Error support</a> + </ol> +<font size="+1"><li>Berkeley DB Architecture</font> + <ol> + <li><a href="arch/bigpic.html">The big picture</a> + <li><a href="arch/progmodel.html">Programming model</a> + <li><a href="arch/apis.html">Programmatic APIs</a> + <li><a href="arch/script.html">Scripting languages</a> + <li><a href="arch/utilities.html">Supporting utilities</a> + </ol> +<font size="+1"><li>The Berkeley DB Environment</font> + <ol> + <li><a href="env/intro.html">Introduction</a> + <li><a href="env/create.html">Creating an environment</a> + <li><a href="env/naming.html">File naming</a> + <li><a href="env/security.html">Security</a> + <li><a href="env/region.html">Shared memory regions</a> + <li><a href="env/remote.html">Remote filesystems</a> + <li><a href="env/open.html">Opening databases within the environment</a> + <li><a href="env/error.html">Error support</a> + </ol> +<font size="+1"><li>Berkeley DB Concurrent Data Store Applications</font> + <ol> + <li><a href="cam/intro.html">Building Berkeley DB Concurrent Data Store applications</a> + </ol> +<font size="+1"><li>Berkeley DB Transactional Data Store Applications</font> + <ol> + <li><a href="transapp/intro.html">Building Berkeley DB Transactional Data Store applications</a> + <li><a href="transapp/why.html">Why transactions?</a> + <li><a href="transapp/term.html">Terminology</a> + <li><a href="transapp/app.html">Application structure</a> + <li><a href="transapp/env_open.html">Opening the environment</a> + <li><a href="transapp/data_open.html">Opening the databases</a> + <li><a href="transapp/put.html">Recoverability and deadlock avoidance</a> + <li><a href="transapp/inc.html">Atomicity</a> + <li><a href="transapp/read.html">Repeatable reads</a> + <li><a href="transapp/cursor.html">Transactional cursors</a> + <li><a href="transapp/admin.html">Environment infrastructure</a> + <li><a href="transapp/deadlock.html">Deadlock detection</a> + <li><a href="transapp/checkpoint.html">Checkpoints</a> + <li><a href="transapp/archival.html">Database and log file archival</a> + <li><a href="transapp/logfile.html">Log file removal</a> + <li><a href="transapp/recovery.html">Recovery procedures</a> + <li><a href="transapp/filesys.html">Recovery and filesystem operations</a> + <li><a href="transapp/reclimit.html">Berkeley DB recoverability</a> + <li><a href="transapp/throughput.html">Transaction throughput</a> + </ol> +<font size="+1"><li>XA Resource Manager</font> + <ol> + <li><a href="xa/intro.html">Introduction</a> + <li><a href="xa/config.html">Configuring Berkeley DB with The Tuxedo System</a> + <li><a href="xa/faq.html">Frequently Asked Questions</a> + </ol> +<font size="+1"><li>Programmer Notes</font> + <ol> + <li><a href="program/appsignals.html">Application signal handling</a> + <li><a href="program/errorret.html">Error returns to applications</a> + <li><a href="program/environ.html">Environmental variables</a> + <li><a href="program/mt.html">Building multi-threaded applications</a> + <li><a href="program/scope.html">Berkeley DB handles</a> + <li><a href="program/namespace.html">Name spaces</a> + <li><a href="program/copy.html">Copying databases</a> + <li><a href="program/version.html">Library version information</a> + <li><a href="program/dbsizes.html">Database limits</a> + <li><a href="program/byteorder.html">Byte ordering</a> + <li><a href="program/diskspace.html">Disk space requirements</a> + <li><a href="program/compatible.html">Compatibility with historic interfaces</a> + <li><a href="program/recimp.html">Recovery implementation</a> + <li><a href="program/extending.html">Application-specific logging and recovery</a> + <li><a href="program/runtime.html">Run-time configuration</a> + </ol> +<font size="+1"><li>The Locking Subsystem</font> + <ol> + <li><a href="lock/intro.html">Berkeley DB and locking</a> + <li><a href="lock/page.html">Page locks</a> + <ol> + <li><a href="lock/stdmode.html">Standard lock modes</a> + <li><a href="lock/notxn.html">Locking without transactions</a> + <li><a href="lock/twopl.html">Locking with transactions: two-phase locking</a> + </ol> + <li><a href="lock/am_conv.html">Access method locking conventions</a> + <li><a href="lock/cam_conv.html">Berkeley DB Concurrent Data Store locking conventions</a> + <li><a href="lock/dead.html">Deadlocks and deadlock avoidance</a> + <li><a href="lock/config.html">Configuring locking</a> + <li><a href="lock/max.html">Configuring locking: sizing the system</a> + <li><a href="lock/nondb.html">Locking and non-Berkeley DB applications</a> + </ol> +<font size="+1"><li>The Logging Subsystem</font> + <ol> + <li><a href="log/intro.html">Berkeley DB and logging</a> + <li><a href="log/config.html">Configuring logging</a> + <li><a href="log/limits.html">Log file limits</a> + </ol> +<font size="+1"><li>The Memory Pool Subsystem</font> + <ol> + <li><a href="mp/intro.html">Berkeley DB and the memory pool</a> + <li><a href="mp/config.html">Configuring the memory pool</a> + </ol> +<font size="+1"><li>The Transaction Subsystem</font> + <ol> + <li><a href="txn/intro.html">Berkeley DB and transactions</a> + <li><a href="txn/nested.html">Nested transactions</a> + <li><a href="txn/limits.html">Transaction limits</a> + <li><a href="txn/config.html">Configuring transactions</a> + <li><a href="txn/other.html">Transactions and non-Berkeley DB applications</a> + </ol> +<font size="+1"><li>RPC Client/Server</font> + <ol> + <li><a href="rpc/intro.html">Introduction</a> + <li><a href="rpc/client.html">Client program</a> + <li><a href="rpc/server.html">Server program</a> + </ol> +<font size="+1"><li>Java API</font> + <ol> + <li><a href="java/conf.html">Configuration</a> + <li><a href="java/compat.html">Compatibility</a> + <li><a href="java/program.html">Programming notes</a> + <li><a href="java/faq.html">Java FAQ</a> + </ol> +<font size="+1"><li>Perl API</font> + <ol> + <li><a href="perl/intro.html">Using Berkeley DB with Perl</a> + </ol> +<font size="+1"><li>Tcl API</font> + <ol> + <li><a href="tcl/intro.html">Loading Berkeley DB with Tcl</a> + <li><a href="tcl/using.html">Using Berkeley DB with Tcl</a> + <li><a href="tcl/program.html">Tcl API programming notes</a> + <li><a href="tcl/error.html">Tcl error handling</a> + <li><a href="tcl/faq.html">Tcl FAQ</a> + </ol> +<font size="+1"><li>Sendmail</font> + <ol> + <li><a href="sendmail/intro.html">Using Berkeley DB with Sendmail</a> + </ol> +<font size="+1"><li>Dumping and Reloading Databases</font> + <ol> + <li><a href="dumpload/utility.html">The db_dump and db_load utilities</a> + <li><a href="dumpload/format.html">Dump output formats</a> + <li><a href="dumpload/text.html">Loading text into databases</a> + </ol> +<font size="+1"><li>System Installation Notes</font> + <ol> + <li><a href="install/file.html">File utility /etc/magic information</a> + </ol> +<font size="+1"><li>Debugging Applications</font> + <ol> + <li><a href="debug/intro.html">Introduction</a> + <li><a href="debug/compile.html">Compile-time configuration</a> + <li><a href="debug/runtime.html">Run-time error information</a> + <li><a href="debug/printlog.html">Reviewing Berkeley DB log files</a> + <li><a href="debug/common.html">Common errors</a> + </ol> +<font size="+1"><li>Building Berkeley DB for UNIX and QNX systems</font> + <ol> + <li><a href="build_unix/intro.html">Building for UNIX</a> + <li><a href="build_unix/conf.html">Configuring Berkeley DB</a> + <li><a href="build_unix/flags.html">Changing compile or load options</a> + <li><a href="build_unix/install.html">Installing Berkeley DB</a> + <li><a href="build_unix/shlib.html">Dynamic shared libraries</a> + <li><a href="build_unix/test.html">Running the test suite under UNIX</a> + <li><a href="build_unix/notes.html">Architecture independent FAQ</a> + <li>Architecture specific FAQs + <ol> + <li><a href="build_unix/aix.html">AIX</a> + <li><a href="build_unix/freebsd.html">FreeBSD</a> + <li><a href="build_unix/hpux.html">HP-UX</a> + <li><a href="build_unix/irix.html">IRIX</a> + <li><a href="build_unix/linux.html">Linux</a> + <li><a href="build_unix/osf1.html">OSF/1</a> + <li><a href="build_unix/qnx.html">QNX</a> + <li><a href="build_unix/sco.html">SCO</a> + <li><a href="build_unix/solaris.html">Solaris</a> + <li><a href="build_unix/sunos.html">SunOS</a> + <li><a href="build_unix/ultrix.html">Ultrix</a> + </ol> + </ol> +<font size="+1"><li>Building Berkeley DB for Win32 platforms</font> + <ol> + <li><a href="build_win/intro.html">Building for Win32</a> + <li><a href="build_win/test.html">Running the test suite under Windows</a> + <li><a href="build_win/notes.html">Windows notes</a> + <li><a href="build_win/faq.html">Windows FAQ</a> + </ol> +<font size="+1"><li>Building Berkeley DB for VxWorks systems</font> + <ol> + <li><a href="build_vxworks/intro.html">Building for VxWorks</a> + <li><a href="build_vxworks/notes.html">VxWorks notes</a> + <li><a href="build_vxworks/faq.html">VxWorks FAQ</a> + </ol> +<font size="+1"><li>Upgrading Berkeley DB Applications</font> + <ol> + <li><a href="upgrade/process.html">Upgrading Berkeley DB installations</a> + <li><a href="upgrade.2.0/toc.html">Upgrading Berkeley DB 1.XX applications to Berkeley DB 2.0</a> + <li><a href="upgrade.3.0/toc.html">Upgrading Berkeley DB 2.X.X applications to Berkeley DB 3.0</a> + <li><a href="upgrade.3.1/toc.html">Upgrading Berkeley DB 3.0.X applications to Berkeley DB 3.1</a> + <li><a href="upgrade.3.2/toc.html">Upgrading Berkeley DB 3.1.X applications to Berkeley DB 3.2</a> + </ol> +<font size="+1"><li>Test Suite</font> + <ol> + <li><a href="test/run.html">Running the test suite</a> + <li><a href="test/faq.html">Test suite FAQ</a> + </ol> +<font size="+1"><li>Distribution</font> + <ol> + <li><a href="distrib/layout.html">Source code layout</a> + </ol> +<font size="+1"><li>Additional References</font> + <ol> + <li><a href="refs/refs.html">Additional references</a> + </ol> +</ol> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/admin.html b/db/docs/ref/transapp/admin.html new file mode 100644 index 000000000..c908a7a33 --- /dev/null +++ b/db/docs/ref/transapp/admin.html @@ -0,0 +1,47 @@ +<!--$Id: admin.so,v 10.14 2000/08/16 17:50:39 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Environment infrastructure</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/cursor.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/deadlock.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Environment infrastructure</h1> +<p>When building transactional applications, it is usually necessary to +build an administrative infrastructure around the database environment. +There are five components to this infrastructure, and each is +supported by the Berkeley DB package in two different ways: a standalone +utility and one or more library interfaces. +<ul type=disc> +<li>Deadlock detection: <a href="../../utility/db_deadlock.html">db_deadlock</a>, +<a href="../../api_c/lock_detect.html">lock_detect</a>, <a href="../../api_c/env_set_lk_detect.html">DBENV->set_lk_detect</a> +<li>Checkpoints: <a href="../../utility/db_checkpoint.html">db_checkpoint</a>, <a href="../../api_c/txn_checkpoint.html">txn_checkpoint</a> +<li>Database and log file archival: +<a href="../../utility/db_archive.html">db_archive</a>, <a href="../../api_c/log_archive.html">log_archive</a> +<li>Log file removal: <a href="../../utility/db_archive.html">db_archive</a>, <a href="../../api_c/log_archive.html">log_archive</a> +<li>Recovery procedures: <a href="../../utility/db_recover.html">db_recover</a>, <a href="../../api_c/env_open.html">DBENV->open</a> +</ul> +<p>When writing multi-threaded server applications and/or applications +intended for download from the web, it is usually simpler to create +local threads that are responsible for administration of the database +environment as scheduling is often simpler in a single-process model, +and only a single binary need be installed and run. However, the +supplied utilities can be generally useful tools even when the +application is responsible for doing its own administration, as +applications rarely offer external interfaces to database +administration. The utilities are required when programming to a Berkeley DB +scripting interface, as the scripting APIs do not always offer +interfaces to the administrative functionality. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/cursor.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/deadlock.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/app.html b/db/docs/ref/transapp/app.html new file mode 100644 index 000000000..3c946989b --- /dev/null +++ b/db/docs/ref/transapp/app.html @@ -0,0 +1,117 @@ +<!--$Id: app.so,v 10.4 2000/07/25 16:31:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Application structure</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/term.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/env_open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Application structure</h1> +<p>When building transactionally protected applications, there are some +special issues that must be considered. The most important one is that, +if any thread of control exits for any reason while holding Berkeley DB +resources, recovery must be performed to: +<ul type=disc> +<li>recover the Berkeley DB resources, +<li>release any locks or mutexes that may have been held to avoid starvation +as the remaining threads of control convoy behind the failed thread's +locks, and +<li>clean up any partially completed operations that may have left a +database in an inconsistent or corrupted state. +</ul> +<p>Complicating this problem is the fact that the Berkeley DB library itself +cannot determine if recovery is required, the application itself +<b>must</b> make that decision. A further complication is that +recovery must be single-threaded, that is, one thread of control or +process must perform recovery before any other thread of control or +processes attempts to create or join the Berkeley DB environment. +<p>There are two approaches to handling this problem: +<p><dl compact> +<p><dt>The hard way:<dd>An application can track its own state carefully enough that it knows +when recovery needs to be performed. Specifically, the rule to use is +that recovery must be performed before using a Berkeley DB environment any +time the threads of control previously using the Berkeley DB environment did +not shut the environment down cleanly before exiting the environment +for any reason (including application or system failure). +<p>Requirements for shutting down the environment cleanly differ depending +on the type of environment created. If the environment is public and +persistent (i.e., the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag was not specified to the +<a href="../../api_c/env_open.html">DBENV->open</a> function), recovery must be performed if any transaction was +not committed or aborted, or <a href="../../api_c/env_close.html">DBENV->close</a> function was not called for +any open DB_ENV handle. +<p>If the environment is private and temporary (i.e., the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> +flag was specified to the <a href="../../api_c/env_open.html">DBENV->open</a> function), recovery must be performed +if any transaction was not committed or aborted, or <a href="../../api_c/env_close.html">DBENV->close</a> function +was not called for any open DB_ENV handle. In addition, at least +one transaction checkpoint must be performed after all existing +transactions have been committed or aborted. +<p><dt>The easy way:<dd>It greatly simplifies matters that recovery may be performed regardless +of whether recovery strictly needs to be performed, that is, it is not +an error to run recovery on a database where no recovery is necessary. +Because of this fact, it is almost invariably simpler to ignore the +above rules about shutting an application down cleanly, and simply run +recovery each time a thread of control accessing a database environment +fails for any reason, as well as before accessing any database +environment after system reboot. +</dl> +<p>There are two common ways to build transactionally protected Berkeley DB +applications. The most common way is as a single, usually +multi-threaded, process. This architecture is simplest because it +requires no monitoring of other threads of control. When the +application starts, it opens and potentially creates the environment, +runs recovery (whether it was needed or not), and then opens its +databases. From then on, the application can create new threads of +control as it chooses. All threads of control share the open Berkeley DB +DB_ENV and DB handles. In this model, databases are +rarely opened or closed when more than a single thread of control is +running, that is, they are opened when only a single thread is running, +and closed after all threads but one have exited. The last thread of +control to exit closes the databases and the environment. +<p>An alternative way to build Berkeley DB applications is as a set of +cooperating processes, which may or may not be multi-threaded. This +architecture is more complicated. +<p>First, this architecture requires that the order in which threads of +control are created and subsequently access the Berkeley DB environment be +controlled, because recovery must be single-threaded. The first thread +of control to access the environment must run recovery, and no other +thread should attempt to access the environment until recovery is +complete. (Note that this ordering requirement does not apply to +environment creation without recovery. If multiple threads attempt to +create a Berkeley DB environment, only one will perform the creation and the +others will join the already existing environment.) +<p>Second, this architecture requires that threads of control be monitored. +If any thread of control that owns Berkeley DB resources exits, without first +cleanly discarding those resources, recovery is usually necessary. +Before running recovery, all threads using the Berkeley DB environment must +relinquish all of their Berkeley DB resources (it does not matter if they do +so gracefully or because they are forced to exit). Then recovery can +be run and the threads of control continued or re-started. +<p>We have found that the safest way to structure groups of cooperating +processes is to first create a single process (often a shell script) +that opens/creates the Berkeley DB environment and runs recovery, and which +then creates the processes or threads that will actually perform work. +The initial thread has no further responsibilities other than to monitor +the threads of control it has created, to ensure that none of them +unexpectedly exits. If one exits, the initial process then forces all +of the threads of control using the Berkeley DB environment to exit, runs +recovery, and restarts the working threads of control. +<p>If it is not practical to have a single parent for the processes sharing +a Berkeley DB environment, each process sharing the environment should log +their connection to and exit from the environment in some fashion that +permits a monitoring process to detect if a thread of control may have +potentially acquired Berkeley DB resources and never released them. +<p>Obviously, it is important that the monitoring process in either case +be as simple and well-tested as possible as there is no recourse should +it fail. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/term.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/env_open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/archival.html b/db/docs/ref/transapp/archival.html new file mode 100644 index 000000000..2e8815850 --- /dev/null +++ b/db/docs/ref/transapp/archival.html @@ -0,0 +1,149 @@ +<!--$Id: archival.so,v 10.41 2000/12/05 20:36:25 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Database and log file archival</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/checkpoint.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/logfile.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Database and log file archival</h1> + <a name="3"><!--meow--></a> +<p>The third component of the administrative infrastructure, archival for +catastrophic recovery, concerns the recoverability of the database in +the face of catastrophic failure. Recovery after catastrophic failure +is intended to minimize data loss when physical hardware has been +destroyed, for example, loss of a disk that contains databases or log +files. While the application may still experience data loss in this +case, it is possible to minimize it. +<p>First, you may want to periodically create snapshots (i.e., backups) of +your databases to make it possible to recover from catastrophic failure. +These snapshots are either a standard backup which creates a consistent +picture of the databases as of a single instant in time, or an on-line +backup (also known as a <i>hot</i> backup), which creates a +consistent picture of the databases as of an unspecified instant during +the period of time when the snapshot was made. The advantage of a hot +backup is that applications may continue to read and write the databases +while the snapshot is being taken. The disadvantage of a hot backup is +that more information must be archived, and recovery based on a hot +backup is to an unspecified time between the start of the backup and +when the backup is completed. +<p>Second, after taking a snapshot, you should periodically archive the +log files being created in the environment. It is often helpful to +think of database archival in terms of full and incremental filesystem +backups. A snapshot is a full backup, while the periodic archival of +the current log files is an incremental. For example, it might be +reasonable to take a full snapshot of a database environment weekly or +monthly, and then archive additional log files daily. Using both the +snapshot and the log files, a catastrophic crash at any time can be +recovered to the time of the most recent log archival, a time long after +the original snapshot. +<p>To create a standard backup of your database that can be used to recover +from catastrophic failure, take the following steps: +<p><ol> +<p><li>Commit or abort all on-going transactions. +<p><li>Force an environment checkpoint (see <a href="../../utility/db_checkpoint.html">db_checkpoint</a> for more +information). +<p><li>Stop writing your databases until the backup has completed. Read-only +operations are permitted, but no write operations and no filesystem +operations may be performed, e.g., the <a href="../../api_c/env_remove.html">DBENV->remove</a> and +<a href="../../api_c/db_open.html">DB->open</a> functions may not be called). +<p><li>Run <a href="../../utility/db_archive.html">db_archive</a> <b>-s</b> to identify all of the database data +files, and copy them to a backup device, such as CDROM, alternate disk, +or tape. Obviously, the reliability of your archive media will affect +the safety of your data. +<p>If the database files are stored in a separate directory from the other +Berkeley DB files, it may be simpler to archive the directory itself instead +of the individual files (see <a href="../../api_c/env_set_data_dir.html">DBENV->set_data_dir</a> for additional +information). If you are performing a hot backup, the utility you use +to copy the files must read database pages atomically (as described by +<a href="../../ref/transapp/reclimit.html">Berkeley DB recoverability</a>). +<p><b>Note: if any of the database files did not have an open DB +handle during the lifetime of the current log files, <a href="../../utility/db_archive.html">db_archive</a> +will not list them in its output!</b> For this reason, it may be simpler +to use a separate database file directory, and archive the entire +directory instead of only the files listed by <a href="../../utility/db_archive.html">db_archive</a>. +</ol> +<p>To create a <i>hot</i> backup of your database that can be used to +recover from catastrophic failure, take the following steps: +<p><ol> +<p><li>Archive your databases as described in Step #4 above. You +do not have to halt on-going transactions or force a checkpoint. +<p><li>When performing a hot backup, you must additionally archive the active +log files. Note that the order of these two operations is required, +and the database files must be archived before the log files. This +means that if the database files and log files are in the same +directory, you cannot simply archive the directory, you must make sure +that the correct order of archival is maintained. +<p>To archive your log files, run the <a href="../../utility/db_archive.html">db_archive</a> utility, using +the <b>-l</b> option, to identify all of the database log files, and +copy them to your backup media. If the database log files are stored +in a separate directory from the other database files, it may be simpler +to archive the directory itself instead of the individual files (see +the <a href="../../api_c/env_set_lg_dir.html">DBENV->set_lg_dir</a> function for more information). +</ol> +<p>Once these steps are completed, your database can be recovered from +catastrophic failure (see <a href="recovery.html">Recovery procedures</a> for +more information). +<p>To update your snapshot so that recovery from catastrophic failure is +possible up to a new point in time, repeat step #2 under the hot backup +instructions, copying all existing log files to a backup device. This +is applicable to both standard and hot backups, that is, you can update +snapshots made in either way. Each time both the database and log files +are copied to backup media, you may discard all previous database +snapshots and saved log files. Archiving additional log files does not +allow you to discard either previous database snapshots or log files. +<p>The time to restore from catastrophic failure is a function of the +number of log records that have been written since the snapshot was +originally created. Perhaps more importantly, the more separate pieces +of backup media you use, the more likely that you will have a problem +reading from one of them. For these reasons, it is often best to make +snapshots on a regular basis. +<p><b>For archival safety, ensure that you have multiple copies of your +database backups, verify that your archival media is error-free and +readable, and that copies of your backups are stored off-site!</b> +<p>The functionality provided by the <a href="../../utility/db_archive.html">db_archive</a> utility is also +available directly from the Berkeley DB library. The following code fragment +prints out a list of log and database files that need to be archived. +<p><blockquote><pre>void +log_archlist(DB_ENV *dbenv) +{ + int ret; + char **begin, **list; +<p> + /* Get the list of database files. */ + if ((ret = log_archive(dbenv, + &list, DB_ARCH_ABS | DB_ARCH_DATA, NULL)) != 0) { + dbenv->err(dbenv, ret, "log_archive: DB_ARCH_DATA"); + exit (1); + } + if (list != NULL) { + for (begin = list; *list != NULL; ++list) + printf("database file: %s\n", *list); + free (begin); + } +<p> + /* Get the list of log files. */ + if ((ret = log_archive(dbenv, + &list, DB_ARCH_ABS | DB_ARCH_LOG, NULL)) != 0) { + dbenv->err(dbenv, ret, "log_archive: DB_ARCH_LOG"); + exit (1); + } + if (list != NULL) { + for (begin = list; *list != NULL; ++list) + printf("log file: %s\n", *list); + free (begin); + } +}</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/checkpoint.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/logfile.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/checkpoint.html b/db/docs/ref/transapp/checkpoint.html new file mode 100644 index 000000000..b9bd81a3e --- /dev/null +++ b/db/docs/ref/transapp/checkpoint.html @@ -0,0 +1,127 @@ +<!--$Id: checkpoint.so,v 10.13 2000/08/16 17:50:40 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Checkpoints</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/deadlock.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/archival.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Checkpoints</h1> +<p>The second component of the infrastructure is performing checkpoints of +the log files. As transactions commit, change records are written into +the log files, but the actual changes to the database are not +necessarily written to disk. When a checkpoint is performed, the +changes to the database that are part of committed transactions are +written into the backing database file. +<p>Performing checkpoints is necessary for two reasons. First, you can +only remove the Berkeley DB log files from your system after a checkpoint. +Second, the frequency of your checkpoints is inversely proportional to +the amount of time it takes to run database recovery after a system or +application failure. +<p>Once the database pages are written, log files can be archived and removed +from the system because they will never be needed for anything other than +catastrophic failure. In addition, recovery after system or application +failure only has to redo or undo changes since the last checkpoint, since +changes before the checkpoint have all been flushed to the filesystem. +<p>Berkeley DB provides a separate utility, <a href="../../utility/db_checkpoint.html">db_checkpoint</a>, which can be +used to perform checkpoints. Alternatively, applications can write +their own checkpoint utility using the underlying <a href="../../api_c/txn_checkpoint.html">txn_checkpoint</a> +function. The following code fragment checkpoints the database +environment every 60 seconds: +<p><blockquote><pre>int +main(int argc, char *argv) +{ + extern char *optarg; + extern int optind; + DB *db_cats, *db_color, *db_fruit; + DB_ENV *dbenv; + pthread_t ptid; + int ch; +<p> + while ((ch = getopt(argc, argv, "")) != EOF) + switch (ch) { + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; +<p> + env_dir_create(); + env_open(&dbenv); +<p> +<b> /* Start a checkpoint thread. */ + if ((errno = pthread_create( + &ptid, NULL, checkpoint_thread, (void *)dbenv)) != 0) { + fprintf(stderr, + "txnapp: failed spawning checkpoint thread: %s\n", + strerror(errno)); + exit (1); + }</b> +<p> + /* Open database: Key is fruit class; Data is specific type. */ + db_open(dbenv, &db_fruit, "fruit", 0); +<p> + /* Open database: Key is a color; Data is an integer. */ + db_open(dbenv, &db_color, "color", 0); +<p> + /* + * Open database: + * Key is a name; Data is: company name, address, cat breeds. + */ + db_open(dbenv, &db_cats, "cats", 1); +<p> + add_fruit(dbenv, db_fruit, "apple", "yellow delicious"); +<p> + add_color(dbenv, db_color, "blue", 0); + add_color(dbenv, db_color, "blue", 3); +<p> + add_cat(dbenv, db_cats, + "Amy Adams", + "Sleepycat Software", + "394 E. Riding Dr., Carlisle, MA 01741, USA", + "abyssinian", + "bengal", + "chartreaux", + NULL); +<p> + return (0); +} +<p> +<b>void * +checkpoint_thread(void *arg) +{ + DB_ENV *dbenv; + int ret; +<p> + dbenv = arg; + dbenv->errx(dbenv, "Checkpoint thread: %lu", (u_long)pthread_self()); +<p> + /* Checkpoint once a minute. */ + for (;; sleep(60)) + switch (ret = txn_checkpoint(dbenv, 0, 0, 0)) { + case 0: + case DB_INCOMPLETE: + break; + default: + dbenv->err(dbenv, ret, "checkpoint thread"); + exit (1); + } +<p> + /* NOTREACHED */ +}</b></pre></blockquote> +<p>As checkpoints can be quite expensive, choosing how often to perform a +checkpoint is a common tuning parameter for Berkeley DB applications. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/deadlock.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/archival.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/cursor.html b/db/docs/ref/transapp/cursor.html new file mode 100644 index 000000000..bb1aff98a --- /dev/null +++ b/db/docs/ref/transapp/cursor.html @@ -0,0 +1,169 @@ +<!--$Id: cursor.so,v 1.2 2000/08/16 17:50:40 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Transactional cursors</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/read.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/admin.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Transactional cursors</h1> +<p>Berkeley DB cursors may be used inside a transaction, exactly like any other +DB method. The enclosing transaction ID must be specified when +the cursor is created, but it does not then need to be further specified +on operations performed using the cursor. One important point to +remember is that a cursor <b>must be closed</b> before the enclosing +transaction is committed or aborted. +<p>The following code fragment uses a cursor to store a new key in the cats +database with four associated data items. The key is a name. The data +items are a company name, an address, and a list of the breeds of cat +owned. Each of the data entries is stored as a duplicate data item. +In this example, transactions are necessary to ensure that either all or none +of the data items appear in case of system or application failure. +<p><blockquote><pre>int +main(int argc, char *argv) +{ + extern char *optarg; + extern int optind; + DB *db_cats, *db_color, *db_fruit; + DB_ENV *dbenv; + pthread_t ptid; + int ch; +<p> + while ((ch = getopt(argc, argv, "")) != EOF) + switch (ch) { + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; +<p> + env_dir_create(); + env_open(&dbenv); +<p> + /* Open database: Key is fruit class; Data is specific type. */ + db_open(dbenv, &db_fruit, "fruit", 0); +<p> + /* Open database: Key is a color; Data is an integer. */ + db_open(dbenv, &db_color, "color", 0); +<p> + /* + * Open database: + * Key is a name; Data is: company name, address, cat breeds. + */ + db_open(dbenv, &db_cats, "cats", 1); +<p> + add_fruit(dbenv, db_fruit, "apple", "yellow delicious"); +<p> + add_color(dbenv, db_color, "blue", 0); + add_color(dbenv, db_color, "blue", 3); +<p> +<b> add_cat(dbenv, db_cats, + "Amy Adams", + "Sleepycat Software", + "394 E. Riding Dr., Carlisle, MA 01741, USA", + "abyssinian", + "bengal", + "chartreaux", + NULL);</b> +<p> + return (0); +} +<p> +<b>void +add_cat(DB_ENV *dbenv, DB *db, char *name, ...) +{ + va_list ap; + DBC *dbc; + DBT key, data; + DB_TXN *tid; + int ret; + char *s; +<p> + /* Initialization. */ + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = name; + key.size = strlen(name); +<p> +retry: /* Begin the transaction. */ + if ((ret = txn_begin(dbenv, NULL, &tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_begin"); + exit (1); + } +<p> + /* Delete any previously existing item. */ + switch (ret = db->del(db, tid, &key, 0)) { + case 0: + case DB_NOTFOUND: + break; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + goto retry; + default: + dbenv->err(dbenv, ret, "db->del: %s", name); + exit (1); + } +<p> + /* Create a cursor. */ + if ((ret = db->cursor(db, tid, &dbc, 0)) != 0) { + dbenv->err(dbenv, ret, "db->cursor"); + exit (1); + } +<p> + /* Append the items, in order. */ + va_start(ap, name); + while ((s = va_arg(ap, char *)) != NULL) { + data.data = s; + data.size = strlen(s); + switch (ret = dbc->c_put(dbc, &key, &data, DB_KEYLAST)) { + case 0: + break; + case DB_LOCK_DEADLOCK: + va_end(ap); +<p> + /* Deadlock: retry the operation. */ + if ((ret = dbc->c_close(dbc)) != 0) { + dbenv->err( + dbenv, ret, "dbc->c_close"); + exit (1); + } + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + goto retry; + default: + /* Error: run recovery. */ + dbenv->err(dbenv, ret, "dbc->put: %s/%s", name, s); + exit (1); + } + } + va_end(ap); +<p> + /* Success: commit the change. */ + if ((ret = dbc->c_close(dbc)) != 0) { + dbenv->err(dbenv, ret, "dbc->c_close"); + exit (1); + } + if ((ret = txn_commit(tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_commit"); + exit (1); + } +}</b></pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/read.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/admin.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/data_open.html b/db/docs/ref/transapp/data_open.html new file mode 100644 index 000000000..904778c35 --- /dev/null +++ b/db/docs/ref/transapp/data_open.html @@ -0,0 +1,119 @@ +<!--$Id: data_open.so,v 1.3 2000/08/16 17:50:40 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Opening the databases</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/env_open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Opening the databases</h1> +<p>Next, we open three databases ("color" and "fruit" and "cats"), in the +database environment. Again, our DB database handles are +declared to be free-threaded using the <a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flag, and so +may be used by any number of threads we subsequently create. +<p><blockquote><pre>int +main(int argc, char *argv) +{ + extern char *optarg; + extern int optind; + DB *db_cats, *db_color, *db_fruit; + DB_ENV *dbenv; + pthread_t ptid; + int ch; +<p> + while ((ch = getopt(argc, argv, "")) != EOF) + switch (ch) { + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; +<p> + env_dir_create(); + env_open(&dbenv); +<p> +<b> /* Open database: Key is fruit class; Data is specific type. */ + db_open(dbenv, &db_fruit, "fruit", 0); +<p> + /* Open database: Key is a color; Data is an integer. */ + db_open(dbenv, &db_color, "color", 0); +<p> + /* + * Open database: + * Key is a name; Data is: company name, address, cat breeds. + */ + db_open(dbenv, &db_cats, "cats", 1);</b> +<p> + return (0); +} +<p> +<b>void +db_open(DB_ENV *dbenv, DB **dbp, char *name, int dups) +{ + DB *db; + int ret; +<p> + /* Create the database handle. */ + if ((ret = db_create(&db, dbenv, 0)) != 0) { + dbenv->err(dbenv, ret, "db_create"); + exit (1); + } +<p> + /* Optionally, turn on duplicate data items. */ + if (dups && (ret = db->set_flags(db, DB_DUP)) != 0) { + dbenv->err(dbenv, ret, "db->set_flags: DB_DUP"); + exit (1); + } +<p> + /* + * Open a database in the environment: + * create if it doesn't exist + * free-threaded handle + * read/write owner only + */ + if ((ret = db->open(db, name, NULL, + DB_BTREE, DB_CREATE | DB_THREAD, S_IRUSR | S_IWUSR)) != 0) { + dbenv->err(dbenv, ret, "db->open: %s", name); + exit (1); + } +<p> + *dbp = db; +}</b></pre></blockquote> +<p>There is no reason to wrap database opens inside of transactions. All +database opens are transaction protected internally to Berkeley DB, and +applications using transaction-protected environments can simply rely on +files either being successfully re-created in a recovered environment, +or not appearing at all. +<p>After running this initial code, we can use the <a href="../../utility/db_stat.html">db_stat</a> utility +to display information about a database we have created: +<p><blockquote><pre>prompt> db_stat -h TXNAPP -d color +53162 Btree magic number. +8 Btree version number. +Flags: +2 Minimum keys per-page. +8192 Underlying database page size. +1 Number of levels in the tree. +0 Number of unique keys in the tree. +0 Number of data items in the tree. +0 Number of tree internal pages. +0 Number of bytes free in tree internal pages (0% ff). +1 Number of tree leaf pages. +8166 Number of bytes free in tree leaf pages (0.% ff). +0 Number of tree duplicate pages. +0 Number of bytes free in tree duplicate pages (0% ff). +0 Number of tree overflow pages. +0 Number of bytes free in tree overflow pages (0% ff). +0 Number of pages on the free list.</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/env_open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/deadlock.html b/db/docs/ref/transapp/deadlock.html new file mode 100644 index 000000000..65765ec59 --- /dev/null +++ b/db/docs/ref/transapp/deadlock.html @@ -0,0 +1,92 @@ +<!--$Id: deadlock.so,v 10.15 2000/08/10 17:54:49 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Deadlock detection</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/admin.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/checkpoint.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Deadlock detection</h1> +<p>The first component of the infrastructure, deadlock detection, is not +so much a requirement specific to transaction protected applications, +but rather is necessary for almost all applications where more than a +single thread of control will be accessing the database at one time. +While Berkeley DB automatically handles database locking, it is normally +possible for deadlock to occur. It is not required by all transactional +applications, but exceptions are rare. +<p>When the deadlock occurs, two (or more) threads of control each request +additional locks which can never be granted because one of the threads +of control waiting holds the requested resource. +<p>For example, consider two processes A and B. Let's say that A obtains +an exclusive lock on item X, and B obtains an exclusive lock on item Y. +Then, A requests a lock on Y and B requests a lock on X. A will wait +until resource Y becomes available and B will wait until resource X +becomes available. Unfortunately, since both A and B are waiting, +neither will release the locks they hold and neither will ever obtain +the resource on which it is waiting. In order to detect that deadlock +has happened, a separate process or thread must review the locks +currently held in the database. If deadlock has occurred, a victim must +be selected, and that victim will then return the error +<a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> from whatever Berkeley DB call it was making. +<p>Berkeley DB provides a separate UNIX-style utility which can be used to +perform this deadlock detection, named <a href="../../utility/db_deadlock.html">db_deadlock</a>. +Alternatively, applications can create their own deadlock utility or +thread using the underlying <a href="../../api_c/lock_detect.html">lock_detect</a> function, or specify +that Berkeley DB run the deadlock detector internally whenever there is a +conflict over a lock (see <a href="../../api_c/env_set_lk_detect.html">DBENV->set_lk_detect</a> for more +information). The following code fragment does the latter: +<p><blockquote><pre>void +env_open(DB_ENV **dbenvp) +{ + DB_ENV *dbenv; + int ret; +<p> + /* Create the environment handle. */ + if ((ret = db_env_create(&dbenv, 0)) != 0) { + fprintf(stderr, + "txnapp: db_env_create: %s\n", db_strerror(ret)); + exit (1); + } +<p> + /* Set up error handling. */ + dbenv->set_errpfx(dbenv, "txnapp"); +<p> +<b> /* Do deadlock detection internally. */ + if ((ret = dbenv->set_lk_detect(dbenv, DB_LOCK_DEFAULT)) != 0) { + dbenv->err(dbenv, ret, "set_lk_detect: DB_LOCK_DEFAULT"); + exit (1); + }</b> +<p> + /* + * Open a transactional environment: + * create if it doesn't exist + * free-threaded handle + * run recovery + * read/write owner only + */ + if ((ret = dbenv->open(dbenv, ENV_DIRECTORY, + DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | + DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER | DB_THREAD, + S_IRUSR | S_IWUSR)) != 0) { + dbenv->err(dbenv, ret, "dbenv->open: %s", ENV_DIRECTORY); + exit (1); + } +<p> + *dbenvp = dbenv; +}</pre></blockquote> +<p>Deciding how often to run the deadlock detector and which of the +deadlocked transactions will be forced to abort when the deadlock is +detected is a common tuning parameter for Berkeley DB applications. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/admin.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/checkpoint.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/env_open.html b/db/docs/ref/transapp/env_open.html new file mode 100644 index 000000000..7209a3fef --- /dev/null +++ b/db/docs/ref/transapp/env_open.html @@ -0,0 +1,174 @@ +<!--$Id: env_open.so,v 1.1 2000/07/25 17:56:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Opening the environment</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/app.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/data_open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Opening the environment</h1> +<p>Creating transaction-protected applications using the Berkeley DB library is +quite easy. Applications first use <a href="../../api_c/env_open.html">DBENV->open</a> to initialize +the database environment. Transaction-protected applications normally +require all four Berkeley DB subsystems, so the <a href="../../api_c/env_open.html#DB_INIT_MPOOL">DB_INIT_MPOOL</a>, +<a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a>, <a href="../../api_c/env_open.html#DB_INIT_LOG">DB_INIT_LOG</a> and <a href="../../api_c/env_open.html#DB_INIT_TXN">DB_INIT_TXN</a> flags +should be specified. +<p>Once the application has called <a href="../../api_c/env_open.html">DBENV->open</a>, it opens its +databases within the environment. Once the databases are opened, the +application makes changes to the databases inside of transactions. Each +set of changes that entail a unit of work should be surrounded by the +appropriate <a href="../../api_c/txn_begin.html">txn_begin</a>, <a href="../../api_c/txn_commit.html">txn_commit</a> and <a href="../../api_c/txn_abort.html">txn_abort</a> +calls. The Berkeley DB access methods will make the appropriate calls into +the lock, log and memory pool subsystems in order to guarantee +transaction semantics. When the application is ready to exit, all +outstanding transactions should have been committed or aborted. +<p>Databases accessed by a transaction must not be closed during the +transaction. Once all outstanding transactions are finished, all open +Berkeley DB files should be closed. When the Berkeley DB database files have been +closed, the environment should be closed by calling <a href="../../api_c/env_close.html">DBENV->close</a>. +<p>The following code fragment creates the database environment directory, +then opens the environment, running recovery. Our DB_ENV +database environment handle is declared to be free-threaded using the +<a href="../../api_c/env_open.html#DB_THREAD">DB_THREAD</a> flag, and so may be used by any number of threads that +we may subsequently create. +<p><blockquote><pre>#include <sys/types.h> +#include <sys/stat.h> +<p> +#include <errno.h> +#include <pthread.h> +#include <stdarg.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +<p> +#include <db.h> +<p> +#define ENV_DIRECTORY "TXNAPP" +<p> +void env_dir_create(void); +void env_open(DB_ENV **); +<p> +int +main(int argc, char *argv) +{ + extern char *optarg; + extern int optind; + DB *db_cats, *db_color, *db_fruit; + DB_ENV *dbenv; + pthread_t ptid; + int ch; +<p> + while ((ch = getopt(argc, argv, "")) != EOF) + switch (ch) { + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; +<p> + env_dir_create(); + env_open(&dbenv); +<p> + return (0); +} +<p> +void +env_dir_create() +{ + struct stat sb; +<p> + /* + * If the directory exists, we're done. We do not further check + * the type of the file, DB will fail appropriately if it's the + * wrong type. + */ + if (stat(ENV_DIRECTORY, &sb) == 0) + return; +<p> + /* Create the directory, read/write/access owner only. */ + if (mkdir(ENV_DIRECTORY, S_IRWXU) != 0) { + fprintf(stderr, + "txnapp: mkdir: %s: %s\n", ENV_DIRECTORY, strerror(errno)); + exit (1); + } +} +<p> +void +env_open(DB_ENV **dbenvp) +{ + DB_ENV *dbenv; + int ret; +<p> + /* Create the environment handle. */ + if ((ret = db_env_create(&dbenv, 0)) != 0) { + fprintf(stderr, + "txnapp: db_env_create: %s\n", db_strerror(ret)); + exit (1); + } +<p> + /* Set up error handling. */ + dbenv->set_errpfx(dbenv, "txnapp"); +<p> + /* + * Open a transactional environment: + * create if it doesn't exist + * free-threaded handle + * run recovery + * read/write owner only + */ + if ((ret = dbenv->open(dbenv, ENV_DIRECTORY, + DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | + DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER | DB_THREAD, + S_IRUSR | S_IWUSR)) != 0) { + dbenv->err(dbenv, ret, "dbenv->open: %s", ENV_DIRECTORY); + exit (1); + } +<p> + *dbenvp = dbenv; +}</pre></blockquote> +<p>After running this initial program, we can use the <a href="../../utility/db_stat.html">db_stat</a> +utility to display the contents of the environment directory: +<p><blockquote><pre>prompt> db_stat -e -h TXNAPP +3.2.1 Environment version. +120897 Magic number. +0 Panic value. +1 References. +6 Locks granted without waiting. +0 Locks granted after waiting. +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= +Mpool Region: 4. +264KB Size (270336 bytes). +-1 Segment ID. +1 Locks granted without waiting. +0 Locks granted after waiting. +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= +Log Region: 3. +96KB Size (98304 bytes). +-1 Segment ID. +3 Locks granted without waiting. +0 Locks granted after waiting. +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= +Lock Region: 2. +240KB Size (245760 bytes). +-1 Segment ID. +1 Locks granted without waiting. +0 Locks granted after waiting. +=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= +Txn Region: 5. +8KB Size (8192 bytes). +-1 Segment ID. +1 Locks granted without waiting. +0 Locks granted after waiting.</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/app.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/data_open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/filesys.html b/db/docs/ref/transapp/filesys.html new file mode 100644 index 000000000..fc68089e9 --- /dev/null +++ b/db/docs/ref/transapp/filesys.html @@ -0,0 +1,62 @@ +<!--$Id: filesys.so,v 10.30 2000/07/25 16:31:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Recovery and filesystem operations</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/recovery.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/reclimit.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Recovery and filesystem operations</h1> +<p>When running in a transaction-protected environment, database creation +and deletion are logged as stand-alone transactions internal to Berkeley DB. +That is, for each such operation a new transaction is begun and aborted +or committed internally, so that they will be recovered during recovery. +<p>The Berkeley DB API supports removing and renaming files. Renaming files is +supported by the <a href="../../api_c/db_rename.html">DB->rename</a> method, and removing files by the +<a href="../../api_c/db_remove.html">DB->remove</a> method. Berkeley DB does not permit specifying the +<a href="../../api_c/db_open.html#DB_TRUNCATE">DB_TRUNCATE</a> flag when opening a file in a transaction protected +environment. This is an implicit file deletion, but one that does not +always require the same operating system file permissions as does deleting +and creating a file. +<p>If you have changed the name of a file or deleted it outside of the Berkeley DB +library (e.g., you explicitly removed a file using your normal operating +system utilities), then it is possible that recovery will not be able to +find a database referenced in the log. In this case, <a href="../../utility/db_recover.html">db_recover</a> +will produce a warning message saying it was unable to locate a file it +expected to find. This message is only a warning, as the file may have +been subsequently deleted as part of normal database operations before +the failure occurred, and so is not necessarily a problem. +<p>Generally, any filesystem operations that are performed outside the Berkeley DB +interface should be performed at the same time as making a snapshot of +the database. To perform filesystem operations correctly: +<p><ol> +<p><li>Cleanly shutdown database operations. +<p>To shutdown database operations cleanly, all applications accessing the +database environment must be shutdown and a transaction checkpoint must +be taken. If the applications are not implemented such that they can be +shutdown gracefully (i.e., closing all references to the database +environment), recovery must be performed after all applications have been +killed to ensure that the underlying databases are consistent on disk. +<p><li>Perform the filesystem operations, e.g., remove or rename one +or more files. +<p><li>Make an archival snapshot of the database. +<p>While this step is not strictly necessary, it is strongly recommended. +If this step is not performed, recovery from catastrophic failure will +require that recovery first be performed up to the time of the +filesystem operations, the filesystem operations be redone, and then +recovery be performed from the filesystem operations forward. +<p><li>Restart the database applications. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/recovery.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/reclimit.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/inc.html b/db/docs/ref/transapp/inc.html new file mode 100644 index 000000000..35cf67d7e --- /dev/null +++ b/db/docs/ref/transapp/inc.html @@ -0,0 +1,201 @@ +<!--$Id: inc.so,v 1.6 2000/08/08 19:58:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Atomicity</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/read.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Atomicity</h1> +<p>The third reason listed for using transactions was atomicity. Consider +an application suite where multiple threads of control (multiple +processes or threads in one or more processes) are changing the values +associated with a key in one or more databases. Specifically, they are +taking the current value, incrementing it, and then storing it back into +the database. +<p>Such an application requires atomicity. Since we want to change a value +in the database, we must make sure that once we read it, no other thread +of control modifies it. For example, assume that both thread #1 and +thread #2 are doing similar operations in the database, where thread #1 +is incrementing records by 3, and thread #2 is incrementing records by +5. We want to increment the record by a total of 8. If the operations +interleave in the right (well, wrong) order, that is not what will +happen: +<p><blockquote><pre>thread #1 <b>read</b> record: the value is 2 +thread #2 <b>read</b> record: the value is 2 +thread #2 <b>write</b> record + 5 back into the database (new value 7) +thread #1 <b>write</b> record + 3 back into the database (new value 5)</pre></blockquote> +<p>As you can see, instead of incrementing the record by a total of 8, +we've only incremented it by 3, because thread #1 overwrote thread #2's +change. By wrapping the operations in transactions, we ensure that this +cannot happen. In a transaction, when the first thread reads the +record, locks are acquired that will not be released until the +transaction finishes, guaranteeing that all other readers and writers +will block, waiting for the first thread's transaction to complete (or +to be aborted). +<p>Here is an example function that does transaction-protected increments +on database records to ensure atomicity. +<p><blockquote><pre>int +main(int argc, char *argv) +{ + extern char *optarg; + extern int optind; + DB *db_cats, *db_color, *db_fruit; + DB_ENV *dbenv; + pthread_t ptid; + int ch; +<p> + while ((ch = getopt(argc, argv, "")) != EOF) + switch (ch) { + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; +<p> + env_dir_create(); + env_open(&dbenv); +<p> + /* Open database: Key is fruit class; Data is specific type. */ + db_open(dbenv, &db_fruit, "fruit", 0); +<p> + /* Open database: Key is a color; Data is an integer. */ + db_open(dbenv, &db_color, "color", 0); +<p> + /* + * Open database: + * Key is a name; Data is: company name, address, cat breeds. + */ + db_open(dbenv, &db_cats, "cats", 1); +<p> + add_fruit(dbenv, db_fruit, "apple", "yellow delicious"); +<p> +<b> add_color(dbenv, db_color, "blue", 0); + add_color(dbenv, db_color, "blue", 3);</b> +<p> + return (0); +} +<p> +<b>void +add_color(DB_ENV *dbenv, DB *dbp, char *color, int increment) +{ + DBT key, data; + DB_TXN *tid; + int original, ret; + char buf64; +<p> + /* Initialization. */ + memset(&key, 0, sizeof(key)); + key.data = color; + key.size = strlen(color); + memset(&data, 0, sizeof(data)); + data.flags = DB_DBT_MALLOC; +<p> + for (;;) { + /* Begin the transaction. */ + if ((ret = txn_begin(dbenv, NULL, &tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_begin"); + exit (1); + } +<p> + /* + * Get the key. If it exists, we increment the value. If it + * doesn't exist, we create it. + */ + switch (ret = dbp->get(dbp, tid, &key, &data, 0)) { + case 0: + original = atoi(data.data); + break; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + continue; + case DB_NOTFOUND: + original = 0; + break; + default: + /* Error: run recovery. */ + dbenv->err( + dbenv, ret, "dbc->get: %s/%d", color, increment); + exit (1); + } + if (data.data != NULL) + free(data.data); +<p> + /* Create the new data item. */ + (void)snprintf(buf, sizeof(buf), "%d", original + increment); + data.data = buf; + data.size = strlen(buf) + 1; +<p> + /* Store the new value. */ + switch (ret = dbp->put(dbp, tid, &key, &data, 0)) { + case 0: + /* Success: commit the change. */ + if ((ret = txn_commit(tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_commit"); + exit (1); + } + return; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + break; + default: + /* Error: run recovery. */ + dbenv->err( + dbenv, ret, "dbc->put: %s/%d", color, increment); + exit (1); + } + } +}</b></pre></blockquote> +<p>Any number of operations, on any number of databases, can be included +in a single transaction to ensure atomicity of the operations. There +is, however, a trade-off between the number of operations included in +a single transaction and both throughput and the possibility of +deadlock. The reason for this is because transactions acquire locks +throughout their lifetime, and do not release them until transaction +commit or abort. So, the more operations included in a transaction, +the more likely that a transaction will block other operations and that +deadlock will occur. However, each transaction commit requires a +synchronous disk I/O, so grouping multiple operations into a transaction +can increase overall throughput. (There is one exception to this. The +<a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> option causes transactions to exhibit the ACI +(atomicity, consistency and isolation) properties, but not D +(durability), avoiding the synchronous disk I/O on transaction commit +and greatly increasing transaction throughput for some applications. +<p>When applications do create complex transactions, they often avoid +having more than one complex transaction at a time, as simple operations +like a single <a href="../../api_c/db_put.html">DB->put</a> are unlikely to deadlock with each other +or the complex transaction, while multiple complex transactions are +likely to deadlock with each other as they will both acquire many locks +over their lifetime. Alternatively, complex transactions can be broken +up into smaller sets of operations, and each of those sets may be +encapsulated in a nested transaction. Because nested transactions may +be individually aborted and retried without causing the entire +transaction to be aborted, this allows complex transactions to proceed +even in the face of heavy contention, repeatedly trying the +sub-operations until they succeed. +<p>It is also helpful to order operations within a transaction, that is, +access the databases and items within the databases in the same order, +to the extent possible, in all transactions. Accessing databases and +items in different orders greatly increases the likelihood of operations +being blocked and failing due to deadlocks. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/read.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/intro.html b/db/docs/ref/transapp/intro.html new file mode 100644 index 000000000..758169e85 --- /dev/null +++ b/db/docs/ref/transapp/intro.html @@ -0,0 +1,42 @@ +<!--$Id: intro.so,v 10.35 2000/12/04 18:05:44 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Building transaction protected applications</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/cam/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/why.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Building transaction protected applications</h1> +<p>It is difficult to write a useful transactional tutorial and still keep +within reasonable bounds of documentation, that is, without writing a +book on transactional programming. We have two goals in this section: +to familiarize readers with the transactional interfaces of Berkeley DB and +to provide code building blocks that will be useful in creating +applications. +<p>We have not attempted to present this information using a real-world +application. First, transactional applications are often complex and +time consuming to explain. Also, one of our goals is to give you an +understanding of the wide variety of tools Berkeley DB makes available to you, +and no single application would use most of the interfaces included in +the Berkeley DB library. For these reasons, we have chosen to simply present +the Berkeley DB data structures and programming solutions, using examples that +differ from page to page. All of the examples are included in a +standalone program you can examine, modify and run, and from which you +will be able to extract code blocks for your own applications. +Fragments of the program will be presented throughout this chapter, and +the complete text of the <a href="transapp.txt">example program</a> +for IEEE/ANSI Std 1003.1 (POSIX) standard systems is included in the Berkeley DB +distribution. +<table><tr><td><br></td><td width="1%"><a href="../../ref/cam/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/why.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/logfile.html b/db/docs/ref/transapp/logfile.html new file mode 100644 index 000000000..64d8a9647 --- /dev/null +++ b/db/docs/ref/transapp/logfile.html @@ -0,0 +1,104 @@ +<!--$Id: logfile.so,v 11.1 2000/07/25 16:31:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Log file removal</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/archival.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/recovery.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Log file removal</h1> +<p>The fourth component of the infrastructure, log file removal, concerns +the ongoing disk consumption of the database log files. Depending on +the rate at which the application writes to the databases and the +available disk space, the number of log files may increase quickly +enough that disk space will be a resource problem. For this reason, +you will periodically want to remove log files in order to conserve disk +space. This procedure is distinct from database and log file archival +for catastrophic recovery, and you cannot remove the current log files +simply because you have created a database snapshot or copied log files +to archival media. +<p>Log files may be removed at any time, as long as: +<ul type=disc> +<li>the log file is not involved in an active transaction +<li>at least two checkpoints have been written subsequent to the +log file's creation, and +<li>the log file is not the only log file in the environment. +</ul> +<p>Obviously, if you are preparing for catastrophic failure, you will want +to copy the log files to archival media before you remove them. +<p>To remove log files, take the following steps: +<p><ol> +<p><li>If you are concerned with catastrophic failure, first copy the log files +to backup media as described in <a href="archival.html">Archival for +catastrophic recovery</a>. +<p><li>Run <a href="../../utility/db_archive.html">db_archive</a> without options to identify all of the log files +that are no longer in use (e.g., no longer involved in an active +transaction). +<p><li>Remove those log files from the system. +</ol> +<p>The functionality provided by the <a href="../../utility/db_archive.html">db_archive</a> utility is also +available directly from the Berkeley DB library. The following code fragment +removes log files that are no longer needed by the database +environment. +<p><blockquote><pre>int +main(int argc, char *argv) +{ + ... +<p> +<b> /* Start a logfile removal thread. */ + if ((errno = pthread_create( + &ptid, NULL, logfile_thread, (void *)dbenv)) != 0) { + fprintf(stderr, + "txnapp: failed spawning log file removal thread: %s\n", + strerror(errno)); + exit (1); + }</b> +<p> + ... +} +<p> +<b>void * +logfile_thread(void *arg) +{ + DB_ENV *dbenv; + int ret; + char **begin, **list; +<p> + dbenv = arg; + dbenv->errx(dbenv, + "Log file removal thread: %lu", (u_long)pthread_self()); +<p> + /* Check once every 5 minutes. */ + for (;; sleep(300)) { + /* Get the list of log files. */ + if ((ret = log_archive(dbenv, &list, DB_ARCH_ABS, NULL)) != 0) { + dbenv->err(dbenv, ret, "log_archive"); + exit (1); + } +<p> + /* Remove the log files. */ + if (list != NULL) { + for (begin = list; *list != NULL; ++list) + if ((ret = remove(*list)) != 0) { + dbenv->err(dbenv, + ret, "remove %s", *list); + exit (1); + } + free (begin); + } + } + /* NOTREACHED */ +}</b></pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/archival.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/recovery.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/put.html b/db/docs/ref/transapp/put.html new file mode 100644 index 000000000..e04a04f70 --- /dev/null +++ b/db/docs/ref/transapp/put.html @@ -0,0 +1,151 @@ +<!--$Id: put.so,v 1.3 2000/08/16 17:50:40 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Recoverability and deadlock avoidance</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/data_open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/inc.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Recoverability and deadlock avoidance</h1> +<p>The first reason listed for using transactions was recoverability. Any +logical change to a database may require multiple changes to underlying +data structures. For example, modifying a record in a Btree may require +leaf and internal pages to split, and so a single <a href="../../api_c/db_put.html">DB->put</a> method +call can potentially require that multiple physical database pages be +written. If only some of those pages are written and then the system +or application fails, the database is left inconsistent and cannot be +used until it has been recovered, that is, until the partially completed +changes have been undone. +<p>Write-ahead-logging is the term that describes the underlying +implementation that Berkeley DB uses to ensure recoverability. What it means +is that before any change is made to a database, information about the +change is written to a database log. During recovery, the log is read, +and databases are checked to ensure that changes described in the log +for committed transactions appear in the database. Changes that appear +in the database but are related to aborted or unfinished transactions +in the log are undone from the database. +<p>For recoverability after application or system failure, operations that +modify the database must be protected by transactions. More +specifically, operations are not recoverable unless a transaction is +begun and each operation is associated with the transaction via the +Berkeley DB interfaces, and then the transaction successfully committed. This +is true even if logging is turned on in the database environment. +<p>Here is an example function that updates a record in a database in a +transactionally protected manner. The function takes a key and data +items as arguments, and then attempts to store them into the database. +<p><blockquote><pre>int +main(int argc, char *argv) +{ + extern char *optarg; + extern int optind; + DB *db_cats, *db_color, *db_fruit; + DB_ENV *dbenv; + pthread_t ptid; + int ch; +<p> + while ((ch = getopt(argc, argv, "")) != EOF) + switch (ch) { + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; +<p> + env_dir_create(); + env_open(&dbenv); +<p> + /* Open database: Key is fruit class; Data is specific type. */ + db_open(dbenv, &db_fruit, "fruit", 0); +<p> + /* Open database: Key is a color; Data is an integer. */ + db_open(dbenv, &db_color, "color", 0); +<p> + /* + * Open database: + * Key is a name; Data is: company name, address, cat breeds. + */ + db_open(dbenv, &db_cats, "cats", 1); +<p> +<b> add_fruit(dbenv, db_fruit, "apple", "yellow delicious");</b> +<p> + return (0); +} +<p> +<b>void +add_fruit(DB_ENV *dbenv, DB *db, char *fruit, char *name) +{ + DBT key, data; + DB_TXN *tid; + int ret; +<p> + /* Initialization. */ + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = fruit; + key.size = strlen(fruit); + data.data = name; + data.size = strlen(name); +<p> + for (;;) { + /* Begin the transaction. */ + if ((ret = txn_begin(dbenv, NULL, &tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_begin"); + exit (1); + } +<p> + /* Store the value. */ + switch (ret = db->put(db, tid, &key, &data, 0)) { + case 0: + /* Success: commit the change. */ + if ((ret = txn_commit(tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_commit"); + exit (1); + } + return; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + break; + default: + /* Error: run recovery. */ + dbenv->err(dbenv, ret, "dbc->put: %s/%s", fruit, name); + exit (1); + } + } +}</b></pre></blockquote> +<p>The second reason listed for using transactions was deadlock avoidance. +There is a new error return in this function that you may not have seen +before. In transactional (not Concurrent Data Store) applications +supporting both readers and writers or just multiple writers, Berkeley DB +functions have an additional possible error return: +<a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>. This return means that our thread of control +deadlocked with another thread of control, and our thread was selected +to discard all of its Berkeley DB resources in order to resolve the problem. +In the sample code, any time the <a href="../../api_c/db_put.html">DB->put</a> function returns +<a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>, the transaction is aborted (by calling +<a href="../../api_c/txn_abort.html">txn_abort</a>, which releases the transaction's Berkeley DB resources and +undoes any partial changes to the databases), and then the transaction +is retried from the beginning. +<p>There is no requirement that the transaction be attempted again, but +that is a common course of action for applications. Applications may +want to set an upper boundary on the number of times an operation will +be retried, as some operations on some data sets may simply be unable +to succeed. For example, updating all of the pages on a large web site +during prime business hours may simply be impossible because of the high +access rate to the database. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/data_open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/inc.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/read.html b/db/docs/ref/transapp/read.html new file mode 100644 index 000000000..912401e87 --- /dev/null +++ b/db/docs/ref/transapp/read.html @@ -0,0 +1,40 @@ +<!--$Id: read.so,v 1.1 2000/07/25 17:56:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Repeatable reads</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/inc.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/cursor.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Repeatable reads</h1> +<p>The fourth reason listed for using transactions was repeatable reads. +Generally, most applications do not need to place reads inside a +transaction for performance reasons. The problem is that a +transactionally protected cursor, reading each key/data pair in a +database, will acquire a read lock on most of the pages in the database +and so will gradually block all write operations on the databases until +the transaction commits or aborts. Note, however, if there are update +transactions present in the application, the reading transactions must +still use locking, and should be prepared to repeat any operation +(possibly closing and reopening a cursor) which fails with a return +value of <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>. +<p>The exceptions to this rule are when the application is doing a +read-modify-write operation and so requires atomicity, and when an +application requires the ability to repeatedly access a data item +knowing that it will not have changed. A repeatable read simply means +that, for the life of the transaction, every time a request is made by +any thread of control to read a data item, it will be unchanged from +its previous value, that is, that the value will not change until the +transaction commits or aborts. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/inc.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/cursor.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/reclimit.html b/db/docs/ref/transapp/reclimit.html new file mode 100644 index 000000000..559f8ed11 --- /dev/null +++ b/db/docs/ref/transapp/reclimit.html @@ -0,0 +1,106 @@ +<!--$Id: reclimit.so,v 11.19 2000/08/16 17:50:40 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Berkeley DB recoverability</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/filesys.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/throughput.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Berkeley DB recoverability</h1> +<p>Berkeley DB recovery is based on write-ahead logging. What this means is that, +when a change is made to a database page, a description of the change is +written into a log file. This description in the log file is guaranteed +to be written to stable storage before the database pages that were +changed are written to stable storage. This is the fundamental feature +of the logging system that makes durability and rollback work. +<p>If the application or system crashes, the log is reviewed during recovery. +Any database changes described in the log that were part of committed +transactions, and that were never written to the actual database itself, +are written to the database as part of recovery. Any database changes +described in the log that were never committed, and that were written to +the actual database itself, are backed-out of the the database as part of +recovery. This design allows the database to be written lazily, and only +blocks from the log file have to be forced to disk as part of transaction +commit. +<p>There are two interfaces that are a concern when considering Berkeley DB +recoverability: +<p><ol> +<p><li>The interface between Berkeley DB and the operating system/filesystem. +<li>The interface between the operating system/filesystem and the +underlying stable storage hardware. +</ol> +<p>Berkeley DB uses the operating system interfaces and its underlying filesystem +when writing its files. This means that Berkeley DB can fail if the underlying +filesystem fails in some unrecoverable way. Otherwise, the interface +requirements here are simple: the system call that Berkeley DB uses to flush +data to disk (normally <b>fsync</b>(2)), must guarantee that all the +information necessary for a file's recoverability has been written to +stable storage before it returns to Berkeley DB, and that no possible +application or system crash can cause that file to be unrecoverable. +<p>In addition, Berkeley DB implicitly uses the interface between the operating +system and the underlying hardware. The interface requirements here are +not as simple. +<p>First, it is necessary to consider the underlying page size of the Berkeley DB +databases. The Berkeley DB library performs all database writes using the page +size specified by the application. These pages are not checksummed and +Berkeley DB assumes that they are written atomically. This means that if the +operating system performs filesystem I/O in different sized blocks than +the database page size, it may increase the possibility for database +corruption. For example, assume that Berkeley DB is writing 32KB pages for a +database and the operating system does filesystem I/O in 16KB blocks. If +the operating system writes the first 16KB of the database page +successfully, but crashes before being able to write the second 16KB of +the database, the database has been corrupted and this corruption will +not be detected during recovery. For this reason, it may be important +to select database page sizes that will be written as single block +transfers by the underlying operating system. +<p>Second, it is necessary to consider the behavior of the system's underlying +stable storage hardware. For example, consider a SCSI controller that +has been configured to cache data and return to the operating system that +the data has been written to stable storage, when, in fact, it has only +been written into the controller RAM cache. If power is lost before the +controller is able to flush its cache to disk, and the controller cache +is not stable (i.e., the writes will not be flushed to disk when power +returns), the writes will be lost. If the writes include database blocks, +there is no loss as recovery will correctly update the database. If the +writes include log file blocks, it is possible that transactions that were +already committed may not appear in the recovered database, although the +recovered database will be coherent after a crash. +<p>If the underlying hardware can fail in any way such that only part of the +block was written, the failure conditions are the same as those described +above for an operating system failure that only writes part of a logical +database block. +<p>For these reasons, it is important to select hardware that does not do +partial writes and does not cache data writes (or does not return that +the data has been written to stable storage until it either has been +written to stable storage or the actual writing of all of the data is +guaranteed barring catastrophic hardware failure, e.g., your disk drive +exploding). You should also be aware that Berkeley DB does not protect against +all cases of stable storage hardware failure, nor does it protect against +hardware misbehavior. +<p>If the disk drive on which you are storing your databases explodes, you +can perform normal Berkeley DB catastrophic recovery, as that requires only a +snapshot of your databases plus all of the log files you have archived +since those snapshots were taken. In this case, you will lose no database +changes at all. If the disk drive on which you are storing your log files +explodes, you can still perform catastrophic recovery, but you will lose +any database changes that were part of transactions committed since your +last archival of the log files. For this reason, storing your databases +and log files on different disks should be considered a safety measure as +well as a performance enhancement. +<p>Finally, if your hardware misbehaves, for example, a SCSI controller +writes incorrect data to the disk, Berkeley DB will not detect this and your +data may be corrupted. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/filesys.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/throughput.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/recovery.html b/db/docs/ref/transapp/recovery.html new file mode 100644 index 000000000..5be94bf41 --- /dev/null +++ b/db/docs/ref/transapp/recovery.html @@ -0,0 +1,91 @@ +<!--$Id: recovery.so,v 10.26 2000/08/16 17:50:40 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Recovery procedures</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/logfile.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/filesys.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Recovery procedures</h1> +<p>The fifth component of the infrastructure, recovery procedures, concerns +the recoverability of the database. After any application or system +failure, there are two possible approaches to database recovery: +<p><ol> +<p><li>There is no need for recoverability and all databases can be recreated +from scratch. While these applications may still need transaction +protection for other reasons, recovery usually consists of removing the +Berkeley DB environment home directory and all files it contains, and then +restarting the application. +<p><li>It is necessary to recover information after system or application +failure. In this case, recovery processing must be performed on any +database environments that were active at the time of the failure. +Recovery processing involves running the <a href="../../utility/db_recover.html">db_recover</a> utility or +calling the <a href="../../api_c/env_open.html">DBENV->open</a> function with the <a href="../../api_c/env_open.html#DB_RECOVER">DB_RECOVER</a> or +<a href="../../api_c/env_open.html#DB_RECOVER_FATAL">DB_RECOVER_FATAL</a> flags. +<p>During recovery processing, all database changes made by aborted or +unfinished transactions are undone and all database changes made by +committed transactions are redone, as necessary. Database applications +must not be restarted until recovery completes. After recovery +finishes, the environment is properly initialized so that applications +may be restarted. +</ol> +<p>If you intend to do recovery, there are two possible types of recovery +processing: +<p><ol> +<p><li><i>catastrophic</i> recovery. A failure that requires catastrophic +recovery is a failure where either the database or log files have been +destroyed or corrupted. For example, catastrophic failure includes the +case where the disk drive on which either the database or logs are +stored has been physically destroyed, or when the system's normal +filesystem recovery on startup is unable to bring the database and log +files to a consistent state. This is often difficult to detect, and +perhaps the most common sign of the need for catastrophic recovery is +when the normal recovery procedures fail. +<p>To restore your database environment after catastrophic failure, take +the following steps: +<p><ol> +<p><li>Restore the most recent snapshots of the database and log files from +the backup media into the system directory where recovery will be +performed. +<p><li>If any log files were archived since the last snapshot was made, they +should be restored into the Berkeley DB environment directory where recovery +will be performed. Make sure you restore them in the order in which +they were written. The order is important because it's possible that +the same log file appears on multiple backups and you want to run +recovery using the most recent version of each log file. +<p><li>Run the <a href="../../utility/db_recover.html">db_recover</a> utility, specifying its <b>-c</b> option, +or call the <a href="../../api_c/env_open.html">DBENV->open</a> function specifying the <a href="../../api_c/env_open.html#DB_RECOVER_FATAL">DB_RECOVER_FATAL</a> +flag. The catastrophic recovery process will review the logs and +database files to bring the environment databases to a consistent state +as of the time of the last uncorrupted log file that is found. It is +important to realize that only transactions committed before that date +will appear in the databases. +<p>It is possible to recreate the database in a location different than +the original, by specifying appropriate pathnames to the <b>-h</b> +option of the <a href="../../utility/db_recover.html">db_recover</a> utility. In order for this to work +properly, it is important that your application reference files by +names relative to the database home directory or the pathname(s) specified +in calls to <a href="../../api_c/env_set_data_dir.html">DBENV->set_data_dir</a>, instead of using full path names. +</ol> +<p><li><i>non-catastrophic</i> or <i>normal</i> recovery. If the +failure is non-catastrophic and the database files and log are both +accessible on a stable filesystem, run the <a href="../../utility/db_recover.html">db_recover</a> utility +without the <b>-c</b> option or call the <a href="../../api_c/env_open.html">DBENV->open</a> function +specifying the <a href="../../api_c/env_open.html#DB_RECOVER">DB_RECOVER</a> flag. The normal recovery process +will review the logs and database files to ensure that all changes +associated with committed transactions appear in the databases and that +all uncommitted transactions do not. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/logfile.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/filesys.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/term.html b/db/docs/ref/transapp/term.html new file mode 100644 index 000000000..d6d54a44d --- /dev/null +++ b/db/docs/ref/transapp/term.html @@ -0,0 +1,60 @@ +<!--$Id: term.so,v 10.16 2000/08/16 17:50:40 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Terminology</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/why.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/app.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Terminology</h1> +<p>Here are some definitions that will be helpful in understanding +transactions: +<p><dl compact> +<p><dt>Thread of control<dd>Berkeley DB is indifferent to the type or style of threads being used by the +application, or, for that matter, if threads are being used at all, as +Berkeley DB supports multi-process access. In the Berkeley DB documentation, any +time we refer to a "thread of control", that can be read as a true +thread (one of many in an application's address space), or, a process. +<p><dt>Free-threaded<dd>A Berkeley DB handle that can be used by multiple threads simultaneously +without any application-level synchronization is called free-threaded. +<p><dt>Transaction<dd>A transaction is a one or more operations on one or more databases, that +should be treated as a single unit of work. For example, changes to a +set of databases, where either all of the changes must be applied to +the database(s) or none of them should. Applications specify when each +transaction starts, what database operations are included in it, and +when it ends. +<p><dt>Transaction abort/commit<dd>Every transaction ends by <i>committing</i> or <i>aborting</i>. +If a transaction commits, then Berkeley DB guarantees that any database +changes included in the transaction will never be lost, even after +system or application failure. If a transaction aborts, or is +uncommitted when the system or application fails, then the changes +involved will never appear in the database. +<p><dt>System or application failure<dd>This is the phrase that we will use to describe when something bad +happens near your data. It can be an application dumping core, being +interrupted by a signal, the disk filling up, or the entire system +crashing. In any case, for whatever reason, the application can no +longer make forward progress, and its databases were left in an unknown +state. +<p><dt>Recovery<dd>Whenever system or application failure occurs, the application must run +recovery. Recovery is what makes the database consistent, that is, the +recovery process includes review of log files and databases to ensure +that the changes from each committed transaction appear in the database, +and that no changes from an unfinished (or aborted) transaction do. +<p><dt>Deadlock<dd>Deadlock, in its simplest form, happens when one thread of control owns +resource A, but needs resource B, while another thread of control owns +resource B, but needs resource A. Neither thread of control can make +progress, and so one has to give up and release all of its resources, +at which time the remaining thread of control can make forward progress. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/why.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/app.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/throughput.html b/db/docs/ref/transapp/throughput.html new file mode 100644 index 000000000..734f3c7f9 --- /dev/null +++ b/db/docs/ref/transapp/throughput.html @@ -0,0 +1,117 @@ +<!--$Id: throughput.so,v 10.24 2000/12/04 18:05:44 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Transaction throughput</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/reclimit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Transaction throughput</h1> +<p>Generally, the speed of a database system is measured by the transaction +throughput, expressed as the number of transactions per second. The two +gating factors for Berkeley DB performance in a transactional system are usually +the underlying database files and the log file. Both are factors because +they require disk I/O, which is slow relative to other system resources +like CPU. +<p>In the worst case scenario: +<ul type=disc> +<li>Database access is truly random and the database is too large to fit into +the cache, resulting in a single I/O per requested key/data pair. +<li>Both the database and the log are on a single disk. +</ul> +<p>This means that for each transaction, Berkeley DB is potentially performing +several filesystem operations: +<ul type=disc> +<li>Disk seek to database file. +<li>Database file read. +<li>Disk seek to log file. +<li>Log file write. +<li>Flush log file information to disk. +<li>Disk seek to update log file metadata (e.g., inode). +<li>Log metadata write. +<li>Flush log file metadata to disk. +</ul> +<p>There are a number of ways to increase transactional throughput, all of +which attempt to decrease the number of filesystem operations per +transaction: +<ul type=disc> +<li>Tune the size of the database cache. If the Berkeley DB key/data pairs used +during the transaction are found in the database cache, the seek and read +from the database are no longer necessary, resulting in two fewer +filesystem operations per transaction. To determine if your cache size +is too small, see <a href="../../ref/am_conf/cachesize.html">Selecting a +cache size</a>. +<li>Put the database and the log files on different disks. This allows reads +and writes to the log files and the database files to be performed +concurrently. +<li>Set the filesystem configuration so that file access and modification +times are not updated. Note, although the file access and modification +times are not used by Berkeley DB, this may affect other programs, so be +careful. +<li>Upgrade your hardware. When considering the hardware on which to run your +application, however, it is important to consider the entire system. The +controller and bus can have as much to do with the disk performance as +the disk itself. It is also important to remember that throughput is +rarely the limiting factor, and that disk seek times are normally the true +performance issue for Berkeley DB. +<li>Turn on the <a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flag. This changes the Berkeley DB behavior +so that the log files are not flushed when transactions are committed. +While this change will greatly increase your transaction throughput, it +means that transactions will exhibit the ACI (atomicity, consistency and +isolation) properties, but not D (durability). Database integrity will +be maintained but it is possible that some number of the most recently +committed transactions may be undone during recovery instead of being +redone. +</ul> +<p>If you are bottlenecked on logging, the following test will help you +confirm that the number of transactions per second that your application +does is reasonable for the hardware on which you're running. Your test +program should repeatedly perform the following operations: +<ul type=disc> +<li>Seek to the beginning of a file. +<li>Write to the file. +<li>Flush the file write to disk. +</ul> +<p>The number of times that you can perform these three operations per second +is a rough measure of the number of transactions per second of which the +hardware is capable. This test simulates the operations applied to the +log file. (As a simplifying assumption in this experiment, we assume that +the database files are either on a separate disk, or that they fit, with +some few exceptions, into the database cache.) We do not have to directly +simulate updating the log file directory information, as it will normally +be updated and flushed to disk as a result of flushing the log file write +to disk. +<p>Running this test program, where we write 256 bytes, for 1000 operations, +on reasonably standard commodity hardware (Pentium II CPU, SCSI disk), +returned the following results: +<p><blockquote><pre>% testfile -b256 -o1000 +running: 1000 ops +Elapsed time: 16.641934 seconds +1000 ops: 60.09 ops per second</pre></blockquote> +<p>Note that the number of bytes being written to the log as part of each +transaction can dramatically affect the transaction throughput. The +above test run used 256, which is a reasonable size log write. Your +log writes may be different. To determine your average log write size, +use the <a href="../../utility/db_stat.html">db_stat</a> utility to display your log statistics. +<p>As a quick sanity check, for this particular disk, the average seek time +is 9.4 msec, and the average latency is 4.17 msec. That results in a +minimum requirement for a data transfer to the disk of 13.57 msec, or a +maximum of 74 transfers per second. This is close enough to the above 60 +operations per second (which wasn't done on a quiescent disk) that the +number is believable. +<p>An implementation of the above <a href="writetest.txt">example test +program</a> for IEEE/ANSI Std 1003.1 (POSIX) standard systems is included in the Berkeley DB +distribution. +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/reclimit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/transapp.txt b/db/docs/ref/transapp/transapp.txt new file mode 100644 index 000000000..afd441c59 --- /dev/null +++ b/db/docs/ref/transapp/transapp.txt @@ -0,0 +1,492 @@ +#include <sys/types.h> +#include <sys/stat.h> + +#include <errno.h> +#include <pthread.h> +#include <stdarg.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#include <db.h> + +#define ENV_DIRECTORY "TXNAPP" + +void add_cat(DB_ENV *, DB *, char *, ...); +void add_color(DB_ENV *, DB *, char *, int); +void add_fruit(DB_ENV *, DB *, char *, char *); +void *checkpoint_thread(void *); +void log_archlist(DB_ENV *); +void *logfile_thread(void *); +void db_open(DB_ENV *, DB **, char *, int); +void env_dir_create(void); +void env_open(DB_ENV **); +void usage(void); + +int +main(int argc, char *argv[]) +{ + extern char *optarg; + extern int optind; + DB *db_cats, *db_color, *db_fruit; + DB_ENV *dbenv; + pthread_t ptid; + int ch; + + while ((ch = getopt(argc, argv, "")) != EOF) + switch (ch) { + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; + + env_dir_create(); + env_open(&dbenv); + + /* Start a checkpoint thread. */ + if ((errno = pthread_create( + &ptid, NULL, checkpoint_thread, (void *)dbenv)) != 0) { + fprintf(stderr, + "txnapp: failed spawning checkpoint thread: %s\n", + strerror(errno)); + exit (1); + } + + /* Start a logfile removal thread. */ + if ((errno = pthread_create( + &ptid, NULL, logfile_thread, (void *)dbenv)) != 0) { + fprintf(stderr, + "txnapp: failed spawning log file removal thread: %s\n", + strerror(errno)); + exit (1); + } + + /* Open database: Key is fruit class; Data is specific type. */ + db_open(dbenv, &db_fruit, "fruit", 0); + + /* Open database: Key is a color; Data is an integer. */ + db_open(dbenv, &db_color, "color", 0); + + /* + * Open database: + * Key is a name; Data is: company name, address, cat breeds. + */ + db_open(dbenv, &db_cats, "cats", 1); + + add_fruit(dbenv, db_fruit, "apple", "yellow delicious"); + + add_color(dbenv, db_color, "blue", 0); + add_color(dbenv, db_color, "blue", 3); + + add_cat(dbenv, db_cats, + "Amy Adams", + "Sleepycat Software", + "394 E. Riding Dr., Carlisle, MA 01741, USA", + "abyssinian", + "bengal", + "chartreaux", + NULL); + + return (0); +} + +void +env_dir_create() +{ + struct stat sb; + + /* + * If the directory exists, we're done. We do not further check + * the type of the file, DB will fail appropriately if it's the + * wrong type. + */ + if (stat(ENV_DIRECTORY, &sb) == 0) + return; + + /* Create the directory, read/write/access owner only. */ + if (mkdir(ENV_DIRECTORY, S_IRWXU) != 0) { + fprintf(stderr, + "txnapp: mkdir: %s: %s\n", ENV_DIRECTORY, strerror(errno)); + exit (1); + } +} + +void +env_open(DB_ENV **dbenvp) +{ + DB_ENV *dbenv; + int ret; + + /* Create the environment handle. */ + if ((ret = db_env_create(&dbenv, 0)) != 0) { + fprintf(stderr, + "txnapp: db_env_create: %s\n", db_strerror(ret)); + exit (1); + } + + /* Set up error handling. */ + dbenv->set_errpfx(dbenv, "txnapp"); + + /* Do deadlock detection internally. */ + if ((ret = dbenv->set_lk_detect(dbenv, DB_LOCK_DEFAULT)) != 0) { + dbenv->err(dbenv, ret, "set_lk_detect: DB_LOCK_DEFAULT"); + exit (1); + } + + /* + * Open a transactional environment: + * create if it doesn't exist + * free-threaded handle + * run recovery + * read/write owner only + */ + if ((ret = dbenv->open(dbenv, ENV_DIRECTORY, + DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | + DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER | DB_THREAD, + S_IRUSR | S_IWUSR)) != 0) { + dbenv->err(dbenv, ret, "dbenv->open: %s", ENV_DIRECTORY); + exit (1); + } + + *dbenvp = dbenv; +} + +void * +checkpoint_thread(void *arg) +{ + DB_ENV *dbenv; + int ret; + + dbenv = arg; + dbenv->errx(dbenv, "Checkpoint thread: %lu", (u_long)pthread_self()); + + /* Checkpoint once a minute. */ + for (;; sleep(60)) + switch (ret = txn_checkpoint(dbenv, 0, 0, 0)) { + case 0: + case DB_INCOMPLETE: + break; + default: + dbenv->err(dbenv, ret, "checkpoint thread"); + exit (1); + } + + /* NOTREACHED */ +} + +void * +logfile_thread(void *arg) +{ + DB_ENV *dbenv; + int ret; + char **begin, **list; + + dbenv = arg; + dbenv->errx(dbenv, + "Log file removal thread: %lu", (u_long)pthread_self()); + + /* Check once every 5 minutes. */ + for (;; sleep(300)) { + /* Get the list of log files. */ + if ((ret = log_archive(dbenv, &list, DB_ARCH_ABS, NULL)) != 0) { + dbenv->err(dbenv, ret, "log_archive"); + exit (1); + } + + /* Remove the log files. */ + if (list != NULL) { + for (begin = list; *list != NULL; ++list) + if ((ret = remove(*list)) != 0) { + dbenv->err(dbenv, + ret, "remove %s", *list); + exit (1); + } + free (begin); + } + } + /* NOTREACHED */ +} + +void +log_archlist(DB_ENV *dbenv) +{ + int ret; + char **begin, **list; + + /* Get the list of database files. */ + if ((ret = log_archive(dbenv, + &list, DB_ARCH_ABS | DB_ARCH_DATA, NULL)) != 0) { + dbenv->err(dbenv, ret, "log_archive: DB_ARCH_DATA"); + exit (1); + } + if (list != NULL) { + for (begin = list; *list != NULL; ++list) + printf("database file: %s\n", *list); + free (begin); + } + + /* Get the list of log files. */ + if ((ret = log_archive(dbenv, + &list, DB_ARCH_ABS | DB_ARCH_LOG, NULL)) != 0) { + dbenv->err(dbenv, ret, "log_archive: DB_ARCH_LOG"); + exit (1); + } + if (list != NULL) { + for (begin = list; *list != NULL; ++list) + printf("log file: %s\n", *list); + free (begin); + } +} + +void +db_open(DB_ENV *dbenv, DB **dbp, char *name, int dups) +{ + DB *db; + int ret; + + /* Create the database handle. */ + if ((ret = db_create(&db, dbenv, 0)) != 0) { + dbenv->err(dbenv, ret, "db_create"); + exit (1); + } + + /* Optionally, turn on duplicate data items. */ + if (dups && (ret = db->set_flags(db, DB_DUP)) != 0) { + dbenv->err(dbenv, ret, "db->set_flags: DB_DUP"); + exit (1); + } + + /* + * Open a database in the environment: + * create if it doesn't exist + * free-threaded handle + * read/write owner only + */ + if ((ret = db->open(db, name, NULL, + DB_BTREE, DB_CREATE | DB_THREAD, S_IRUSR | S_IWUSR)) != 0) { + dbenv->err(dbenv, ret, "db->open: %s", name); + exit (1); + } + + *dbp = db; +} + +void +add_fruit(DB_ENV *dbenv, DB *db, char *fruit, char *name) +{ + DBT key, data; + DB_TXN *tid; + int ret; + + /* Initialization. */ + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = fruit; + key.size = strlen(fruit); + data.data = name; + data.size = strlen(name); + + for (;;) { + /* Begin the transaction. */ + if ((ret = txn_begin(dbenv, NULL, &tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_begin"); + exit (1); + } + + /* Store the value. */ + switch (ret = db->put(db, tid, &key, &data, 0)) { + case 0: + /* Success: commit the change. */ + if ((ret = txn_commit(tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_commit"); + exit (1); + } + return; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + break; + default: + /* Error: run recovery. */ + dbenv->err(dbenv, ret, "dbc->put: %s/%s", fruit, name); + exit (1); + } + } +} + +void +add_color(DB_ENV *dbenv, DB *dbp, char *color, int increment) +{ + DBT key, data; + DB_TXN *tid; + int original, ret; + char buf[64]; + + /* Initialization. */ + memset(&key, 0, sizeof(key)); + key.data = color; + key.size = strlen(color); + memset(&data, 0, sizeof(data)); + data.flags = DB_DBT_MALLOC; + + for (;;) { + /* Begin the transaction. */ + if ((ret = txn_begin(dbenv, NULL, &tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_begin"); + exit (1); + } + + /* + * Get the key. If it exists, we increment the value. If it + * doesn't exist, we create it. + */ + switch (ret = dbp->get(dbp, tid, &key, &data, 0)) { + case 0: + original = atoi(data.data); + break; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + continue; + case DB_NOTFOUND: + original = 0; + break; + default: + /* Error: run recovery. */ + dbenv->err( + dbenv, ret, "dbc->get: %s/%d", color, increment); + exit (1); + } + if (data.data != NULL) + free(data.data); + + /* Create the new data item. */ + (void)snprintf(buf, sizeof(buf), "%d", original + increment); + data.data = buf; + data.size = strlen(buf) + 1; + + /* Store the new value. */ + switch (ret = dbp->put(dbp, tid, &key, &data, 0)) { + case 0: + /* Success: commit the change. */ + if ((ret = txn_commit(tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_commit"); + exit (1); + } + return; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + break; + default: + /* Error: run recovery. */ + dbenv->err( + dbenv, ret, "dbc->put: %s/%d", color, increment); + exit (1); + } + } +} + +void +add_cat(DB_ENV *dbenv, DB *db, char *name, ...) +{ + va_list ap; + DBC *dbc; + DBT key, data; + DB_TXN *tid; + int ret; + char *s; + + /* Initialization. */ + memset(&key, 0, sizeof(key)); + memset(&data, 0, sizeof(data)); + key.data = name; + key.size = strlen(name); + +retry: /* Begin the transaction. */ + if ((ret = txn_begin(dbenv, NULL, &tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_begin"); + exit (1); + } + + /* Delete any previously existing item. */ + switch (ret = db->del(db, tid, &key, 0)) { + case 0: + case DB_NOTFOUND: + break; + case DB_LOCK_DEADLOCK: + /* Deadlock: retry the operation. */ + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + goto retry; + default: + dbenv->err(dbenv, ret, "db->del: %s", name); + exit (1); + } + + /* Create a cursor. */ + if ((ret = db->cursor(db, tid, &dbc, 0)) != 0) { + dbenv->err(dbenv, ret, "db->cursor"); + exit (1); + } + + /* Append the items, in order. */ + va_start(ap, name); + while ((s = va_arg(ap, char *)) != NULL) { + data.data = s; + data.size = strlen(s); + switch (ret = dbc->c_put(dbc, &key, &data, DB_KEYLAST)) { + case 0: + break; + case DB_LOCK_DEADLOCK: + va_end(ap); + + /* Deadlock: retry the operation. */ + if ((ret = dbc->c_close(dbc)) != 0) { + dbenv->err( + dbenv, ret, "dbc->c_close"); + exit (1); + } + if ((ret = txn_abort(tid)) != 0) { + dbenv->err(dbenv, ret, "txn_abort"); + exit (1); + } + goto retry; + default: + /* Error: run recovery. */ + dbenv->err(dbenv, ret, "dbc->put: %s/%s", name, s); + exit (1); + } + } + va_end(ap); + + /* Success: commit the change. */ + if ((ret = dbc->c_close(dbc)) != 0) { + dbenv->err(dbenv, ret, "dbc->c_close"); + exit (1); + } + if ((ret = txn_commit(tid, 0)) != 0) { + dbenv->err(dbenv, ret, "txn_commit"); + exit (1); + } +} + +void +usage() +{ + (void)fprintf(stderr, "usage: txnapp\n"); + exit(1); +} diff --git a/db/docs/ref/transapp/why.html b/db/docs/ref/transapp/why.html new file mode 100644 index 000000000..8fee13082 --- /dev/null +++ b/db/docs/ref/transapp/why.html @@ -0,0 +1,49 @@ +<!--$Id: why.so,v 1.1 2000/07/25 17:56:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Why transactions?</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Protected Applications</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/term.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Why transactions?</h1> +<p>Perhaps the first question to answer is "Why transactions?" There are +a number of reasons for including transactional support in your +applications. The most common ones are: +<p><dl compact> +<p><dt>Recoverability<dd>Applications often need to ensure that, no matter how the system or +application fails, previously saved data is available the next time the +application runs. +<p><dt>Deadlock avoidance<dd>When multiple threads of control change the database at the same time, +there is usually the possibility of deadlock, that is, where each of +the threads of control owns a resource another thread wants, and so no +thread is able to make forward progress, all waiting for a resource. +Deadlocks are resolved by having one of the operations involved release +the resources it controls so the other operations can proceed. (The +operation releasing its resources usually just tries again later.) +Transactions are necessary so that any changes that were already made +to the database can be undone as part of releasing the held resources. +<p><dt>Atomicity<dd>Applications often need to make multiple changes to one or more +databases, but want to ensure that either all of the changes happen, or +none of them happen. Transactions guarantee that a group of changes +are atomic, that is, if the application or system fails, either all of +the changes to the databases will appear when the application next runs, +or none of them. +<p><dt>Repeatable reads<dd>Applications sometimes need to ensure that, while doing a group of +operations on a database, the value returned as a result of a database +retrieval doesn't change, that is, if you retrieve the same key more +than once, the data item will be the same each time. Transactions +guarantee this behavior. +</dl> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/term.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/transapp/writetest.txt b/db/docs/ref/transapp/writetest.txt new file mode 100644 index 000000000..b86c1b6ce --- /dev/null +++ b/db/docs/ref/transapp/writetest.txt @@ -0,0 +1,100 @@ +/* + * writetest -- + * + * $Id: writetest.txt,v 10.3 1999/11/19 17:21:06 bostic Exp $ + */ +#include <sys/types.h> + +#include <errno.h> +#include <fcntl.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <time.h> +#include <unistd.h> + +void usage __P((void)); + +int +main(argc, argv) + int argc; + char *argv[]; +{ + struct timeval start_time, end_time; + long usecs; + int bytes, ch, cnt, fd, ops; + char *fname, buf[100 * 1024]; + + bytes = 256; + fname = "testfile"; + ops = 1000; + while ((ch = getopt(argc, argv, "b:f:o:")) != EOF) + switch (ch) { + case 'b': + if ((bytes = atoi(optarg)) > sizeof(buf)) { + fprintf(stderr, + "max -b option %d\n", sizeof(buf)); + exit (1); + } + break; + case 'f': + fname = optarg; + break; + case 'o': + if ((ops = atoi(optarg)) <= 0) { + fprintf(stderr, "illegal -o option value\n"); + exit (1); + } + break; + case '?': + default: + usage(); + } + argc -= optind; + argv += optind; + + (void)unlink(fname); + if ((fd = open(fname, O_RDWR | O_CREAT, 0666)) == -1) { + perror(fname); + exit (1); + } + + memset(buf, 0, bytes); + + printf("running: %d ops\n", ops); + + (void)gettimeofday(&start_time, NULL); + for (cnt = 0; cnt < ops; ++cnt) { + if (write(fd, buf, bytes) != bytes) { + fprintf(stderr, "write: %s\n", strerror(errno)); + exit (1); + } + if (lseek(fd, (off_t)0, SEEK_SET) == -1) { + fprintf(stderr, "lseek: %s\n", strerror(errno)); + exit (1); + } + if (fsync(fd) != 0) { + fprintf(stderr, "fsync: %s\n", strerror(errno)); + exit (1); + } + } + (void)gettimeofday(&end_time, NULL); + + usecs = (end_time.tv_sec - start_time.tv_sec) * 1000000 + + end_time.tv_usec - start_time.tv_usec; + printf("Elapsed time: %ld.%06ld seconds\n", + usecs / 1000000, usecs % 1000000); + printf("%d ops: %7.2f ops per second\n", + ops, (float)1000000 * ops/usecs); + + (void)unlink(fname); + exit (0); +} + +void +usage() +{ + (void)fprintf(stderr, + "usage: testfile [-b bytes] [-f file] [-o ops]\n"); + exit(1); +} diff --git a/db/docs/ref/txn/config.html b/db/docs/ref/txn/config.html new file mode 100644 index 000000000..beb73859f --- /dev/null +++ b/db/docs/ref/txn/config.html @@ -0,0 +1,37 @@ +<!--$Id: config.so,v 10.14 2000/10/03 17:17:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuring transactions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/txn/limits.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/other.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuring transactions</h1> +<p>There is only a single parameter used in configuring transactions, the +<a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flag. Setting the <a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flag to +<a href="../../api_c/env_set_flags.html">DBENV->set_flags</a> when opening a transaction region changes the +behavior of transactions not to synchronously flush the log during +transaction commit. +<p>This change will significantly increase application transactional +throughput. However, it means that while transactions will continue to +exhibit the ACI (atomicity, consistency and isolation) properties, they +will not have D (durability). Database integrity will be maintained but +it is possible that some number of the most recently committed +transactions may be undone during recovery instead of being redone. +<p>The application may also limit the number of simultaneous outstanding +transactions supported by the environment by calling the +<a href="../../api_c/env_set_tx_max.html">DBENV->set_tx_max</a> function. When this number is met, additional calls to +<a href="../../api_c/txn_begin.html">txn_begin</a> will fail until some active transactions complete. +<table><tr><td><br></td><td width="1%"><a href="../../ref/txn/limits.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/other.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/txn/intro.html b/db/docs/ref/txn/intro.html new file mode 100644 index 000000000..557481509 --- /dev/null +++ b/db/docs/ref/txn/intro.html @@ -0,0 +1,86 @@ +<!--$Id: intro.so,v 10.14 2000/03/18 21:43:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Berkeley DB and transactions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/mp/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/nested.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Berkeley DB and transactions</h1> +<p>The transaction subsystem makes operations atomic, consistent, isolated, +and durable in the face of system and application failures. The subsystem +requires that the data be properly logged and locked in order to attain +these properties. Berkeley DB contains all the components necessary to +transaction-protect the Berkeley DB access methods and other forms of data may +be protected if they are logged and locked appropriately. +<p>The transaction subsystem is created, initialized, and opened by calls to +<a href="../../api_c/env_open.html">DBENV->open</a> with the <a href="../../api_c/env_open.html#DB_INIT_TXN">DB_INIT_TXN</a> flag specified. Note that +enabling transactions automatically enables logging, but does not enable +locking, as a single thread of control that needed atomicity and +recoverability would not require it. +<p>The <a href="../../api_c/txn_begin.html">txn_begin</a> function starts a transaction, returning an opaque +handle to a transaction. If the parent parameter to <a href="../../api_c/txn_begin.html">txn_begin</a> is +non-NULL, then the new transaction is a child of the designated parent +transaction. +<p>The <a href="../../api_c/txn_abort.html">txn_abort</a> function ends the designated transaction and causes +all updates performed by the transaction to be undone. The end result is +that the database is left in a state identical to the state that existed +prior to the <a href="../../api_c/txn_begin.html">txn_begin</a>. If the aborting transaction has any child +transactions associated with it (even ones that have already been +committed), they are also aborted. Any transactions that are unresolved +(i.e., neither committed nor aborted) when the application or system fails +are aborted during recovery. +<p>The <a href="../../api_c/txn_commit.html">txn_commit</a> function ends the designated transaction and makes +all the updates performed by the transaction permanent, even in the face +of application or system failure. If this is a parent transaction +committing, then all child transactions that individually committed or +had not been resolved are also committed. +<p>Transactions are identified by 32-bit unsigned integers. The ID +associated with any transaction can be obtained using the <a href="../../api_c/txn_id.html">txn_id</a> +function. If an application is maintaining information outside of Berkeley DB +that it wishes to transaction-protect, it should use this transaction ID +as the locking ID. +<p>The <a href="../../api_c/txn_checkpoint.html">txn_checkpoint</a> function causes a transaction checkpoint. A +checkpoint is performed relative to a specific log sequence number (LSN), +referred to as the checkpoint LSN. When a checkpoint completes +successfully, it means that all data buffers whose updates are described +by LSNs less than the checkpoint LSN have been written to disk. This, in +turn, means that the log records less than the checkpoint LSN are no +longer necessary for normal recovery (although they would be required for +catastrophic recovery should the database files be lost) and all log files +containing only records prior to the checkpoint LSN may be safely archived +and removed. +<p>It is possible that in order to complete a transaction checkpoint, it will +be necessary to write a buffer that is currently in use (i.e., is actively +being read or written by some transaction). In this case, +<a href="../../api_c/txn_checkpoint.html">txn_checkpoint</a> will not be able to write the buffer, as doing so +might cause an inconsistent version of the page to be written to disk, +and instead of completing successfully will return with an error code of +<a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a>. In such cases, the checkpoint can simply be +retried after a short delay. +<p>The interval between successive checkpoints is directly proportional to +the length of time required to run normal recovery. If the interval +between checkpoints is long, then a large number of updates that are +recorded in the log may not yet be written to disk and recovery may take +longer to run. If the interval is short, then data is being written to +disk more frequently, but the recovery time will be shorter. Often, the +checkpoint interval will be tuned for each specific application. +<p>The <a href="../../api_c/txn_stat.html">txn_stat</a> function returns information about the status of +the transaction subsystem. It is the programmatic interface used by the +<a href="../../utility/db_stat.html">db_stat</a> utility. +<p>The transaction system is closed by a call to <a href="../../api_c/env_close.html">DBENV->close</a>. +<p>Finally, the entire transaction system may be removed using the +<a href="../../api_c/env_remove.html">DBENV->remove</a> interface. +<table><tr><td><br></td><td width="1%"><a href="../../ref/mp/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/nested.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/txn/limits.html b/db/docs/ref/txn/limits.html new file mode 100644 index 000000000..0ed978066 --- /dev/null +++ b/db/docs/ref/txn/limits.html @@ -0,0 +1,66 @@ +<!--$Id: limits.so,v 10.29 2001/01/10 17:33:53 margo Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Transaction limits</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/txn/nested.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Transaction limits</h1> +<h3>Transaction IDs</h3> +<p>Transactions are identified uniquely by 32-bit unsigned integers. The +high-order bit of the transaction ID is reserved (and defined to be 1) +resulting in just over two billion unique transaction IDs. Each time +that recovery is run, the beginning transaction ID is reset with new +transactions being numbered starting from 1. This means that recovery +must be run at least once every two billion transactions. +<p>It is possible that some environments may need to be aware of this +limitation. Consider an application performing 600 transactions a second +for 15 hours a day. The transaction ID space will run out in roughly 66 +days: +<p><blockquote><pre>2^31 / (600 * 15 * 60 * 60) = 66</pre></blockquote> +<p>Doing only 100 transactions a second exhausts the transaction ID space +in roughly one year. +<p>The transaction ID name space is initialized each time +a database environment is created or recovered. If you +reach the end of the transaction ID name space, it must +be handled as if an application or system failure had +occurred. The most recently allocated transaction ID +is the <b>st_last_txnid</b> value in the transaction +statistics information, and is displayed by the +<a href="../../utility/db_stat.html">db_stat</a> utility. +<h3>Cursors</h3> +<p>When using transactions, cursors are localized to a single transaction. +That is, a cursor may not span transactions and must be opened and +closed within a single transaction. In addition, intermingling +transaction-protected cursor operations and non-transaction-protected +cursor operations on the same database in a single thread of control is +practically guaranteed to deadlock as the locks obtained for transactions +and non-transactions can conflict. +<h3>Multiple Threads of Control</h3> +<p>Since transactions must hold all their locks until commit, a single +transaction may accumulate a large number of long-term locks during its +lifetime. As a result, when two concurrently running transactions access +the same database, there is strong potential for conflict. While Berkeley +DB allows an application to have multiple outstanding transactions active +within a single thread of control, great care must be taken to ensure that +the transactions do not interfere with each other (e.g., attempt to obtain +conflicting locks on the same data). If two concurrently active +transactions in the same thread of control do encounter a lock conflict, +the thread of control will deadlock in such a manner that the deadlock +detector will be unable to resolve the problem. In this case, there is +no true deadlock, but because the transaction on which a transaction is +waiting is in the same thread of control, no forward progress can be made. +<table><tr><td><br></td><td width="1%"><a href="../../ref/txn/nested.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/txn/nested.html b/db/docs/ref/txn/nested.html new file mode 100644 index 000000000..a635abf52 --- /dev/null +++ b/db/docs/ref/txn/nested.html @@ -0,0 +1,66 @@ +<!--$Id: nested.so,v 10.17 2000/12/31 19:26:22 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Nested transactions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/txn/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/limits.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Nested transactions</h1> +<p>Berkeley DB provides support for nested transactions. Nested transactions +allow an application to decompose a large or long-running transaction +into smaller units that may be independently aborted. +<p>Normally, when beginning a transaction, the application will pass a NULL +value for the parent argument to <a href="../../api_c/txn_begin.html">txn_begin</a>. If, however, the +parent argument is a DB_TXN handle, then the newly created +transaction will be treated as a nested transaction within the parent. +Transactions may nest arbitrarily deeply. For the purposes of this +discussion, transactions created with a parent identifier will be called +child transactions. +<p>Once a transaction becomes a parent, as long as any of its child +transactions are unresolved (i.e., they have neither committed nor +aborted), the parent may not issue any Berkeley DB calls except to begin more +child transactions or to commit or abort. That is, it may not issue +any access method or cursor calls. Once all of a parent's children have +committed or aborted, the parent may again request operations on its +own behalf. +<p>The semantics of nested transactions are as follows. When a child +transaction is begun, it inherits all the locks of its parent. This +means that the child will never block waiting on a lock held by its +parent. However, if a parent attempts to obtain locks after they have +begun a child, the parental locks can conflict with those held by a +child. Furthermore, locks held by two different children will also +conflict. To make this concrete, consider the following set of +transactions and lock acquisitions. +<p>Transaction T1 is the parent transaction. It acquires an exclusive lock +on item A and then begins two child transactions, C1 and C2. C1 also +wishes to acquire a write lock on A; this succeeds. Now, let's say that +C1 acquires a write lock on B. If C2 now attempts to obtain a lock on +B, it will block. However, let's now assume that C1 commits. Its locks +are anti-inherited, which means they are now given to T1. At this +point, either T1 or C2 is allowed to acquire a lock on B. If, however, +transaction T1 aborts, then its locks are released. Future requests by +T1 or C2 will also succeed, but they will be obtaining new locks as +opposed to piggy-backing off a lock already held by T1. +<p>Child transactions are entirely subservient to their parent transaction. +They may abort, undoing their operations regardless of the eventual fate +of the parent. However, even if a child transaction commits, if its +parent transaction is eventually aborted, the child's changes are undone +and the child's transaction is effectively aborted. Any child +transactions that are not yet resolved when the parent commits or aborts +are resolved based on the parent's resolution, committing if the parent +commits and aborting if the parent aborts. Any child transactions that +are not yet resolved when the parent prepares are also prepared. +<table><tr><td><br></td><td width="1%"><a href="../../ref/txn/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/txn/limits.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/txn/other.html b/db/docs/ref/txn/other.html new file mode 100644 index 000000000..e4678c2cb --- /dev/null +++ b/db/docs/ref/txn/other.html @@ -0,0 +1,67 @@ +<!--$Id: other.so,v 10.16 2000/03/18 21:43:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Transactions and non-Berkeley DB applications</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Transaction Subsystem</dl></h3></td> +<td width="1%"><a href="../../ref/txn/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rpc/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Transactions and non-Berkeley DB applications</h1> +<p>It is possible to use the locking, logging and transaction subsystems +of Berkeley DB to provide transaction semantics on objects other than those +described by the Berkeley DB access methods. In these cases, the application +will need more explicit customization of the subsystems as well as the +development of appropriate data-structure-specific recovery functions. +<p>For example, consider an application that provides transaction semantics +on data stored in plain UNIX files accessed using the POSIX read and write +system calls. The operations for which transaction protection is desired +are bracketed by calls to <a href="../../api_c/txn_begin.html">txn_begin</a> and <a href="../../api_c/txn_commit.html">txn_commit</a>. +<p>Before data are referenced, the application must make a call to the lock +manager, <a href="../../api_c/lock_get.html">lock_get</a>, for a lock of the appropriate type (e.g., +read) on the object being locked. The object might be a page in the file, +a byte, a range of bytes, or some key. It is up to the application to +ensure that appropriate locks are acquired. Before a write is performed, +the application should acquire a write lock on the object, by making an +appropriate call to the lock manager, <a href="../../api_c/lock_get.html">lock_get</a>. Then, the +application should make a call to the log manager, <a href="../../api_c/log_put.html">log_put</a>, to +record enough information to redo the operation in case of failure after +commit and to undo the operation in case of abort. +<p>It is important, when designing applications that will use the log +subsystem, to remember that the application is responsible for providing +any necessary structure to the log record. For example, the application +must understand what part of the log record is an operation code, what +part identifies the file being modified, what part is redo information, +and what part is undo information. +<p>After the log message is written, the application may issue the write +system call. After all requests are issued, the application may call +<a href="../../api_c/txn_commit.html">txn_commit</a>. When <a href="../../api_c/txn_commit.html">txn_commit</a> returns, the caller is +guaranteed that all necessary log writes have been written to disk. +<p>At any time, the application may call <a href="../../api_c/txn_abort.html">txn_abort</a>, which will result +in restoration of the database to a consistent pre-transaction state. +(The application may specify its own recovery function for this purpose +using the <a href="../../api_c/env_set_tx_recover.html">DBENV->set_tx_recover</a> function. The recovery function must be +able to either re-apply or undo the update depending on the context, for +each different type of log record.) +<p>If the application should crash, the recovery process uses the log to +restore the database to a consistent state. +<p>The <a href="../../api_c/txn_prepare.html">txn_prepare</a> function provides the core functionality to +implement distributed transactions, but it does not manage the +notification of distributed transaction managers. The caller is +responsible for issuing <a href="../../api_c/txn_prepare.html">txn_prepare</a> calls to all sites +participating in the transaction. If all responses are positive, the +caller can issue a <a href="../../api_c/txn_commit.html">txn_commit</a>. If any of the responses are +negative, the caller should issue a <a href="../../api_c/txn_abort.html">txn_abort</a>. In general, the +<a href="../../api_c/txn_prepare.html">txn_prepare</a> call requires that the transaction log be flushed to +disk. +<table><tr><td><br></td><td width="1%"><a href="../../ref/txn/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rpc/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.2.0/convert.html b/db/docs/ref/upgrade.2.0/convert.html new file mode 100644 index 000000000..ad5685368 --- /dev/null +++ b/db/docs/ref/upgrade.2.0/convert.html @@ -0,0 +1,74 @@ +<!--$Id: convert.so,v 11.6 2000/03/18 21:43:19 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 2.0: converting applications</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.2.0/system.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 2.0: converting applications</h1> +<p>Mapping the Berkeley DB 1.85 functionality into Berkeley DB version 2 is almost always +simple. The manual page <a href="../../api_c/db_open.html">DB->open</a> replaces the Berkeley DB 1.85 manual +pages <b>dbopen</b>(3), <b>btree</b>(3), <b>hash</b>(3) and +<b>recno</b>(3). You should be able to convert each 1.85 function +call into a Berkeley DB version 2 function call using just the <a href="../../api_c/db_open.html">DB->open</a> +documentation. +<p>Some guidelines and things to watch out for: +<p><ol> +<p><li>Most access method functions have exactly the same semantics as in Berkeley DB +1.85, although the arguments to the functions have changed in some cases. +To get your code to compile, the most common change is to add the +transaction ID as an argument (NULL, since Berkeley DB 1.85 did not support +transactions.) +<p><li>You must always initialize DBT structures to zero before using them with +any Berkeley DB version 2 function. (They do not normally have to be +reinitialized each time, only when they are first allocated. Do this by +declaring the DBT structure external or static, or by calling the C +library routine <b>bzero</b>(3) or <b>memset</b>(3).) +<p><li>The error returns are completely different in the two versions. In Berkeley DB +1.85, < 0 meant an error, and > 0 meant a minor Berkeley DB exception. +In Berkeley DB 2.0, > 0 means an error (the Berkeley DB version 2 functions +return <b>errno</b> on error) and < 0 means a Berkeley DB exception. +See <a href="../../ref/program/errorret.html">Error Returns to Applications</a> +for more information. +<p><li>The Berkeley DB 1.85 DB->seq function has been replaced by cursors in Berkeley DB +version 2. The semantics are approximately the same, but cursors require +the creation of an extra object (the DBC object), which is then used to +access the database. +<p>Specifically, the partial key match and range search functionality of the +R_CURSOR flag in DB->seq has been replaced by the +<a href="../../api_c/dbc_get.html#DB_SET_RANGE">DB_SET_RANGE</a> flag in <a href="../../api_c/dbc_get.html">DBcursor->c_get</a>. +<p><li>In version 2 of the Berkeley DB library, additions or deletions into Recno +(fixed and variable-length record) databases no longer automatically +logically renumber all records after the add/delete point, by default. +The default behavior is that deleting records does not cause subsequent +records to be renumbered, and it is an error to attempt to add new records +between records already in the database. Applications wanting the +historic Recno access method semantics should call the +<a href="../../api_c/db_set_flags.html">DB->set_flags</a> function with the <a href="../../api_c/db_set_flags.html#DB_RENUMBER">DB_RENUMBER</a> flag. +<p><li>Opening a database in Berkeley DB version 2 is a much heavier-weight operation +than it was in Berkeley DB 1.85. Therefore, if your historic applications were +written to open a database, perform a single operation, and close the +database, you may observe performance degradation. In most cases, this +is due to the expense of creating the environment upon each open. While +we encourage restructuring your application to avoid repeated opens and +closes, you can probably recover most of the lost performance by simply +using a persistent environment across invocations. +</ol> +<p>While simply converting Berkeley DB 1.85 function calls to Berkeley DB version 2 +function calls will work, we recommend that you eventually reconsider your +application's interface to the Berkeley DB database library in light of the +additional functionality supplied by Berkeley DB version 2, as it is likely to +result in enhanced application performance. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.2.0/system.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.2.0/disk.html b/db/docs/ref/upgrade.2.0/disk.html new file mode 100644 index 000000000..8e7aeabc7 --- /dev/null +++ b/db/docs/ref/upgrade.2.0/disk.html @@ -0,0 +1,27 @@ +<!--$Id: disk.so,v 11.6 2000/12/05 20:36:25 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 2.0: upgrade requirements</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.2.0/convert.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 2.0: upgrade requirements</h1> +<p>You will need to upgrade your on-disk databases, as all access method +database formats changed in the Berkeley DB 2.0 release. For information on +converting databases from Berkeley DB 1.85 to Berkeley DB 2.0, see the +<a href="../../utility/db_dump.html">db_dump185</a> and <a href="../../utility/db_load.html">db_load</a> documentation. As database +environments did not exist prior to the 2.0 release, there is no +question of upgrading existing database environments. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.2.0/convert.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.2.0/intro.html b/db/docs/ref/upgrade.2.0/intro.html new file mode 100644 index 000000000..1bebc81cb --- /dev/null +++ b/db/docs/ref/upgrade.2.0/intro.html @@ -0,0 +1,32 @@ +<!--$Id: intro.so,v 11.8 2000/12/21 18:33:44 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 2.0: introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade/process.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/system.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 2.0: introduction</h1> +<p>The following pages describe how to upgrade applications coded against +the Berkeley DB 1.85 and 1.86 release interfaces to the Berkeley DB 2.0 release +interfaces. They do not describe how to upgrade to the current Berkeley DB +release interfaces. +<p>It is not difficult to upgrade Berkeley DB 1.85 applications to use the Berkeley DB +version 2 library. The Berkeley DB version 2 library has a Berkeley DB 1.85 +compatibility API, which you can use by either recompiling your +application's source code or by relinking its object files against the +version 2 library. The underlying databases must be converted, however, +as the Berkeley DB version 2 library has a different underlying database format. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade/process.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/system.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.2.0/system.html b/db/docs/ref/upgrade.2.0/system.html new file mode 100644 index 000000000..60a11c9bd --- /dev/null +++ b/db/docs/ref/upgrade.2.0/system.html @@ -0,0 +1,84 @@ +<!--$Id: system.so,v 11.5 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 2.0: system integration</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.2.0/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/convert.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 2.0: system integration</h1> +<p><ol> +<p><li>It is possible to maintain both the Berkeley DB 1.85 and Berkeley DB version 2 +libraries on your system. However, the <b>db.h</b> include file that +was distributed with Berkeley DB 1.85 is not compatible with the <b>db.h</b> +file distributed with Berkeley DB version 2, so you will have to install them +in different locations. In addition, both the Berkeley DB 1.85 and Berkeley DB +version 2 libraries are named <b>libdb.a</b>. +<p>As the Berkeley DB 1.85 library did not have an installation target in the +Makefile, there's no way to know exactly where it was installed on the +system. In addition, many vendors included it in the C library instead +of as a separate library, and so it may actually be part of libc and the +<b>db.h</b> include file may be installed in <b>/usr/include</b>. +<p>For these reasons, the simplest way to maintain both libraries is to +install Berkeley DB version 2 in a completely separate area of your system. +The Berkeley DB version 2 installation process allows you to install into a +standalone directory hierarchy on your system. See the +<a href="../../ref/build_unix/intro.html">Building for UNIX systems</a> +documentation for more information and instructions on how to install the +Berkeley DB version 2 library, include files and documentation into specific +locations. +<p><li>Alternatively, you can replace Berkeley DB 1.85 on your system with Berkeley DB +version 2. In this case, you'll probably want to install Berkeley DB version +2 in the normal place on your system, wherever that may be, and delete +the Berkeley DB 1.85 include files, manual pages and libraries. +<p>To replace 1.85 with version 2, you must either convert your 1.85 +applications to use the version 2 API or build the Berkeley DB version 2 library +to include Berkeley DB 1.85 interface compatibility code. Whether converting +your applications to use the version 2 interface or using the version 1.85 +compatibility API, you will need to recompile or relink your 1.85 +applications, and you must convert any persistent application databases +to the Berkeley DB version 2 database formats. +<p>If you want to recompile your Berkeley DB 1.85 applications, you will have to +change them to include the file <b>db_185.h</b> instead of +<b>db.h</b>. (The <b>db_185.h</b> file is automatically installed +during the Berkeley DB version 2 installation process.) You can then recompile +the applications, linking them against the Berkeley DB version 2 library. +<p>For more information on compiling the Berkeley DB 1.85 compatibility code into +the Berkeley DB version 2 library, see <a href="../../ref/build_unix/intro.html">Building for UNIX platforms</a>. +<p>For more information on converting databases from the Berkeley DB 1.85 formats +to the Berkeley DB version 2 formats, see the <a href="../../utility/db_dump.html">db_dump185</a> and +<a href="../../utility/db_load.html">db_load</a> documentation. +<p><li>Finally, although we certainly do not recommend it, it is possible to +load both Berkeley DB 1.85 and Berkeley DB version 2 into the same library. +Similarly, it is possible to use both Berkeley DB 1.85 and Berkeley DB version 2 +within a single application, although it is not possible to use them from +within the same file. +<p>The name space in Berkeley DB version 2 has been changed from that of previous +Berkeley DB versions, notably version 1.85, for portability and consistency +reasons. The only name collisions in the two libraries are the names used +by the historic <a href="../../api_c/dbm.html">dbm</a>, <a href="../../api_c/dbm.html">ndbm</a> and <a href="../../api_c/hsearch.html">hsearch</a> interfaces, +and the Berkeley DB 1.85 compatibility interfaces in the Berkeley DB version 2 +library. +<p>If you are loading both Berkeley DB 1.85 and Berkeley DB version 2 into a single +library, remove the historic interfaces from one of the two library +builds, and configure the Berkeley DB version 2 build to not include the Berkeley DB +1.85 compatibility API, otherwise you could have collisions and undefined +behavior. This can be done by editing the library Makefiles and +reconfiguring and rebuilding the Berkeley DB version 2 library. Obviously, if +you use the historic interfaces, you will get the version in the library +from which you did not remove them. Similarly, you will not be able to +access Berkeley DB version 2 files using the Berkeley DB 1.85 compatibility interface, +since you have removed that from the library as well. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.2.0/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/convert.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.2.0/toc.html b/db/docs/ref/upgrade.2.0/toc.html new file mode 100644 index 000000000..68502be59 --- /dev/null +++ b/db/docs/ref/upgrade.2.0/toc.html @@ -0,0 +1,20 @@ +<!--$Id: toc.so,v 11.2 2000/12/05 20:36:25 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB: Upgrading Berkeley DB 1.XX applications to Berkeley DB 2.0</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<h1 align=center>Upgrading Berkeley DB 1.XX applications to Berkeley DB 2.0</h1> +<ol> +<li><a href="intro.html">Release 2.0: introduction</a> +<li><a href="system.html">Release 2.0: system integration</a> +<li><a href="convert.html">Release 2.0: converting applications</a> +<li><a href="disk.html">Release 2.0: upgrade requirements</a> +</ol> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/close.html b/db/docs/ref/upgrade.3.0/close.html new file mode 100644 index 000000000..620e4babb --- /dev/null +++ b/db/docs/ref/upgrade.3.0/close.html @@ -0,0 +1,34 @@ +<!--$Id: close.so,v 11.9 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: DB->sync and DB->close</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: DB->sync and DB->close</h1> +<p>In previous Berkeley DB releases, the <a href="../../api_c/db_close.html">DB->close</a> and <a href="../../api_c/db_sync.html">DB->sync</a> functions +discarded any return of <a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a> from the underlying buffer +pool interfaces, and returned success to its caller. (The +<a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a> error will be returned if the buffer pool functions +are unable to flush all of the database's dirty blocks from the pool. +This often happens if another thread is reading or writing the database's +pages in the pool.) +<p>In the 3.X release, <a href="../../api_c/db_sync.html">DB->sync</a> and <a href="../../api_c/db_close.html">DB->close</a> will return +<a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a> to the application. The best solution is to not +call <a href="../../api_c/db_sync.html">DB->sync</a> and specify the <a href="../../api_c/db_close.html#DB_NOSYNC">DB_NOSYNC</a> flag to the +<a href="../../api_c/db_close.html">DB->close</a> function when multiple threads are expected to be accessing the +database. Alternatively, the caller can ignore any error return of +<a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/cxx.html b/db/docs/ref/upgrade.3.0/cxx.html new file mode 100644 index 000000000..7f6c1ab7e --- /dev/null +++ b/db/docs/ref/upgrade.3.0/cxx.html @@ -0,0 +1,31 @@ +<!--$Id: cxx.so,v 11.5 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: additional C++ changes</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/db_cxx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/java.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: additional C++ changes</h1> +<p>The Db::set_error_model method is gone. The way to change the C++ API to +return errors rather than throw exceptions is via a flag on the DbEnv or +Db constructor. For example: +<p><blockquote><pre>int dberr; +DbEnv *dbenv = new DbEnv(DB_CXX_NO_EXCEPTIONS);</pre></blockquote> +<p>creates an environment that will never throw exceptions, and method +returns should be checked instead. +<p>There are a number of smaller changes to the API that bring the C, C++ +and Java APIs much closer in terms of functionality and usage. Please +refer to the pages for upgrading C applications for further details. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/db_cxx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/java.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/db.html b/db/docs/ref/upgrade.3.0/db.html new file mode 100644 index 000000000..a086b589e --- /dev/null +++ b/db/docs/ref/upgrade.3.0/db.html @@ -0,0 +1,48 @@ +<!--$Id: db.so,v 11.9 2000/12/01 17:57:34 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: the DB structure</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/xa.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/dbinfo.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: the DB structure</h1> +<p>The DB structure is now opaque for applications in the Berkeley DB 3.0 +release. Accesses to any fields within that structure by the application +should be replaced with method calls. The following example illustrates +this using the historic type structure field. In the Berkeley DB 2.X releases, +applications could find the type of an underlying database using code +similar to the following: +<p><blockquote><pre>DB *db; +DB_TYPE type; +<p> + type = db->type;</pre></blockquote> +<p>in the Berkeley DB 3.X releases, this should be done using the +<a href="../../api_c/db_get_type.html">DB->get_type</a> method, as follows: +<p><blockquote><pre>DB *db; +DB_TYPE type; +<p> + type = db->get_type(db);</pre></blockquote> +<p>The following table lists the DB fields previously used by +applications and the methods that should now be used to get or set them. +<p><table border=1 align=center> +<tr><th>DB field</th><th>Berkeley DB 3.X method</th></tr> +<tr><td>byteswapped</td><td><a href="../../api_c/db_get_byteswapped.html">DB->get_byteswapped</a></td></tr> +<tr><td>db_errcall</td><td><a href="../../api_c/db_set_errcall.html">DB->set_errcall</a></td></tr> +<tr><td>db_errfile</td><td><a href="../../api_c/db_set_errfile.html">DB->set_errfile</a></td></tr> +<tr><td>db_errpfx</td><td><a href="../../api_c/db_set_errpfx.html">DB->set_errpfx</a></td></tr> +<tr><td>db_paniccall</td><td><a href="../../api_c/db_set_paniccall.html">DB->set_paniccall</a></td></tr> +<tr><td>type</td><td><a href="../../api_c/db_get_type.html">DB->get_type</a></td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/xa.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/dbinfo.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/db_cxx.html b/db/docs/ref/upgrade.3.0/db_cxx.html new file mode 100644 index 000000000..e3a794e38 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/db_cxx.html @@ -0,0 +1,47 @@ +<!--$Id: db_cxx.so,v 11.9 2000/03/22 22:02:14 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: the Db class for C++ and Java</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/dbenv_cxx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/cxx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: the Db class for C++ and Java</h1> +<p>The static Db::open method and the DbInfo class have been removed in the +Berkeley DB 3.0 release. The way to open a database file is to use the new Db +constructor with two arguments, followed by set_XXX methods to configure +the Db object, and finally a call to the new (nonstatic) Db::open(). In +comparing the Berkeley DB 3.0 release open method with the 2.X static open +method, the second argument is new. It is a database name, which can +be null. The DbEnv argument has been removed, as the environment is now +specified in the constructor. The open method no longer returns a Db, +since it operates on one. +<p>Here's a C++ example opening a Berkeley DB database using the 2.X interface: +<p><blockquote><pre>// Note: by default, errors are thrown as exceptions +Db *table; +Db::open("lookup.db", DB_BTREE, DB_CREATE, 0644, dbenv, 0, &table);</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code would be written as: +<p><blockquote><pre>// Note: by default, errors are thrown as exceptions +Db *table = new Db(dbenv, 0); +table->open("lookup.db", NULL, DB_BTREE, DB_CREATE, 0644);</pre></blockquote> +<p>Here's a Java example opening a Berkeley DB database using the 2.X interface: +<p><blockquote><pre>// Note: errors are thrown as exceptions +Db table = Db.open("lookup.db", Db.DB_BTREE, Db.DB_CREATE, 0644, dbenv, 0);</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code would be written as: +<p><blockquote><pre>// Note: errors are thrown as exceptions +Db table = new Db(dbenv, 0); +table.open("lookup.db", null, Db.DB_BTREE, Db.DB_CREATE, 0644);</pre></blockquote> +<p>Note that if the dbenv argument is null, the database will not exist +within an environment. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/dbenv_cxx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/cxx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/dbenv.html b/db/docs/ref/upgrade.3.0/dbenv.html new file mode 100644 index 000000000..08b6ec149 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/dbenv.html @@ -0,0 +1,68 @@ +<!--$Id: dbenv.so,v 11.9 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: the DB_ENV structure</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/func.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: the DB_ENV structure</h1> +<p>The DB_ENV structure is now opaque for applications in the Berkeley DB +3.0 release. Accesses to any fields within that structure by the +application should be replaced with method calls. The following example +illustrates this using the historic errpfx structure field. In the Berkeley DB +2.X releases, applications set error prefixes using code similar to the +following: +<p><blockquote><pre>DB_ENV *dbenv; +<p> + dbenv->errpfx = "my prefix";</pre></blockquote> +<p>in the Berkeley DB 3.X releases, this should be done using the +<a href="../../api_c/env_set_errpfx.html">DBENV->set_errpfx</a> method, as follows: +<p><blockquote><pre>DB_ENV *dbenv; +<p> + dbenv->set_errpfx(dbenv, "my prefix");</pre></blockquote> +<p>The following table lists the DB_ENV fields previously used by +applications and the methods that should now be used to set them. +<p><table border=1 align=center> +<tr><th>DB_ENV field</th><th>Berkeley DB 3.X method</th></tr> +<tr><td>db_errcall</td><td><a href="../../api_c/env_set_errcall.html">DBENV->set_errcall</a></td></tr> +<tr><td>db_errfile</td><td><a href="../../api_c/env_set_errfile.html">DBENV->set_errfile</a></td></tr> +<tr><td>db_errpfx</td><td><a href="../../api_c/env_set_errpfx.html">DBENV->set_errpfx</a></td></tr> +<tr><td>db_lorder</td><td>This field was removed from the DB_ENV structure in the Berkeley DB +3.0 release as no application should have ever used it. Any code using +it should be evaluated for potential bugs.</td></tr> +<tr><td>db_paniccall</td><td><a href="../../api_c/env_set_paniccall.html">DBENV->set_paniccall</a></td></tr> +<tr><td>db_verbose</td><td><a href="../../api_c/env_set_verbose.html">DBENV->set_verbose</a> +<p>Note: the db_verbose field was a simple boolean toggle, the +<a href="../../api_c/env_set_verbose.html">DBENV->set_verbose</a> method takes arguments that specify exactly +which verbose messages are desired.</td></tr> +<tr><td>lg_max</td><td><a href="../../api_c/env_set_lg_max.html">DBENV->set_lg_max</a></td></tr> +<tr><td>lk_conflicts</td><td><a href="../../api_c/env_set_lk_conflicts.html">DBENV->set_lk_conflicts</a></td></tr> +<tr><td>lk_detect</td><td><a href="../../api_c/env_set_lk_detect.html">DBENV->set_lk_detect</a></td></tr> +<tr><td>lk_max</td><td><a href="../../api_c/env_set_lk_max.html">DBENV->set_lk_max</a></td></tr> +<tr><td>lk_modes</td><td><a href="../../api_c/env_set_lk_conflicts.html">DBENV->set_lk_conflicts</a></td></tr> +<tr><td>mp_mmapsize</td><td><a href="../../api_c/env_set_mp_mmapsize.html">DBENV->set_mp_mmapsize</a></td></tr> +<tr><td>mp_size</td><td><a href="../../api_c/env_set_cachesize.html">DBENV->set_cachesize</a> +<p>Note: the <a href="../../api_c/env_set_cachesize.html">DBENV->set_cachesize</a> function takes additional arguments. +Setting both the second argument (the number of GB in the pool) and the +last argument (the number of memory pools to create) to 0 will result in +behavior that is backward compatible with previous Berkeley DB releases.</td></tr> +<tr><td>tx_info</td><td>This field was used by applications as an argument to the transaction +subsystem functions. As those functions take references to a +DB_ENV structure as arguments in the Berkeley DB 3.0 release, it should +no longer be used by any application.</td></tr> +<tr><td>tx_max</td><td><a href="../../api_c/env_set_tx_max.html">DBENV->set_tx_max</a></td></tr> +<tr><td>tx_recover</td><td><a href="../../api_c/env_set_tx_recover.html">DBENV->set_tx_recover</a></td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/func.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/open.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/dbenv_cxx.html b/db/docs/ref/upgrade.3.0/dbenv_cxx.html new file mode 100644 index 000000000..8839d6408 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/dbenv_cxx.html @@ -0,0 +1,72 @@ +<!--$Id: dbenv_cxx.so,v 11.10 2000/12/01 17:59:32 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: the DbEnv class for C++ and Java</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/value_set.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/db_cxx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: the DbEnv class for C++ and Java</h1> +<p>The DbEnv::appinit() method and two constructors for the DbEnv class are +gone. There is now a single way to create and initialize the environment. +The way to create an environment is to use the new DbEnv constructor with +one argument. After this call, the DbEnv can be configured with various +set_XXX methods. Finally, a call to DbEnv::open is made to initialize +the environment. +<p>Here's a C++ example creating a Berkeley DB environment using the 2.X interface +<p><blockquote><pre>int dberr; +DbEnv *dbenv = new DbEnv(); +<p> +dbenv->set_error_stream(&cerr); +dbenv->set_errpfx("myprog"); +<p> +if ((dberr = dbenv->appinit("/database/home", + NULL, DB_CREATE | DB_INIT_LOCK | DB_INIT_MPOOL)) != 0) { + cerr << "failure: " << strerror(dberr); + exit (1); +}</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code would be written as: +<p><blockquote><pre>int dberr; +DbEnv *dbenv = new DbEnv(0); +<p> +dbenv->set_error_stream(&cerr); +dbenv->set_errpfx("myprog"); +<p> +if ((dberr = dbenv->open("/database/home", + NULL, DB_CREATE | DB_INIT_LOCK | DB_INIT_MPOOL, 0)) != 0) { + cerr << "failure: " << dbenv->strerror(dberr); + exit (1); +}</pre></blockquote> +<p>Here's a Java example creating a Berkeley DB environment using the 2.X interface: +<p><blockquote><pre>int dberr; +DbEnv dbenv = new DbEnv(); +<p> +dbenv.set_error_stream(System.err); +dbenv.set_errpfx("myprog"); +<p> +dbenv.appinit("/database/home", + null, Db.DB_CREATE | Db.DB_INIT_LOCK | Db.DB_INIT_MPOOL);</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code would be written as: +<p><blockquote><pre>int dberr; +DbEnv dbenv = new DbEnv(0); +<p> +dbenv.set_error_stream(System.err); +dbenv.set_errpfx("myprog"); +<p> +dbenv.open("/database/home", + null, Db.DB_CREATE | Db.DB_INIT_LOCK | Db.DB_INIT_MPOOL, 0);</pre></blockquote> +<p>In the Berkeley DB 2.X release, DbEnv had accessors to obtain "managers" of type +DbTxnMgr, DbMpool, DbLog, DbTxnMgr. If you used any of these managers, +all their methods are now found directly in the DbEnv class. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/value_set.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/db_cxx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/dbinfo.html b/db/docs/ref/upgrade.3.0/dbinfo.html new file mode 100644 index 000000000..da1f8460d --- /dev/null +++ b/db/docs/ref/upgrade.3.0/dbinfo.html @@ -0,0 +1,72 @@ +<!--$Id: dbinfo.so,v 11.8 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: the DBINFO structure</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/db.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/join.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: the DBINFO structure</h1> +<p>The DB_INFO structure has been removed from the Berkeley DB 3.0 release. +Accesses to any fields within that structure by the application should be +replaced with method calls on the DB handle. The following +example illustrates this using the historic db_cachesize structure field. +In the Berkeley DB 2.X releases, applications could set the size of an +underlying database cache using code similar to the following: +<p><blockquote><pre>DB_INFO dbinfo; +<p> + memset(dbinfo, 0, sizeof(dbinfo)); + dbinfo.db_cachesize = 1024 * 1024;</pre></blockquote> +<p>in the Berkeley DB 3.X releases, this should be done using the +<a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> method, as follows: +<p><blockquote><pre>DB *db; +int ret; +<p> + ret = db->set_cachesize(db, 0, 1024 * 1024, 0);</pre></blockquote> +<p>The DB_INFO structure is no longer used in any way by the Berkeley DB 3.0 +release, and should be removed from the application. +<p>The following table lists the DB_INFO fields previously used by +applications and the methods that should now be used to set +them. Because these calls provide configuration for the +database open, they must precede the call to <a href="../../api_c/db_open.html">DB->open</a>. +Calling them after the call to <a href="../../api_c/db_open.html">DB->open</a> will return an +error. +<p><table border=1 align=center> +<tr><th>DB_INFO field</th><th>Berkeley DB 3.X method</th></tr> +<tr><td>bt_compare</td><td><a href="../../api_c/db_set_bt_compare.html">DB->set_bt_compare</a></td></tr> +<tr><td>bt_minkey</td><td><a href="../../api_c/db_set_bt_minkey.html">DB->set_bt_minkey</a></td></tr> +<tr><td>bt_prefix</td><td><a href="../../api_c/db_set_bt_prefix.html">DB->set_bt_prefix</a></td></tr> +<tr><td>db_cachesize</td><td><a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> +<p>Note: the <a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> function takes additional arguments. +Setting both the second argument (the number of GB in the pool) and the +last argument (the number of memory pools to create) to 0 will result in +behavior that is backward compatible with previous Berkeley DB releases.</td></tr> +<tr><td>db_lorder</td><td><a href="../../api_c/db_set_lorder.html">DB->set_lorder</a></td></tr> +<tr><td>db_malloc</td><td><a href="../../api_c/db_set_malloc.html">DB->set_malloc</a></td></tr> +<tr><td>db_pagesize</td><td><a href="../../api_c/db_set_pagesize.html">DB->set_pagesize</a></td></tr> +<tr><td>dup_compare</td><td><a href="../../api_c/db_set_dup_compare.html">DB->set_dup_compare</a></td></tr> +<tr><td>flags</td><td><a href="../../api_c/db_set_flags.html">DB->set_flags</a> +<p>Note: the DB_DELIMITER, DB_FIXEDLEN and DB_PAD flags no longer need to be +set as there are specific methods off the DB handle that set the +file delimiter, the length of fixed-length records and the fixed-length +record pad character. They should simply be discarded from the application.</td></tr> +<tr><td>h_ffactor</td><td><a href="../../api_c/db_set_h_ffactor.html">DB->set_h_ffactor</a></td></tr> +<tr><td>h_hash</td><td><a href="../../api_c/db_set_h_hash.html">DB->set_h_hash</a></td></tr> +<tr><td>h_nelem</td><td><a href="../../api_c/db_set_h_nelem.html">DB->set_h_nelem</a></td></tr> +<tr><td>re_delim</td><td><a href="../../api_c/db_set_re_delim.html">DB->set_re_delim</a></td></tr> +<tr><td>re_len</td><td><a href="../../api_c/db_set_re_len.html">DB->set_re_len</a></td></tr> +<tr><td>re_pad</td><td><a href="../../api_c/db_set_re_pad.html">DB->set_re_pad</a></td></tr> +<tr><td>re_source</td><td><a href="../../api_c/db_set_re_source.html">DB->set_re_source</a></td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/db.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/join.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/disk.html b/db/docs/ref/upgrade.3.0/disk.html new file mode 100644 index 000000000..f6ea2799b --- /dev/null +++ b/db/docs/ref/upgrade.3.0/disk.html @@ -0,0 +1,30 @@ +<!--$Id: disk.so,v 11.15 2000/12/21 18:37:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: upgrade requirements</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/java.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: upgrade requirements</h1> +<p>Log file formats and the Btree, Recno and Hash Access Method database +formats changed in the Berkeley DB 3.0 release. (The on-disk Btree/Recno +format changed from version 6 to version 7. The on-disk Hash format +changed from version 5 to version 6.) Until the underlying databases +are upgraded, the <a href="../../api_c/db_open.html">DB->open</a> function will return a <a href="../../api_c/db_open.html#DB_OLD_VERSION">DB_OLD_VERSION</a> +error. +<p>For further information on upgrading Berkeley DB installations, see +<a href="../../ref/upgrade/process.html">Upgrading Berkeley DB +installations</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/java.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/eacces.html b/db/docs/ref/upgrade.3.0/eacces.html new file mode 100644 index 000000000..b7fb3e859 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/eacces.html @@ -0,0 +1,28 @@ +<!--$Id: eacces.so,v 11.7 2000/12/01 17:58:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: EACCES</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/eagain.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/jump_set.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: EACCES</h1> +<p>There was an error in previous releases of the Berkeley DB documentation that +said that the <a href="../../api_c/lock_put.html">lock_put</a> and <a href="../../api_c/lock_vec.html">lock_vec</a> interfaces could +return EACCES as an error to indicate that a lock could not be released +because it was held by another locker. The application should be +searched for any occurrences of EACCES. For each of these, any that are +checking for an error return from <a href="../../api_c/lock_put.html">lock_put</a> or <a href="../../api_c/lock_vec.html">lock_vec</a> +should have the test and any error handling removed. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/eagain.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/jump_set.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/eagain.html b/db/docs/ref/upgrade.3.0/eagain.html new file mode 100644 index 000000000..e998c1b43 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/eagain.html @@ -0,0 +1,34 @@ +<!--$Id: eagain.so,v 11.5 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: EAGAIN</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/txn_commit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/eacces.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: EAGAIN</h1> +<p>Historically, the Berkeley DB interfaces have returned the POSIX error value +EAGAIN to indicate a deadlock. This has been removed from the Berkeley DB 3.0 +release in order to make it possible for applications to distinguish +between EAGAIN errors returned by the system and returns from Berkeley DB +indicating deadlock. +<p>The application should be searched for any occurrences of EAGAIN. For +each of these, any that are checking for a deadlock return from Berkeley DB +should be changed to check for the DB_LOCK_DEADLOCK return value. +<p>If, for any reason, this is a difficult change for the application to +make, the <b>include/db.src</b> distribution file should be modified to +translate all returns of DB_LOCK_DEADLOCK to EAGAIN. Search for the +string EAGAIN in that file, there is a comment that describes how to make +the change. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/txn_commit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/eacces.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/envopen.html b/db/docs/ref/upgrade.3.0/envopen.html new file mode 100644 index 000000000..3c20a0e9e --- /dev/null +++ b/db/docs/ref/upgrade.3.0/envopen.html @@ -0,0 +1,156 @@ +<!--$Id: envopen.so,v 11.12 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: environment open/close/unlink</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/func.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: environment open/close/unlink</h1> +<p>The hardest part of upgrading your application from a 2.X code base to +the 3.0 release is translating the Berkeley DB environment open, close and +remove calls. +<p>There were two logical changes in this part of the Berkeley DB interface. +First, in Berkeley DB 3.0, there are no longer separate structures that +represent each subsystem (e.g., DB_LOCKTAB or DB_TXNMGR) and an overall +DB_ENV environment structure. Instead there is only the +DB_ENV structure. This means that DB_ENV references should +be passed around by your application instead of passing around DB_LOCKTAB +or DB_TXNMGR references. This is likely to be a simple change for most +applications as few applications use the lock_XXX, log_XXX, +memp_XXX or txn_XXX interfaces to create Berkeley DB environments. +<p>The second change is that there are no longer separate open, close, and +unlink interfaces to the +Berkeley DB subsystems, e.g., in previous releases, it was possible to open a +lock subsystem either using db_appinit or using the lock_open call. In +the 3.0 release the XXX_open interfaces to the subsystems have been +removed, and subsystems must now be opened using the 3.0 replacement for the +db_appinit call. +<p>To upgrade your application, first find each place your application opens, +closes and/or removes a Berkeley DB environment. This will be code of the form: +<p><blockquote><pre>db_appinit, db_appexit +lock_open, lock_close, lock_unlink +log_open, log_close, log_unlink +memp_open, memp_close, memp_unlink +txn_open, txn_close, txn_unlink</pre></blockquote> +<p>Each of these groups of calls should be replaced with calls to: +<p><blockquote><pre><a href="../../api_c/env_create.html">db_env_create</a>, <a href="../../api_c/env_open.html">DBENV->open</a>, <a href="../../api_c/env_close.html">DBENV->close</a>, +<a href="../../api_c/env_remove.html">DBENV->remove</a></pre></blockquote> +<p>The <a href="../../api_c/env_create.html">db_env_create</a> call and the call to the <a href="../../api_c/env_open.html">DBENV->open</a> +method replace the db_appinit, lock_open, log_open, memp_open and txn_open +calls. The <a href="../../api_c/env_close.html">DBENV->close</a> method replaces the db_appexit, +lock_close, log_close, memp_close and txn_close calls. The +<a href="../../api_c/env_remove.html">DBENV->remove</a> call replaces the lock_unlink, log_unlink, +memp_unlink and txn_unlink calls. +<p>Here's an example creating a Berkeley DB environment using the 2.X interface: +<p><blockquote><pre>/* + * db_init -- + * Initialize the environment. + */ +DB_ENV * +db_init(home) + char *home; +{ + DB_ENV *dbenv; +<p> + if ((dbenv = (DB_ENV *)calloc(sizeof(DB_ENV), 1)) == NULL) + return (errno); +<p> + if ((errno = db_appinit(home, NULL, dbenv, + DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | + DB_USE_ENVIRON)) == 0) + return (dbenv); +<p> + free(dbenv); + return (NULL); +}</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code would be written as: +<p><blockquote><pre>/* + * db_init -- + * Initialize the environment. + */ +int +db_init(home, dbenvp) + char *home; + DB_ENV **dbenvp; +{ + int ret; + DB_ENV *dbenv; +<p> + if ((ret = db_env_create(&dbenv, 0)) != 0) + return (ret); +<p> + if ((ret = dbenv->open(dbenv, home, NULL, + DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | + DB_USE_ENVIRON, 0)) == 0) { + *dbenvp = dbenv; + return (0); + } +<p> + (void)dbenv->close(dbenv, 0); + return (ret); +}</pre></blockquote> +<p>As you can see, the arguments to db_appinit and to <a href="../../api_c/env_open.html">DBENV->open</a> are +largely the same. There is some minor re-organization: the mapping is +that arguments #1, 2, 3, and 4 to db_appinit become arguments #2, 3, 1 +and 4 to <a href="../../api_c/env_open.html">DBENV->open</a>. There is one additional argument to +<a href="../../api_c/env_open.html">DBENV->open</a>, argument #5. For backward compatibility with the 2.X +Berkeley DB releases, simply set that argument to 0. +<p>It is only slightly more complex to translate calls to XXX_open to the +<a href="../../api_c/env_open.html">DBENV->open</a> method. Here's an example of creating a lock region +using the 2.X interface: +<p><blockquote><pre>lock_open(dir, DB_CREATE, 0664, dbenv, ®ionp);</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code would be written as: +<p><blockquote><pre>if ((ret = db_env_create(&dbenv, 0)) != 0) + return (ret); +<p> +if ((ret = dbenv->open(dbenv, + dir, NULL, DB_CREATE | DB_INIT_LOCK, 0664)) == 0) { + *dbenvp = dbenv; + return (0); +}</pre></blockquote> +<p>Note that in this example, you no longer need the DB_LOCKTAB structure +reference that was required in Berkeley DB 2.X releases. +<p>The final issue with upgrading the db_appinit call is the DB_MPOOL_PRIVATE +option previously provided for the db_appinit interface. If your +application is using this flag, it should almost certainly use the new +<a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag to the <a href="../../api_c/env_open.html">DBENV->open</a> interface. Regardless, +you should carefully consider this change before converting to use the +<a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag. +<p>Translating db_appexit or XXX_close calls to <a href="../../api_c/env_close.html">DBENV->close</a> is equally +simple. Instead of taking a reference to a per-subsystem structure such +as DB_LOCKTAB or DB_TXNMGR, all calls take a reference to a DB_ENV +structure. The calling sequence is otherwise unchanged. Note that as +the application no longer allocates the memory for the DB_ENV structure, +application code to discard it after the call to db_appexit() is no longer +needed. +<p>Translating XXX_unlink calls to <a href="../../api_c/env_remove.html">DBENV->remove</a> is slightly more complex. +As with <a href="../../api_c/env_close.html">DBENV->close</a>, the call takes a reference to a DB_ENV +structure instead of a per-subsystem structure. The calling sequence is +slightly different, however. Here is an example of removing a lock region +using the 2.X interface: +<p><blockquote><pre>DB_ENV *dbenv; +<p> +ret = lock_unlink(dir, 1, dbenv);</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code fragment would be written as: +<p><blockquote><pre>DB_ENV *dbenv; +<p> +ret = dbenv->remove(dbenv, dir, NULL, DB_FORCE);</pre></blockquote> +<p>The additional argument to the <a href="../../api_c/env_remove.html">DBENV->remove</a> function is a +configuration argument similar to that previously taken by db_appinit and +now taken by the <a href="../../api_c/env_open.html">DBENV->open</a> method. For backward compatibility +this new argument should simply be set to NULL. The force argument to +XXX_unlink is now a flag value that is set by bitwise inclusively <b>OR</b>'ing it the +<a href="../../api_c/env_remove.html">DBENV->remove</a> flag argument. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/func.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/func.html b/db/docs/ref/upgrade.3.0/func.html new file mode 100644 index 000000000..b6f7d816b --- /dev/null +++ b/db/docs/ref/upgrade.3.0/func.html @@ -0,0 +1,69 @@ +<!--$Id: func.so,v 11.8 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: function arguments</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/envopen.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/dbenv.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: function arguments</h1> +<p>In Berkeley DB 3.0, there are no longer separate structures that +represent each subsystem (e.g., DB_LOCKTAB or DB_TXNMGR), and an overall +DB_ENV environment structure. Instead there is only the +DB_ENV structure. This means that DB_ENV references should +be passed around by your application instead of passing around DB_LOCKTAB +or DB_TXNMGR references. +<p>Each of the following functions: +<p><blockquote><pre>lock_detect +lock_get +lock_id +lock_put +lock_stat +lock_vec</pre></blockquote> +<p>should have its first argument, a reference to the DB_LOCKTAB structure, +replaced with a reference to the enclosing DB_ENV structure. For +example, the following line of code from a Berkeley DB 2.X application: +<p><blockquote><pre>DB_LOCKTAB *lt; +DB_LOCK lock; + ret = lock_put(lt, lock);</pre></blockquote> +<p>should now be written as follows: +<p><blockquote><pre>DB_ENV *dbenv; +DB_LOCK *lock; + ret = lock_put(dbenv, lock);</pre></blockquote> +<p>Similarly, all of the functions: +<p><blockquote><pre>log_archive +log_compare +log_file +log_flush +log_get +log_put +log_register +log_stat +log_unregister</pre></blockquote> +<p>should have their DB_LOG argument replaced with a reference to a +DB_ENV structure, and the functions: +<p><blockquote><pre>memp_fopen +memp_register +memp_stat +memp_sync +memp_trickle</pre></blockquote> +<p>should have their DB_MPOOL argument replaced with a reference to a +DB_ENV structure. +<p>You should remove all references to DB_LOCKTAB, DB_LOG, DB_MPOOL, and +DB_TXNMGR structures from your application, they are no longer useful +in any way. In fact, a simple way to identify all of the places that +need to be upgraded is to remove all such structures and variables +they declare, and then compile. You will see a warning message from +your compiler in each case that needs to be upgraded. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/envopen.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/dbenv.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/intro.html b/db/docs/ref/upgrade.3.0/intro.html new file mode 100644 index 000000000..a74e40f4e --- /dev/null +++ b/db/docs/ref/upgrade.3.0/intro.html @@ -0,0 +1,26 @@ +<!--$Id: intro.so,v 11.6 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.2.0/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/envopen.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: introduction</h1> +<p>The following pages describe how to upgrade applications coded against +the Berkeley DB 2.X release interfaces to the Berkeley DB 3.0 release interfaces. +This information does not describe how to upgrade Berkeley DB 1.85 release +applications. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.2.0/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/envopen.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/java.html b/db/docs/ref/upgrade.3.0/java.html new file mode 100644 index 000000000..3997095bc --- /dev/null +++ b/db/docs/ref/upgrade.3.0/java.html @@ -0,0 +1,34 @@ +<!--$Id: java.so,v 11.8 2000/12/01 18:33:56 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: additional Java changes</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/cxx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: additional Java changes</h1> +<p>There are several additional types of exceptions thrown in the Berkeley DB 3.0 +Java API. +<p>DbMemoryException and DbDeadlockException can be caught independently of +DbException if you want to do special handling for these kinds of errors. +Since they are subclassed from DbException, a try block that catches +DbException will catch these also, so code is not required to change. +The catch clause for these new exceptions should appear before the catch +clause for DbException. +<p>You will need to add a catch clause for java.io.FileNotFoundException, +since that can be thrown by the <a href="../../api_java/db_open.html">Db.open</a> and <a href="../../api_java/env_open.html">DbEnv.open</a> functions. +<p>There are a number of smaller changes to the API that bring the C, C++ +and Java APIs much closer in terms of functionality and usage. Please +refer to the pages for upgrading C applications for further details. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/cxx.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/join.html b/db/docs/ref/upgrade.3.0/join.html new file mode 100644 index 000000000..82c9019fa --- /dev/null +++ b/db/docs/ref/upgrade.3.0/join.html @@ -0,0 +1,28 @@ +<!--$Id: join.so,v 11.9 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: DB->join</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/dbinfo.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: DB->join</h1> +<p>Historically, the last two arguments to the Berkeley DB <a href="../../api_c/db_join.html">DB->join</a> +interface were a flags value followed by a reference to a memory location +to store the returned cursor object. In the Berkeley DB 3.0 release, the +order of those two arguments has been swapped for consistency with other +Berkeley DB interfaces. +<p>The application should be searched for any occurrences of <a href="../../api_c/db_join.html">DB->join</a>. +For each of these, the order of the last two arguments should be swapped. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/dbinfo.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/jump_set.html b/db/docs/ref/upgrade.3.0/jump_set.html new file mode 100644 index 000000000..c93e7270e --- /dev/null +++ b/db/docs/ref/upgrade.3.0/jump_set.html @@ -0,0 +1,48 @@ +<!--$Id: jump_set.so,v 11.6 2000/03/18 21:43:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: db_jump_set</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/eacces.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/value_set.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: db_jump_set</h1> +<p>The db_jump_set interface has been removed from the Berkeley DB 3.0 release, +replaced by method calls on the DB_ENV handle. +<p>The following table lists the db_jump_set arguments previously used by +applications and the methods that should now be used instead. +<p><table border=1 align=center> +<tr><th>db_jump_set argument</th><th>Berkeley DB 3.X method</th></tr> +<tr><td>DB_FUNC_CLOSE</td><td><a href="../../api_c/set_func_close.html">db_env_set_func_close</a></td></tr> +<tr><td>DB_FUNC_DIRFREE</td><td><a href="../../api_c/set_func_dirfree.html">db_env_set_func_dirfree</a></td></tr> +<tr><td>DB_FUNC_DIRLIST</td><td><a href="../../api_c/set_func_dirlist.html">db_env_set_func_dirlist</a></td></tr> +<tr><td>DB_FUNC_EXISTS</td><td><a href="../../api_c/set_func_exists.html">db_env_set_func_exists</a></td></tr> +<tr><td>DB_FUNC_FREE</td><td><a href="../../api_c/set_func_free.html">db_env_set_func_free</a></td></tr> +<tr><td>DB_FUNC_FSYNC</td><td><a href="../../api_c/set_func_fsync.html">db_env_set_func_fsync</a></td></tr> +<tr><td>DB_FUNC_IOINFO</td><td><a href="../../api_c/set_func_ioinfo.html">db_env_set_func_ioinfo</a></td></tr> +<tr><td>DB_FUNC_MALLOC</td><td><a href="../../api_c/set_func_malloc.html">db_env_set_func_malloc</a></td></tr> +<tr><td>DB_FUNC_MAP</td><td><a href="../../api_c/set_func_map.html">db_env_set_func_map</a></td></tr> +<tr><td>DB_FUNC_OPEN</td><td><a href="../../api_c/set_func_open.html">db_env_set_func_open</a></td></tr> +<tr><td>DB_FUNC_READ</td><td><a href="../../api_c/set_func_read.html">db_env_set_func_read</a></td></tr> +<tr><td>DB_FUNC_REALLOC</td><td><a href="../../api_c/set_func_realloc.html">db_env_set_func_realloc</a></td></tr> +<tr><td>DB_FUNC_RUNLINK</td><td>The DB_FUNC_RUNLINK functionality has been removed from the Berkeley DB +3.0 release, and should be removed from the application.</td></tr> +<tr><td>DB_FUNC_SEEK</td><td><a href="../../api_c/set_func_seek.html">db_env_set_func_seek</a></td></tr> +<tr><td>DB_FUNC_SLEEP</td><td><a href="../../api_c/set_func_sleep.html">db_env_set_func_sleep</a></td></tr> +<tr><td>DB_FUNC_UNLINK</td><td><a href="../../api_c/set_func_unlink.html">db_env_set_func_unlink</a></td></tr> +<tr><td>DB_FUNC_UNMAP</td><td><a href="../../api_c/set_func_unmap.html">db_env_set_func_unmap</a></td></tr> +<tr><td>DB_FUNC_WRITE</td><td><a href="../../api_c/set_func_write.html">db_env_set_func_write</a></td></tr> +<tr><td>DB_FUNC_YIELD</td><td><a href="../../api_c/set_func_yield.html">db_env_set_func_yield</a></td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/eacces.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/value_set.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/lock_detect.html b/db/docs/ref/upgrade.3.0/lock_detect.html new file mode 100644 index 000000000..4ff00a8a6 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/lock_detect.html @@ -0,0 +1,24 @@ +<!--$Id: lock_detect.so,v 11.8 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: lock_detect</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/lock_put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: lock_detect</h1> +<p>An additional argument has been added to the <a href="../../api_c/lock_detect.html">lock_detect</a> interface. +<p>The application should be searched for any occurrences of <a href="../../api_c/lock_detect.html">lock_detect</a>. +For each one, a NULL argument should be appended to the current arguments. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/lock_put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/lock_notheld.html b/db/docs/ref/upgrade.3.0/lock_notheld.html new file mode 100644 index 000000000..3f1173856 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/lock_notheld.html @@ -0,0 +1,27 @@ +<!--$Id: lock_notheld.so,v 11.7 2000/12/01 17:58:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: DB_LOCK_NOTHELD</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/rmw.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/eagain.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: DB_LOCK_NOTHELD</h1> +<p>Historically, the Berkeley DB <a href="../../api_c/lock_put.html">lock_put</a> and <a href="../../api_c/lock_vec.html">lock_vec</a> interfaces +could return the DB_LOCK_NOTHELD error to indicate that a lock could +not be released as it was held by another locker. This error can no +longer be returned under any circumstances. The application should be +searched for any occurrences of DB_LOCK_NOTHELD. For each of these, +the test and any error processing should be removed. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/rmw.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/eagain.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/lock_put.html b/db/docs/ref/upgrade.3.0/lock_put.html new file mode 100644 index 000000000..d6057f8e2 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/lock_put.html @@ -0,0 +1,25 @@ +<!--$Id: lock_put.so,v 11.8 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: lock_put</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/close.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_detect.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: lock_put</h1> +<p>An argument change has been made in the <a href="../../api_c/lock_put.html">lock_put</a> interface. +<p>The application should be searched for any occurrences of <a href="../../api_c/lock_put.html">lock_put</a>. +For each one, instead of passing a DB_LOCK variable as the last argument +to the function, the address of the DB_LOCK variable should be passed. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/close.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_detect.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/lock_stat.html b/db/docs/ref/upgrade.3.0/lock_stat.html new file mode 100644 index 000000000..80504db3b --- /dev/null +++ b/db/docs/ref/upgrade.3.0/lock_stat.html @@ -0,0 +1,24 @@ +<!--$Id: lock_stat.so,v 11.3 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: lock_stat</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/lock_detect.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/log_register.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: lock_stat</h1> +<p>The <b>st_magic</b>, <b>st_version</b>, <b>st_numobjs</b> and +<b>st_refcnt</b> fields returned from the <a href="../../api_c/lock_stat.html">lock_stat</a> interface +have been removed, and this information is no longer available. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/lock_detect.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/log_register.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/log_register.html b/db/docs/ref/upgrade.3.0/log_register.html new file mode 100644 index 000000000..3a856275f --- /dev/null +++ b/db/docs/ref/upgrade.3.0/log_register.html @@ -0,0 +1,25 @@ +<!--$Id: log_register.so,v 11.8 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: log_register</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/lock_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/log_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: log_register</h1> +<p>An argument has been removed from the <a href="../../api_c/log_register.html">log_register</a> interface. +The application should be searched for any occurrences of +<a href="../../api_c/log_register.html">log_register</a>. In each of these, the DBTYPE argument (it is the +fourth argument) should be removed. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/lock_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/log_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/log_stat.html b/db/docs/ref/upgrade.3.0/log_stat.html new file mode 100644 index 000000000..8c023bfe2 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/log_stat.html @@ -0,0 +1,23 @@ +<!--$Id: log_stat.so,v 11.3 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: log_stat</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/log_register.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/memp_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: log_stat</h1> +<p>The <b>st_refcnt</b> field returned from the <a href="../../api_c/log_stat.html">log_stat</a> interface +has been removed, and this information is no longer available. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/log_register.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/memp_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/memp_stat.html b/db/docs/ref/upgrade.3.0/memp_stat.html new file mode 100644 index 000000000..ff61fa745 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/memp_stat.html @@ -0,0 +1,26 @@ +<!--$Id: memp_stat.so,v 11.3 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: memp_stat</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/log_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/txn_begin.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: memp_stat</h1> +<p>The <b>st_refcnt</b> field returned from the <a href="../../api_c/memp_stat.html">memp_stat</a> interface +has been removed, and this information is no longer available. +<p>The <b>st_cachesize</b> field returned from the <a href="../../api_c/memp_stat.html">memp_stat</a> +interface has been replaced with two new fields, <b>st_gbytes</b> and +<b>st_bytes</b>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/log_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/txn_begin.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/open.html b/db/docs/ref/upgrade.3.0/open.html new file mode 100644 index 000000000..3730ab474 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/open.html @@ -0,0 +1,65 @@ +<!--$Id: open.so,v 11.10 2000/03/18 21:43:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: database open/close</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/dbenv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/xa.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: database open/close</h1> +<p>Database opens were changed in the Berkeley DB 3.0 release in a similar way to +environment opens. +<p>To upgrade your application, first find each place your application opens +a database, that is, calls the db_open function. Each of these calls +should be replaced with calls to <a href="../../api_c/db_create.html">db_create</a> and <a href="../../api_c/db_open.html">DB->open</a>. +<p>Here's an example creating a Berkeley DB database using the 2.X interface: +<p><blockquote><pre>DB *dbp; +DB_ENV *dbenv; +int ret; +<p> +if ((ret = db_open(DATABASE, + DB_BTREE, DB_CREATE, 0664, dbenv, NULL, &dbp)) != 0) + return (ret);</pre></blockquote> +<p>In the Berkeley DB 3.0 release, this code would be written as: +<p><blockquote><pre>DB *dbp; +DB_ENV *dbenv; +int ret; +<p> +if ((ret = db_create(&dbp, dbenv, 0)) != 0) + return (ret); +<p> +if ((ret = dbp->open(dbp, + DATABASE, NULL, DB_BTREE, DB_CREATE, 0664)) != 0) { + (void)dbp->close(dbp, 0); + return (ret); +}</pre></blockquote> +<p>As you can see, the arguments to db_open and to <a href="../../api_c/db_open.html">DB->open</a> are +largely the same. There is some re-organization, and note that the +enclosing DB_ENV structure is specified when the DB object +is created using the <a href="../../api_c/db_create.html">db_create</a> interface. There is one +additional argument to <a href="../../api_c/db_open.html">DB->open</a>, argument #3. For backward +compatibility with the 2.X Berkeley DB releases, simply set that argument to +NULL. +<p>There are two additional issues with the db_open call. +<p>First, it was possible in the 2.X releases for an application to provide +an environment that did not contain a shared memory buffer pool as the +database environment, and Berkeley DB would create a private one automatically. +This functionality is no longer available, applications must specify the +<a href="../../api_c/env_open.html#DB_INIT_MPOOL">DB_INIT_MPOOL</a> flag if databases are going to be opened in the +environment. +<p>The final issue with upgrading the db_open call is that the DB_INFO +structure is no longer used, having been replaced by individual methods +on the DB handle. That change is discussed in detail later in +this chapter. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/dbenv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/xa.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/rmw.html b/db/docs/ref/upgrade.3.0/rmw.html new file mode 100644 index 000000000..a1a30da5e --- /dev/null +++ b/db/docs/ref/upgrade.3.0/rmw.html @@ -0,0 +1,31 @@ +<!--$Id: rmw.so,v 11.9 2000/03/18 21:43:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: DB_RMW</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/txn_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_notheld.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: DB_RMW</h1> +<p>The following change applies only to applications using the +Berkeley DB Concurrent Data Store product. If your application is not using that product, +you can ignore this change. +<p>Historically, the Berkeley DB <a href="../../api_c/db_cursor.html">DB->cursor</a> interface took the DB_RMW flag +to indicate that the created cursor would be used for write operations on +the database. This flag has been renamed to the <a href="../../api_c/db_cursor.html#DB_WRITECURSOR">DB_WRITECURSOR</a> +flag. +<p>The application should be searched for any occurrences of DB_RMW. For +each of these, any that are arguments to the <a href="../../api_c/db_cursor.html">DB->cursor</a> function +should be changed to pass in the <a href="../../api_c/db_cursor.html#DB_WRITECURSOR">DB_WRITECURSOR</a> flag instead. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/txn_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/lock_notheld.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/stat.html b/db/docs/ref/upgrade.3.0/stat.html new file mode 100644 index 000000000..735e235d9 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/stat.html @@ -0,0 +1,24 @@ +<!--$Id: stat.so,v 11.3 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: DB->stat</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/join.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/close.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: DB->stat</h1> +<p>The <b>bt_flags</b> field returned from the <a href="../../api_c/db_stat.html">DB->stat</a> interface +for Btree and Recno databases has been removed, and this information is +no longer available. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/join.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/close.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/toc.html b/db/docs/ref/upgrade.3.0/toc.html new file mode 100644 index 000000000..189d7c0a6 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/toc.html @@ -0,0 +1,47 @@ +<!--$Id: toc.so,v 11.2 2000/12/05 20:36:26 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB: Upgrading Berkeley DB 2.X.X applications to Berkeley DB 3.0</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<h1 align=center>Upgrading Berkeley DB 2.X.X applications to Berkeley DB 3.0</h1> +<ol> +<li><a href="intro.html">Release 3.0: introduction</a> +<li><a href="envopen.html">Release 3.0: environment open/close/unlink</a> +<li><a href="func.html">Release 3.0: function arguments</a> +<li><a href="dbenv.html">Release 3.0: the DB_ENV structure</a> +<li><a href="open.html">Release 3.0: database open/close</a> +<li><a href="xa.html">Release 3.0: db_xa_open</a> +<li><a href="db.html">Release 3.0: the DB structure</a> +<li><a href="dbinfo.html">Release 3.0: the DBINFO structure</a> +<li><a href="join.html">Release 3.0: DB->join</a> +<li><a href="stat.html">Release 3.0: DB->stat</a> +<li><a href="close.html">Release 3.0: DB->sync and DB->close</a> +<li><a href="lock_put.html">Release 3.0: lock_put</a> +<li><a href="lock_detect.html">Release 3.0: lock_detect</a> +<li><a href="lock_stat.html">Release 3.0: lock_stat</a> +<li><a href="log_register.html">Release 3.0: log_register</a> +<li><a href="log_stat.html">Release 3.0: log_stat</a> +<li><a href="memp_stat.html">Release 3.0: memp_stat</a> +<li><a href="txn_begin.html">Release 3.0: txn_begin</a> +<li><a href="txn_commit.html">Release 3.0: txn_commit</a> +<li><a href="txn_stat.html">Release 3.0: txn_stat</a> +<li><a href="rmw.html">Release 3.0: DB_RMW</a> +<li><a href="lock_notheld.html">Release 3.0: DB_LOCK_NOTHELD</a> +<li><a href="eagain.html">Release 3.0: EAGAIN</a> +<li><a href="eacces.html">Release 3.0: EACCES</a> +<li><a href="jump_set.html">Release 3.0: db_jump_set</a> +<li><a href="value_set.html">Release 3.0: db_value_set</a> +<li><a href="dbenv_cxx.html">Release 3.0: the DbEnv class for C++ and Java</a> +<li><a href="db_cxx.html">Release 3.0: the Db class for C++ and Java</a> +<li><a href="cxx.html">Release 3.0: additional C++ changes</a> +<li><a href="java.html">Release 3.0: additional Java changes</a> +<li><a href="disk.html">Release 3.0: upgrade requirements</a> +</ol> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/txn_begin.html b/db/docs/ref/upgrade.3.0/txn_begin.html new file mode 100644 index 000000000..3fb9a6527 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/txn_begin.html @@ -0,0 +1,25 @@ +<!--$Id: txn_begin.so,v 11.7 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: txn_begin</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/memp_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/txn_commit.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: txn_begin</h1> +<p>An additional argument has been added to the <a href="../../api_c/txn_begin.html">txn_begin</a> interface. +<p>The application should be searched for any occurrences of +<a href="../../api_c/txn_begin.html">txn_begin</a>. For each one, an argument of 0 should be appended to +the current arguments. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/memp_stat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/txn_commit.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/txn_commit.html b/db/docs/ref/upgrade.3.0/txn_commit.html new file mode 100644 index 000000000..8090b1e3b --- /dev/null +++ b/db/docs/ref/upgrade.3.0/txn_commit.html @@ -0,0 +1,25 @@ +<!--$Id: txn_commit.so,v 11.8 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: txn_commit</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/txn_begin.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/txn_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: txn_commit</h1> +<p>An additional argument has been added to the <a href="../../api_c/txn_commit.html">txn_commit</a> interface. +<p>The application should be searched for any occurrences of +<a href="../../api_c/txn_commit.html">txn_commit</a>. For each one, an argument of 0 should be appended to +the current arguments. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/txn_begin.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/txn_stat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/txn_stat.html b/db/docs/ref/upgrade.3.0/txn_stat.html new file mode 100644 index 000000000..d965494d5 --- /dev/null +++ b/db/docs/ref/upgrade.3.0/txn_stat.html @@ -0,0 +1,23 @@ +<!--$Id: txn_stat.so,v 11.3 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: txn_stat</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/txn_commit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/rmw.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: txn_stat</h1> +<p>The <b>st_refcnt</b> field returned from the <a href="../../api_c/txn_stat.html">txn_stat</a> interface +has been removed, and this information is no longer available. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/txn_commit.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/rmw.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/value_set.html b/db/docs/ref/upgrade.3.0/value_set.html new file mode 100644 index 000000000..66070b09f --- /dev/null +++ b/db/docs/ref/upgrade.3.0/value_set.html @@ -0,0 +1,41 @@ +<!--$Id: value_set.so,v 11.6 2000/03/18 21:43:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: db_value_set</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/jump_set.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/dbenv_cxx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: db_value_set</h1> +<p>The db_value_set interface has been removed from the Berkeley DB 3.0 release, +replaced by method calls on the DB_ENV handle. +<p>The following table lists the db_value_set arguments previously used by +applications and the methods that should now be used instead. +<p><table border=1 align=center> +<tr><th>db_value_set argument</th><th>Berkeley DB 3.X method</th></tr> +<tr><td>DB_MUTEX_LOCKS</td><td><a href="../../api_c/env_set_mutexlocks.html">DBENV->set_mutexlocks</a></td></tr> +<tr><td>DB_REGION_ANON</td><td>The DB_REGION_ANON functionality has +been replaced by the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> and <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flags +to the <a href="../../api_c/env_open.html">DBENV->open</a> function. A direct translation is not +available, please review the <a href="../../api_c/env_open.html">DBENV->open</a> manual page for more +information.</td></tr> +<tr><td>DB_REGION_INIT</td><td><a href="../../api_c/env_set_region_init.html">db_env_set_region_init</a></td></tr> +<tr><td>DB_REGION_NAME</td><td>The DB_REGION_NAME functionality has +been replaced by the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> and <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flags +to the <a href="../../api_c/env_open.html">DBENV->open</a> function. A direct translation is not +available, please review the <a href="../../api_c/env_open.html">DBENV->open</a> manual page for more +information.</td></tr> +<tr><td>DB_TSL_SPINS</td><td><a href="../../api_c/env_set_tas_spins.html">db_env_set_tas_spins</a></td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/jump_set.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/dbenv_cxx.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.0/xa.html b/db/docs/ref/upgrade.3.0/xa.html new file mode 100644 index 000000000..41f5a993d --- /dev/null +++ b/db/docs/ref/upgrade.3.0/xa.html @@ -0,0 +1,33 @@ +<!--$Id: xa.so,v 11.7 2000/03/18 21:43:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.0: db_xa_open</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/db.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.0: db_xa_open</h1> +<p>The following change applies only to applications using Berkeley DB as an XA +Resource Manager. If your application is not using Berkeley DB in this way, +you can ignore this change. +<p>The db_xa_open function has been replaced with the <a href="../../api_c/db_create.html#DB_XA_CREATE">DB_XA_CREATE</a> +flag to the <a href="../../api_c/db_create.html">db_create</a> function. All calls to db_xa_open should +be replaced with calls to <a href="../../api_c/db_create.html">db_create</a> with the <a href="../../api_c/db_create.html#DB_XA_CREATE">DB_XA_CREATE</a> +flag set, followed by a call to the <a href="../../api_c/db_open.html">DB->open</a> function. +<p>A similar change has been made for the C++ API, where the +<a href="../../api_c/db_create.html#DB_XA_CREATE">DB_XA_CREATE</a> flag should be specified to the Db constructor. All +calls to the Db::xa_open method should be replaced with the +<a href="../../api_c/db_create.html#DB_XA_CREATE">DB_XA_CREATE</a> flag to the Db constructor, followed by a call to +the DB::open method. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.0/db.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/btstat.html b/db/docs/ref/upgrade.3.1/btstat.html new file mode 100644 index 000000000..e5d7c4bb5 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/btstat.html @@ -0,0 +1,50 @@ +<!--$Id: btstat.so,v 1.11 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DB->stat</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/dup.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/sysmem.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DB->stat</h1> +<p>For Btree database statistics, the <a href="../../api_c/db_stat.html">DB->stat</a> interface field +<b>bt_nrecs</b> has been removed, replaced by two fields: +<b>bt_nkeys</b> and <b>bt_ndata</b>. The <b>bt_nkeys</b> field returns +a count of the unique keys in the database. The <b>bt_ndata</b> field +returns a count of the key/data pairs in the database. Neither exactly +matches the previous value of the <b>bt_nrecs</b> field, which returned +a count of keys in the database, but, in the case of Btree databases, +could overcount as it sometimes counted duplicate data items as unique +keys. The application should be searched for any uses of the +<b>bt_nrecs</b> field and the field should be changed to be either +<b>bt_nkeys</b> or <b>bt_ndata</b>, whichever is more appropriate. +<p>For Hash database statistics, the <a href="../../api_c/db_stat.html">DB->stat</a> interface field +<b>hash_nrecs</b> has been removed, replaced by two fields: +<b>hash_nkeys</b> and <b>hash_ndata</b>. The <b>hash_nkeys</b> field +returns a count of the unique keys in the database. The +<b>hash_ndata</b> field returns a count of the key/data pairs in the +database. The new <b>hash_nkeys</b> field exactly matches the previous +value of the <b>hash_nrecs</b> field. The application should be searched +for any uses of the <b>hash_nrecs</b> field, and the field should be +changed to be <b>hash_nkeys</b>. +<p>For Queue database statistics, the <a href="../../api_c/db_stat.html">DB->stat</a> interface field +<b>qs_nrecs</b> has been removed, replaced by two fields: +<b>qs_nkeys</b> and <b>qs_ndata</b>. The <b>qs_nkeys</b> field returns +a count of the unique keys in the database. The <b>qs_ndata</b> field +returns a count of the key/data pairs in the database. The new +<b>qs_nkeys</b> field exactly matches the previous value of the +<b>qs_nrecs</b> field. The application should be searched for any uses +of the <b>qs_nrecs</b> field, and the field should be changed to be +<b>qs_nkeys</b>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/dup.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/sysmem.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/config.html b/db/docs/ref/upgrade.3.1/config.html new file mode 100644 index 000000000..29a53363e --- /dev/null +++ b/db/docs/ref/upgrade.3.1/config.html @@ -0,0 +1,35 @@ +<!--$Id: config.so,v 1.3 2000/07/25 16:59:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DBENV->open, DBENV->remove</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/set_tx_recover.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DBENV->open, DBENV->remove</h1> +<p>In the Berkeley DB 3.1 release, the <b>config</b> argument to the +<a href="../../api_c/env_open.html">DBENV->open</a>, <a href="../../api_c/env_remove.html">DBENV->remove</a> methods has been removed, +replaced by additional methods on the DB_ENV handle. If your +application calls <a href="../../api_c/env_open.html">DBENV->open</a> or <a href="../../api_c/env_remove.html">DBENV->remove</a> with a NULL +<b>config</b> argument, find those functions and remove the config +argument from the call. If your application has non-NULL <b>config</b> +argument, the strings values in that argument are replaced with calls to +DB_ENV methods as follows: +<p><table border=1 align=center> +<tr><th>Previous config string</th><th>Berkeley DB 3.1 version method</th></tr> +<tr><td>DB_DATA_DIR</td><td><a href="../../api_c/env_set_data_dir.html">DBENV->set_data_dir</a></td></tr> +<tr><td>DB_LOG_DIR</td><td><a href="../../api_c/env_set_lg_dir.html">DBENV->set_lg_dir</a></td></tr> +<tr><td>DB_TMP_DIR</td><td><a href="../../api_c/env_set_tmp_dir.html">DBENV->set_tmp_dir</a></td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/set_tx_recover.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/disk.html b/db/docs/ref/upgrade.3.1/disk.html new file mode 100644 index 000000000..cbaa3342b --- /dev/null +++ b/db/docs/ref/upgrade.3.1/disk.html @@ -0,0 +1,34 @@ +<!--$Id: disk.so,v 1.9 2000/12/21 18:37:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: upgrade requirements</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/logalloc.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: upgrade requirements</h1> +<p>Log file formats and the Btree, Queue, Recno and Hash Access Method +database formats changed in the Berkeley DB 3.1 release. (The on-disk +Btree/Recno format changed from version 7 to version 8. The on-disk +Hash format changed from version 6 to version 7. The on-disk Queue +format changed from version 1 to version 2.) Until the underlying +databases are upgraded, the <a href="../../api_c/db_open.html">DB->open</a> function will return a +<a href="../../api_c/db_open.html#DB_OLD_VERSION">DB_OLD_VERSION</a> error. +<p>An additional flag, <a href="../../api_c/db_set_flags.html#DB_DUPSORT">DB_DUPSORT</a>, has been added to the +<a href="../../api_c/db_upgrade.html">DB->upgrade</a> function for this upgrade. Please review the +<a href="../../api_c/db_upgrade.html">DB->upgrade</a> documentation for further information. +<p>For further information on upgrading Berkeley DB installations, see +<a href="../../ref/upgrade/process.html">Upgrading Berkeley DB +installations</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/logalloc.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/dup.html b/db/docs/ref/upgrade.3.1/dup.html new file mode 100644 index 000000000..33f71ebb4 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/dup.html @@ -0,0 +1,31 @@ +<!--$Id: dup.so,v 1.1 2000/05/31 18:53:28 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: identical duplicate data items</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/btstat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: identical duplicate data items</h1> +<p>In previous releases of Berkeley DB, it was not an error to store identical +duplicate data items, or, for those that just like the way it sounds, +duplicate duplicates. However, there were implementation bugs where +storing duplicate duplicates could cause database corruption. +<p>In this release, applications may store identical duplicate data items +as long as the data items are unsorted. It is an error to attempt to +store identical duplicate data items when duplicates are being stored +in a sorted order. This restriction is expected to be lifted in a future +release. See <a href="../../ref/am_conf/dup.html">Duplicate data items</a> +for more information. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/put.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/btstat.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/env.html b/db/docs/ref/upgrade.3.1/env.html new file mode 100644 index 000000000..6e1b8ccde --- /dev/null +++ b/db/docs/ref/upgrade.3.1/env.html @@ -0,0 +1,53 @@ +<!--$Id: env.so,v 1.1 2000/05/31 15:10:03 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: environment configuration</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/txn_check.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/tcl.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: environment configuration</h1> +<p>A set of DB_ENV configuration methods which were not environment +specific, but which instead affected the entire application space, have +been removed from the DB_ENV object and replaced by static +functions. The following table lists the DB_ENV methods previously +available to applications and the static functions that should now be used +instead. +<p><table border=1 align=center> +<tr><th>DB_ENV method</th><th>Berkeley DB 3.1 function</th></tr> +<tr><td>DBENV->set_func_close</td><td><a href="../../api_c/set_func_close.html">db_env_set_func_close</a></td></tr> +<tr><td>DBENV->set_func_dirfree</td><td><a href="../../api_c/set_func_dirfree.html">db_env_set_func_dirfree</a></td></tr> +<tr><td>DBENV->set_func_dirlist</td><td><a href="../../api_c/set_func_dirlist.html">db_env_set_func_dirlist</a></td></tr> +<tr><td>DBENV->set_func_exists</td><td><a href="../../api_c/set_func_exists.html">db_env_set_func_exists</a></td></tr> +<tr><td>DBENV->set_func_free</td><td><a href="../../api_c/set_func_free.html">db_env_set_func_free</a></td></tr> +<tr><td>DBENV->set_func_fsync</td><td><a href="../../api_c/set_func_fsync.html">db_env_set_func_fsync</a></td></tr> +<tr><td>DBENV->set_func_ioinfo</td><td><a href="../../api_c/set_func_ioinfo.html">db_env_set_func_ioinfo</a></td></tr> +<tr><td>DBENV->set_func_malloc</td><td><a href="../../api_c/set_func_malloc.html">db_env_set_func_malloc</a></td></tr> +<tr><td>DBENV->set_func_map</td><td><a href="../../api_c/set_func_map.html">db_env_set_func_map</a></td></tr> +<tr><td>DBENV->set_func_open</td><td><a href="../../api_c/set_func_open.html">db_env_set_func_open</a></td></tr> +<tr><td>DBENV->set_func_read</td><td><a href="../../api_c/set_func_read.html">db_env_set_func_read</a></td></tr> +<tr><td>DBENV->set_func_realloc</td><td><a href="../../api_c/set_func_realloc.html">db_env_set_func_realloc</a></td></tr> +<tr><td>DBENV->set_func_rename</td><td><a href="../../api_c/set_func_rename.html">db_env_set_func_rename</a></td></tr> +<tr><td>DBENV->set_func_seek</td><td><a href="../../api_c/set_func_seek.html">db_env_set_func_seek</a></td></tr> +<tr><td>DBENV->set_func_sleep</td><td><a href="../../api_c/set_func_sleep.html">db_env_set_func_sleep</a></td></tr> +<tr><td>DBENV->set_func_unlink</td><td><a href="../../api_c/set_func_unlink.html">db_env_set_func_unlink</a></td></tr> +<tr><td>DBENV->set_func_unmap</td><td><a href="../../api_c/set_func_unmap.html">db_env_set_func_unmap</a></td></tr> +<tr><td>DBENV->set_func_write</td><td><a href="../../api_c/set_func_write.html">db_env_set_func_write</a></td></tr> +<tr><td>DBENV->set_func_yield</td><td><a href="../../api_c/set_func_yield.html">db_env_set_func_yield</a></td></tr> +<tr><td>DBENV->set_pageyield</td><td><a href="../../api_c/env_set_pageyield.html">db_env_set_pageyield</a></td></tr> +<tr><td>DBENV->set_region_init</td><td><a href="../../api_c/env_set_region_init.html">db_env_set_region_init</a></td></tr> +<tr><td>DBENV->set_mutexlocks</td><td><a href="../../api_c/env_set_mutexlocks.html">DBENV->set_mutexlocks</a></td></tr> +<tr><td>DBENV->set_tas_spins</td><td><a href="../../api_c/env_set_tas_spins.html">db_env_set_tas_spins</a></td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/txn_check.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/tcl.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/intro.html b/db/docs/ref/upgrade.3.1/intro.html new file mode 100644 index 000000000..9c5d95291 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/intro.html @@ -0,0 +1,26 @@ +<!--$Id: intro.so,v 1.4 2000/03/18 21:43:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.0/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: introduction</h1> +<p>The following pages describe how to upgrade applications coded against +the Berkeley DB 3.0 release interfaces to the Berkeley DB 3.1 release interfaces. +This information does not describe how to upgrade Berkeley DB 1.85 release +applications. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.0/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/log_register.html b/db/docs/ref/upgrade.3.1/log_register.html new file mode 100644 index 000000000..8823d6439 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/log_register.html @@ -0,0 +1,28 @@ +<!--$Id: log_register.so,v 1.3 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: log_register</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/sysmem.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/memp_register.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: log_register</h1> +<p>The arguments to the <a href="../../api_c/log_register.html">log_register</a> and <a href="../../api_c/log_unregister.html">log_unregister</a> +interfaces have changed. Instead of returning (and passing in) a logging +file ID, a reference to the DB structure being registered (or +unregistered) is passed. The application should be searched for any +occurrences of <a href="../../api_c/log_register.html">log_register</a> and <a href="../../api_c/log_unregister.html">log_unregister</a>. For each +one, change the arguments to be a reference to the DB structure +being registered or unregistered. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/sysmem.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/memp_register.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/logalloc.html b/db/docs/ref/upgrade.3.1/logalloc.html new file mode 100644 index 000000000..acafbf6ee --- /dev/null +++ b/db/docs/ref/upgrade.3.1/logalloc.html @@ -0,0 +1,27 @@ +<!--$Id: logalloc.so,v 1.1 2000/06/02 23:32:48 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: log file pre-allocation</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/tmp.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: log file pre-allocation</h1> +<p>This change only affects Win/32 applications. +<p>On Win/32 platforms Berkeley DB no longer pre-allocates log files. The problem +was a noticeable performance spike as each log file was created. To turn +this feature back on, search for the flag DB_OSO_LOG in the source file +<b>log/log_put.c</b> and make the change described there, or contact +Sleepycat Software for assistance. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/tmp.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/memp_register.html b/db/docs/ref/upgrade.3.1/memp_register.html new file mode 100644 index 000000000..e8a667031 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/memp_register.html @@ -0,0 +1,30 @@ +<!--$Id: memp_register.so,v 1.3 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: memp_register</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/log_register.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/txn_check.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: memp_register</h1> +<p>An additional argument has been added to the <b>pgin</b> and +<b>pgout</b> functions provided to the <a href="../../api_c/memp_register.html">memp_register</a> interface. +The application should be searched for any occurrences of +<a href="../../api_c/memp_register.html">memp_register</a>. For each one, if <b>pgin</b> or <b>pgout</b> +functions are specified, the <b>pgin</b> and <b>pgout</b> functions +should be modified to take an initial argument of a <b>DB_ENV *</b>. +This argument is intended to support better error reporting for +applications, and may be entirely ignored by the <b>pgin</b> and +<b>pgout</b> functions themselves. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/log_register.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/txn_check.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/put.html b/db/docs/ref/upgrade.3.1/put.html new file mode 100644 index 000000000..5252b3ac0 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/put.html @@ -0,0 +1,63 @@ +<!--$Id: put.so,v 1.8 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DB->put</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/set_paniccall.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/dup.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DB->put</h1> +<p>For the Queue and Recno access methods, when the <a href="../../api_c/db_put.html#DB_APPEND">DB_APPEND</a> flag +is specified to the <a href="../../api_c/db_put.html">DB->put</a> interface, the allocated record number +is returned to the application in the <b>key</b> <a href="../../api_c/dbt.html">DBT</a> argument. +In previous releases of Berkeley DB, this <a href="../../api_c/dbt.html">DBT</a> structure did not follow +the usual <a href="../../api_c/dbt.html">DBT</a> conventions, e.g., it was not possible to cause +Berkeley DB to allocate space for the returned record number. Rather, it was +always assumed that the <b>data</b> field of the <b>key</b> structure +referenced memory that could be used as storage for a db_recno_t type. +<p>As of the Berkeley DB 3.1.0 release, the <b>key</b> structure behaves as +described in the <a href="../../api_c/dbt.html">DBT</a> C++/Java class or C structure documentation. +<p>Applications which are using the <a href="../../api_c/db_put.html#DB_APPEND">DB_APPEND</a> flag for Queue and +Recno access method databases will require a change to upgrade to the +Berkeley DB 3.1 releases. The simplest change is likely to be to add the +<a href="../../api_c/dbt.html#DB_DBT_USERMEM">DB_DBT_USERMEM</a> flag to the <b>key</b> structure. For example, +code that appears as follows: +<p><blockquote><pre>DBT key; +db_recno_t recno; +<p> +memset(&key, 0, sizeof(DBT)); +key.data = &recno; +key.size = sizeof(recno); +DB->put(DB, NULL, &key, &data, DB_APPEND); +printf("new record number is %lu\n", (u_long)recno);</pre></blockquote> +<p>would be changed to: +<p><blockquote><pre>DBT key; +db_recno_t recno; +<p> +memset(&key, 0, sizeof(DBT)); +key.data = &recno; +key.ulen = sizeof(recno); +key.flags = DB_DBT_USERMEM; +DB->put(DB, NULL, &key, &data, DB_APPEND); +printf("new record number is %lu\n", (u_long)recno);</pre></blockquote> +<p>Note that the <b>ulen</b> field is now set as well as the flag value. +An alternative change would be: +<p><blockquote><pre>DBT key; +db_recno_t recno; +<p> +memset(&key, 0, sizeof(DBT)); +DB->put(DB, NULL, &key, &data, DB_APPEND); +recno = *(db_recno_t *)key->data; +printf("new record number is %lu\n", (u_long)recno);</pre></blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/set_paniccall.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/dup.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/set_feedback.html b/db/docs/ref/upgrade.3.1/set_feedback.html new file mode 100644 index 000000000..c7b7864b9 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/set_feedback.html @@ -0,0 +1,27 @@ +<!--$Id: set_feedback.so,v 1.3 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DBENV->set_feedback, DB->set_feedback</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/set_tx_recover.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/set_paniccall.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DBENV->set_feedback, DB->set_feedback</h1> +<p>Starting with the 3.1 release of Berkeley DB, the <a href="../../api_c/env_set_feedback.html">DBENV->set_feedback</a> +and <a href="../../api_c/db_set_feedback.html">DB->set_feedback</a> functions may return an error value, that is, they +are no longer declared as returning no value, instead they return an int +or throw an exception as appropriate when an error occurs. +<p>If your application calls these functions, you may want to check for a +possible error on return. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/set_tx_recover.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/set_paniccall.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/set_paniccall.html b/db/docs/ref/upgrade.3.1/set_paniccall.html new file mode 100644 index 000000000..8aa554cf0 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/set_paniccall.html @@ -0,0 +1,27 @@ +<!--$Id: set_paniccall.so,v 1.4 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DBENV->set_paniccall, DB->set_paniccall</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/set_feedback.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DBENV->set_paniccall, DB->set_paniccall</h1> +<p>Starting with the 3.1 release of Berkeley DB, the <a href="../../api_c/env_set_paniccall.html">DBENV->set_paniccall</a> +and <a href="../../api_c/db_set_paniccall.html">DB->set_paniccall</a> functions may return an error value, that is, they +are no longer declared as returning no value, instead they return an int +or throw an exception as appropriate when an error occurs. +<p>If your application calls these functions, you may want to check for a +possible error on return. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/set_feedback.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/put.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/set_tx_recover.html b/db/docs/ref/upgrade.3.1/set_tx_recover.html new file mode 100644 index 000000000..9943845e8 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/set_tx_recover.html @@ -0,0 +1,36 @@ +<!--$Id: set_tx_recover.so,v 1.9 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DBENV->set_tx_recover</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/set_feedback.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DBENV->set_tx_recover</h1> +<p>The redo parameter of the function passed to <a href="../../api_c/env_set_tx_recover.html">DBENV->set_tx_recover</a> +used to be an integer set to any one of a number of #defined values. In +the 3.1 release of Berkeley DB, the redo parameter has been replaced by the op +parameter which is an enumerated type of type db_recops. +<p>If your application calls <a href="../../api_c/env_set_tx_recover.html">DBENV->set_tx_recover</a>, then find the +function referenced in the call. Replace the flag values in that function +as follows: +<p><table border=1 align=center> +<tr><th>Previous flag</th><th>Berkeley DB 3.1 version flag</th></tr> +<tr><td>TXN_BACKWARD_ROLL</td><td>DB_TXN_BACKWARD_ROLL</td></tr> +<tr><td>TXN_FORWARD_ROLL</td><td>DB_TXN_FORWARD_ROLL</td></tr> +<tr><td>TXN_OPENFILES</td><td>DB_TXN_OPENFILES</td></tr> +<tr><td>TXN_REDO</td><td>DB_TXN_FORWARD_ROLL</td></tr> +<tr><td>TXN_UNDO</td><td>DB_TXN_ABORT</td></tr> +</table> +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/set_feedback.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/sysmem.html b/db/docs/ref/upgrade.3.1/sysmem.html new file mode 100644 index 000000000..7e21a565e --- /dev/null +++ b/db/docs/ref/upgrade.3.1/sysmem.html @@ -0,0 +1,25 @@ +<!--$Id: sysmem.so,v 1.3 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DB_SYSTEM_MEM</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/btstat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/log_register.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DB_SYSTEM_MEM</h1> +<p>Using the <a href="../../api_c/env_open.html#DB_SYSTEM_MEM">DB_SYSTEM_MEM</a> option on UNIX systems now requires the +specification of a base system memory segment ID, using the +<a href="../../api_c/env_set_shm_key.html">DBENV->set_shm_key</a> function. Any valid segment ID may be specified, for +example, one returned by the UNIX <b>ftok</b>(3) interface. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/btstat.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/log_register.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/tcl.html b/db/docs/ref/upgrade.3.1/tcl.html new file mode 100644 index 000000000..0f964abb3 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/tcl.html @@ -0,0 +1,30 @@ +<!--$Id: tcl.so,v 1.5 2000/06/02 14:50:20 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: Tcl API</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/env.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/tmp.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: Tcl API</h1> +<p>The Berkeley DB Tcl API has been modified so that the <b>-mpool</b> option to +the <b>berkdb env</b> command is now the default behavior. The Tcl API +has also been modified so that the <b>-txn</b> option to the +<b>berkdb env</b> command implies the <b>-lock</b> and <b>-log</b> +options. Tcl scripts should be updated to remove the <b>-mpool</b>, +<b>-lock</b> and <b>-log</b> options. +<p>The Berkeley DB Tcl API has been modified to follow the Tcl standard rules for +integer conversion, e.g., if the first two characters of a record number +are "0x", the record number is expected to be in hexadecimal form. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/env.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/tmp.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/tmp.html b/db/docs/ref/upgrade.3.1/tmp.html new file mode 100644 index 000000000..72034803b --- /dev/null +++ b/db/docs/ref/upgrade.3.1/tmp.html @@ -0,0 +1,34 @@ +<!--$Id: tmp.so,v 1.7 2000/05/22 20:26:35 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: DB_TMP_DIR</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/tcl.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/logalloc.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: DB_TMP_DIR</h1> +<p>This change only affects Win/32 applications that create in-memory +databases. +<p>On Win/32 platforms an additional test has been added when searching for +the appropriate directory in which to create the temporary files that are +used to back in-memory databases. Berkeley DB now uses any return value from +the GetTempPath interface as the temporary file directory name before +resorting to the static list of compiled-in pathnames. +<p>If the system registry does not return the same directory as Berkeley DB has +been using previously, this change could cause temporary backing files to +move to a new directory when applications are upgraded to the 3.1 release. +In extreme cases, this could create (or fix) security problems if the file +protection modes for the system registry directory are different from +those on the directory previously used by Berkeley DB. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/tcl.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/logalloc.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/toc.html b/db/docs/ref/upgrade.3.1/toc.html new file mode 100644 index 000000000..091318810 --- /dev/null +++ b/db/docs/ref/upgrade.3.1/toc.html @@ -0,0 +1,33 @@ +<!--$Id: toc.so,v 1.2 2000/12/05 20:36:27 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB: Upgrading Berkeley DB 3.0.X applications to Berkeley DB 3.1</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<h1 align=center>Upgrading Berkeley DB 3.0.X applications to Berkeley DB 3.1</h1> +<ol> +<li><a href="intro.html">Release 3.1: introduction</a> +<li><a href="config.html">Release 3.1: DBENV->open, DBENV->remove</a> +<li><a href="set_tx_recover.html">Release 3.1: DBENV->set_tx_recover</a> +<li><a href="set_feedback.html">Release 3.1: DBENV->set_feedback, DB->set_feedback</a> +<li><a href="set_paniccall.html">Release 3.1: DBENV->set_paniccall, DB->set_paniccall</a> +<li><a href="put.html">Release 3.1: DB->put</a> +<li><a href="dup.html">Release 3.1: identical duplicate data items</a> +<li><a href="btstat.html">Release 3.1: DB->stat</a> +<li><a href="sysmem.html">Release 3.1: DB_SYSTEM_MEM</a> +<li><a href="log_register.html">Release 3.1: log_register</a> +<li><a href="memp_register.html">Release 3.1: memp_register</a> +<li><a href="txn_check.html">Release 3.1: txn_checkpoint</a> +<li><a href="env.html">Release 3.1: environment configuration</a> +<li><a href="tcl.html">Release 3.1: Tcl API</a> +<li><a href="tmp.html">Release 3.1: DB_TMP_DIR</a> +<li><a href="logalloc.html">Release 3.1: log file pre-allocation</a> +<li><a href="disk.html">Release 3.1: upgrade requirements</a> +</ol> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.1/txn_check.html b/db/docs/ref/upgrade.3.1/txn_check.html new file mode 100644 index 000000000..27dc3851f --- /dev/null +++ b/db/docs/ref/upgrade.3.1/txn_check.html @@ -0,0 +1,26 @@ +<!--$Id: txn_check.so,v 1.6 2000/07/25 16:59:37 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.1: txn_checkpoint</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/memp_register.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/env.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.1: txn_checkpoint</h1> +<p>An additional argument has been added to the <a href="../../api_c/txn_checkpoint.html">txn_checkpoint</a> +interface. +<p>The application should be searched for any occurrences of +<a href="../../api_c/txn_checkpoint.html">txn_checkpoint</a>. For each one, an argument of 0 should be appended +to the current arguments. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/memp_register.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.1/env.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/callback.html b/db/docs/ref/upgrade.3.2/callback.html new file mode 100644 index 000000000..f60a81d5c --- /dev/null +++ b/db/docs/ref/upgrade.3.2/callback.html @@ -0,0 +1,39 @@ +<!--$Id: callback.so,v 1.5 2000/10/26 15:20:40 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: DB callback functions, app_private field</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/set_flags.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/renumber.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: DB callback functions, app_private field</h1> +<p>In the Berkeley DB 3.2 release, four application callback functions (the +callback functions set by <a href="../../api_c/db_set_bt_compare.html">DB->set_bt_compare</a>, +<a href="../../api_c/db_set_bt_prefix.html">DB->set_bt_prefix</a>, <a href="../../api_c/db_set_dup_compare.html">DB->set_dup_compare</a> and +<a href="../../api_c/db_set_h_hash.html">DB->set_h_hash</a>) were modified to take a reference to a +DB object as their first argument. This change allows the Berkeley DB +Java API to reasonably support these interfaces. There is currently no +need for the callback functions to do anything with this additional +argument. +<p>C and C++ applications that specify their own Btree key comparison, +Btree prefix comparison, duplicate data item comparison or Hash +functions should modify these functions to take a reference to a +DB structure as their first argument. No further change is +required. +<p>The app_private field of the <a href="../../api_c/dbt.html">DBT</a> structure (accessible only from +the Berkeley DB C API) has been removed in the 3.2 release. It was replaced +with app_private fields in the DB_ENV and DB handles. +Applications using this field will have to convert to using one of the +replacement fields. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/set_flags.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/renumber.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/db_dump.html b/db/docs/ref/upgrade.3.2/db_dump.html new file mode 100644 index 000000000..87d909086 --- /dev/null +++ b/db/docs/ref/upgrade.3.2/db_dump.html @@ -0,0 +1,29 @@ +<!--$Id: db_dump.so,v 1.3 2000/11/28 21:27:49 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: db_dump</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/notfound.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: db_dump</h1> +<p>In previous releases of Berkeley DB, the <a href="../../utility/db_dump.html">db_dump</a> utility dumped Recno +access method database keys as numeric strings. For consistency, the +<a href="../../utility/db_dump.html">db_dump</a> utility has been changed in the 3.2 release to dump +record numbers as hex pairs when the data items are being dumped as hex +pairs. (See the <b>-k</b> and <b>-p</b> options to the +<a href="../../utility/db_dump.html">db_dump</a> utility for more information.) Any applications or +scripts post-processing the <a href="../../utility/db_dump.html">db_dump</a> output of Recno databases +under these conditions may require modification. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/notfound.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/disk.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/disk.html b/db/docs/ref/upgrade.3.2/disk.html new file mode 100644 index 000000000..8cebb9319 --- /dev/null +++ b/db/docs/ref/upgrade.3.2/disk.html @@ -0,0 +1,28 @@ +<!--$Id: disk.so,v 1.4 2000/12/21 18:37:09 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: upgrade requirements</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/db_dump.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/test/run.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: upgrade requirements</h1> +<p>Log file formats and the Queue Access Method database formats changed +in the Berkeley DB 3.2 release. (The on-disk Queue format changed from +version 2 to version 3.) Until the underlying databases are upgraded, +the <a href="../../api_c/db_open.html">DB->open</a> function will return a <a href="../../api_c/db_open.html#DB_OLD_VERSION">DB_OLD_VERSION</a> error. +<p>For further information on upgrading Berkeley DB installations, see +<a href="../../ref/upgrade/process.html">Upgrading Berkeley DB +installations</a>. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/db_dump.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/test/run.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/handle.html b/db/docs/ref/upgrade.3.2/handle.html new file mode 100644 index 000000000..86f86a03a --- /dev/null +++ b/db/docs/ref/upgrade.3.2/handle.html @@ -0,0 +1,27 @@ +<!--$Id: handle.so,v 1.2 2000/11/17 19:56:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: Java and C++ object re-use</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/mutexlock.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/notfound.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: Java and C++ object re-use</h1> +<p>In previous releases of Berkeley DB, Java <a href="../../api_java/dbenv_class.html">DbEnv</a> and <a href="../../api_java/db_class.html">Db</a> +objects, and C++ <a href="../../api_cxx/dbenv_class.html">DbEnv</a> and <a href="../../api_cxx/db_class.html">Db</a> objects could be +re-used after they were closed, by calling open on them again. This is +no longer permitted, and these objects no longer allow any operations +after a close. Applications re-using these objects should be modified +to create new objects instead. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/mutexlock.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/notfound.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/incomplete.html b/db/docs/ref/upgrade.3.2/incomplete.html new file mode 100644 index 000000000..5aeb77559 --- /dev/null +++ b/db/docs/ref/upgrade.3.2/incomplete.html @@ -0,0 +1,39 @@ +<!--$Id: incomplete.so,v 1.4 2000/12/07 15:59:23 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: DB_INCOMPLETE</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/renumber.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/tx_recover.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: DB_INCOMPLETE</h1> +<p>There are a number of functions that flush pages from the Berkeley DB shared +memory buffer pool to disk. Most of those functions can potentially +fail because a page that needs to be flushed is not currently available. +However, this is not a hard failure and is rarely cause for concern. +In the Berkeley DB 3.2 release, the C++ API (if that API is configured to +throw exceptions) and the Java API have been changed so that this +failure does not throw an exception, but rather returns a non-zero error +code of <a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a>. +<p>The following C++ methods will return <a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a> rather than throw +an exception: <a href="../../api_cxx/db_close.html">Db::close</a>, <a href="../../api_cxx/db_sync.html">Db::sync</a>, <a href="../../api_cxx/memp_sync.html">DbEnv::memp_sync</a>, +<a href="../../api_cxx/txn_checkpoint.html">DbEnv::txn_checkpoint</a>, <a href="../../api_cxx/memp_fsync.html">DbMpoolFile::sync</a>. +<p>The following Java methods are now declared "public int" rather than +"public void", and will return <a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">Db.DB_INCOMPLETE</a> rather than +throw an exception: <a href="../../api_java/db_close.html">Db.close</a>, <a href="../../api_java/db_sync.html">Db.sync</a>, +<a href="../../api_java/txn_checkpoint.html">DbEnv.txn_checkpoint</a>. +<p>It is likely that the only change required by any application will be +those currently checking for a <a href="../../api_c/memp_fsync.html#DB_INCOMPLETE">DB_INCOMPLETE</a> return that has +been encapsulated in an exception. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/renumber.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/tx_recover.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/intro.html b/db/docs/ref/upgrade.3.2/intro.html new file mode 100644 index 000000000..df4d573a0 --- /dev/null +++ b/db/docs/ref/upgrade.3.2/intro.html @@ -0,0 +1,26 @@ +<!--$Id: intro.so,v 1.3 2000/10/03 17:17:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.1/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/set_flags.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: introduction</h1> +<p>The following pages describe how to upgrade applications coded against +the Berkeley DB 3.1 release interfaces to the Berkeley DB 3.2 release interfaces. +This information does not describe how to upgrade Berkeley DB 1.85 release +applications. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.1/disk.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/set_flags.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/mutexlock.html b/db/docs/ref/upgrade.3.2/mutexlock.html new file mode 100644 index 000000000..fb1b87ca9 --- /dev/null +++ b/db/docs/ref/upgrade.3.2/mutexlock.html @@ -0,0 +1,28 @@ +<!--$Id: mutexlock.so,v 1.1 2000/11/17 19:56:16 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: DBENV->set_mutexlocks</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/tx_recover.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/handle.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: DBENV->set_mutexlocks</h1> +<p>Previous Berkeley DB releases included the db_env_set_mutexlocks interface, +intended for debugging, that allows applications to always obtain +requested mutual exclusion mutexes without regard for their +availability. This interface has been replaced with +<a href="../../api_c/env_set_mutexlocks.html">DBENV->set_mutexlocks</a>, which provides the same functionality on +a per-database environment basis. Applications using the old interface +should be updated to use the new one. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/tx_recover.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/handle.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/notfound.html b/db/docs/ref/upgrade.3.2/notfound.html new file mode 100644 index 000000000..cb40beaae --- /dev/null +++ b/db/docs/ref/upgrade.3.2/notfound.html @@ -0,0 +1,25 @@ +<!--$Id: notfound.so,v 1.1 2000/10/25 14:27:30 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: Java java.io.FileNotFoundException</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/handle.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/db_dump.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: Java java.io.FileNotFoundException</h1> +<p>The Java <a href="../../api_java/env_remove.html">DbEnv.remove</a>, <a href="../../api_java/db_remove.html">Db.remove</a> and +<a href="../../api_java/db_rename.html">Db.rename</a> methods now throw java.io.FileNotFoundException +in the case where the named file does not exist. Applications should +be modified to catch this exception where appropriate. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/handle.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/db_dump.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/renumber.html b/db/docs/ref/upgrade.3.2/renumber.html new file mode 100644 index 000000000..619fa07ff --- /dev/null +++ b/db/docs/ref/upgrade.3.2/renumber.html @@ -0,0 +1,39 @@ +<!--$Id: renumber.so,v 1.3 2000/12/01 18:33:57 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: Logically renumbering records</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/callback.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/incomplete.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: Logically renumbering records</h1> +<p>In the Berkeley DB 3.2 release, cursor adjustment semantics changed for Recno +databases with mutable record numbers. Before the 3.2 release, cursors +were adjusted to point to the previous or next record at the time the +record referenced by the cursor was deleted. This could lead to +unexpected behaviors. For example, two cursors referencing sequential +records that were both deleted would lose their relationship to each +other and would reference the same position in the database instead of +their original sequential relationship. There were also command +sequences that would have unexpected results. For example, DB_AFTER +and DB_BEFORE cursor put operations, using a cursor previously used to +delete an item, would perform the put relative to the cursor's adjusted +position and not its original position. +<p>In the Berkeley DB 3.2 release, cursors maintain their position in the tree +regardless of deletion operations using the cursor. Applications that +perform database operations, using cursors previously used to delete +entries in Recno databases with mutable record numbers, should be +evaluated to ensure that the new semantics do not cause application +failure. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/callback.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/incomplete.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/set_flags.html b/db/docs/ref/upgrade.3.2/set_flags.html new file mode 100644 index 000000000..b1bbe906b --- /dev/null +++ b/db/docs/ref/upgrade.3.2/set_flags.html @@ -0,0 +1,35 @@ +<!--$Id: set_flags.so,v 1.1 2000/10/03 17:17:36 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: DBENV->set_flags</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/callback.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: DBENV->set_flags</h1> +<p>A new method has been added to the Berkeley DB environment handle, +<a href="../../api_c/env_set_flags.html">DBENV->set_flags</a>. This interface currently takes three flags: +<a href="../../api_c/env_set_flags.html#DB_CDB_ALLDB">DB_CDB_ALLDB</a>, <a href="../../api_c/env_open.html#DB_NOMMAP">DB_NOMMAP</a> and <a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a>. The +first of these flags, <a href="../../api_c/env_set_flags.html#DB_CDB_ALLDB">DB_CDB_ALLDB</a>, provides new functionality, +allowing Berkeley DB Concurrent Data Store applications to do locking across multiple databases. +<p>The other two flags, <a href="../../api_c/env_open.html#DB_NOMMAP">DB_NOMMAP</a> and <a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a>, were +specified to the <a href="../../api_c/env_open.html">DBENV->open</a> method in previous releases. In +the 3.2 release, they have been moved to the <a href="../../api_c/env_set_flags.html">DBENV->set_flags</a> function +because this allows the database environment's value to be toggled +during the life of the application as well as because it is a more +appropriate place for them. Applications specifying either the +<a href="../../api_c/env_open.html#DB_NOMMAP">DB_NOMMAP</a> or <a href="../../api_c/env_open.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flags to the +<a href="../../api_c/env_open.html">DBENV->open</a> function should replace those flags with calls to the +<a href="../../api_c/env_set_flags.html">DBENV->set_flags</a> function. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/callback.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/toc.html b/db/docs/ref/upgrade.3.2/toc.html new file mode 100644 index 000000000..8a466d1b4 --- /dev/null +++ b/db/docs/ref/upgrade.3.2/toc.html @@ -0,0 +1,27 @@ +<!--$Id: toc.so,v 1.7 2000/12/07 15:59:23 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB: Upgrading Berkeley DB 3.1.X applications to Berkeley DB 3.2</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<h1 align=center>Upgrading Berkeley DB 3.1.X applications to Berkeley DB 3.2</h1> +<ol> +<li><a href="intro.html">Release 3.2: introduction</a> +<li><a href="set_flags.html">Release 3.2: DBENV->set_flags</a> +<li><a href="callback.html">Release 3.2: DB callback functions, app_private field</a> +<li><a href="renumber.html">Release 3.2: logically renumbering records</a> +<li><a href="incomplete.html">Release 3.2: DB_INCOMPLETE</a> +<li><a href="tx_recover.html">Release 3.2: DBENV->set_tx_recover</a> +<li><a href="mutexlock.html">Release 3.2: DBENV->set_mutexlocks</a> +<li><a href="handle.html">Release 3.2: Java and C++ object re-use</a> +<li><a href="notfound.html">Release 3.2: Java java.io.FileNotFoundException</a> +<li><a href="db_dump.html">Release 3.2: db_dump</a> +<li><a href="disk.html">Release 3.2: upgrade requirements</a> +</ol> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade.3.2/tx_recover.html b/db/docs/ref/upgrade.3.2/tx_recover.html new file mode 100644 index 000000000..c5cf18ebc --- /dev/null +++ b/db/docs/ref/upgrade.3.2/tx_recover.html @@ -0,0 +1,32 @@ +<!--$Id: tx_recover.so,v 1.11 2000/12/07 15:59:23 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Release 3.2: DBENV->set_tx_recover</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/upgrade.3.2/incomplete.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/mutexlock.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Release 3.2: DBENV->set_tx_recover</h1> +<p>The <b>info</b> parameter of the function passed to +<a href="../../api_c/env_set_tx_recover.html">DBENV->set_tx_recover</a> is no longer needed. If your application +calls <a href="../../api_c/env_set_tx_recover.html">DBENV->set_tx_recover</a>, find the callback function referenced +in that call and remove the <b>info</b> parameter. +<p>In addition, the called function no longer needs to handle Berkeley DB log +records, Berkeley DB will handle them internally as well as call the +application-specified function. Any handling of Berkeley DB log records in the +application's callback function may be removed. +<p>In addition, the callback function will no longer be called with the +<a href="../../api_c/env_set_tx_recover.html#DB_TXN_FORWARD_ROLL">DB_TXN_FORWARD_ROLL</a> flag specified unless the transaction +enclosing the operation successfully committed. +<table><tr><td><br></td><td width="1%"><a href="../../ref/upgrade.3.2/incomplete.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.3.2/mutexlock.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/upgrade/process.html b/db/docs/ref/upgrade/process.html new file mode 100644 index 000000000..40be3c8e8 --- /dev/null +++ b/db/docs/ref/upgrade/process.html @@ -0,0 +1,108 @@ +<!--$Id: process.so,v 1.1 2000/12/05 20:39:10 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Upgrading Berkeley DB installations</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Upgrading Berkeley DB Applications</dl></h3></td> +<td width="1%"><a href="../../ref/build_vxworks/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Upgrading Berkeley DB installations</h1> +<p>The following information describes the general process of upgrading +Berkeley DB installations. There are three issues to be considered when +upgrading Berkeley DB applications and database environments. They are the +application API, the underlying database formats, and, in the case of +transactional database environments, the log files. +<p>An application must always be re-compiled to use a new Berkeley DB release. +Internal Berkeley DB interfaces may change at any time and in any release, +without warning. This means the application and library must be entirely +recompiled and reinstalled when upgrading to new releases of the +library, as there is no guarantee that modules from one version of the +library will interact correctly with modules from another release. +<p>A Berkeley DB patch release will never modify the Berkeley DB API, log file or +database formats in non-backward compatible ways. Berkeley DB minor and major +releases may optionally include changes to the Berkeley DB application API, +log files and database formats that are not backward compatible. Note, +that there are several underlying Berkeley DB database formats. As all of +them do not necessarily change at the same time, changes to one database +format in a release may not affect any particular application. +<p>Each Berkeley DB minor or major release has an upgrading section in this +chapter of the Berkeley DB Reference Guide. The section describes any API +changes that were made in the release. Application maintainers must +review the API changes, update their applications as necessary, and then +re-compile using the new release. In addition, each section includes +a page specifying if the log file format or database formats changed in +non-backward compatible ways as part of the release. +<p>If the application does not have a Berkeley DB transactional environment, the +re-compiled application may be installed in the field using the +following steps: +<p><ol> +<p><li>Shut down the old version of the application. +<p><li>Remove any Berkeley DB environment, using the <a href="../../api_c/env_remove.html">DBENV->remove</a> function or an +appropriate system utility. +<p><li>Install the new version of the application. +<p><li>If the database format has changed, upgrade the application's databases. +See <a href="../../ref/am/upgrade.html">Upgrading databases</a> for more +information. +<p><li>Re-start the application. +</ol> +<p>If the application has a Berkeley DB transactional environment, but neither +the log file or database formats have changed, the re-compiled +application may be installed in the field using the following steps: +<p><ol> +<p><li>Shut down the old version of the application. +<p><li>Run recovery on the database environment, using the <a href="../../api_c/env_open.html">DBENV->open</a> function +or the <a href="../../utility/db_recover.html">db_recover</a> utility. +<p><li>Install the new version of the application. +<p><li>Re-start the application. +</ol> +<p>If the application has a Berkeley DB transactional environment, and the log +file format has changed but the database formats have not, the +re-compiled application may be installed in the field using the +following steps: +<p><ol> +<p><li>Shut down the old version of the application. +<p><li>Run recovery on the database environment, using the <a href="../../api_c/env_open.html">DBENV->open</a> function +or the <a href="../../utility/db_recover.html">db_recover</a> utility. +<p><li>Archive the database environment for catastrophic recovery. See +<a href="../../ref/transapp/archival.html">Archival procedures</a> for more +information. +<p><li>Install the new version of the application. +<p><li>Re-start the application. +</ol> +<p>If the application has a Berkeley DB transactional environment and the +database format has changed, the re-compiled application may be +installed in the field using the following steps: +<p><ol> +<p><li>Shut down the old version of the application. +<p><li>Run recovery on the database environment, using the <a href="../../api_c/env_open.html">DBENV->open</a> function +or the <a href="../../utility/db_recover.html">db_recover</a> utility. +<p><li>Archive the database environment for catastrophic recovery. See +<a href="../../ref/transapp/archival.html">Archival procedures</a> for more +information. +<p><li>Install the new version of the application. +<p><li>Upgrade the application's databases. See +<a href="../../ref/am/upgrade.html">Upgrading databases</a> for more +information. +<p><li>Archive the database for catastrophic recovery again (using different +media than before, of course). +<p>This archival is not strictly necessary. However, if you have to perform +catastrophic recovery after restarting your applications, that recovery +must be done based on the last archive you have made. If you make this +archive, you can use it as the basis of your catastrophic recovery. If +you do not make this archive, you will have to use the archive you made +in step #2 as the basis of your recovery, and you will have to upgrade it +as described in step #3 before you can apply your log files to it. +<p><li>Re-start the application. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/build_vxworks/faq.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/upgrade.2.0/intro.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/xa/config.html b/db/docs/ref/xa/config.html new file mode 100644 index 000000000..cfe31f372 --- /dev/null +++ b/db/docs/ref/xa/config.html @@ -0,0 +1,79 @@ +<!--$Id: config.so,v 10.18 2000/03/22 22:02:15 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Configuring Berkeley DB with the Tuxedo System</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>XA Resource Manager</dl></h3></td> +<td width="1%"><a href="../../ref/xa/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Configuring Berkeley DB with the Tuxedo System</h1> +<p>This information assumes that you have already installed the Berkeley DB +library. +<p>First, you must update the resource manager file in Tuxedo. For the +purposes of this discussion, assume the Tuxedo home directory is in: +<p><blockquote><pre>/home/tuxedo</pre></blockquote> +In that case, the resource manager file will be located in: +<p><blockquote><pre>/home/tuxedo/udataobj/RM</pre></blockquote> +Edit the resource manager file, adding the following line: +<p><blockquote><pre>BERKELEY-DB:db_xa_switch:-L${DB_INSTALL}/lib -ldb \ + -lsocket -ldl -lm</pre></blockquote> +<p>where ${DB_INSTALLHOME} is the directory into which you installed the Berkeley DB +library. +<p><b>Note, the above load options are for a Sun Microsystems Solaris +5.6 Sparc installation of Tuxedo, and may not be correct for your system.</b> +<p>Next, you must build the transaction manager server. To do this, use the +Tuxedo <b>buildtms</b>(1) utility. The buildtms utility will create +the Berkeley-DB resource manager in the directory from which it was run. +The parameters to buildtms should be: +<p><blockquote><pre>buildtms -v -o DBRM -r BERKELEY-DB</pre></blockquote> +<p>This will create an executable transaction manager server, DBRM, that is +called by Tuxedo to process begins, commits, and aborts. +<p>Finally, you must make sure that your TUXCONFIG environment variable +identifies a ubbconfig file that properly identifies your resource +managers. In the GROUPS section of the ubb file, you should identify the +group's LMID and GRPNO as well as the transaction manager server name +"TMSNAME=DBRM." You must also specify the OPENINFO parameter, setting it +equal to the string: +<p><blockquote><pre>rm_name:dir</pre></blockquote> +<p>where rm_name is the resource name specified in the RM file (i.e., +BERKELEY-DB) and dir is the directory for the Berkeley DB home environment +(see <a href="../../api_c/env_open.html">DBENV->open</a> for a discussion of Berkeley DB environments). +<p>As Tuxedo resource manager startup accepts only a single string for +configuration, any environment customization that might have been done +via the config parameter to <a href="../../api_c/env_open.html">DBENV->open</a> must instead be done by +placing a <a href="../../ref/env/naming.html#DB_CONFIG">DB_CONFIG</a> file in the Berkeley DB environment directory. See +<a href="../../ref/env/naming.html">Berkeley DB File Naming</a> for further +information. +<p>Consider the following configuration. We have built a transaction +manager server as described above. We want the Berkeley DB environment +to be <b>/home/dbhome</b>, our database files to be maintained +in <b>/home/datafiles</b>, our log files to be maintained in +<b>/home/log</b>, and we want a duplexed server. +<p>The GROUPS section of the ubb file might look like: +<p><blockquote><pre>group_tm LMID=myname GRPNO=1 TMSNAME=DBRM TMSCOUNT=2 \ + OPENINFO="BERKELEY-DB:/home/dbhome"</pre></blockquote> +<p>There would be a <a href="../../ref/env/naming.html#DB_CONFIG">DB_CONFIG</a> configuration file in the directory +<b>/home/dbhome</b> that contained the following two lines: +<p><blockquote><pre>DB_DATA_DIR /home/datafiles +DB_LOG_DIR /home/log +</pre></blockquote> +<p>Finally, the ubb file must be translated into a binary version, using +Tuxedo's <b>tmloadcf</b>(1) utility, and then the pathname of that +binary file must be specified as your TUXCONFIG environment variable. +<p>At this point, your system is properly initialized to use the Berkeley DB +resource manager. +<p>See <a href="../../api_c/db_create.html">db_create</a> for further information on accessing data files +using XA. +<table><tr><td><br></td><td width="1%"><a href="../../ref/xa/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/faq.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/xa/faq.html b/db/docs/ref/xa/faq.html new file mode 100644 index 000000000..db1e26a0b --- /dev/null +++ b/db/docs/ref/xa/faq.html @@ -0,0 +1,55 @@ +<!--$Id: faq.so,v 10.11 2000/03/18 21:43:21 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Frequently Asked Questions</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>XA Resource Manager</dl></h3></td> +<td width="1%"><a href="../../ref/xa/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/appsignals.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Frequently Asked Questions</h1> +<p><ol> +<p><li><b>Does converting an application to run within XA change any of +the already existing C/C++ API calls it does?</b> +<p>When converting an application to run under XA, the application's Berkeley DB +calls are unchanged, with two exceptions: +<p><ol> +<p><li>The application must use specify the <a href="../../api_c/db_create.html#DB_XA_CREATE">DB_XA_CREATE</a> flag +to the <a href="../../api_c/db_create.html">db_create</a> interface. +<p><li>The application should never explicitly call <a href="../../api_c/txn_commit.html">txn_commit</a>, +<a href="../../api_c/txn_abort.html">txn_abort</a> or <a href="../../api_c/txn_begin.html">txn_begin</a>, as those calls are replaced by +calls into the Tuxedo transaction manager. For the same reason, the +application will always specify a transaction argument of NULL to the +Berkeley DB functions that take transaction arguments (e.g., <a href="../../api_c/db_put.html">DB->put</a> or +<a href="../../api_c/db_cursor.html">DB->cursor</a>). +</ol> +<p>Otherwise, your application should be unchanged. +<hr size=1 noshade> +<p><li><b>Is it possible to mix XA and non-XA transactions?</b> +<p>Yes. It is also possible for XA and non-XA transactions to co-exist in +the same Berkeley DB environment. To do this, specify the same environment to +the non-XA <a href="../../api_c/env_open.html">DBENV->open</a> calls as was specified in the Tuxedo +configuration file. +<hr size=1 noshade> +<p><li><b>How does Berkeley DB recovery interact with recovery by the transaction +manager?</b> +<p>When the Tuxedo recovery calls the Berkeley DB recovery functions, the standard +Berkeley DB recovery procedures occur, for all operations that are represented +in the Berkeley DB log files. This includes any non-XA transactions that were +performed in the environment. Of course, this means that you can't use +the standard Berkeley DB utilities (e.g., <a href="../../utility/db_recover.html">db_recover</a>) to perform +recovery. +<p>Also, standard log file archival and catastrophic recovery procedures +should occur independently of XA operation. +</ol> +<table><tr><td><br></td><td width="1%"><a href="../../ref/xa/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/program/appsignals.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> diff --git a/db/docs/ref/xa/intro.html b/db/docs/ref/xa/intro.html new file mode 100644 index 000000000..7643ee420 --- /dev/null +++ b/db/docs/ref/xa/intro.html @@ -0,0 +1,61 @@ +<!--$Id: intro.so,v 10.19 2000/12/04 18:05:45 bostic Exp $--> +<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.--> +<!--All rights reserved.--> +<html> +<head> +<title>Berkeley DB Reference Guide: Introduction</title> +<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit."> +<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++"> +</head> +<body bgcolor=white> + <a name="2"><!--meow--></a> +<table><tr valign=top> +<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>XA Resource Manager</dl></h3></td> +<td width="1%"><a href="../../ref/transapp/throughput.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p> +<h1 align=center>Introduction</h1> +<p>Berkeley DB can be used as an XA-compliant resource manager. The XA +implementation is known to work with the Tuxedo(tm) transaction +manager. +<p>The XA support is encapsulated in the resource manager switch +db_xa_switch, which defines the following functions: +<p><blockquote><pre>__db_xa_close Close the resource manager. +__db_xa_commit Commit the specified transaction. +__db_xa_complete Wait for asynchronous operations to + complete. +__db_xa_end Disassociate the application from a + transaction. +__db_xa_forget Forget about a transaction that was heuristically + completed. (Berkeley DB does not support heuristic + completion.) +__db_xa_open Open the resource manager. +__db_xa_prepare Prepare the specified transaction. +__db_xa_recover Return a list of prepared, but not yet + committed transactions. +__db_xa_rollback Abort the specified transaction. +__db_xa_start Associate the application with a + transaction. +</pre></blockquote> +<p>The Berkeley DB resource manager does not support the following optional +XA features: +<ul type=disc> +<li>Asynchronous operations. +<li>Transaction migration. +</ul> +<p>The Tuxedo System is available from <a href="http://www.beasys.com">BEA Systems, Inc.</a> +<p>For additional information on Tuxedo, see: +<p><blockquote><i>Building Client/Server Applications Using Tuxedo</i>, +by Hall, John Wiley & Sons, Inc. Publishers.</blockquote> +<p>For additional information on XA Resource Managers, see: +<p><blockquote>X/Open CAE Specification +<i>Distributed Transaction Processing: The XA Specification</i>, +X/Open Document Number: XO/CAE/91/300.</blockquote> +<p>For additional information on The Tuxedo System, see: +<p><blockquote><i>The Tuxedo System</i>, +by Andrade, Carges, Dwyer and Felts, Addison Wesley Longman Publishers.</blockquote> +<table><tr><td><br></td><td width="1%"><a href="../../ref/transapp/throughput.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/xa/config.html"><img src="../../images/next.gif" alt="Next"></a> +</td></tr></table> +<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font> +</body> +</html> |