1 files changed, 12394 insertions, 0 deletions
diff --git a/doc/nasmdoc.txt b/doc/nasmdoc.txt
new file mode 100644
index 0000000..5a7286b
--- /dev/null
+++ b/doc/nasmdoc.txt
@@ -0,0 +1,12394 @@
+                        The Netwide Assembler: NASM
+                        ===========================
+
+Chapter 1: Introduction
+-----------------------
+
+   1.1 What Is NASM?
+
+       The Netwide Assembler, NASM, is an 80x86 and x86-64 assembler
+       designed for portability and modularity. It supports a range of
+       object file formats, including Linux and `*BSD' `a.out', `ELF',
+       `COFF', `Mach-O', Microsoft 16-bit `OBJ', `Win32' and `Win64'. It
+       will also output plain binary files. Its syntax is designed to be
+       simple and easy to understand, similar to Intel's but less complex.
+       It supports all currently known x86 architectural extensions, and
+       has strong support for macros.
+
+ 1.1.1 Why Yet Another Assembler?
+
+       The Netwide Assembler grew out of an idea on `comp.lang.asm.x86' (or
+       possibly `alt.lang.asm' - I forget which), which was essentially
+       that there didn't seem to be a good _free_ x86-series assembler
+       around, and that maybe someone ought to write one.
+
+       (*) `a86' is good, but not free, and in particular you don't get any
+           32-bit capability until you pay. It's DOS only, too.
+
+       (*) `gas' is free, and ports over to DOS and Unix, but it's not very
+           good, since it's designed to be a back end to `gcc', which
+           always feeds it correct code. So its error checking is minimal.
+           Also, its syntax is horrible, from the point of view of anyone
+           trying to actually _write_ anything in it. Plus you can't write
+           16-bit code in it (properly.)
+
+       (*) `as86' is specific to Minix and Linux, and (my version at least)
+           doesn't seem to have much (or any) documentation.
+
+       (*) `MASM' isn't very good, and it's (was) expensive, and it runs
+           only under DOS.
+
+       (*) `TASM' is better, but still strives for MASM compatibility,
+           which means millions of directives and tons of red tape. And its
+           syntax is essentially MASM's, with the contradictions and quirks
+           that entails (although it sorts out some of those by means of
+           Ideal mode.) It's expensive too. And it's DOS-only.
+
+       So here, for your coding pleasure, is NASM. At present it's still in
+       prototype stage - we don't promise that it can outperform any of
+       these assemblers. But please, _please_ send us bug reports, fixes,
+       helpful information, and anything else you can get your hands on
+       (and thanks to the many people who've done this already! You all
+       know who you are), and we'll improve it out of all recognition.
+       Again.
+
+ 1.1.2 License Conditions
+
+       Please see the file `LICENSE', supplied as part of any NASM
+       distribution archive, for the license conditions under which you may
+       use NASM. NASM is now under the so-called 2-clause BSD license, also
+       known as the simplified BSD license.
+
+       Copyright 1996-2009 the NASM Authors - All rights reserved.
+
+       Redistribution and use in source and binary forms, with or without
+       modification, are permitted provided that the following conditions
+       are met:
+
+       (*) Redistributions of source code must retain the above copyright
+           notice, this list of conditions and the following disclaimer.
+
+       (*) Redistributions in binary form must reproduce the above
+           copyright notice, this list of conditions and the following
+           disclaimer in the documentation and/or other materials provided
+           with the distribution.
+
+       THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+       "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+       LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+       FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+       COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+       INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+       BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+       LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+       CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+       LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+       ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+       POSSIBILITY OF SUCH DAMAGE.
+
+   1.2 Contact Information
+
+       The current version of NASM (since about 0.98.08) is maintained by a
+       team of developers, accessible through the `nasm-devel' mailing list
+       (see below for the link). If you want to report a bug, please read
+       section 12.2 first.
+
+       NASM has a website at `http://www.nasm.us/'. If it's not there,
+       google for us!
+
+       New releases, release candidates, and daily development snapshots of
+       NASM are available from the official web site.
+
+       Announcements are posted to `comp.lang.asm.x86', and to the web site
+       `http://www.freshmeat.net/'.
+
+       If you want information about the current development status, please
+       subscribe to the `nasm-devel' email list; see link from the website.
+
+   1.3 Installation
+
+ 1.3.1 Installing NASM under MS-DOS or Windows
+
+       Once you've obtained the appropriate archive for NASM,
+       `nasm-XXX-dos.zip' or `nasm-XXX-win32.zip' (where `XXX' denotes the
+       version number of NASM contained in the archive), unpack it into its
+       own directory (for example `c:\nasm').
+
+       The archive will contain a set of executable files: the NASM
+       executable file `nasm.exe', the NDISASM executable file
+       `ndisasm.exe', and possibly additional utilities to handle the RDOFF
+       file format.
+
+       The only file NASM needs to run is its own executable, so copy
+       `nasm.exe' to a directory on your PATH, or alternatively edit
+       `autoexec.bat' to add the `nasm' directory to your `PATH' (to do
+       that under Windows XP, go to Start > Control Panel > System >
+       Advanced > Environment Variables; these instructions may work under
+       other versions of Windows as well.)
+
+       That's it - NASM is installed. You don't need the nasm directory to
+       be present to run NASM (unless you've added it to your `PATH'), so
+       you can delete it if you need to save space; however, you may want
+       to keep the documentation or test programs.
+
+       If you've downloaded the DOS source archive, `nasm-XXX.zip', the
+       `nasm' directory will also contain the full NASM source code, and a
+       selection of Makefiles you can (hopefully) use to rebuild your copy
+       of NASM from scratch. See the file `INSTALL' in the source archive.
+
+       Note that a number of files are generated from other files by Perl
+       scripts. Although the NASM source distribution includes these
+       generated files, you will need to rebuild them (and hence, will need
+       a Perl interpreter) if you change insns.dat, standard.mac or the
+       documentation. It is possible future source distributions may not
+       include these files at all. Ports of Perl for a variety of
+       platforms, including DOS and Windows, are available from
+       www.cpan.org.
+
+ 1.3.2 Installing NASM under Unix
+
+       Once you've obtained the Unix source archive for NASM,
+       `nasm-XXX.tar.gz' (where `XXX' denotes the version number of NASM
+       contained in the archive), unpack it into a directory such as
+       `/usr/local/src'. The archive, when unpacked, will create its own
+       subdirectory `nasm-XXX'.
+
+       NASM is an auto-configuring package: once you've unpacked it, `cd'
+       to the directory it's been unpacked into and type `./configure'.
+       This shell script will find the best C compiler to use for building
+       NASM and set up Makefiles accordingly.
+
+       Once NASM has auto-configured, you can type `make' to build the
+       `nasm' and `ndisasm' binaries, and then `make install' to install
+       them in `/usr/local/bin' and install the man pages `nasm.1' and
+       `ndisasm.1' in `/usr/local/man/man1'. Alternatively, you can give
+       options such as `--prefix' to the configure script (see the file
+       `INSTALL' for more details), or install the programs yourself.
+
+       NASM also comes with a set of utilities for handling the `RDOFF'
+       custom object-file format, which are in the `rdoff' subdirectory of
+       the NASM archive. You can build these with `make rdf' and install
+       them with `make rdf_install', if you want them.
+
+Chapter 2: Running NASM
+-----------------------
+
+   2.1 NASM Command-Line Syntax
+
+       To assemble a file, you issue a command of the form
+
+       nasm -f <format> <filename> [-o <output>]
+
+       For example,
+
+       nasm -f elf myfile.asm
+
+       will assemble `myfile.asm' into an `ELF' object file `myfile.o'. And
+
+       nasm -f bin myfile.asm -o myfile.com
+
+       will assemble `myfile.asm' into a raw binary file `myfile.com'.
+
+       To produce a listing file, with the hex codes output from NASM
+       displayed on the left of the original sources, use the `-l' option
+       to give a listing file name, for example:
+
+       nasm -f coff myfile.asm -l myfile.lst
+
+       To get further usage instructions from NASM, try typing
+
+       nasm -h
+
+       As `-hf', this will also list the available output file formats, and
+       what they are.
+
+       If you use Linux but aren't sure whether your system is `a.out' or
+       `ELF', type
+
+       file nasm
+
+       (in the directory in which you put the NASM binary when you
+       installed it). If it says something like
+
+       nasm: ELF 32-bit LSB executable i386 (386 and up) Version 1
+
+       then your system is `ELF', and you should use the option `-f elf'
+       when you want NASM to produce Linux object files. If it says
+
+       nasm: Linux/i386 demand-paged executable (QMAGIC)
+
+       or something similar, your system is `a.out', and you should use
+       `-f aout' instead (Linux `a.out' systems have long been obsolete,
+       and are rare these days.)
+
+       Like Unix compilers and assemblers, NASM is silent unless it goes
+       wrong: you won't see any output at all, unless it gives error
+       messages.
+
+ 2.1.1 The `-o' Option: Specifying the Output File Name
+
+       NASM will normally choose the name of your output file for you;
+       precisely how it does this is dependent on the object file format.
+       For Microsoft object file formats (`obj', `win32' and `win64'), it
+       will remove the `.asm' extension (or whatever extension you like to
+       use - NASM doesn't care) from your source file name and substitute
+       `.obj'. For Unix object file formats (`aout', `as86', `coff',
+       `elf32', `elf64', `ieee', `macho32' and `macho64') it will
+       substitute `.o'. For `dbg', `rdf', `ith' and `srec', it will use
+       `.dbg', `.rdf', `.ith' and `.srec', respectively, and for the `bin'
+       format it will simply remove the extension, so that `myfile.asm'
+       produces the output file `myfile'.
+
+       If the output file already exists, NASM will overwrite it, unless it
+       has the same name as the input file, in which case it will give a
+       warning and use `nasm.out' as the output file name instead.
+
+       For situations in which this behaviour is unacceptable, NASM
+       provides the `-o' command-line option, which allows you to specify
+       your desired output file name. You invoke `-o' by following it with
+       the name you wish for the output file, either with or without an
+       intervening space. For example:
+
+       nasm -f bin program.asm -o program.com 
+       nasm -f bin driver.asm -odriver.sys
+
+       Note that this is a small o, and is different from a capital O ,
+       which is used to specify the number of optimisation passes required.
+       See section 2.1.22.
+
+ 2.1.2 The `-f' Option: Specifying the Output File Format
+
+       If you do not supply the `-f' option to NASM, it will choose an
+       output file format for you itself. In the distribution versions of
+       NASM, the default is always `bin'; if you've compiled your own copy
+       of NASM, you can redefine `OF_DEFAULT' at compile time and choose
+       what you want the default to be.
+
+       Like `-o', the intervening space between `-f' and the output file
+       format is optional; so `-f elf' and `-felf' are both valid.
+
+       A complete list of the available output file formats can be given by
+       issuing the command `nasm -hf'.
+
+ 2.1.3 The `-l' Option: Generating a Listing File
+
+       If you supply the `-l' option to NASM, followed (with the usual
+       optional space) by a file name, NASM will generate a source-listing
+       file for you, in which addresses and generated code are listed on
+       the left, and the actual source code, with expansions of multi-line
+       macros (except those which specifically request no expansion in
+       source listings: see section 4.3.10) on the right. For example:
+
+       nasm -f elf myfile.asm -l myfile.lst
+
+       If a list file is selected, you may turn off listing for a section
+       of your source with `[list -]', and turn it back on with `[list +]',
+       (the default, obviously). There is no "user form" (without the
+       brackets). This can be used to list only sections of interest,
+       avoiding excessively long listings.
+
+ 2.1.4 The `-M' Option: Generate Makefile Dependencies
+
+       This option can be used to generate makefile dependencies on stdout.
+       This can be redirected to a file for further processing. For
+       example:
+
+       nasm -M myfile.asm > myfile.dep
+
+ 2.1.5 The `-MG' Option: Generate Makefile Dependencies
+
+       This option can be used to generate makefile dependencies on stdout.
+       This differs from the `-M' option in that if a nonexisting file is
+       encountered, it is assumed to be a generated file and is added to
+       the dependency list without a prefix.
+
+ 2.1.6 The `-MF' Option: Set Makefile Dependency File
+
+       This option can be used with the `-M' or `-MG' options to send the
+       output to a file, rather than to stdout. For example:
+
+       nasm -M -MF myfile.dep myfile.asm
+
+ 2.1.7 The `-MD' Option: Assemble and Generate Dependencies
+
+       The `-MD' option acts as the combination of the `-M' and `-MF'
+       options (i.e. a filename has to be specified.) However, unlike the
+       `-M' or `-MG' options, `-MD' does _not_ inhibit the normal operation
+       of the assembler. Use this to automatically generate updated
+       dependencies with every assembly session. For example:
+
+       nasm -f elf -o myfile.o -MD myfile.dep myfile.asm
+
+ 2.1.8 The `-MT' Option: Dependency Target Name
+
+       The `-MT' option can be used to override the default name of the
+       dependency target. This is normally the same as the output filename,
+       specified by the `-o' option.
+
+ 2.1.9 The `-MQ' Option: Dependency Target Name (Quoted)
+
+       The `-MQ' option acts as the `-MT' option, except it tries to quote
+       characters that have special meaning in Makefile syntax. This is not
+       foolproof, as not all characters with special meaning are quotable
+       in Make.
+
+2.1.10 The `-MP' Option: Emit phony targets
+
+       When used with any of the dependency generation options, the `-MP'
+       option causes NASM to emit a phony target without dependencies for
+       each header file. This prevents Make from complaining if a header
+       file has been removed.
+
+2.1.11 The `-F' Option: Selecting a Debug Information Format
+
+       This option is used to select the format of the debug information
+       emitted into the output file, to be used by a debugger (or _will_
+       be). Prior to version 2.03.01, the use of this switch did _not_
+       enable output of the selected debug info format. Use `-g', see
+       section 2.1.12, to enable output. Versions 2.03.01 and later
+       automatically enable `-g' if `-F' is specified.
+
+       A complete list of the available debug file formats for an output
+       format can be seen by issuing the command `nasm -f <format> -y'. Not
+       all output formats currently support debugging output. See section
+       2.1.26.
+
+       This should not be confused with the `-f dbg' output format option
+       which is not built into NASM by default. For information on how to
+       enable it when building from the sources, see section 7.14.
+
+2.1.12 The `-g' Option: Enabling Debug Information.
+
+       This option can be used to generate debugging information in the
+       specified format. See section 2.1.11. Using `-g' without `-F'
+       results in emitting debug info in the default format, if any, for
+       the selected output format. If no debug information is currently
+       implemented in the selected output format, `-g' is _silently
+       ignored_.
+
+2.1.13 The `-X' Option: Selecting an Error Reporting Format
+
+       This option can be used to select an error reporting format for any
+       error messages that might be produced by NASM.
+
+       Currently, two error reporting formats may be selected. They are the
+       `-Xvc' option and the `-Xgnu' option. The GNU format is the default
+       and looks like this:
+
+       filename.asm:65: error: specific error message
+
+       where `filename.asm' is the name of the source file in which the
+       error was detected, `65' is the source file line number on which the
+       error was detected, `error' is the severity of the error (this could
+       be `warning'), and `specific error message' is a more detailed text
+       message which should help pinpoint the exact problem.
+
+       The other format, specified by `-Xvc' is the style used by Microsoft
+       Visual C++ and some other programs. It looks like this:
+
+       filename.asm(65) : error: specific error message
+
+       where the only difference is that the line number is in parentheses
+       instead of being delimited by colons.
+
+       See also the `Visual C++' output format, section 7.5.
+
+2.1.14 The `-Z' Option: Send Errors to a File
+
+       Under `MS-DOS' it can be difficult (though there are ways) to
+       redirect the standard-error output of a program to a file. Since
+       NASM usually produces its warning and error messages on `stderr',
+       this can make it hard to capture the errors if (for example) you
+       want to load them into an editor.
+
+       NASM therefore provides the `-Z' option, taking a filename argument
+       which causes errors to be sent to the specified files rather than
+       standard error. Therefore you can redirect the errors into a file by
+       typing
+
+       nasm -Z myfile.err -f obj myfile.asm
+
+       In earlier versions of NASM, this option was called `-E', but it was
+       changed since `-E' is an option conventionally used for
+       preprocessing only, with disastrous results. See section 2.1.20.
+
+2.1.15 The `-s' Option: Send Errors to `stdout'
+
+       The `-s' option redirects error messages to `stdout' rather than
+       `stderr', so it can be redirected under `MS-DOS'. To assemble the
+       file `myfile.asm' and pipe its output to the `more' program, you can
+       type:
+
+       nasm -s -f obj myfile.asm | more
+
+       See also the `-Z' option, section 2.1.14.
+
+2.1.16 The `-i' Option: Include File Search Directories
+
+       When NASM sees the `%include' or `%pathsearch' directive in a source
+       file (see section 4.6.1, section 4.6.2 or section 3.2.3), it will
+       search for the given file not only in the current directory, but
+       also in any directories specified on the command line by the use of
+       the `-i' option. Therefore you can include files from a macro
+       library, for example, by typing
+
+       nasm -ic:\macrolib\ -f obj myfile.asm
+
+       (As usual, a space between `-i' and the path name is allowed, and
+       optional).
+
+       NASM, in the interests of complete source-code portability, does not
+       understand the file naming conventions of the OS it is running on;
+       the string you provide as an argument to the `-i' option will be
+       prepended exactly as written to the name of the include file.
+       Therefore the trailing backslash in the above example is necessary.
+       Under Unix, a trailing forward slash is similarly necessary.
+
+       (You can use this to your advantage, if you're really perverse, by
+       noting that the option `-ifoo' will cause `%include "bar.i"' to
+       search for the file `foobar.i'...)
+
+       If you want to define a _standard_ include search path, similar to
+       `/usr/include' on Unix systems, you should place one or more `-i'
+       directives in the `NASMENV' environment variable (see section
+       2.1.28).
+
+       For Makefile compatibility with many C compilers, this option can
+       also be specified as `-I'.
+
+2.1.17 The `-p' Option: Pre-Include a File
+
+       NASM allows you to specify files to be _pre-included_ into your
+       source file, by the use of the `-p' option. So running
+
+       nasm myfile.asm -p myinc.inc
+
+       is equivalent to running `nasm myfile.asm' and placing the directive
+       `%include "myinc.inc"' at the start of the file.
+
+       For consistency with the `-I', `-D' and `-U' options, this option
+       can also be specified as `-P'.
+
+2.1.18 The `-d' Option: Pre-Define a Macro
+
+       Just as the `-p' option gives an alternative to placing `%include'
+       directives at the start of a source file, the `-d' option gives an
+       alternative to placing a `%define' directive. You could code
+
+       nasm myfile.asm -dFOO=100
+
+       as an alternative to placing the directive
+
+       %define FOO 100
+
+       at the start of the file. You can miss off the macro value, as well:
+       the option `-dFOO' is equivalent to coding `%define FOO'. This form
+       of the directive may be useful for selecting assembly-time options
+       which are then tested using `%ifdef', for example `-dDEBUG'.
+
+       For Makefile compatibility with many C compilers, this option can
+       also be specified as `-D'.
+
+2.1.19 The `-u' Option: Undefine a Macro
+
+       The `-u' option undefines a macro that would otherwise have been
+       pre-defined, either automatically or by a `-p' or `-d' option
+       specified earlier on the command lines.
+
+       For example, the following command line:
+
+       nasm myfile.asm -dFOO=100 -uFOO
+
+       would result in `FOO' _not_ being a predefined macro in the program.
+       This is useful to override options specified at a different point in
+       a Makefile.
+
+       For Makefile compatibility with many C compilers, this option can
+       also be specified as `-U'.
+
+2.1.20 The `-E' Option: Preprocess Only
+
+       NASM allows the preprocessor to be run on its own, up to a point.
+       Using the `-E' option (which requires no arguments) will cause NASM
+       to preprocess its input file, expand all the macro references,
+       remove all the comments and preprocessor directives, and print the
+       resulting file on standard output (or save it to a file, if the `-o'
+       option is also used).
+
+       This option cannot be applied to programs which require the
+       preprocessor to evaluate expressions which depend on the values of
+       symbols: so code such as
+
+       %assign tablesize ($-tablestart)
+
+       will cause an error in preprocess-only mode.
+
+       For compatiblity with older version of NASM, this option can also be
+       written `-e'. `-E' in older versions of NASM was the equivalent of
+       the current `-Z' option, section 2.1.14.
+
+2.1.21 The `-a' Option: Don't Preprocess At All
+
+       If NASM is being used as the back end to a compiler, it might be
+       desirable to suppress preprocessing completely and assume the
+       compiler has already done it, to save time and increase compilation
+       speeds. The `-a' option, requiring no argument, instructs NASM to
+       replace its powerful preprocessor with a stub preprocessor which
+       does nothing.
+
+2.1.22 The `-O' Option: Specifying Multipass Optimization
+
+       NASM defaults to not optimizing operands which can fit into a signed
+       byte. This means that if you want the shortest possible object code,
+       you have to enable optimization.
+
+       Using the `-O' option, you can tell NASM to carry out different
+       levels of optimization. The syntax is:
+
+       (*) `-O0': No optimization. All operands take their long forms, if a
+           short form is not specified, except conditional jumps. This is
+           intended to match NASM 0.98 behavior.
+
+       (*) `-O1': Minimal optimization. As above, but immediate operands
+           which will fit in a signed byte are optimized, unless the long
+           form is specified. Conditional jumps default to the long form
+           unless otherwise specified.
+
+       (*) `-Ox' (where `x' is the actual letter `x'): Multipass
+           optimization. Minimize branch offsets and signed immediate
+           bytes, overriding size specification unless the `strict' keyword
+           has been used (see section 3.7). For compatability with earlier
+           releases, the letter `x' may also be any number greater than
+           one. This number has no effect on the actual number of passes.
+
+       The `-Ox' mode is recommended for most uses.
+
+       Note that this is a capital `O', and is different from a small `o',
+       which is used to specify the output file name. See section 2.1.1.
+
+2.1.23 The `-t' Option: Enable TASM Compatibility Mode
+
+       NASM includes a limited form of compatibility with Borland's `TASM'.
+       When NASM's `-t' option is used, the following changes are made:
+
+       (*) local labels may be prefixed with `@@' instead of `.'
+
+       (*) size override is supported within brackets. In TASM compatible
+           mode, a size override inside square brackets changes the size of
+           the operand, and not the address type of the operand as it does
+           in NASM syntax. E.g. `mov eax,[DWORD val]' is valid syntax in
+           TASM compatibility mode. Note that you lose the ability to
+           override the default address type for the instruction.
+
+       (*) unprefixed forms of some directives supported (`arg', `elif',
+           `else', `endif', `if', `ifdef', `ifdifi', `ifndef', `include',
+           `local')
+
+2.1.24 The `-w' and `-W' Options: Enable or Disable Assembly Warnings
+
+       NASM can observe many conditions during the course of assembly which
+       are worth mentioning to the user, but not a sufficiently severe
+       error to justify NASM refusing to generate an output file. These
+       conditions are reported like errors, but come up with the word
+       `warning' before the message. Warnings do not prevent NASM from
+       generating an output file and returning a success status to the
+       operating system.
+
+       Some conditions are even less severe than that: they are only
+       sometimes worth mentioning to the user. Therefore NASM supports the
+       `-w' command-line option, which enables or disables certain classes
+       of assembly warning. Such warning classes are described by a name,
+       for example `orphan-labels'; you can enable warnings of this class
+       by the command-line option `-w+orphan-labels' and disable it by
+       `-w-orphan-labels'.
+
+       The suppressible warning classes are:
+
+       (*) `macro-params' covers warnings about multi-line macros being
+           invoked with the wrong number of parameters. This warning class
+           is enabled by default; see section 4.3.2 for an example of why
+           you might want to disable it.
+
+       (*) `macro-selfref' warns if a macro references itself. This warning
+           class is disabled by default.
+
+       (*) `macro-defaults' warns when a macro has more default parameters
+           than optional parameters. This warning class is enabled by
+           default; see section 4.3.5 for why you might want to disable it.
+
+       (*) `orphan-labels' covers warnings about source lines which contain
+           no instruction but define a label without a trailing colon. NASM
+           warns about this somewhat obscure condition by default; see
+           section 3.1 for more information.
+
+       (*) `number-overflow' covers warnings about numeric constants which
+           don't fit in 64 bits. This warning class is enabled by default.
+
+       (*) `gnu-elf-extensions' warns if 8-bit or 16-bit relocations are
+           used in `-f elf' format. The GNU extensions allow this. This
+           warning class is disabled by default.
+
+       (*) `float-overflow' warns about floating point overflow. Enabled by
+           default.
+
+       (*) `float-denorm' warns about floating point denormals. Disabled by
+           default.
+
+       (*) `float-underflow' warns about floating point underflow. Disabled
+           by default.
+
+       (*) `float-toolong' warns about too many digits in floating-point
+           numbers. Enabled by default.
+
+       (*) `user' controls `%warning' directives (see section 4.9). Enabled
+           by default.
+
+       (*) `error' causes warnings to be treated as errors. Disabled by
+           default.
+
+       (*) `all' is an alias for _all_ suppressible warning classes (not
+           including `error'). Thus, `-w+all' enables all available
+           warnings.
+
+       In addition, you can set warning classes across sections. Warning
+       classes may be enabled with `[warning +warning-name]', disabled with
+       `[warning -warning-name]' or reset to their original value with
+       `[warning *warning-name]'. No "user form" (without the brackets)
+       exists.
+
+       Since version 2.00, NASM has also supported the gcc-like syntax
+       `-Wwarning' and `-Wno-warning' instead of `-w+warning' and
+       `-w-warning', respectively.
+
+2.1.25 The `-v' Option: Display Version Info
+
+       Typing `NASM -v' will display the version of NASM which you are
+       using, and the date on which it was compiled.
+
+       You will need the version number if you report a bug.
+
+2.1.26 The `-y' Option: Display Available Debug Info Formats
+
+       Typing `nasm -f <option> -y' will display a list of the available
+       debug info formats for the given output format. The default format
+       is indicated by an asterisk. For example:
+
+       nasm -f elf -y
+
+       valid debug formats for 'elf32' output format are 
+         ('*' denotes default): 
+         * stabs     ELF32 (i386) stabs debug format for Linux 
+           dwarf     elf32 (i386) dwarf debug format for Linux
+
+2.1.27 The `--prefix' and `--postfix' Options.
+
+       The `--prefix' and `--postfix' options prepend or append
+       (respectively) the given argument to all `global' or `extern'
+       variables. E.g. `--prefix _' will prepend the underscore to all
+       global and external variables, as C sometimes (but not always) likes
+       it.
+
+2.1.28 The `NASMENV' Environment Variable
+
+       If you define an environment variable called `NASMENV', the program
+       will interpret it as a list of extra command-line options, which are
+       processed before the real command line. You can use this to define
+       standard search directories for include files, by putting `-i'
+       options in the `NASMENV' variable.
+
+       The value of the variable is split up at white space, so that the
+       value `-s -ic:\nasmlib\' will be treated as two separate options.
+       However, that means that the value `-dNAME="my name"' won't do what
+       you might want, because it will be split at the space and the NASM
+       command-line processing will get confused by the two nonsensical
+       words `-dNAME="my' and `name"'.
+
+       To get round this, NASM provides a feature whereby, if you begin the
+       `NASMENV' environment variable with some character that isn't a
+       minus sign, then NASM will treat this character as the separator
+       character for options. So setting the `NASMENV' variable to the
+       value `!-s!-ic:\nasmlib\' is equivalent to setting it to
+       `-s -ic:\nasmlib\', but `!-dNAME="my name"' will work.
+
+       This environment variable was previously called `NASM'. This was
+       changed with version 0.98.31.
+
+   2.2 Quick Start for MASM Users
+
+       If you're used to writing programs with MASM, or with TASM in MASM-
+       compatible (non-Ideal) mode, or with `a86', this section attempts to
+       outline the major differences between MASM's syntax and NASM's. If
+       you're not already used to MASM, it's probably worth skipping this
+       section.
+
+ 2.2.1 NASM Is Case-Sensitive
+
+       One simple difference is that NASM is case-sensitive. It makes a
+       difference whether you call your label `foo', `Foo' or `FOO'. If
+       you're assembling to `DOS' or `OS/2' `.OBJ' files, you can invoke
+       the `UPPERCASE' directive (documented in section 7.4) to ensure that
+       all symbols exported to other code modules are forced to be upper
+       case; but even then, _within_ a single module, NASM will distinguish
+       between labels differing only in case.
+
+ 2.2.2 NASM Requires Square Brackets For Memory References
+
+       NASM was designed with simplicity of syntax in mind. One of the
+       design goals of NASM is that it should be possible, as far as is
+       practical, for the user to look at a single line of NASM code and
+       tell what opcode is generated by it. You can't do this in MASM: if
+       you declare, for example,
+
+       foo     equ     1 
+       bar     dw      2
+
+       then the two lines of code
+
+               mov     ax,foo 
+               mov     ax,bar
+
+       generate completely different opcodes, despite having identical-
+       looking syntaxes.
+
+       NASM avoids this undesirable situation by having a much simpler
+       syntax for memory references. The rule is simply that any access to
+       the _contents_ of a memory location requires square brackets around
+       the address, and any access to the _address_ of a variable doesn't.
+       So an instruction of the form `mov ax,foo' will _always_ refer to a
+       compile-time constant, whether it's an `EQU' or the address of a
+       variable; and to access the _contents_ of the variable `bar', you
+       must code `mov ax,[bar]'.
+
+       This also means that NASM has no need for MASM's `OFFSET' keyword,
+       since the MASM code `mov ax,offset bar' means exactly the same thing
+       as NASM's `mov ax,bar'. If you're trying to get large amounts of
+       MASM code to assemble sensibly under NASM, you can always code
+       `%idefine offset' to make the preprocessor treat the `OFFSET'
+       keyword as a no-op.
+
+       This issue is even more confusing in `a86', where declaring a label
+       with a trailing colon defines it to be a `label' as opposed to a
+       `variable' and causes `a86' to adopt NASM-style semantics; so in
+       `a86', `mov ax,var' has different behaviour depending on whether
+       `var' was declared as `var: dw 0' (a label) or `var dw 0' (a word-
+       size variable). NASM is very simple by comparison: _everything_ is a
+       label.
+
+       NASM, in the interests of simplicity, also does not support the
+       hybrid syntaxes supported by MASM and its clones, such as
+       `mov ax,table[bx]', where a memory reference is denoted by one
+       portion outside square brackets and another portion inside. The
+       correct syntax for the above is `mov ax,[table+bx]'. Likewise,
+       `mov ax,es:[di]' is wrong and `mov ax,[es:di]' is right.
+
+ 2.2.3 NASM Doesn't Store Variable Types
+
+       NASM, by design, chooses not to remember the types of variables you
+       declare. Whereas MASM will remember, on seeing `var dw 0', that you
+       declared `var' as a word-size variable, and will then be able to
+       fill in the ambiguity in the size of the instruction `mov var,2',
+       NASM will deliberately remember nothing about the symbol `var'
+       except where it begins, and so you must explicitly code
+       `mov word [var],2'.
+
+       For this reason, NASM doesn't support the `LODS', `MOVS', `STOS',
+       `SCAS', `CMPS', `INS', or `OUTS' instructions, but only supports the
+       forms such as `LODSB', `MOVSW', and `SCASD', which explicitly
+       specify the size of the components of the strings being manipulated.
+
+ 2.2.4 NASM Doesn't `ASSUME'
+
+       As part of NASM's drive for simplicity, it also does not support the
+       `ASSUME' directive. NASM will not keep track of what values you
+       choose to put in your segment registers, and will never
+       _automatically_ generate a segment override prefix.
+
+ 2.2.5 NASM Doesn't Support Memory Models
+
+       NASM also does not have any directives to support different 16-bit
+       memory models. The programmer has to keep track of which functions
+       are supposed to be called with a far call and which with a near
+       call, and is responsible for putting the correct form of `RET'
+       instruction (`RETN' or `RETF'; NASM accepts `RET' itself as an
+       alternate form for `RETN'); in addition, the programmer is
+       responsible for coding CALL FAR instructions where necessary when
+       calling _external_ functions, and must also keep track of which
+       external variable definitions are far and which are near.
+
+ 2.2.6 Floating-Point Differences
+
+       NASM uses different names to refer to floating-point registers from
+       MASM: where MASM would call them `ST(0)', `ST(1)' and so on, and
+       `a86' would call them simply `0', `1' and so on, NASM chooses to
+       call them `st0', `st1' etc.
+
+       As of version 0.96, NASM now treats the instructions with `nowait'
+       forms in the same way as MASM-compatible assemblers. The
+       idiosyncratic treatment employed by 0.95 and earlier was based on a
+       misunderstanding by the authors.
+
+ 2.2.7 Other Differences
+
+       For historical reasons, NASM uses the keyword `TWORD' where MASM and
+       compatible assemblers use `TBYTE'.
+
+       NASM does not declare uninitialized storage in the same way as MASM:
+       where a MASM programmer might use `stack db 64 dup (?)', NASM
+       requires `stack resb 64', intended to be read as `reserve 64 bytes'.
+       For a limited amount of compatibility, since NASM treats `?' as a
+       valid character in symbol names, you can code `? equ 0' and then
+       writing `dw ?' will at least do something vaguely useful. `DUP' is
+       still not a supported syntax, however.
+
+       In addition to all of this, macros and directives work completely
+       differently to MASM. See chapter 4 and chapter 6 for further
+       details.
+
+Chapter 3: The NASM Language
+----------------------------
+
+   3.1 Layout of a NASM Source Line
+
+       Like most assemblers, each NASM source line contains (unless it is a
+       macro, a preprocessor directive or an assembler directive: see
+       chapter 4 and chapter 6) some combination of the four fields
+
+       label:    instruction operands        ; comment
+
+       As usual, most of these fields are optional; the presence or absence
+       of any combination of a label, an instruction and a comment is
+       allowed. Of course, the operand field is either required or
+       forbidden by the presence and nature of the instruction field.
+
+       NASM uses backslash (\) as the line continuation character; if a
+       line ends with backslash, the next line is considered to be a part
+       of the backslash-ended line.
+
+       NASM places no restrictions on white space within a line: labels may
+       have white space before them, or instructions may have no space
+       before them, or anything. The colon after a label is also optional.
+       (Note that this means that if you intend to code `lodsb' alone on a
+       line, and type `lodab' by accident, then that's still a valid source
+       line which does nothing but define a label. Running NASM with the
+       command-line option `-w+orphan-labels' will cause it to warn you if
+       you define a label alone on a line without a trailing colon.)
+
+       Valid characters in labels are letters, numbers, `_', `$', `#', `@',
+       `~', `.', and `?'. The only characters which may be used as the
+       _first_ character of an identifier are letters, `.' (with special
+       meaning: see section 3.9), `_' and `?'. An identifier may also be
+       prefixed with a `$' to indicate that it is intended to be read as an
+       identifier and not a reserved word; thus, if some other module you
+       are linking with defines a symbol called `eax', you can refer to
+       `$eax' in NASM code to distinguish the symbol from the register.
+       Maximum length of an identifier is 4095 characters.
+
+       The instruction field may contain any machine instruction: Pentium
+       and P6 instructions, FPU instructions, MMX instructions and even
+       undocumented instructions are all supported. The instruction may be
+       prefixed by `LOCK', `REP', `REPE'/`REPZ' or `REPNE'/`REPNZ', in the
+       usual way. Explicit address-size and operand-size prefixes `A16',
+       `A32', `A64', `O16' and `O32', `O64' are provided - one example of
+       their use is given in chapter 10. You can also use the name of a
+       segment register as an instruction prefix: coding `es mov [bx],ax'
+       is equivalent to coding `mov [es:bx],ax'. We recommend the latter
+       syntax, since it is consistent with other syntactic features of the
+       language, but for instructions such as `LODSB', which has no
+       operands and yet can require a segment override, there is no clean
+       syntactic way to proceed apart from `es lodsb'.
+
+       An instruction is not required to use a prefix: prefixes such as
+       `CS', `A32', `LOCK' or `REPE' can appear on a line by themselves,
+       and NASM will just generate the prefix bytes.
+
+       In addition to actual machine instructions, NASM also supports a
+       number of pseudo-instructions, described in section 3.2.
+
+       Instruction operands may take a number of forms: they can be
+       registers, described simply by the register name (e.g. `ax', `bp',
+       `ebx', `cr0': NASM does not use the `gas'-style syntax in which
+       register names must be prefixed by a `%' sign), or they can be
+       effective addresses (see section 3.3), constants (section 3.4) or
+       expressions (section 3.5).
+
+       For x87 floating-point instructions, NASM accepts a wide range of
+       syntaxes: you can use two-operand forms like MASM supports, or you
+       can use NASM's native single-operand forms in most cases. For
+       example, you can code:
+
+               fadd    st1             ; this sets st0 := st0 + st1 
+               fadd    st0,st1         ; so does this 
+       
+               fadd    st1,st0         ; this sets st1 := st1 + st0 
+               fadd    to st1          ; so does this
+
+       Almost any x87 floating-point instruction that references memory
+       must use one of the prefixes `DWORD', `QWORD' or `TWORD' to indicate
+       what size of memory operand it refers to.
+
+   3.2 Pseudo-Instructions
+
+       Pseudo-instructions are things which, though not real x86 machine
+       instructions, are used in the instruction field anyway because
+       that's the most convenient place to put them. The current pseudo-
+       instructions are `DB', `DW', `DD', `DQ', `DT', `DO' and `DY'; their
+       uninitialized counterparts `RESB', `RESW', `RESD', `RESQ', `REST',
+       `RESO' and `RESY'; the `INCBIN' command, the `EQU' command, and the
+       `TIMES' prefix.
+
+ 3.2.1 `DB' and Friends: Declaring Initialized Data
+
+       `DB', `DW', `DD', `DQ', `DT', `DO' and `DY' are used, much as in
+       MASM, to declare initialized data in the output file. They can be
+       invoked in a wide range of ways:
+
+             db    0x55                ; just the byte 0x55 
+             db    0x55,0x56,0x57      ; three bytes in succession 
+             db    'a',0x55            ; character constants are OK 
+             db    'hello',13,10,'$'   ; so are string constants 
+             dw    0x1234              ; 0x34 0x12 
+             dw    'a'                 ; 0x61 0x00 (it's just a number) 
+             dw    'ab'                ; 0x61 0x62 (character constant) 
+             dw    'abc'               ; 0x61 0x62 0x63 0x00 (string) 
+             dd    0x12345678          ; 0x78 0x56 0x34 0x12 
+             dd    1.234567e20         ; floating-point constant 
+             dq    0x123456789abcdef0  ; eight byte constant 
+             dq    1.234567e20         ; double-precision float 
+             dt    1.234567e20         ; extended-precision float
+
+       `DT', `DO' and `DY' do not accept numeric constants as operands.
+
+ 3.2.2 `RESB' and Friends: Declaring Uninitialized Data
+
+       `RESB', `RESW', `RESD', `RESQ', `REST', `RESO' and `RESY' are
+       designed to be used in the BSS section of a module: they declare
+       _uninitialized_ storage space. Each takes a single operand, which is
+       the number of bytes, words, doublewords or whatever to reserve. As
+       stated in section 2.2.7, NASM does not support the MASM/TASM syntax
+       of reserving uninitialized space by writing `DW ?' or similar
+       things: this is what it does instead. The operand to a `RESB'-type
+       pseudo-instruction is a _critical expression_: see section 3.8.
+
+       For example:
+
+       buffer:         resb    64              ; reserve 64 bytes 
+       wordvar:        resw    1               ; reserve a word 
+       realarray       resq    10              ; array of ten reals 
+       ymmval:         resy    1               ; one YMM register
+
+ 3.2.3 `INCBIN': Including External Binary Files
+
+       `INCBIN' is borrowed from the old Amiga assembler DevPac: it
+       includes a binary file verbatim into the output file. This can be
+       handy for (for example) including graphics and sound data directly
+       into a game executable file. It can be called in one of these three
+       ways:
+
+           incbin  "file.dat"             ; include the whole file 
+           incbin  "file.dat",1024        ; skip the first 1024 bytes 
+           incbin  "file.dat",1024,512    ; skip the first 1024, and 
+                                          ; actually include at most 512
+
+       `INCBIN' is both a directive and a standard macro; the standard
+       macro version searches for the file in the include file search path
+       and adds the file to the dependency lists. This macro can be
+       overridden if desired.
+
+ 3.2.4 `EQU': Defining Constants
+
+       `EQU' defines a symbol to a given constant value: when `EQU' is
+       used, the source line must contain a label. The action of `EQU' is
+       to define the given label name to the value of its (only) operand.
+       This definition is absolute, and cannot change later. So, for
+       example,
+
+       message         db      'hello, world' 
+       msglen          equ     $-message
+
+       defines `msglen' to be the constant 12. `msglen' may not then be
+       redefined later. This is not a preprocessor definition either: the
+       value of `msglen' is evaluated _once_, using the value of `$' (see
+       section 3.5 for an explanation of `$') at the point of definition,
+       rather than being evaluated wherever it is referenced and using the
+       value of `$' at the point of reference.
+
+ 3.2.5 `TIMES': Repeating Instructions or Data
+
+       The `TIMES' prefix causes the instruction to be assembled multiple
+       times. This is partly present as NASM's equivalent of the `DUP'
+       syntax supported by MASM-compatible assemblers, in that you can code
+
+       zerobuf:        times 64 db 0
+
+       or similar things; but `TIMES' is more versatile than that. The
+       argument to `TIMES' is not just a numeric constant, but a numeric
+       _expression_, so you can do things like
+
+       buffer: db      'hello, world' 
+               times 64-$+buffer db ' '
+
+       which will store exactly enough spaces to make the total length of
+       `buffer' up to 64. Finally, `TIMES' can be applied to ordinary
+       instructions, so you can code trivial unrolled loops in it:
+
+               times 100 movsb
+
+       Note that there is no effective difference between
+       `times 100 resb 1' and `resb 100', except that the latter will be
+       assembled about 100 times faster due to the internal structure of
+       the assembler.
+
+       The operand to `TIMES' is a critical expression (section 3.8).
+
+       Note also that `TIMES' can't be applied to macros: the reason for
+       this is that `TIMES' is processed after the macro phase, which
+       allows the argument to `TIMES' to contain expressions such as
+       `64-$+buffer' as above. To repeat more than one line of code, or a
+       complex macro, use the preprocessor `%rep' directive.
+
+   3.3 Effective Addresses
+
+       An effective address is any operand to an instruction which
+       references memory. Effective addresses, in NASM, have a very simple
+       syntax: they consist of an expression evaluating to the desired
+       address, enclosed in square brackets. For example:
+
+       wordvar dw      123 
+               mov     ax,[wordvar] 
+               mov     ax,[wordvar+1] 
+               mov     ax,[es:wordvar+bx]
+
+       Anything not conforming to this simple system is not a valid memory
+       reference in NASM, for example `es:wordvar[bx]'.
+
+       More complicated effective addresses, such as those involving more
+       than one register, work in exactly the same way:
+
+               mov     eax,[ebx*2+ecx+offset] 
+               mov     ax,[bp+di+8]
+
+       NASM is capable of doing algebra on these effective addresses, so
+       that things which don't necessarily _look_ legal are perfectly all
+       right:
+
+           mov     eax,[ebx*5]             ; assembles as [ebx*4+ebx] 
+           mov     eax,[label1*2-label2]   ; ie [label1+(label1-label2)]
+
+       Some forms of effective address have more than one assembled form;
+       in most such cases NASM will generate the smallest form it can. For
+       example, there are distinct assembled forms for the 32-bit effective
+       addresses `[eax*2+0]' and `[eax+eax]', and NASM will generally
+       generate the latter on the grounds that the former requires four
+       bytes to store a zero offset.
+
+       NASM has a hinting mechanism which will cause `[eax+ebx]' and
+       `[ebx+eax]' to generate different opcodes; this is occasionally
+       useful because `[esi+ebp]' and `[ebp+esi]' have different default
+       segment registers.
+
+       However, you can force NASM to generate an effective address in a
+       particular form by the use of the keywords `BYTE', `WORD', `DWORD'
+       and `NOSPLIT'. If you need `[eax+3]' to be assembled using a double-
+       word offset field instead of the one byte NASM will normally
+       generate, you can code `[dword eax+3]'. Similarly, you can force
+       NASM to use a byte offset for a small value which it hasn't seen on
+       the first pass (see section 3.8 for an example of such a code
+       fragment) by using `[byte eax+offset]'. As special cases,
+       `[byte eax]' will code `[eax+0]' with a byte offset of zero, and
+       `[dword eax]' will code it with a double-word offset of zero. The
+       normal form, `[eax]', will be coded with no offset field.
+
+       The form described in the previous paragraph is also useful if you
+       are trying to access data in a 32-bit segment from within 16 bit
+       code. For more information on this see the section on mixed-size
+       addressing (section 10.2). In particular, if you need to access data
+       with a known offset that is larger than will fit in a 16-bit value,
+       if you don't specify that it is a dword offset, nasm will cause the
+       high word of the offset to be lost.
+
+       Similarly, NASM will split `[eax*2]' into `[eax+eax]' because that
+       allows the offset field to be absent and space to be saved; in fact,
+       it will also split `[eax*2+offset]' into `[eax+eax+offset]'. You can
+       combat this behaviour by the use of the `NOSPLIT' keyword:
+       `[nosplit eax*2]' will force `[eax*2+0]' to be generated literally.
+
+       In 64-bit mode, NASM will by default generate absolute addresses.
+       The `REL' keyword makes it produce `RIP'-relative addresses. Since
+       this is frequently the normally desired behaviour, see the `DEFAULT'
+       directive (section 6.2). The keyword `ABS' overrides `REL'.
+
+   3.4 Constants
+
+       NASM understands four different types of constant: numeric,
+       character, string and floating-point.
+
+ 3.4.1 Numeric Constants
+
+       A numeric constant is simply a number. NASM allows you to specify
+       numbers in a variety of number bases, in a variety of ways: you can
+       suffix `H' or `X', `Q' or `O', and `B' for hexadecimal, octal and
+       binary respectively, or you can prefix `0x' for hexadecimal in the
+       style of C, or you can prefix `$' for hexadecimal in the style of
+       Borland Pascal. Note, though, that the `$' prefix does double duty
+       as a prefix on identifiers (see section 3.1), so a hex number
+       prefixed with a `$' sign must have a digit after the `$' rather than
+       a letter. In addition, current versions of NASM accept the prefix
+       `0h' for hexadecimal, `0o' or `0q' for octal, and `0b' for binary.
+       Please note that unlike C, a `0' prefix by itself does _not_ imply
+       an octal constant!
+
+       Numeric constants can have underscores (`_') interspersed to break
+       up long strings.
+
+       Some examples (all producing exactly the same code):
+
+               mov     ax,200          ; decimal 
+               mov     ax,0200         ; still decimal 
+               mov     ax,0200d        ; explicitly decimal 
+               mov     ax,0d200        ; also decimal 
+               mov     ax,0c8h         ; hex 
+               mov     ax,$0c8         ; hex again: the 0 is required 
+               mov     ax,0xc8         ; hex yet again 
+               mov     ax,0hc8         ; still hex 
+               mov     ax,310q         ; octal 
+               mov     ax,310o         ; octal again 
+               mov     ax,0o310        ; octal yet again 
+               mov     ax,0q310        ; hex yet again 
+               mov     ax,11001000b    ; binary 
+               mov     ax,1100_1000b   ; same binary constant 
+               mov     ax,0b1100_1000  ; same binary constant yet again
+
+ 3.4.2 Character Strings
+
+       A character string consists of up to eight characters enclosed in
+       either single quotes (`'...''), double quotes (`"..."') or
+       backquotes (``...`'). Single or double quotes are equivalent to NASM
+       (except of course that surrounding the constant with single quotes
+       allows double quotes to appear within it and vice versa); the
+       contents of those are represented verbatim. Strings enclosed in
+       backquotes support C-style `\'-escapes for special characters.
+
+       The following escape sequences are recognized by backquoted strings:
+
+             \'          single quote (') 
+             \"          double quote (") 
+             \`          backquote (`) 
+             \\          backslash (\) 
+             \?          question mark (?) 
+             \a          BEL (ASCII 7) 
+             \b          BS  (ASCII 8) 
+             \t          TAB (ASCII 9) 
+             \n          LF  (ASCII 10) 
+             \v          VT  (ASCII 11) 
+             \f          FF  (ASCII 12) 
+             \r          CR  (ASCII 13) 
+             \e          ESC (ASCII 27) 
+             \377        Up to 3 octal digits - literal byte 
+             \xFF        Up to 2 hexadecimal digits - literal byte 
+             \u1234      4 hexadecimal digits - Unicode character 
+             \U12345678  8 hexadecimal digits - Unicode character
+
+       All other escape sequences are reserved. Note that `\0', meaning a
+       `NUL' character (ASCII 0), is a special case of the octal escape
+       sequence.
+
+       Unicode characters specified with `\u' or `\U' are converted to
+       UTF-8. For example, the following lines are all equivalent:
+
+             db `\u263a`            ; UTF-8 smiley face 
+             db `\xe2\x98\xba`      ; UTF-8 smiley face 
+             db 0E2h, 098h, 0BAh    ; UTF-8 smiley face
+
+ 3.4.3 Character Constants
+
+       A character constant consists of a string up to eight bytes long,
+       used in an expression context. It is treated as if it was an
+       integer.
+
+       A character constant with more than one byte will be arranged with
+       little-endian order in mind: if you code
+
+                 mov eax,'abcd'
+
+       then the constant generated is not `0x61626364', but `0x64636261',
+       so that if you were then to store the value into memory, it would
+       read `abcd' rather than `dcba'. This is also the sense of character
+       constants understood by the Pentium's `CPUID' instruction.
+
+ 3.4.4 String Constants
+
+       String constants are character strings used in the context of some
+       pseudo-instructions, namely the `DB' family and `INCBIN' (where it
+       represents a filename.) They are also used in certain preprocessor
+       directives.
+
+       A string constant looks like a character constant, only longer. It
+       is treated as a concatenation of maximum-size character constants
+       for the conditions. So the following are equivalent:
+
+             db    'hello'               ; string constant 
+             db    'h','e','l','l','o'   ; equivalent character constants
+
+       And the following are also equivalent:
+
+             dd    'ninechars'           ; doubleword string constant 
+             dd    'nine','char','s'     ; becomes three doublewords 
+             db    'ninechars',0,0,0     ; and really looks like this
+
+       Note that when used in a string-supporting context, quoted strings
+       are treated as a string constants even if they are short enough to
+       be a character constant, because otherwise `db 'ab'' would have the
+       same effect as `db 'a'', which would be silly. Similarly, three-
+       character or four-character constants are treated as strings when
+       they are operands to `DW', and so forth.
+
+ 3.4.5 Unicode Strings
+
+       The special operators `__utf16__' and `__utf32__' allows definition
+       of Unicode strings. They take a string in UTF-8 format and converts
+       it to (littleendian) UTF-16 or UTF-32, respectively.
+
+       For example:
+
+       %define u(x) __utf16__(x) 
+       %define w(x) __utf32__(x) 
+       
+             dw u('C:\WINDOWS'), 0       ; Pathname in UTF-16 
+             dd w(`A + B = \u206a`), 0   ; String in UTF-32
+
+       `__utf16__' and `__utf32__' can be applied either to strings passed
+       to the `DB' family instructions, or to character constants in an
+       expression context.
+
+ 3.4.6 Floating-Point Constants
+
+       Floating-point constants are acceptable only as arguments to `DB',
+       `DW', `DD', `DQ', `DT', and `DO', or as arguments to the special
+       operators `__float8__', `__float16__', `__float32__', `__float64__',
+       `__float80m__', `__float80e__', `__float128l__', and
+       `__float128h__'.
+
+       Floating-point constants are expressed in the traditional form:
+       digits, then a period, then optionally more digits, then optionally
+       an `E' followed by an exponent. The period is mandatory, so that
+       NASM can distinguish between `dd 1', which declares an integer
+       constant, and `dd 1.0' which declares a floating-point constant.
+       NASM also support C99-style hexadecimal floating-point: `0x',
+       hexadecimal digits, period, optionally more hexadeximal digits, then
+       optionally a `P' followed by a _binary_ (not hexadecimal) exponent
+       in decimal notation.
+
+       Underscores to break up groups of digits are permitted in floating-
+       point constants as well.
+
+       Some examples:
+
+             db    -0.2                    ; "Quarter precision" 
+             dw    -0.5                    ; IEEE 754r/SSE5 half precision 
+             dd    1.2                     ; an easy one 
+             dd    1.222_222_222           ; underscores are permitted 
+             dd    0x1p+2                  ; 1.0x2^2 = 4.0 
+             dq    0x1p+32                 ; 1.0x2^32 = 4 294 967 296.0 
+             dq    1.e10                   ; 10 000 000 000.0 
+             dq    1.e+10                  ; synonymous with 1.e10 
+             dq    1.e-10                  ; 0.000 000 000 1 
+             dt    3.141592653589793238462 ; pi 
+             do    1.e+4000                ; IEEE 754r quad precision
+
+       The 8-bit "quarter-precision" floating-point format is
+       sign:exponent:mantissa = 1:4:3 with an exponent bias of 7. This
+       appears to be the most frequently used 8-bit floating-point format,
+       although it is not covered by any formal standard. This is sometimes
+       called a "minifloat."
+
+       The special operators are used to produce floating-point numbers in
+       other contexts. They produce the binary representation of a specific
+       floating-point number as an integer, and can use anywhere integer
+       constants are used in an expression. `__float80m__' and
+       `__float80e__' produce the 64-bit mantissa and 16-bit exponent of an
+       80-bit floating-point number, and `__float128l__' and
+       `__float128h__' produce the lower and upper 64-bit halves of a 128-
+       bit floating-point number, respectively.
+
+       For example:
+
+             mov    rax,__float64__(3.141592653589793238462)
+
+       ... would assign the binary representation of pi as a 64-bit
+       floating point number into `RAX'. This is exactly equivalent to:
+
+             mov    rax,0x400921fb54442d18
+
+       NASM cannot do compile-time arithmetic on floating-point constants.
+       This is because NASM is designed to be portable - although it always
+       generates code to run on x86 processors, the assembler itself can
+       run on any system with an ANSI C compiler. Therefore, the assembler
+       cannot guarantee the presence of a floating-point unit capable of
+       handling the Intel number formats, and so for NASM to be able to do
+       floating arithmetic it would have to include its own complete set of
+       floating-point routines, which would significantly increase the size
+       of the assembler for very little benefit.
+
+       The special tokens `__Infinity__', `__QNaN__' (or `__NaN__') and
+       `__SNaN__' can be used to generate infinities, quiet NaNs, and
+       signalling NaNs, respectively. These are normally used as macros:
+
+       %define Inf __Infinity__ 
+       %define NaN __QNaN__ 
+       
+             dq    +1.5, -Inf, NaN         ; Double-precision constants
+
+ 3.4.7 Packed BCD Constants
+
+       x87-style packed BCD constants can be used in the same contexts as
+       80-bit floating-point numbers. They are suffixed with `p' or
+       prefixed with `0p', and can include up to 18 decimal digits.
+
+       As with other numeric constants, underscores can be used to separate
+       digits.
+
+       For example:
+
+             dt 12_345_678_901_245_678p 
+             dt -12_345_678_901_245_678p 
+             dt +0p33 
+             dt 33p
+
+   3.5 Expressions
+
+       Expressions in NASM are similar in syntax to those in C. Expressions
+       are evaluated as 64-bit integers which are then adjusted to the
+       appropriate size.
+
+       NASM supports two special tokens in expressions, allowing
+       calculations to involve the current assembly position: the `$' and
+       `$$' tokens. `$' evaluates to the assembly position at the beginning
+       of the line containing the expression; so you can code an infinite
+       loop using `JMP $'. `$$' evaluates to the beginning of the current
+       section; so you can tell how far into the section you are by using
+       `($-$$)'.
+
+       The arithmetic operators provided by NASM are listed here, in
+       increasing order of precedence.
+
+ 3.5.1 `|': Bitwise OR Operator
+
+       The `|' operator gives a bitwise OR, exactly as performed by the
+       `OR' machine instruction. Bitwise OR is the lowest-priority
+       arithmetic operator supported by NASM.
+
+ 3.5.2 `^': Bitwise XOR Operator
+
+       `^' provides the bitwise XOR operation.
+
+ 3.5.3 `&': Bitwise AND Operator
+
+       `&' provides the bitwise AND operation.
+
+ 3.5.4 `<<' and `>>': Bit Shift Operators
+
+       `<<' gives a bit-shift to the left, just as it does in C. So `5<<3'
+       evaluates to 5 times 8, or 40. `>>' gives a bit-shift to the right;
+       in NASM, such a shift is _always_ unsigned, so that the bits shifted
+       in from the left-hand end are filled with zero rather than a sign-
+       extension of the previous highest bit.
+
+ 3.5.5 `+' and `-': Addition and Subtraction Operators
+
+       The `+' and `-' operators do perfectly ordinary addition and
+       subtraction.
+
+ 3.5.6 `*', `/', `//', `%' and `%%': Multiplication and Division
+
+       `*' is the multiplication operator. `/' and `//' are both division
+       operators: `/' is unsigned division and `//' is signed division.
+       Similarly, `%' and `%%' provide unsigned and signed modulo operators
+       respectively.
+
+       NASM, like ANSI C, provides no guarantees about the sensible
+       operation of the signed modulo operator.
+
+       Since the `%' character is used extensively by the macro
+       preprocessor, you should ensure that both the signed and unsigned
+       modulo operators are followed by white space wherever they appear.
+
+ 3.5.7 Unary Operators: `+', `-', `~', `!' and `SEG'
+
+       The highest-priority operators in NASM's expression grammar are
+       those which only apply to one argument. `-' negates its operand, `+'
+       does nothing (it's provided for symmetry with `-'), `~' computes the
+       one's complement of its operand, `!' is the logical negation
+       operator, and `SEG' provides the segment address of its operand
+       (explained in more detail in section 3.6).
+
+   3.6 `SEG' and `WRT'
+
+       When writing large 16-bit programs, which must be split into
+       multiple segments, it is often necessary to be able to refer to the
+       segment part of the address of a symbol. NASM supports the `SEG'
+       operator to perform this function.
+
+       The `SEG' operator returns the _preferred_ segment base of a symbol,
+       defined as the segment base relative to which the offset of the
+       symbol makes sense. So the code
+
+               mov     ax,seg symbol 
+               mov     es,ax 
+               mov     bx,symbol
+
+       will load `ES:BX' with a valid pointer to the symbol `symbol'.
+
+       Things can be more complex than this: since 16-bit segments and
+       groups may overlap, you might occasionally want to refer to some
+       symbol using a different segment base from the preferred one. NASM
+       lets you do this, by the use of the `WRT' (With Reference To)
+       keyword. So you can do things like
+
+               mov     ax,weird_seg        ; weird_seg is a segment base 
+               mov     es,ax 
+               mov     bx,symbol wrt weird_seg
+
+       to load `ES:BX' with a different, but functionally equivalent,
+       pointer to the symbol `symbol'.
+
+       NASM supports far (inter-segment) calls and jumps by means of the
+       syntax `call segment:offset', where `segment' and `offset' both
+       represent immediate values. So to call a far procedure, you could
+       code either of
+
+               call    (seg procedure):procedure 
+               call    weird_seg:(procedure wrt weird_seg)
+
+       (The parentheses are included for clarity, to show the intended
+       parsing of the above instructions. They are not necessary in
+       practice.)
+
+       NASM supports the syntax `call far procedure' as a synonym for the
+       first of the above usages. `JMP' works identically to `CALL' in
+       these examples.
+
+       To declare a far pointer to a data item in a data segment, you must
+       code
+
+               dw      symbol, seg symbol
+
+       NASM supports no convenient synonym for this, though you can always
+       invent one using the macro processor.
+
+   3.7 `STRICT': Inhibiting Optimization
+
+       When assembling with the optimizer set to level 2 or higher (see
+       section 2.1.22), NASM will use size specifiers (`BYTE', `WORD',
+       `DWORD', `QWORD', `TWORD', `OWORD' or `YWORD'), but will give them
+       the smallest possible size. The keyword `STRICT' can be used to
+       inhibit optimization and force a particular operand to be emitted in
+       the specified size. For example, with the optimizer on, and in
+       `BITS 16' mode,
+
+               push dword 33
+
+       is encoded in three bytes `66 6A 21', whereas
+
+               push strict dword 33
+
+       is encoded in six bytes, with a full dword immediate operand
+       `66 68 21 00 00 00'.
+
+       With the optimizer off, the same code (six bytes) is generated
+       whether the `STRICT' keyword was used or not.
+
+   3.8 Critical Expressions
+
+       Although NASM has an optional multi-pass optimizer, there are some
+       expressions which must be resolvable on the first pass. These are
+       called _Critical Expressions_.
+
+       The first pass is used to determine the size of all the assembled
+       code and data, so that the second pass, when generating all the
+       code, knows all the symbol addresses the code refers to. So one
+       thing NASM can't handle is code whose size depends on the value of a
+       symbol declared after the code in question. For example,
+
+               times (label-$) db 0 
+       label:  db      'Where am I?'
+
+       The argument to `TIMES' in this case could equally legally evaluate
+       to anything at all; NASM will reject this example because it cannot
+       tell the size of the `TIMES' line when it first sees it. It will
+       just as firmly reject the slightly paradoxical code
+
+               times (label-$+1) db 0 
+       label:  db      'NOW where am I?'
+
+       in which _any_ value for the `TIMES' argument is by definition
+       wrong!
+
+       NASM rejects these examples by means of a concept called a _critical
+       expression_, which is defined to be an expression whose value is
+       required to be computable in the first pass, and which must
+       therefore depend only on symbols defined before it. The argument to
+       the `TIMES' prefix is a critical expression.
+
+   3.9 Local Labels
+
+       NASM gives special treatment to symbols beginning with a period. A
+       label beginning with a single period is treated as a _local_ label,
+       which means that it is associated with the previous non-local label.
+       So, for example:
+
+       label1  ; some code 
+       
+       .loop 
+               ; some more code 
+       
+               jne     .loop 
+               ret 
+       
+       label2  ; some code 
+       
+       .loop 
+               ; some more code 
+       
+               jne     .loop 
+               ret
+
+       In the above code fragment, each `JNE' instruction jumps to the line
+       immediately before it, because the two definitions of `.loop' are
+       kept separate by virtue of each being associated with the previous
+       non-local label.
+
+       This form of local label handling is borrowed from the old Amiga
+       assembler DevPac; however, NASM goes one step further, in allowing
+       access to local labels from other parts of the code. This is
+       achieved by means of _defining_ a local label in terms of the
+       previous non-local label: the first definition of `.loop' above is
+       really defining a symbol called `label1.loop', and the second
+       defines a symbol called `label2.loop'. So, if you really needed to,
+       you could write
+
+       label3  ; some more code 
+               ; and some more 
+       
+               jmp label1.loop
+
+       Sometimes it is useful - in a macro, for instance - to be able to
+       define a label which can be referenced from anywhere but which
+       doesn't interfere with the normal local-label mechanism. Such a
+       label can't be non-local because it would interfere with subsequent
+       definitions of, and references to, local labels; and it can't be
+       local because the macro that defined it wouldn't know the label's
+       full name. NASM therefore introduces a third type of label, which is
+       probably only useful in macro definitions: if a label begins with
+       the special prefix `..@', then it does nothing to the local label
+       mechanism. So you could code
+
+       label1:                         ; a non-local label 
+       .local:                         ; this is really label1.local 
+       ..@foo:                         ; this is a special symbol 
+       label2:                         ; another non-local label 
+       .local:                         ; this is really label2.local 
+       
+               jmp     ..@foo          ; this will jump three lines up
+
+       NASM has the capacity to define other special symbols beginning with
+       a double period: for example, `..start' is used to specify the entry
+       point in the `obj' output format (see section 7.4.6).
+
+Chapter 4: The NASM Preprocessor
+--------------------------------
+
+       NASM contains a powerful macro processor, which supports conditional
+       assembly, multi-level file inclusion, two forms of macro (single-
+       line and multi-line), and a `context stack' mechanism for extra
+       macro power. Preprocessor directives all begin with a `%' sign.
+
+       The preprocessor collapses all lines which end with a backslash (\)
+       character into a single line. Thus:
+
+       %define THIS_VERY_LONG_MACRO_NAME_IS_DEFINED_TO \ 
+               THIS_VALUE
+
+       will work like a single-line macro without the backslash-newline
+       sequence.
+
+   4.1 Single-Line Macros
+
+ 4.1.1 The Normal Way: `%define'
+
+       Single-line macros are defined using the `%define' preprocessor
+       directive. The definitions work in a similar way to C; so you can do
+       things like
+
+       %define ctrl    0x1F & 
+       %define param(a,b) ((a)+(a)*(b)) 
+       
+               mov     byte [param(2,ebx)], ctrl 'D'
+
+       which will expand to
+
+               mov     byte [(2)+(2)*(ebx)], 0x1F & 'D'
+
+       When the expansion of a single-line macro contains tokens which
+       invoke another macro, the expansion is performed at invocation time,
+       not at definition time. Thus the code
+
+       %define a(x)    1+b(x) 
+       %define b(x)    2*x 
+       
+               mov     ax,a(8)
+
+       will evaluate in the expected way to `mov ax,1+2*8', even though the
+       macro `b' wasn't defined at the time of definition of `a'.
+
+       Macros defined with `%define' are case sensitive: after
+       `%define foo bar', only `foo' will expand to `bar': `Foo' or `FOO'
+       will not. By using `%idefine' instead of `%define' (the `i' stands
+       for `insensitive') you can define all the case variants of a macro
+       at once, so that `%idefine foo bar' would cause `foo', `Foo', `FOO',
+       `fOO' and so on all to expand to `bar'.
+
+       There is a mechanism which detects when a macro call has occurred as
+       a result of a previous expansion of the same macro, to guard against
+       circular references and infinite loops. If this happens, the
+       preprocessor will only expand the first occurrence of the macro.
+       Hence, if you code
+
+       %define a(x)    1+a(x) 
+       
+               mov     ax,a(3)
+
+       the macro `a(3)' will expand once, becoming `1+a(3)', and will then
+       expand no further. This behaviour can be useful: see section 9.1 for
+       an example of its use.
+
+       You can overload single-line macros: if you write
+
+       %define foo(x)   1+x 
+       %define foo(x,y) 1+x*y
+
+       the preprocessor will be able to handle both types of macro call, by
+       counting the parameters you pass; so `foo(3)' will become `1+3'
+       whereas `foo(ebx,2)' will become `1+ebx*2'. However, if you define
+
+       %define foo bar
+
+       then no other definition of `foo' will be accepted: a macro with no
+       parameters prohibits the definition of the same name as a macro
+       _with_ parameters, and vice versa.
+
+       This doesn't prevent single-line macros being _redefined_: you can
+       perfectly well define a macro with
+
+       %define foo bar
+
+       and then re-define it later in the same source file with
+
+       %define foo baz
+
+       Then everywhere the macro `foo' is invoked, it will be expanded
+       according to the most recent definition. This is particularly useful
+       when defining single-line macros with `%assign' (see section 4.1.7).
+
+       You can pre-define single-line macros using the `-d' option on the
+       NASM command line: see section 2.1.18.
+
+ 4.1.2 Resolving `%define': `%xdefine'
+
+       To have a reference to an embedded single-line macro resolved at the
+       time that the embedding macro is _defined_, as opposed to when the
+       embedding macro is _expanded_, you need a different mechanism to the
+       one offered by `%define'. The solution is to use `%xdefine', or it's
+       case-insensitive counterpart `%ixdefine'.
+
+       Suppose you have the following code:
+
+       %define  isTrue  1 
+       %define  isFalse isTrue 
+       %define  isTrue  0 
+       
+       val1:    db      isFalse 
+       
+       %define  isTrue  1 
+       
+       val2:    db      isFalse
+
+       In this case, `val1' is equal to 0, and `val2' is equal to 1. This
+       is because, when a single-line macro is defined using `%define', it
+       is expanded only when it is called. As `isFalse' expands to
+       `isTrue', the expansion will be the current value of `isTrue'. The
+       first time it is called that is 0, and the second time it is 1.
+
+       If you wanted `isFalse' to expand to the value assigned to the
+       embedded macro `isTrue' at the time that `isFalse' was defined, you
+       need to change the above code to use `%xdefine'.
+
+       %xdefine isTrue  1 
+       %xdefine isFalse isTrue 
+       %xdefine isTrue  0 
+       
+       val1:    db      isFalse 
+       
+       %xdefine isTrue  1 
+       
+       val2:    db      isFalse
+
+       Now, each time that `isFalse' is called, it expands to 1, as that is
+       what the embedded macro `isTrue' expanded to at the time that
+       `isFalse' was defined.
+
+ 4.1.3 Macro Indirection: `%[...]'
+
+       The `%[...]' construct can be used to expand macros in contexts
+       where macro expansion would otherwise not occur, including in the
+       names other macros. For example, if you have a set of macros named
+       `Foo16', `Foo32' and `Foo64', you could write:
+
+            mov ax,Foo%[__BITS__]   ; The Foo value
+
+       to use the builtin macro `__BITS__' (see section 4.11.5) to
+       automatically select between them. Similarly, the two statements:
+
+       %xdefine Bar         Quux    ; Expands due to %xdefine 
+       %define  Bar         %[Quux] ; Expands due to %[...]
+
+       have, in fact, exactly the same effect.
+
+       `%[...]' concatenates to adjacent tokens in the same way that multi-
+       line macro parameters do, see section 4.3.8 for details.
+
+ 4.1.4 Concatenating Single Line Macro Tokens: `%+'
+
+       Individual tokens in single line macros can be concatenated, to
+       produce longer tokens for later processing. This can be useful if
+       there are several similar macros that perform similar functions.
+
+       Please note that a space is required after `%+', in order to
+       disambiguate it from the syntax `%+1' used in multiline macros.
+
+       As an example, consider the following:
+
+       %define BDASTART 400h                ; Start of BIOS data area
+
+       struc   tBIOSDA                      ; its structure 
+               .COM1addr       RESW    1 
+               .COM2addr       RESW    1 
+               ; ..and so on 
+       endstruc
+
+       Now, if we need to access the elements of tBIOSDA in different
+       places, we can end up with:
+
+               mov     ax,BDASTART + tBIOSDA.COM1addr 
+               mov     bx,BDASTART + tBIOSDA.COM2addr
+
+       This will become pretty ugly (and tedious) if used in many places,
+       and can be reduced in size significantly by using the following
+       macro:
+
+       ; Macro to access BIOS variables by their names (from tBDA):
+
+       %define BDA(x)  BDASTART + tBIOSDA. %+ x
+
+       Now the above code can be written as:
+
+               mov     ax,BDA(COM1addr) 
+               mov     bx,BDA(COM2addr)
+
+       Using this feature, we can simplify references to a lot of macros
+       (and, in turn, reduce typing errors).
+
+ 4.1.5 The Macro Name Itself: `%?' and `%??'
+
+       The special symbols `%?' and `%??' can be used to reference the
+       macro name itself inside a macro expansion, this is supported for
+       both single-and multi-line macros. `%?' refers to the macro name as
+       _invoked_, whereas `%??' refers to the macro name as _declared_. The
+       two are always the same for case-sensitive macros, but for case-
+       insensitive macros, they can differ.
+
+       For example:
+
+       %idefine Foo mov %?,%?? 
+       
+               foo 
+               FOO
+
+       will expand to:
+
+               mov foo,Foo 
+               mov FOO,Foo
+
+       The sequence:
+
+       %idefine keyword $%?
+
+       can be used to make a keyword "disappear", for example in case a new
+       instruction has been used as a label in older code. For example:
+
+       %idefine pause $%?                  ; Hide the PAUSE instruction
+
+ 4.1.6 Undefining Single-Line Macros: `%undef'
+
+       Single-line macros can be removed with the `%undef' directive. For
+       example, the following sequence:
+
+       %define foo bar 
+       %undef  foo 
+       
+               mov     eax, foo
+
+       will expand to the instruction `mov eax, foo', since after `%undef'
+       the macro `foo' is no longer defined.
+
+       Macros that would otherwise be pre-defined can be undefined on the
+       command-line using the `-u' option on the NASM command line: see
+       section 2.1.19.
+
+ 4.1.7 Preprocessor Variables: `%assign'
+
+       An alternative way to define single-line macros is by means of the
+       `%assign' command (and its case-insensitive counterpart `%iassign',
+       which differs from `%assign' in exactly the same way that `%idefine'
+       differs from `%define').
+
+       `%assign' is used to define single-line macros which take no
+       parameters and have a numeric value. This value can be specified in
+       the form of an expression, and it will be evaluated once, when the
+       `%assign' directive is processed.
+
+       Like `%define', macros defined using `%assign' can be re-defined
+       later, so you can do things like
+
+       %assign i i+1
+
+       to increment the numeric value of a macro.
+
+       `%assign' is useful for controlling the termination of `%rep'
+       preprocessor loops: see section 4.5 for an example of this. Another
+       use for `%assign' is given in section 8.4 and section 9.1.
+
+       The expression passed to `%assign' is a critical expression (see
+       section 3.8), and must also evaluate to a pure number (rather than a
+       relocatable reference such as a code or data address, or anything
+       involving a register).
+
+ 4.1.8 Defining Strings: `%defstr'
+
+       `%defstr', and its case-insensitive counterpart `%idefstr', define
+       or redefine a single-line macro without parameters but converts the
+       entire right-hand side, after macro expansion, to a quoted string
+       before definition.
+
+       For example:
+
+       %defstr test TEST
+
+       is equivalent to
+
+       %define test 'TEST'
+
+       This can be used, for example, with the `%!' construct (see section
+       4.10.2):
+
+       %defstr PATH %!PATH          ; The operating system PATH variable
+
+ 4.1.9 Defining Tokens: `%deftok'
+
+       `%deftok', and its case-insensitive counterpart `%ideftok', define
+       or redefine a single-line macro without parameters but converts the
+       second parameter, after string conversion, to a sequence of tokens.
+
+       For example:
+
+       %deftok test 'TEST'
+
+       is equivalent to
+
+       %define test TEST
+
+   4.2 String Manipulation in Macros
+
+       It's often useful to be able to handle strings in macros. NASM
+       supports a few simple string handling macro operators from which
+       more complex operations can be constructed.
+
+       All the string operators define or redefine a value (either a string
+       or a numeric value) to a single-line macro. When producing a string
+       value, it may change the style of quoting of the input string or
+       strings, and possibly use `\'-escapes inside ``'-quoted strings.
+
+ 4.2.1 Concatenating Strings: `%strcat'
+
+       The `%strcat' operator concatenates quoted strings and assign them
+       to a single-line macro.
+
+       For example:
+
+       %strcat alpha "Alpha: ", '12" screen'
+
+       ... would assign the value `'Alpha: 12" screen'' to `alpha'.
+       Similarly:
+
+       %strcat beta '"foo"\', "'bar'"
+
+       ... would assign the value ``"foo"\\'bar'`' to `beta'.
+
+       The use of commas to separate strings is permitted but optional.
+
+ 4.2.2 String Length: `%strlen'
+
+       The `%strlen' operator assigns the length of a string to a macro.
+       For example:
+
+       %strlen charcnt 'my string'
+
+       In this example, `charcnt' would receive the value 9, just as if an
+       `%assign' had been used. In this example, `'my string'' was a
+       literal string but it could also have been a single-line macro that
+       expands to a string, as in the following example:
+
+       %define sometext 'my string' 
+       %strlen charcnt sometext
+
+       As in the first case, this would result in `charcnt' being assigned
+       the value of 9.
+
+ 4.2.3 Extracting Substrings: `%substr'
+
+       Individual letters or substrings in strings can be extracted using
+       the `%substr' operator. An example of its use is probably more
+       useful than the description:
+
+       %substr mychar 'xyzw' 1       ; equivalent to %define mychar 'x' 
+       %substr mychar 'xyzw' 2       ; equivalent to %define mychar 'y' 
+       %substr mychar 'xyzw' 3       ; equivalent to %define mychar 'z' 
+       %substr mychar 'xyzw' 2,2     ; equivalent to %define mychar 'yz' 
+       %substr mychar 'xyzw' 2,-1    ; equivalent to %define mychar 'yzw' 
+       %substr mychar 'xyzw' 2,-2    ; equivalent to %define mychar 'yz'
+
+       As with `%strlen' (see section 4.2.2), the first parameter is the
+       single-line macro to be created and the second is the string. The
+       third parameter specifies the first character to be selected, and
+       the optional fourth parameter preceeded by comma) is the length.
+       Note that the first index is 1, not 0 and the last index is equal to
+       the value that `%strlen' would assign given the same string. Index
+       values out of range result in an empty string. A negative length
+       means "until N-1 characters before the end of string", i.e. `-1'
+       means until end of string, `-2' until one character before, etc.
+
+   4.3 Multi-Line Macros: `%macro'
+
+       Multi-line macros are much more like the type of macro seen in MASM
+       and TASM: a multi-line macro definition in NASM looks something like
+       this.
+
+       %macro  prologue 1 
+       
+               push    ebp 
+               mov     ebp,esp 
+               sub     esp,%1 
+       
+       %endmacro
+
+       This defines a C-like function prologue as a macro: so you would
+       invoke the macro with a call such as
+
+       myfunc:   prologue 12
+
+       which would expand to the three lines of code
+
+       myfunc: push    ebp 
+               mov     ebp,esp 
+               sub     esp,12
+
+       The number `1' after the macro name in the `%macro' line defines the
+       number of parameters the macro `prologue' expects to receive. The
+       use of `%1' inside the macro definition refers to the first
+       parameter to the macro call. With a macro taking more than one
+       parameter, subsequent parameters would be referred to as `%2', `%3'
+       and so on.
+
+       Multi-line macros, like single-line macros, are case-sensitive,
+       unless you define them using the alternative directive `%imacro'.
+
+       If you need to pass a comma as _part_ of a parameter to a multi-line
+       macro, you can do that by enclosing the entire parameter in braces.
+       So you could code things like
+
+       %macro  silly 2 
+       
+           %2: db      %1 
+       
+       %endmacro 
+       
+               silly 'a', letter_a             ; letter_a:  db 'a' 
+               silly 'ab', string_ab           ; string_ab: db 'ab' 
+               silly {13,10}, crlf             ; crlf:      db 13,10
+
+ 4.3.1 Recursive Multi-Line Macros: `%rmacro'
+
+       A multi-line macro cannot be referenced within itself, in order to
+       prevent accidental infinite recursion.
+
+       Recursive multi-line macros allow for self-referencing, with the
+       caveat that the user is aware of the existence, use and purpose of
+       recursive multi-line macros. There is also a generous, but sane,
+       upper limit to the number of recursions, in order to prevent run-
+       away memory consumption in case of accidental infinite recursion.
+
+       As with non-recursive multi-line macros, recursive multi-line macros
+       are case-sensitive, unless you define them using the alternative
+       directive `%irmacro'.
+
+ 4.3.2 Overloading Multi-Line Macros
+
+       As with single-line macros, multi-line macros can be overloaded by
+       defining the same macro name several times with different numbers of
+       parameters. This time, no exception is made for macros with no
+       parameters at all. So you could define
+
+       %macro  prologue 0 
+       
+               push    ebp 
+               mov     ebp,esp 
+       
+       %endmacro
+
+       to define an alternative form of the function prologue which
+       allocates no local stack space.
+
+       Sometimes, however, you might want to `overload' a machine
+       instruction; for example, you might want to define
+
+       %macro  push 2 
+       
+               push    %1 
+               push    %2 
+       
+       %endmacro
+
+       so that you could code
+
+               push    ebx             ; this line is not a macro call 
+               push    eax,ecx         ; but this one is
+
+       Ordinarily, NASM will give a warning for the first of the above two
+       lines, since `push' is now defined to be a macro, and is being
+       invoked with a number of parameters for which no definition has been
+       given. The correct code will still be generated, but the assembler
+       will give a warning. This warning can be disabled by the use of the
+       `-w-macro-params' command-line option (see section 2.1.24).
+
+ 4.3.3 Macro-Local Labels
+
+       NASM allows you to define labels within a multi-line macro
+       definition in such a way as to make them local to the macro call: so
+       calling the same macro multiple times will use a different label
+       each time. You do this by prefixing `%%' to the label name. So you
+       can invent an instruction which executes a `RET' if the `Z' flag is
+       set by doing this:
+
+       %macro  retz 0 
+       
+               jnz     %%skip 
+               ret 
+           %%skip: 
+       
+       %endmacro
+
+       You can call this macro as many times as you want, and every time
+       you call it NASM will make up a different `real' name to substitute
+       for the label `%%skip'. The names NASM invents are of the form
+       `..@2345.skip', where the number 2345 changes with every macro call.
+       The `..@' prefix prevents macro-local labels from interfering with
+       the local label mechanism, as described in section 3.9. You should
+       avoid defining your own labels in this form (the `..@' prefix, then
+       a number, then another period) in case they interfere with macro-
+       local labels.
+
+ 4.3.4 Greedy Macro Parameters
+
+       Occasionally it is useful to define a macro which lumps its entire
+       command line into one parameter definition, possibly after
+       extracting one or two smaller parameters from the front. An example
+       might be a macro to write a text string to a file in MS-DOS, where
+       you might want to be able to write
+
+               writefile [filehandle],"hello, world",13,10
+
+       NASM allows you to define the last parameter of a macro to be
+       _greedy_, meaning that if you invoke the macro with more parameters
+       than it expects, all the spare parameters get lumped into the last
+       defined one along with the separating commas. So if you code:
+
+       %macro  writefile 2+ 
+       
+               jmp     %%endstr 
+         %%str:        db      %2 
+         %%endstr: 
+               mov     dx,%%str 
+               mov     cx,%%endstr-%%str 
+               mov     bx,%1 
+               mov     ah,0x40 
+               int     0x21 
+       
+       %endmacro
+
+       then the example call to `writefile' above will work as expected:
+       the text before the first comma, `[filehandle]', is used as the
+       first macro parameter and expanded when `%1' is referred to, and all
+       the subsequent text is lumped into `%2' and placed after the `db'.
+
+       The greedy nature of the macro is indicated to NASM by the use of
+       the `+' sign after the parameter count on the `%macro' line.
+
+       If you define a greedy macro, you are effectively telling NASM how
+       it should expand the macro given _any_ number of parameters from the
+       actual number specified up to infinity; in this case, for example,
+       NASM now knows what to do when it sees a call to `writefile' with 2,
+       3, 4 or more parameters. NASM will take this into account when
+       overloading macros, and will not allow you to define another form of
+       `writefile' taking 4 parameters (for example).
+
+       Of course, the above macro could have been implemented as a non-
+       greedy macro, in which case the call to it would have had to look
+       like
+
+                 writefile [filehandle], {"hello, world",13,10}
+
+       NASM provides both mechanisms for putting commas in macro
+       parameters, and you choose which one you prefer for each macro
+       definition.
+
+       See section 6.3.1 for a better way to write the above macro.
+
+ 4.3.5 Default Macro Parameters
+
+       NASM also allows you to define a multi-line macro with a _range_ of
+       allowable parameter counts. If you do this, you can specify defaults
+       for omitted parameters. So, for example:
+
+       %macro  die 0-1 "Painful program death has occurred." 
+       
+               writefile 2,%1 
+               mov     ax,0x4c01 
+               int     0x21 
+       
+       %endmacro
+
+       This macro (which makes use of the `writefile' macro defined in
+       section 4.3.4) can be called with an explicit error message, which
+       it will display on the error output stream before exiting, or it can
+       be called with no parameters, in which case it will use the default
+       error message supplied in the macro definition.
+
+       In general, you supply a minimum and maximum number of parameters
+       for a macro of this type; the minimum number of parameters are then
+       required in the macro call, and then you provide defaults for the
+       optional ones. So if a macro definition began with the line
+
+       %macro foobar 1-3 eax,[ebx+2]
+
+       then it could be called with between one and three parameters, and
+       `%1' would always be taken from the macro call. `%2', if not
+       specified by the macro call, would default to `eax', and `%3' if not
+       specified would default to `[ebx+2]'.
+
+       You can provide extra information to a macro by providing too many
+       default parameters:
+
+       %macro quux 1 something
+
+       This will trigger a warning by default; see section 2.1.24 for more
+       information. When `quux' is invoked, it receives not one but two
+       parameters. `something' can be referred to as `%2'. The difference
+       between passing `something' this way and writing `something' in the
+       macro body is that with this way `something' is evaluated when the
+       macro is defined, not when it is expanded.
+
+       You may omit parameter defaults from the macro definition, in which
+       case the parameter default is taken to be blank. This can be useful
+       for macros which can take a variable number of parameters, since the
+       `%0' token (see section 4.3.6) allows you to determine how many
+       parameters were really passed to the macro call.
+
+       This defaulting mechanism can be combined with the greedy-parameter
+       mechanism; so the `die' macro above could be made more powerful, and
+       more useful, by changing the first line of the definition to
+
+       %macro die 0-1+ "Painful program death has occurred.",13,10
+
+       The maximum parameter count can be infinite, denoted by `*'. In this
+       case, of course, it is impossible to provide a _full_ set of default
+       parameters. Examples of this usage are shown in section 4.3.7.
+
+ 4.3.6 `%0': Macro Parameter Counter
+
+       The parameter reference `%0' will return a numeric constant giving
+       the number of parameters received, that is, if `%0' is n then `%'n
+       is the last parameter. `%0' is mostly useful for macros that can
+       take a variable number of parameters. It can be used as an argument
+       to `%rep' (see section 4.5) in order to iterate through all the
+       parameters of a macro. Examples are given in section 4.3.7.
+
+ 4.3.7 `%rotate': Rotating Macro Parameters
+
+       Unix shell programmers will be familiar with the `shift' shell
+       command, which allows the arguments passed to a shell script
+       (referenced as `$1', `$2' and so on) to be moved left by one place,
+       so that the argument previously referenced as `$2' becomes available
+       as `$1', and the argument previously referenced as `$1' is no longer
+       available at all.
+
+       NASM provides a similar mechanism, in the form of `%rotate'. As its
+       name suggests, it differs from the Unix `shift' in that no
+       parameters are lost: parameters rotated off the left end of the
+       argument list reappear on the right, and vice versa.
+
+       `%rotate' is invoked with a single numeric argument (which may be an
+       expression). The macro parameters are rotated to the left by that
+       many places. If the argument to `%rotate' is negative, the macro
+       parameters are rotated to the right.
+
+       So a pair of macros to save and restore a set of registers might
+       work as follows:
+
+       %macro  multipush 1-* 
+       
+         %rep  %0 
+               push    %1 
+         %rotate 1 
+         %endrep 
+       
+       %endmacro
+
+       This macro invokes the `PUSH' instruction on each of its arguments
+       in turn, from left to right. It begins by pushing its first
+       argument, `%1', then invokes `%rotate' to move all the arguments one
+       place to the left, so that the original second argument is now
+       available as `%1'. Repeating this procedure as many times as there
+       were arguments (achieved by supplying `%0' as the argument to
+       `%rep') causes each argument in turn to be pushed.
+
+       Note also the use of `*' as the maximum parameter count, indicating
+       that there is no upper limit on the number of parameters you may
+       supply to the `multipush' macro.
+
+       It would be convenient, when using this macro, to have a `POP'
+       equivalent, which _didn't_ require the arguments to be given in
+       reverse order. Ideally, you would write the `multipush' macro call,
+       then cut-and-paste the line to where the pop needed to be done, and
+       change the name of the called macro to `multipop', and the macro
+       would take care of popping the registers in the opposite order from
+       the one in which they were pushed.
+
+       This can be done by the following definition:
+
+       %macro  multipop 1-* 
+       
+         %rep %0 
+         %rotate -1 
+               pop     %1 
+         %endrep 
+       
+       %endmacro
+
+       This macro begins by rotating its arguments one place to the
+       _right_, so that the original _last_ argument appears as `%1'. This
+       is then popped, and the arguments are rotated right again, so the
+       second-to-last argument becomes `%1'. Thus the arguments are
+       iterated through in reverse order.
+
+ 4.3.8 Concatenating Macro Parameters
+
+       NASM can concatenate macro parameters and macro indirection
+       constructs on to other text surrounding them. This allows you to
+       declare a family of symbols, for example, in a macro definition. If,
+       for example, you wanted to generate a table of key codes along with
+       offsets into the table, you could code something like
+
+       %macro keytab_entry 2 
+       
+           keypos%1    equ     $-keytab 
+                       db      %2 
+       
+       %endmacro 
+       
+       keytab: 
+                 keytab_entry F1,128+1 
+                 keytab_entry F2,128+2 
+                 keytab_entry Return,13
+
+       which would expand to
+
+       keytab: 
+       keyposF1        equ     $-keytab 
+                       db     128+1 
+       keyposF2        equ     $-keytab 
+                       db      128+2 
+       keyposReturn    equ     $-keytab 
+                       db      13
+
+       You can just as easily concatenate text on to the other end of a
+       macro parameter, by writing `%1foo'.
+
+       If you need to append a _digit_ to a macro parameter, for example
+       defining labels `foo1' and `foo2' when passed the parameter `foo',
+       you can't code `%11' because that would be taken as the eleventh
+       macro parameter. Instead, you must code `%{1}1', which will separate
+       the first `1' (giving the number of the macro parameter) from the
+       second (literal text to be concatenated to the parameter).
+
+       This concatenation can also be applied to other preprocessor in-line
+       objects, such as macro-local labels (section 4.3.3) and context-
+       local labels (section 4.7.2). In all cases, ambiguities in syntax
+       can be resolved by enclosing everything after the `%' sign and
+       before the literal text in braces: so `%{%foo}bar' concatenates the
+       text `bar' to the end of the real name of the macro-local label
+       `%%foo'. (This is unnecessary, since the form NASM uses for the real
+       names of macro-local labels means that the two usages `%{%foo}bar'
+       and `%%foobar' would both expand to the same thing anyway;
+       nevertheless, the capability is there.)
+
+       The single-line macro indirection construct, `%[...]' (section
+       4.1.3), behaves the same way as macro parameters for the purpose of
+       concatenation.
+
+       See also the `%+' operator, section 4.1.4.
+
+ 4.3.9 Condition Codes as Macro Parameters
+
+       NASM can give special treatment to a macro parameter which contains
+       a condition code. For a start, you can refer to the macro parameter
+       `%1' by means of the alternative syntax `%+1', which informs NASM
+       that this macro parameter is supposed to contain a condition code,
+       and will cause the preprocessor to report an error message if the
+       macro is called with a parameter which is _not_ a valid condition
+       code.
+
+       Far more usefully, though, you can refer to the macro parameter by
+       means of `%-1', which NASM will expand as the _inverse_ condition
+       code. So the `retz' macro defined in section 4.3.3 can be replaced
+       by a general conditional-return macro like this:
+
+       %macro  retc 1 
+       
+               j%-1    %%skip 
+               ret 
+         %%skip: 
+       
+       %endmacro
+
+       This macro can now be invoked using calls like `retc ne', which will
+       cause the conditional-jump instruction in the macro expansion to
+       come out as `JE', or `retc po' which will make the jump a `JPE'.
+
+       The `%+1' macro-parameter reference is quite happy to interpret the
+       arguments `CXZ' and `ECXZ' as valid condition codes; however, `%-1'
+       will report an error if passed either of these, because no inverse
+       condition code exists.
+
+4.3.10 Disabling Listing Expansion
+
+       When NASM is generating a listing file from your program, it will
+       generally expand multi-line macros by means of writing the macro
+       call and then listing each line of the expansion. This allows you to
+       see which instructions in the macro expansion are generating what
+       code; however, for some macros this clutters the listing up
+       unnecessarily.
+
+       NASM therefore provides the `.nolist' qualifier, which you can
+       include in a macro definition to inhibit the expansion of the macro
+       in the listing file. The `.nolist' qualifier comes directly after
+       the number of parameters, like this:
+
+       %macro foo 1.nolist
+
+       Or like this:
+
+       %macro bar 1-5+.nolist a,b,c,d,e,f,g,h
+
+4.3.11 Undefining Multi-Line Macros: `%unmacro'
+
+       Multi-line macros can be removed with the `%unmacro' directive.
+       Unlike the `%undef' directive, however, `%unmacro' takes an argument
+       specification, and will only remove exact matches with that argument
+       specification.
+
+       For example:
+
+       %macro foo 1-3 
+               ; Do something 
+       %endmacro 
+       %unmacro foo 1-3
+
+       removes the previously defined macro `foo', but
+
+       %macro bar 1-3 
+               ; Do something 
+       %endmacro 
+       %unmacro bar 1
+
+       does _not_ remove the macro `bar', since the argument specification
+       does not match exactly.
+
+4.3.12 Exiting Multi-Line Macros: `%exitmacro'
+
+       Multi-line macro expansions can be arbitrarily terminated with the
+       `%exitmacro' directive.
+
+       For example:
+
+       %macro foo 1-3 
+               ; Do something 
+           %if<condition> 
+               %exitmacro 
+           %endif 
+               ; Do something 
+       %endmacro
+
+   4.4 Conditional Assembly
+
+       Similarly to the C preprocessor, NASM allows sections of a source
+       file to be assembled only if certain conditions are met. The general
+       syntax of this feature looks like this:
+
+       %if<condition> 
+           ; some code which only appears if <condition> is met 
+       %elif<condition2> 
+           ; only appears if <condition> is not met but <condition2> is 
+       %else 
+           ; this appears if neither <condition> nor <condition2> was met 
+       %endif
+
+       The inverse forms `%ifn' and `%elifn' are also supported.
+
+       The `%else' clause is optional, as is the `%elif' clause. You can
+       have more than one `%elif' clause as well.
+
+       There are a number of variants of the `%if' directive. Each has its
+       corresponding `%elif', `%ifn', and `%elifn' directives; for example,
+       the equivalents to the `%ifdef' directive are `%elifdef', `%ifndef',
+       and `%elifndef'.
+
+ 4.4.1 `%ifdef': Testing Single-Line Macro Existence
+
+       Beginning a conditional-assembly block with the line `%ifdef MACRO'
+       will assemble the subsequent code if, and only if, a single-line
+       macro called `MACRO' is defined. If not, then the `%elif' and
+       `%else' blocks (if any) will be processed instead.
+
+       For example, when debugging a program, you might want to write code
+       such as
+
+                 ; perform some function 
+       %ifdef DEBUG 
+                 writefile 2,"Function performed successfully",13,10 
+       %endif 
+                 ; go and do something else
+
+       Then you could use the command-line option `-dDEBUG' to create a
+       version of the program which produced debugging messages, and remove
+       the option to generate the final release version of the program.
+
+       You can test for a macro _not_ being defined by using `%ifndef'
+       instead of `%ifdef'. You can also test for macro definitions in
+       `%elif' blocks by using `%elifdef' and `%elifndef'.
+
+ 4.4.2 `%ifmacro': Testing Multi-Line Macro Existence
+
+       The `%ifmacro' directive operates in the same way as the `%ifdef'
+       directive, except that it checks for the existence of a multi-line
+       macro.
+
+       For example, you may be working with a large project and not have
+       control over the macros in a library. You may want to create a macro
+       with one name if it doesn't already exist, and another name if one
+       with that name does exist.
+
+       The `%ifmacro' is considered true if defining a macro with the given
+       name and number of arguments would cause a definitions conflict. For
+       example:
+
+       %ifmacro MyMacro 1-3 
+       
+            %error "MyMacro 1-3" causes a conflict with an existing macro. 
+       
+       %else 
+       
+            %macro MyMacro 1-3 
+       
+                    ; insert code to define the macro 
+       
+            %endmacro 
+       
+       %endif
+
+       This will create the macro "MyMacro 1-3" if no macro already exists
+       which would conflict with it, and emits a warning if there would be
+       a definition conflict.
+
+       You can test for the macro not existing by using the `%ifnmacro'
+       instead of `%ifmacro'. Additional tests can be performed in `%elif'
+       blocks by using `%elifmacro' and `%elifnmacro'.
+
+ 4.4.3 `%ifctx': Testing the Context Stack
+
+       The conditional-assembly construct `%ifctx' will cause the
+       subsequent code to be assembled if and only if the top context on
+       the preprocessor's context stack has the same name as one of the
+       arguments. As with `%ifdef', the inverse and `%elif' forms
+       `%ifnctx', `%elifctx' and `%elifnctx' are also supported.
+
+       For more details of the context stack, see section 4.7. For a sample
+       use of `%ifctx', see section 4.7.5.
+
+ 4.4.4 `%if': Testing Arbitrary Numeric Expressions
+
+       The conditional-assembly construct `%if expr' will cause the
+       subsequent code to be assembled if and only if the value of the
+       numeric expression `expr' is non-zero. An example of the use of this
+       feature is in deciding when to break out of a `%rep' preprocessor
+       loop: see section 4.5 for a detailed example.
+
+       The expression given to `%if', and its counterpart `%elif', is a
+       critical expression (see section 3.8).
+
+       `%if' extends the normal NASM expression syntax, by providing a set
+       of relational operators which are not normally available in
+       expressions. The operators `=', `<', `>', `<=', `>=' and `<>' test
+       equality, less-than, greater-than, less-or-equal, greater-or-equal
+       and not-equal respectively. The C-like forms `==' and `!=' are
+       supported as alternative forms of `=' and `<>'. In addition, low-
+       priority logical operators `&&', `^^' and `||' are provided,
+       supplying logical AND, logical XOR and logical OR. These work like
+       the C logical operators (although C has no logical XOR), in that
+       they always return either 0 or 1, and treat any non-zero input as 1
+       (so that `^^', for example, returns 1 if exactly one of its inputs
+       is zero, and 0 otherwise). The relational operators also return 1
+       for true and 0 for false.
+
+       Like other `%if' constructs, `%if' has a counterpart `%elif', and
+       negative forms `%ifn' and `%elifn'.
+
+ 4.4.5 `%ifidn' and `%ifidni': Testing Exact Text Identity
+
+       The construct `%ifidn text1,text2' will cause the subsequent code to
+       be assembled if and only if `text1' and `text2', after expanding
+       single-line macros, are identical pieces of text. Differences in
+       white space are not counted.
+
+       `%ifidni' is similar to `%ifidn', but is case-insensitive.
+
+       For example, the following macro pushes a register or number on the
+       stack, and allows you to treat `IP' as a real register:
+
+       %macro  pushparam 1 
+       
+         %ifidni %1,ip 
+               call    %%label 
+         %%label: 
+         %else 
+               push    %1 
+         %endif 
+       
+       %endmacro
+
+       Like other `%if' constructs, `%ifidn' has a counterpart `%elifidn',
+       and negative forms `%ifnidn' and `%elifnidn'. Similarly, `%ifidni'
+       has counterparts `%elifidni', `%ifnidni' and `%elifnidni'.
+
+ 4.4.6 `%ifid', `%ifnum', `%ifstr': Testing Token Types
+
+       Some macros will want to perform different tasks depending on
+       whether they are passed a number, a string, or an identifier. For
+       example, a string output macro might want to be able to cope with
+       being passed either a string constant or a pointer to an existing
+       string.
+
+       The conditional assembly construct `%ifid', taking one parameter
+       (which may be blank), assembles the subsequent code if and only if
+       the first token in the parameter exists and is an identifier.
+       `%ifnum' works similarly, but tests for the token being a numeric
+       constant; `%ifstr' tests for it being a string.
+
+       For example, the `writefile' macro defined in section 4.3.4 can be
+       extended to take advantage of `%ifstr' in the following fashion:
+
+       %macro writefile 2-3+ 
+       
+         %ifstr %2 
+               jmp     %%endstr 
+           %if %0 = 3 
+             %%str:    db      %2,%3 
+           %else 
+             %%str:    db      %2 
+           %endif 
+             %%endstr: mov     dx,%%str 
+                       mov     cx,%%endstr-%%str 
+         %else 
+                       mov     dx,%2 
+                       mov     cx,%3 
+         %endif 
+                       mov     bx,%1 
+                       mov     ah,0x40 
+                       int     0x21 
+       
+       %endmacro
+
+       Then the `writefile' macro can cope with being called in either of
+       the following two ways:
+
+               writefile [file], strpointer, length 
+               writefile [file], "hello", 13, 10
+
+       In the first, `strpointer' is used as the address of an already-
+       declared string, and `length' is used as its length; in the second,
+       a string is given to the macro, which therefore declares it itself
+       and works out the address and length for itself.
+
+       Note the use of `%if' inside the `%ifstr': this is to detect whether
+       the macro was passed two arguments (so the string would be a single
+       string constant, and `db %2' would be adequate) or more (in which
+       case, all but the first two would be lumped together into `%3', and
+       `db %2,%3' would be required).
+
+       The usual `%elif'..., `%ifn'..., and `%elifn'... versions exist for
+       each of `%ifid', `%ifnum' and `%ifstr'.
+
+ 4.4.7 `%iftoken': Test for a Single Token
+
+       Some macros will want to do different things depending on if it is
+       passed a single token (e.g. paste it to something else using `%+')
+       versus a multi-token sequence.
+
+       The conditional assembly construct `%iftoken' assembles the
+       subsequent code if and only if the expanded parameters consist of
+       exactly one token, possibly surrounded by whitespace.
+
+       For example:
+
+       %iftoken 1
+
+       will assemble the subsequent code, but
+
+       %iftoken -1
+
+       will not, since `-1' contains two tokens: the unary minus operator
+       `-', and the number `1'.
+
+       The usual `%eliftoken', `%ifntoken', and `%elifntoken' variants are
+       also provided.
+
+ 4.4.8 `%ifempty': Test for Empty Expansion
+
+       The conditional assembly construct `%ifempty' assembles the
+       subsequent code if and only if the expanded parameters do not
+       contain any tokens at all, whitespace excepted.
+
+       The usual `%elifempty', `%ifnempty', and `%elifnempty' variants are
+       also provided.
+
+   4.5 Preprocessor Loops: `%rep'
+
+       NASM's `TIMES' prefix, though useful, cannot be used to invoke a
+       multi-line macro multiple times, because it is processed by NASM
+       after macros have already been expanded. Therefore NASM provides
+       another form of loop, this time at the preprocessor level: `%rep'.
+
+       The directives `%rep' and `%endrep' (`%rep' takes a numeric
+       argument, which can be an expression; `%endrep' takes no arguments)
+       can be used to enclose a chunk of code, which is then replicated as
+       many times as specified by the preprocessor:
+
+       %assign i 0 
+       %rep    64 
+               inc     word [table+2*i] 
+       %assign i i+1 
+       %endrep
+
+       This will generate a sequence of 64 `INC' instructions, incrementing
+       every word of memory from `[table]' to `[table+126]'.
+
+       For more complex termination conditions, or to break out of a repeat
+       loop part way along, you can use the `%exitrep' directive to
+       terminate the loop, like this:
+
+       fibonacci: 
+       %assign i 0 
+       %assign j 1 
+       %rep 100 
+       %if j > 65535 
+           %exitrep 
+       %endif 
+               dw j 
+       %assign k j+i 
+       %assign i j 
+       %assign j k 
+       %endrep 
+       
+       fib_number equ ($-fibonacci)/2
+
+       This produces a list of all the Fibonacci numbers that will fit in
+       16 bits. Note that a maximum repeat count must still be given to
+       `%rep'. This is to prevent the possibility of NASM getting into an
+       infinite loop in the preprocessor, which (on multitasking or multi-
+       user systems) would typically cause all the system memory to be
+       gradually used up and other applications to start crashing.
+
+   4.6 Source Files and Dependencies
+
+       These commands allow you to split your sources into multiple files.
+
+ 4.6.1 `%include': Including Other Files
+
+       Using, once again, a very similar syntax to the C preprocessor,
+       NASM's preprocessor lets you include other source files into your
+       code. This is done by the use of the `%include' directive:
+
+       %include "macros.mac"
+
+       will include the contents of the file `macros.mac' into the source
+       file containing the `%include' directive.
+
+       Include files are searched for in the current directory (the
+       directory you're in when you run NASM, as opposed to the location of
+       the NASM executable or the location of the source file), plus any
+       directories specified on the NASM command line using the `-i'
+       option.
+
+       The standard C idiom for preventing a file being included more than
+       once is just as applicable in NASM: if the file `macros.mac' has the
+       form
+
+       %ifndef MACROS_MAC 
+           %define MACROS_MAC 
+           ; now define some macros 
+       %endif
+
+       then including the file more than once will not cause errors,
+       because the second time the file is included nothing will happen
+       because the macro `MACROS_MAC' will already be defined.
+
+       You can force a file to be included even if there is no `%include'
+       directive that explicitly includes it, by using the `-p' option on
+       the NASM command line (see section 2.1.17).
+
+ 4.6.2 `%pathsearch': Search the Include Path
+
+       The `%pathsearch' directive takes a single-line macro name and a
+       filename, and declare or redefines the specified single-line macro
+       to be the include-path-resolved version of the filename, if the file
+       exists (otherwise, it is passed unchanged.)
+
+       For example,
+
+       %pathsearch MyFoo "foo.bin"
+
+       ... with `-Ibins/' in the include path may end up defining the macro
+       `MyFoo' to be `"bins/foo.bin"'.
+
+ 4.6.3 `%depend': Add Dependent Files
+
+       The `%depend' directive takes a filename and adds it to the list of
+       files to be emitted as dependency generation when the `-M' options
+       and its relatives (see section 2.1.4) are used. It produces no
+       output.
+
+       This is generally used in conjunction with `%pathsearch'. For
+       example, a simplified version of the standard macro wrapper for the
+       `INCBIN' directive looks like:
+
+       %imacro incbin 1-2+ 0 
+       %pathsearch dep %1 
+       %depend dep 
+               incbin dep,%2 
+       %endmacro
+
+       This first resolves the location of the file into the macro `dep',
+       then adds it to the dependency lists, and finally issues the
+       assembler-level `INCBIN' directive.
+
+ 4.6.4 `%use': Include Standard Macro Package
+
+       The `%use' directive is similar to `%include', but rather than
+       including the contents of a file, it includes a named standard macro
+       package. The standard macro packages are part of NASM, and are
+       described in chapter 5.
+
+       Unlike the `%include' directive, package names for the `%use'
+       directive do not require quotes, but quotes are permitted. In NASM
+       2.04 and 2.05 the unquoted form would be macro-expanded; this is no
+       longer true. Thus, the following lines are equivalent:
+
+       %use altreg 
+       %use 'altreg'
+
+       Standard macro packages are protected from multiple inclusion. When
+       a standard macro package is used, a testable single-line macro of
+       the form `__USE_'_package_`__' is also defined, see section 4.11.8.
+
+   4.7 The Context Stack
+
+       Having labels that are local to a macro definition is sometimes not
+       quite powerful enough: sometimes you want to be able to share labels
+       between several macro calls. An example might be a `REPEAT' ...
+       `UNTIL' loop, in which the expansion of the `REPEAT' macro would
+       need to be able to refer to a label which the `UNTIL' macro had
+       defined. However, for such a macro you would also want to be able to
+       nest these loops.
+
+       NASM provides this level of power by means of a _context stack_. The
+       preprocessor maintains a stack of _contexts_, each of which is
+       characterized by a name. You add a new context to the stack using
+       the `%push' directive, and remove one using `%pop'. You can define
+       labels that are local to a particular context on the stack.
+
+ 4.7.1 `%push' and `%pop': Creating and Removing Contexts
+
+       The `%push' directive is used to create a new context and place it
+       on the top of the context stack. `%push' takes an optional argument,
+       which is the name of the context. For example:
+
+       %push    foobar
+
+       This pushes a new context called `foobar' on the stack. You can have
+       several contexts on the stack with the same name: they can still be
+       distinguished. If no name is given, the context is unnamed (this is
+       normally used when both the `%push' and the `%pop' are inside a
+       single macro definition.)
+
+       The directive `%pop', taking one optional argument, removes the top
+       context from the context stack and destroys it, along with any
+       labels associated with it. If an argument is given, it must match
+       the name of the current context, otherwise it will issue an error.
+
+ 4.7.2 Context-Local Labels
+
+       Just as the usage `%%foo' defines a label which is local to the
+       particular macro call in which it is used, the usage `%$foo' is used
+       to define a label which is local to the context on the top of the
+       context stack. So the `REPEAT' and `UNTIL' example given above could
+       be implemented by means of:
+
+       %macro repeat 0 
+       
+           %push   repeat 
+           %$begin: 
+       
+       %endmacro 
+       
+       %macro until 1 
+       
+               j%-1    %$begin 
+           %pop 
+       
+       %endmacro
+
+       and invoked by means of, for example,
+
+               mov     cx,string 
+               repeat 
+               add     cx,3 
+               scasb 
+               until   e
+
+       which would scan every fourth byte of a string in search of the byte
+       in `AL'.
+
+       If you need to define, or access, labels local to the context
+       _below_ the top one on the stack, you can use `%$$foo', or `%$$$foo'
+       for the context below that, and so on.
+
+ 4.7.3 Context-Local Single-Line Macros
+
+       NASM also allows you to define single-line macros which are local to
+       a particular context, in just the same way:
+
+       %define %$localmac 3
+
+       will define the single-line macro `%$localmac' to be local to the
+       top context on the stack. Of course, after a subsequent `%push', it
+       can then still be accessed by the name `%$$localmac'.
+
+ 4.7.4 `%repl': Renaming a Context
+
+       If you need to change the name of the top context on the stack (in
+       order, for example, to have it respond differently to `%ifctx'), you
+       can execute a `%pop' followed by a `%push'; but this will have the
+       side effect of destroying all context-local labels and macros
+       associated with the context that was just popped.
+
+       NASM provides the directive `%repl', which _replaces_ a context with
+       a different name, without touching the associated macros and labels.
+       So you could replace the destructive code
+
+       %pop 
+       %push   newname
+
+       with the non-destructive version `%repl newname'.
+
+ 4.7.5 Example Use of the Context Stack: Block IFs
+
+       This example makes use of almost all the context-stack features,
+       including the conditional-assembly construct `%ifctx', to implement
+       a block IF statement as a set of macros.
+
+       %macro if 1 
+       
+           %push if 
+           j%-1  %$ifnot 
+       
+       %endmacro 
+       
+       %macro else 0 
+       
+         %ifctx if 
+               %repl   else 
+               jmp     %$ifend 
+               %$ifnot: 
+         %else 
+               %error  "expected `if' before `else'" 
+         %endif 
+       
+       %endmacro 
+       
+       %macro endif 0 
+       
+         %ifctx if 
+               %$ifnot: 
+               %pop 
+         %elifctx      else 
+               %$ifend: 
+               %pop 
+         %else 
+               %error  "expected `if' or `else' before `endif'" 
+         %endif 
+       
+       %endmacro
+
+       This code is more robust than the `REPEAT' and `UNTIL' macros given
+       in section 4.7.2, because it uses conditional assembly to check that
+       the macros are issued in the right order (for example, not calling
+       `endif' before `if') and issues a `%error' if they're not.
+
+       In addition, the `endif' macro has to be able to cope with the two
+       distinct cases of either directly following an `if', or following an
+       `else'. It achieves this, again, by using conditional assembly to do
+       different things depending on whether the context on top of the
+       stack is `if' or `else'.
+
+       The `else' macro has to preserve the context on the stack, in order
+       to have the `%$ifnot' referred to by the `if' macro be the same as
+       the one defined by the `endif' macro, but has to change the
+       context's name so that `endif' will know there was an intervening
+       `else'. It does this by the use of `%repl'.
+
+       A sample usage of these macros might look like:
+
+               cmp     ax,bx 
+       
+               if ae 
+                      cmp     bx,cx 
+       
+                      if ae 
+                              mov     ax,cx 
+                      else 
+                              mov     ax,bx 
+                      endif 
+       
+               else 
+                      cmp     ax,cx 
+       
+                      if ae 
+                              mov     ax,cx 
+                      endif 
+       
+               endif
+
+       The block-`IF' macros handle nesting quite happily, by means of
+       pushing another context, describing the inner `if', on top of the
+       one describing the outer `if'; thus `else' and `endif' always refer
+       to the last unmatched `if' or `else'.
+
+   4.8 Stack Relative Preprocessor Directives
+
+       The following preprocessor directives provide a way to use labels to
+       refer to local variables allocated on the stack.
+
+       (*) `%arg' (see section 4.8.1)
+
+       (*) `%stacksize' (see section 4.8.2)
+
+       (*) `%local' (see section 4.8.3)
+
+ 4.8.1 `%arg' Directive
+
+       The `%arg' directive is used to simplify the handling of parameters
+       passed on the stack. Stack based parameter passing is used by many
+       high level languages, including C, C++ and Pascal.
+
+       While NASM has macros which attempt to duplicate this functionality
+       (see section 8.4.5), the syntax is not particularly convenient to
+       use. and is not TASM compatible. Here is an example which shows the
+       use of `%arg' without any external macros:
+
+       some_function: 
+       
+           %push     mycontext        ; save the current context 
+           %stacksize large           ; tell NASM to use bp 
+           %arg      i:word, j_ptr:word 
+       
+               mov     ax,[i] 
+               mov     bx,[j_ptr] 
+               add     ax,[bx] 
+               ret 
+       
+           %pop                       ; restore original context
+
+       This is similar to the procedure defined in section 8.4.5 and adds
+       the value in i to the value pointed to by j_ptr and returns the sum
+       in the ax register. See section 4.7.1 for an explanation of `push'
+       and `pop' and the use of context stacks.
+
+ 4.8.2 `%stacksize' Directive
+
+       The `%stacksize' directive is used in conjunction with the `%arg'
+       (see section 4.8.1) and the `%local' (see section 4.8.3) directives.
+       It tells NASM the default size to use for subsequent `%arg' and
+       `%local' directives. The `%stacksize' directive takes one required
+       argument which is one of `flat', `flat64', `large' or `small'.
+
+       %stacksize flat
+
+       This form causes NASM to use stack-based parameter addressing
+       relative to `ebp' and it assumes that a near form of call was used
+       to get to this label (i.e. that `eip' is on the stack).
+
+       %stacksize flat64
+
+       This form causes NASM to use stack-based parameter addressing
+       relative to `rbp' and it assumes that a near form of call was used
+       to get to this label (i.e. that `rip' is on the stack).
+
+       %stacksize large
+
+       This form uses `bp' to do stack-based parameter addressing and
+       assumes that a far form of call was used to get to this address
+       (i.e. that `ip' and `cs' are on the stack).
+
+       %stacksize small
+
+       This form also uses `bp' to address stack parameters, but it is
+       different from `large' because it also assumes that the old value of
+       bp is pushed onto the stack (i.e. it expects an `ENTER'
+       instruction). In other words, it expects that `bp', `ip' and `cs'
+       are on the top of the stack, underneath any local space which may
+       have been allocated by `ENTER'. This form is probably most useful
+       when used in combination with the `%local' directive (see section
+       4.8.3).
+
+ 4.8.3 `%local' Directive
+
+       The `%local' directive is used to simplify the use of local
+       temporary stack variables allocated in a stack frame. Automatic
+       local variables in C are an example of this kind of variable. The
+       `%local' directive is most useful when used with the `%stacksize'
+       (see section 4.8.2 and is also compatible with the `%arg' directive
+       (see section 4.8.1). It allows simplified reference to variables on
+       the stack which have been allocated typically by using the `ENTER'
+       instruction. An example of its use is the following:
+
+       silly_swap: 
+       
+           %push mycontext             ; save the current context 
+           %stacksize small            ; tell NASM to use bp 
+           %assign %$localsize 0       ; see text for explanation 
+           %local old_ax:word, old_dx:word 
+       
+               enter   %$localsize,0   ; see text for explanation 
+               mov     [old_ax],ax     ; swap ax & bx 
+               mov     [old_dx],dx     ; and swap dx & cx 
+               mov     ax,bx 
+               mov     dx,cx 
+               mov     bx,[old_ax] 
+               mov     cx,[old_dx] 
+               leave                   ; restore old bp 
+               ret                     ; 
+       
+           %pop                        ; restore original context
+
+       The `%$localsize' variable is used internally by the `%local'
+       directive and _must_ be defined within the current context before
+       the `%local' directive may be used. Failure to do so will result in
+       one expression syntax error for each `%local' variable declared. It
+       then may be used in the construction of an appropriately sized ENTER
+       instruction as shown in the example.
+
+   4.9 Reporting User-Defined Errors: `%error', `%warning', `%fatal'
+
+       The preprocessor directive `%error' will cause NASM to report an
+       error if it occurs in assembled code. So if other users are going to
+       try to assemble your source files, you can ensure that they define
+       the right macros by means of code like this:
+
+       %ifdef F1 
+           ; do some setup 
+       %elifdef F2 
+           ; do some different setup 
+       %else 
+           %error "Neither F1 nor F2 was defined." 
+       %endif
+
+       Then any user who fails to understand the way your code is supposed
+       to be assembled will be quickly warned of their mistake, rather than
+       having to wait until the program crashes on being run and then not
+       knowing what went wrong.
+
+       Similarly, `%warning' issues a warning, but allows assembly to
+       continue:
+
+       %ifdef F1 
+           ; do some setup 
+       %elifdef F2 
+           ; do some different setup 
+       %else 
+           %warning "Neither F1 nor F2 was defined, assuming F1." 
+           %define F1 
+       %endif
+
+       `%error' and `%warning' are issued only on the final assembly pass.
+       This makes them safe to use in conjunction with tests that depend on
+       symbol values.
+
+       `%fatal' terminates assembly immediately, regardless of pass. This
+       is useful when there is no point in continuing the assembly further,
+       and doing so is likely just going to cause a spew of confusing error
+       messages.
+
+       It is optional for the message string after `%error', `%warning' or
+       `%fatal' to be quoted. If it is _not_, then single-line macros are
+       expanded in it, which can be used to display more information to the
+       user. For example:
+
+       %if foo > 64 
+           %assign foo_over foo-64 
+           %error foo is foo_over bytes too large 
+       %endif
+
+  4.10 Other Preprocessor Directives
+
+       NASM also has preprocessor directives which allow access to
+       information from external sources. Currently they include:
+
+       (*) `%line' enables NASM to correctly handle the output of another
+           preprocessor (see section 4.10.1).
+
+       (*) `%!' enables NASM to read in the value of an environment
+           variable, which can then be used in your program (see section
+           4.10.2).
+
+4.10.1 `%line' Directive
+
+       The `%line' directive is used to notify NASM that the input line
+       corresponds to a specific line number in another file. Typically
+       this other file would be an original source file, with the current
+       NASM input being the output of a pre-processor. The `%line'
+       directive allows NASM to output messages which indicate the line
+       number of the original source file, instead of the file that is
+       being read by NASM.
+
+       This preprocessor directive is not generally of use to programmers,
+       by may be of interest to preprocessor authors. The usage of the
+       `%line' preprocessor directive is as follows:
+
+       %line nnn[+mmm] [filename]
+
+       In this directive, `nnn' identifies the line of the original source
+       file which this line corresponds to. `mmm' is an optional parameter
+       which specifies a line increment value; each line of the input file
+       read in is considered to correspond to `mmm' lines of the original
+       source file. Finally, `filename' is an optional parameter which
+       specifies the file name of the original source file.
+
+       After reading a `%line' preprocessor directive, NASM will report all
+       file name and line numbers relative to the values specified therein.
+
+4.10.2 `%!'`<env>': Read an environment variable.
+
+       The `%!<env>' directive makes it possible to read the value of an
+       environment variable at assembly time. This could, for example, be
+       used to store the contents of an environment variable into a string,
+       which could be used at some other point in your code.
+
+       For example, suppose that you have an environment variable `FOO',
+       and you want the contents of `FOO' to be embedded in your program.
+       You could do that as follows:
+
+       %defstr FOO    %!FOO
+
+       See section 4.1.8 for notes on the `%defstr' directive.
+
+  4.11 Standard Macros
+
+       NASM defines a set of standard macros, which are already defined
+       when it starts to process any source file. If you really need a
+       program to be assembled with no pre-defined macros, you can use the
+       `%clear' directive to empty the preprocessor of everything but
+       context-local preprocessor variables and single-line macros.
+
+       Most user-level assembler directives (see chapter 6) are implemented
+       as macros which invoke primitive directives; these are described in
+       chapter 6. The rest of the standard macro set is described here.
+
+4.11.1 NASM Version Macros
+
+       The single-line macros `__NASM_MAJOR__', `__NASM_MINOR__',
+       `__NASM_SUBMINOR__' and `___NASM_PATCHLEVEL__' expand to the major,
+       minor, subminor and patch level parts of the version number of NASM
+       being used. So, under NASM 0.98.32p1 for example, `__NASM_MAJOR__'
+       would be defined to be 0, `__NASM_MINOR__' would be defined as 98,
+       `__NASM_SUBMINOR__' would be defined to 32, and
+       `___NASM_PATCHLEVEL__' would be defined as 1.
+
+       Additionally, the macro `__NASM_SNAPSHOT__' is defined for
+       automatically generated snapshot releases _only_.
+
+4.11.2 `__NASM_VERSION_ID__': NASM Version ID
+
+       The single-line macro `__NASM_VERSION_ID__' expands to a dword
+       integer representing the full version number of the version of nasm
+       being used. The value is the equivalent to `__NASM_MAJOR__',
+       `__NASM_MINOR__', `__NASM_SUBMINOR__' and `___NASM_PATCHLEVEL__'
+       concatenated to produce a single doubleword. Hence, for 0.98.32p1,
+       the returned number would be equivalent to:
+
+               dd      0x00622001
+
+       or
+
+               db      1,32,98,0
+
+       Note that the above lines are generate exactly the same code, the
+       second line is used just to give an indication of the order that the
+       separate values will be present in memory.
+
+4.11.3 `__NASM_VER__': NASM Version string
+
+       The single-line macro `__NASM_VER__' expands to a string which
+       defines the version number of nasm being used. So, under NASM
+       0.98.32 for example,
+
+               db      __NASM_VER__
+
+       would expand to
+
+               db      "0.98.32"
+
+4.11.4 `__FILE__' and `__LINE__': File Name and Line Number
+
+       Like the C preprocessor, NASM allows the user to find out the file
+       name and line number containing the current instruction. The macro
+       `__FILE__' expands to a string constant giving the name of the
+       current input file (which may change through the course of assembly
+       if `%include' directives are used), and `__LINE__' expands to a
+       numeric constant giving the current line number in the input file.
+
+       These macros could be used, for example, to communicate debugging
+       information to a macro, since invoking `__LINE__' inside a macro
+       definition (either single-line or multi-line) will return the line
+       number of the macro _call_, rather than _definition_. So to
+       determine where in a piece of code a crash is occurring, for
+       example, one could write a routine `stillhere', which is passed a
+       line number in `EAX' and outputs something like `line 155: still
+       here'. You could then write a macro
+
+       %macro  notdeadyet 0 
+       
+               push    eax 
+               mov     eax,__LINE__ 
+               call    stillhere 
+               pop     eax 
+       
+       %endmacro
+
+       and then pepper your code with calls to `notdeadyet' until you find
+       the crash point.
+
+4.11.5 `__BITS__': Current BITS Mode
+
+       The `__BITS__' standard macro is updated every time that the BITS
+       mode is set using the `BITS XX' or `[BITS XX]' directive, where XX
+       is a valid mode number of 16, 32 or 64. `__BITS__' receives the
+       specified mode number and makes it globally available. This can be
+       very useful for those who utilize mode-dependent macros.
+
+4.11.6 `__OUTPUT_FORMAT__': Current Output Format
+
+       The `__OUTPUT_FORMAT__' standard macro holds the current Output
+       Format, as given by the `-f' option or NASM's default. Type
+       `nasm -hf' for a list.
+
+       %ifidn __OUTPUT_FORMAT__, win32 
+        %define NEWLINE 13, 10 
+       %elifidn __OUTPUT_FORMAT__, elf32 
+        %define NEWLINE 10 
+       %endif
+
+4.11.7 Assembly Date and Time Macros
+
+       NASM provides a variety of macros that represent the timestamp of
+       the assembly session.
+
+       (*) The `__DATE__' and `__TIME__' macros give the assembly date and
+           time as strings, in ISO 8601 format (`"YYYY-MM-DD"' and
+           `"HH:MM:SS"', respectively.)
+
+       (*) The `__DATE_NUM__' and `__TIME_NUM__' macros give the assembly
+           date and time in numeric form; in the format `YYYYMMDD' and
+           `HHMMSS' respectively.
+
+       (*) The `__UTC_DATE__' and `__UTC_TIME__' macros give the assembly
+           date and time in universal time (UTC) as strings, in ISO 8601
+           format (`"YYYY-MM-DD"' and `"HH:MM:SS"', respectively.) If the
+           host platform doesn't provide UTC time, these macros are
+           undefined.
+
+       (*) The `__UTC_DATE_NUM__' and `__UTC_TIME_NUM__' macros give the
+           assembly date and time universal time (UTC) in numeric form; in
+           the format `YYYYMMDD' and `HHMMSS' respectively. If the host
+           platform doesn't provide UTC time, these macros are undefined.
+
+       (*) The `__POSIX_TIME__' macro is defined as a number containing the
+           number of seconds since the POSIX epoch, 1 January 1970 00:00:00
+           UTC; excluding any leap seconds. This is computed using UTC time
+           if available on the host platform, otherwise it is computed
+           using the local time as if it was UTC.
+
+       All instances of time and date macros in the same assembly session
+       produce consistent output. For example, in an assembly session
+       started at 42 seconds after midnight on January 1, 2010 in Moscow
+       (timezone UTC+3) these macros would have the following values,
+       assuming, of course, a properly configured environment with a
+       correct clock:
+
+             __DATE__             "2010-01-01" 
+             __TIME__             "00:00:42" 
+             __DATE_NUM__         20100101 
+             __TIME_NUM__         000042 
+             __UTC_DATE__         "2009-12-31" 
+             __UTC_TIME__         "21:00:42" 
+             __UTC_DATE_NUM__     20091231 
+             __UTC_TIME_NUM__     210042 
+             __POSIX_TIME__       1262293242
+
+4.11.8 `__USE_'_package_`__': Package Include Test
+
+       When a standard macro package (see chapter 5) is included with the
+       `%use' directive (see section 4.6.4), a single-line macro of the
+       form `__USE_'_package_`__' is automatically defined. This allows
+       testing if a particular package is invoked or not.
+
+       For example, if the `altreg' package is included (see section 5.1),
+       then the macro `__USE_ALTREG__' is defined.
+
+4.11.9 `__PASS__': Assembly Pass
+
+       The macro `__PASS__' is defined to be `1' on preparatory passes, and
+       `2' on the final pass. In preprocess-only mode, it is set to `3',
+       and when running only to generate dependencies (due to the `-M' or
+       `-MG' option, see section 2.1.4) it is set to `0'.
+
+       _Avoid using this macro if at all possible. It is tremendously easy
+       to generate very strange errors by misusing it, and the semantics
+       may change in future versions of NASM._
+
+4.11.10 `STRUC' and `ENDSTRUC': Declaring Structure Data Types
+
+       The core of NASM contains no intrinsic means of defining data
+       structures; instead, the preprocessor is sufficiently powerful that
+       data structures can be implemented as a set of macros. The macros
+       `STRUC' and `ENDSTRUC' are used to define a structure data type.
+
+       `STRUC' takes one or two parameters. The first parameter is the name
+       of the data type. The second, optional parameter is the base offset
+       of the structure. The name of the data type is defined as a symbol
+       with the value of the base offset, and the name of the data type
+       with the suffix `_size' appended to it is defined as an `EQU' giving
+       the size of the structure. Once `STRUC' has been issued, you are
+       defining the structure, and should define fields using the `RESB'
+       family of pseudo-instructions, and then invoke `ENDSTRUC' to finish
+       the definition.
+
+       For example, to define a structure called `mytype' containing a
+       longword, a word, a byte and a string of bytes, you might code
+
+       struc   mytype 
+       
+         mt_long:      resd    1 
+         mt_word:      resw    1 
+         mt_byte:      resb    1 
+         mt_str:       resb    32 
+       
+       endstruc
+
+       The above code defines six symbols: `mt_long' as 0 (the offset from
+       the beginning of a `mytype' structure to the longword field),
+       `mt_word' as 4, `mt_byte' as 6, `mt_str' as 7, `mytype_size' as 39,
+       and `mytype' itself as zero.
+
+       The reason why the structure type name is defined at zero by default
+       is a side effect of allowing structures to work with the local label
+       mechanism: if your structure members tend to have the same names in
+       more than one structure, you can define the above structure like
+       this:
+
+       struc mytype 
+       
+         .long:        resd    1 
+         .word:        resw    1 
+         .byte:        resb    1 
+         .str:         resb    32 
+       
+       endstruc
+
+       This defines the offsets to the structure fields as `mytype.long',
+       `mytype.word', `mytype.byte' and `mytype.str'.
+
+       NASM, since it has no _intrinsic_ structure support, does not
+       support any form of period notation to refer to the elements of a
+       structure once you have one (except the above local-label notation),
+       so code such as `mov ax,[mystruc.mt_word]' is not valid. `mt_word'
+       is a constant just like any other constant, so the correct syntax is
+       `mov ax,[mystruc+mt_word]' or `mov ax,[mystruc+mytype.word]'.
+
+       Sometimes you only have the address of the structure displaced by an
+       offset. For example, consider this standard stack frame setup:
+
+       push ebp 
+       mov ebp, esp 
+       sub esp, 40
+
+       In this case, you could access an element by subtracting the offset:
+
+       mov [ebp - 40 + mytype.word], ax
+
+       However, if you do not want to repeat this offset, you can use -40
+       as a base offset:
+
+       struc mytype, -40
+
+       And access an element this way:
+
+       mov [ebp + mytype.word], ax
+
+4.11.11 `ISTRUC', `AT' and `IEND': Declaring Instances of Structures
+
+       Having defined a structure type, the next thing you typically want
+       to do is to declare instances of that structure in your data
+       segment. NASM provides an easy way to do this in the `ISTRUC'
+       mechanism. To declare a structure of type `mytype' in a program, you
+       code something like this:
+
+       mystruc: 
+           istruc mytype 
+       
+               at mt_long, dd      123456 
+               at mt_word, dw      1024 
+               at mt_byte, db      'x' 
+               at mt_str,  db      'hello, world', 13, 10, 0 
+       
+           iend
+
+       The function of the `AT' macro is to make use of the `TIMES' prefix
+       to advance the assembly position to the correct point for the
+       specified structure field, and then to declare the specified data.
+       Therefore the structure fields must be declared in the same order as
+       they were specified in the structure definition.
+
+       If the data to go in a structure field requires more than one source
+       line to specify, the remaining source lines can easily come after
+       the `AT' line. For example:
+
+               at mt_str,  db      123,134,145,156,167,178,189 
+                           db      190,100,0
+
+       Depending on personal taste, you can also omit the code part of the
+       `AT' line completely, and start the structure field on the next
+       line:
+
+               at mt_str 
+                       db      'hello, world' 
+                       db      13,10,0
+
+4.11.12 `ALIGN' and `ALIGNB': Data Alignment
+
+       The `ALIGN' and `ALIGNB' macros provides a convenient way to align
+       code or data on a word, longword, paragraph or other boundary. (Some
+       assemblers call this directive `EVEN'.) The syntax of the `ALIGN'
+       and `ALIGNB' macros is
+
+               align   4               ; align on 4-byte boundary 
+               align   16              ; align on 16-byte boundary 
+               align   8,db 0          ; pad with 0s rather than NOPs 
+               align   4,resb 1        ; align to 4 in the BSS 
+               alignb  4               ; equivalent to previous line
+
+       Both macros require their first argument to be a power of two; they
+       both compute the number of additional bytes required to bring the
+       length of the current section up to a multiple of that power of two,
+       and then apply the `TIMES' prefix to their second argument to
+       perform the alignment.
+
+       If the second argument is not specified, the default for `ALIGN' is
+       `NOP', and the default for `ALIGNB' is `RESB 1'. So if the second
+       argument is specified, the two macros are equivalent. Normally, you
+       can just use `ALIGN' in code and data sections and `ALIGNB' in BSS
+       sections, and never need the second argument except for special
+       purposes.
+
+       `ALIGN' and `ALIGNB', being simple macros, perform no error
+       checking: they cannot warn you if their first argument fails to be a
+       power of two, or if their second argument generates more than one
+       byte of code. In each of these cases they will silently do the wrong
+       thing.
+
+       `ALIGNB' (or `ALIGN' with a second argument of `RESB 1') can be used
+       within structure definitions:
+
+       struc mytype2 
+       
+         mt_byte: 
+               resb 1 
+               alignb 2 
+         mt_word: 
+               resw 1 
+               alignb 4 
+         mt_long: 
+               resd 1 
+         mt_str: 
+               resb 32 
+       
+       endstruc
+
+       This will ensure that the structure members are sensibly aligned
+       relative to the base of the structure.
+
+       A final caveat: `ALIGN' and `ALIGNB' work relative to the beginning
+       of the _section_, not the beginning of the address space in the
+       final executable. Aligning to a 16-byte boundary when the section
+       you're in is only guaranteed to be aligned to a 4-byte boundary, for
+       example, is a waste of effort. Again, NASM does not check that the
+       section's alignment characteristics are sensible for the use of
+       `ALIGN' or `ALIGNB'.
+
+       See also the `smartalign' standard macro package, section 5.2.
+
+Chapter 5: Standard Macro Packages
+----------------------------------
+
+       The `%use' directive (see section 4.6.4) includes one of the
+       standard macro packages included with the NASM distribution and
+       compiled into the NASM binary. It operates like the `%include'
+       directive (see section 4.6.1), but the included contents is provided
+       by NASM itself.
+
+       The names of standard macro packages are case insensitive, and can
+       be quoted or not.
+
+   5.1 `altreg': Alternate Register Names
+
+       The `altreg' standard macro package provides alternate register
+       names. It provides numeric register names for all registers (not
+       just `R8'-`R15'), the Intel-defined aliases `R8L'-`R15L' for the low
+       bytes of register (as opposed to the NASM/AMD standard names `R8B'-
+       `R15B'), and the names `R0H'-`R3H' (by analogy with `R0L'-`R3L') for
+       `AH', `CH', `DH', and `BH'.
+
+       Example use:
+
+       %use altreg 
+       
+       proc: 
+             mov r0l,r3h                    ; mov al,bh 
+             ret
+
+       See also section 11.1.
+
+   5.2 `smartalign': Smart `ALIGN' Macro
+
+       The `smartalign' standard macro package provides for an `ALIGN'
+       macro which is more powerful than the default (and backwards-
+       compatible) one (see section 4.11.12). When the `smartalign' package
+       is enabled, when `ALIGN' is used without a second argument, NASM
+       will generate a sequence of instructions more efficient than a
+       series of `NOP'. Furthermore, if the padding exceeds a specific
+       threshold, then NASM will generate a jump over the entire padding
+       sequence.
+
+       The specific instructions generated can be controlled with the new
+       `ALIGNMODE' macro. This macro takes two parameters: one mode, and an
+       optional jump threshold override. The modes are as follows:
+
+       (*) `generic': Works on all x86 CPUs and should have reasonable
+           performance. The default jump threshold is 8. This is the
+           default.
+
+       (*) `nop': Pad out with `NOP' instructions. The only difference
+           compared to the standard `ALIGN' macro is that NASM can still
+           jump over a large padding area. The default jump threshold is
+           16.
+
+       (*) `k7': Optimize for the AMD K7 (Athlon/Althon XP). These
+           instructions should still work on all x86 CPUs. The default jump
+           threshold is 16.
+
+       (*) `k8': Optimize for the AMD K8 (Opteron/Althon 64). These
+           instructions should still work on all x86 CPUs. The default jump
+           threshold is 16.
+
+       (*) `p6': Optimize for Intel CPUs. This uses the long `NOP'
+           instructions first introduced in Pentium Pro. This is
+           incompatible with all CPUs of family 5 or lower, as well as some
+           VIA CPUs and several virtualization solutions. The default jump
+           threshold is 16.
+
+       The macro `__ALIGNMODE__' is defined to contain the current
+       alignment mode. A number of other macros beginning with `__ALIGN_'
+       are used internally by this macro package.
+
+Chapter 6: Assembler Directives
+-------------------------------
+
+       NASM, though it attempts to avoid the bureaucracy of assemblers like
+       MASM and TASM, is nevertheless forced to support a _few_ directives.
+       These are described in this chapter.
+
+       NASM's directives come in two types: _user-level_ directives and
+       _primitive_ directives. Typically, each directive has a user-level
+       form and a primitive form. In almost all cases, we recommend that
+       users use the user-level forms of the directives, which are
+       implemented as macros which call the primitive forms.
+
+       Primitive directives are enclosed in square brackets; user-level
+       directives are not.
+
+       In addition to the universal directives described in this chapter,
+       each object file format can optionally supply extra directives in
+       order to control particular features of that file format. These
+       _format-specific_ directives are documented along with the formats
+       that implement them, in chapter 7.
+
+   6.1 `BITS': Specifying Target Processor Mode
+
+       The `BITS' directive specifies whether NASM should generate code
+       designed to run on a processor operating in 16-bit mode, 32-bit mode
+       or 64-bit mode. The syntax is `BITS XX', where XX is 16, 32 or 64.
+
+       In most cases, you should not need to use `BITS' explicitly. The
+       `aout', `coff', `elf', `macho', `win32' and `win64' object formats,
+       which are designed for use in 32-bit or 64-bit operating systems,
+       all cause NASM to select 32-bit or 64-bit mode, respectively, by
+       default. The `obj' object format allows you to specify each segment
+       you define as either `USE16' or `USE32', and NASM will set its
+       operating mode accordingly, so the use of the `BITS' directive is
+       once again unnecessary.
+
+       The most likely reason for using the `BITS' directive is to write
+       32-bit or 64-bit code in a flat binary file; this is because the
+       `bin' output format defaults to 16-bit mode in anticipation of it
+       being used most frequently to write DOS `.COM' programs, DOS `.SYS'
+       device drivers and boot loader software.
+
+       You do _not_ need to specify `BITS 32' merely in order to use 32-bit
+       instructions in a 16-bit DOS program; if you do, the assembler will
+       generate incorrect code because it will be writing code targeted at
+       a 32-bit platform, to be run on a 16-bit one.
+
+       When NASM is in `BITS 16' mode, instructions which use 32-bit data
+       are prefixed with an 0x66 byte, and those referring to 32-bit
+       addresses have an 0x67 prefix. In `BITS 32' mode, the reverse is
+       true: 32-bit instructions require no prefixes, whereas instructions
+       using 16-bit data need an 0x66 and those working on 16-bit addresses
+       need an 0x67.
+
+       When NASM is in `BITS 64' mode, most instructions operate the same
+       as they do for `BITS 32' mode. However, there are 8 more general and
+       SSE registers, and 16-bit addressing is no longer supported.
+
+       The default address size is 64 bits; 32-bit addressing can be
+       selected with the 0x67 prefix. The default operand size is still 32
+       bits, however, and the 0x66 prefix selects 16-bit operand size. The
+       `REX' prefix is used both to select 64-bit operand size, and to
+       access the new registers. NASM automatically inserts REX prefixes
+       when necessary.
+
+       When the `REX' prefix is used, the processor does not know how to
+       address the AH, BH, CH or DH (high 8-bit legacy) registers. Instead,
+       it is possible to access the the low 8-bits of the SP, BP SI and DI
+       registers as SPL, BPL, SIL and DIL, respectively; but only when the
+       REX prefix is used.
+
+       The `BITS' directive has an exactly equivalent primitive form,
+       `[BITS 16]', `[BITS 32]' and `[BITS 64]'. The user-level form is a
+       macro which has no function other than to call the primitive form.
+
+       Note that the space is neccessary, e.g. `BITS32' will _not_ work!
+
+ 6.1.1 `USE16' & `USE32': Aliases for BITS
+
+       The ``USE16'' and ``USE32'' directives can be used in place of
+       ``BITS 16'' and ``BITS 32'', for compatibility with other
+       assemblers.
+
+   6.2 `DEFAULT': Change the assembler defaults
+
+       The `DEFAULT' directive changes the assembler defaults. Normally,
+       NASM defaults to a mode where the programmer is expected to
+       explicitly specify most features directly. However, this is
+       occationally obnoxious, as the explicit form is pretty much the only
+       one one wishes to use.
+
+       Currently, the only `DEFAULT' that is settable is whether or not
+       registerless instructions in 64-bit mode are `RIP'-relative or not.
+       By default, they are absolute unless overridden with the `REL'
+       specifier (see section 3.3). However, if `DEFAULT REL' is specified,
+       `REL' is default, unless overridden with the `ABS' specifier,
+       _except when used with an FS or GS segment override_.
+
+       The special handling of `FS' and `GS' overrides are due to the fact
+       that these registers are generally used as thread pointers or other
+       special functions in 64-bit mode, and generating `RIP'-relative
+       addresses would be extremely confusing.
+
+       `DEFAULT REL' is disabled with `DEFAULT ABS'.
+
+   6.3 `SECTION' or `SEGMENT': Changing and Defining Sections
+
+       The `SECTION' directive (`SEGMENT' is an exactly equivalent synonym)
+       changes which section of the output file the code you write will be
+       assembled into. In some object file formats, the number and names of
+       sections are fixed; in others, the user may make up as many as they
+       wish. Hence `SECTION' may sometimes give an error message, or may
+       define a new section, if you try to switch to a section that does
+       not (yet) exist.
+
+       The Unix object formats, and the `bin' object format (but see
+       section 7.1.3, all support the standardized section names `.text',
+       `.data' and `.bss' for the code, data and uninitialized-data
+       sections. The `obj' format, by contrast, does not recognize these
+       section names as being special, and indeed will strip off the
+       leading period of any section name that has one.
+
+ 6.3.1 The `__SECT__' Macro
+
+       The `SECTION' directive is unusual in that its user-level form
+       functions differently from its primitive form. The primitive form,
+       `[SECTION xyz]', simply switches the current target section to the
+       one given. The user-level form, `SECTION xyz', however, first
+       defines the single-line macro `__SECT__' to be the primitive
+       `[SECTION]' directive which it is about to issue, and then issues
+       it. So the user-level directive
+
+               SECTION .text
+
+       expands to the two lines
+
+       %define __SECT__        [SECTION .text] 
+               [SECTION .text]
+
+       Users may find it useful to make use of this in their own macros.
+       For example, the `writefile' macro defined in section 4.3.4 can be
+       usefully rewritten in the following more sophisticated form:
+
+       %macro  writefile 2+ 
+       
+               [section .data] 
+       
+         %%str:        db      %2 
+         %%endstr: 
+       
+               __SECT__ 
+       
+               mov     dx,%%str 
+               mov     cx,%%endstr-%%str 
+               mov     bx,%1 
+               mov     ah,0x40 
+               int     0x21 
+       
+       %endmacro
+
+       This form of the macro, once passed a string to output, first
+       switches temporarily to the data section of the file, using the
+       primitive form of the `SECTION' directive so as not to modify
+       `__SECT__'. It then declares its string in the data section, and
+       then invokes `__SECT__' to switch back to _whichever_ section the
+       user was previously working in. It thus avoids the need, in the
+       previous version of the macro, to include a `JMP' instruction to
+       jump over the data, and also does not fail if, in a complicated
+       `OBJ' format module, the user could potentially be assembling the
+       code in any of several separate code sections.
+
+   6.4 `ABSOLUTE': Defining Absolute Labels
+
+       The `ABSOLUTE' directive can be thought of as an alternative form of
+       `SECTION': it causes the subsequent code to be directed at no
+       physical section, but at the hypothetical section starting at the
+       given absolute address. The only instructions you can use in this
+       mode are the `RESB' family.
+
+       `ABSOLUTE' is used as follows:
+
+       absolute 0x1A 
+       
+           kbuf_chr    resw    1 
+           kbuf_free   resw    1 
+           kbuf        resw    16
+
+       This example describes a section of the PC BIOS data area, at
+       segment address 0x40: the above code defines `kbuf_chr' to be 0x1A,
+       `kbuf_free' to be 0x1C, and `kbuf' to be 0x1E.
+
+       The user-level form of `ABSOLUTE', like that of `SECTION', redefines
+       the `__SECT__' macro when it is invoked.
+
+       `STRUC' and `ENDSTRUC' are defined as macros which use `ABSOLUTE'
+       (and also `__SECT__').
+
+       `ABSOLUTE' doesn't have to take an absolute constant as an argument:
+       it can take an expression (actually, a critical expression: see
+       section 3.8) and it can be a value in a segment. For example, a TSR
+       can re-use its setup code as run-time BSS like this:
+
+               org     100h               ; it's a .COM program 
+       
+               jmp     setup              ; setup code comes last 
+       
+               ; the resident part of the TSR goes here 
+       setup: 
+               ; now write the code that installs the TSR here 
+       
+       absolute setup 
+       
+       runtimevar1     resw    1 
+       runtimevar2     resd    20 
+       
+       tsr_end:
+
+       This defines some variables `on top of' the setup code, so that
+       after the setup has finished running, the space it took up can be
+       re-used as data storage for the running TSR. The symbol `tsr_end'
+       can be used to calculate the total size of the part of the TSR that
+       needs to be made resident.
+
+   6.5 `EXTERN': Importing Symbols from Other Modules
+
+       `EXTERN' is similar to the MASM directive `EXTRN' and the C keyword
+       `extern': it is used to declare a symbol which is not defined
+       anywhere in the module being assembled, but is assumed to be defined
+       in some other module and needs to be referred to by this one. Not
+       every object-file format can support external variables: the `bin'
+       format cannot.
+
+       The `EXTERN' directive takes as many arguments as you like. Each
+       argument is the name of a symbol:
+
+       extern  _printf 
+       extern  _sscanf,_fscanf
+
+       Some object-file formats provide extra features to the `EXTERN'
+       directive. In all cases, the extra features are used by suffixing a
+       colon to the symbol name followed by object-format specific text.
+       For example, the `obj' format allows you to declare that the default
+       segment base of an external should be the group `dgroup' by means of
+       the directive
+
+       extern  _variable:wrt dgroup
+
+       The primitive form of `EXTERN' differs from the user-level form only
+       in that it can take only one argument at a time: the support for
+       multiple arguments is implemented at the preprocessor level.
+
+       You can declare the same variable as `EXTERN' more than once: NASM
+       will quietly ignore the second and later redeclarations. You can't
+       declare a variable as `EXTERN' as well as something else, though.
+
+   6.6 `GLOBAL': Exporting Symbols to Other Modules
+
+       `GLOBAL' is the other end of `EXTERN': if one module declares a
+       symbol as `EXTERN' and refers to it, then in order to prevent linker
+       errors, some other module must actually _define_ the symbol and
+       declare it as `GLOBAL'. Some assemblers use the name `PUBLIC' for
+       this purpose.
+
+       The `GLOBAL' directive applying to a symbol must appear _before_ the
+       definition of the symbol.
+
+       `GLOBAL' uses the same syntax as `EXTERN', except that it must refer
+       to symbols which _are_ defined in the same module as the `GLOBAL'
+       directive. For example:
+
+       global _main 
+       _main: 
+               ; some code
+
+       `GLOBAL', like `EXTERN', allows object formats to define private
+       extensions by means of a colon. The `elf' object format, for
+       example, lets you specify whether global data items are functions or
+       data:
+
+       global  hashlookup:function, hashtable:data
+
+       Like `EXTERN', the primitive form of `GLOBAL' differs from the user-
+       level form only in that it can take only one argument at a time.
+
+   6.7 `COMMON': Defining Common Data Areas
+
+       The `COMMON' directive is used to declare _common variables_. A
+       common variable is much like a global variable declared in the
+       uninitialized data section, so that
+
+       common  intvar  4
+
+       is similar in function to
+
+       global  intvar 
+       section .bss 
+       
+       intvar  resd    1
+
+       The difference is that if more than one module defines the same
+       common variable, then at link time those variables will be _merged_,
+       and references to `intvar' in all modules will point at the same
+       piece of memory.
+
+       Like `GLOBAL' and `EXTERN', `COMMON' supports object-format specific
+       extensions. For example, the `obj' format allows common variables to
+       be NEAR or FAR, and the `elf' format allows you to specify the
+       alignment requirements of a common variable:
+
+       common  commvar  4:near  ; works in OBJ 
+       common  intarray 100:4   ; works in ELF: 4 byte aligned
+
+       Once again, like `EXTERN' and `GLOBAL', the primitive form of
+       `COMMON' differs from the user-level form only in that it can take
+       only one argument at a time.
+
+   6.8 `CPU': Defining CPU Dependencies
+
+       The `CPU' directive restricts assembly to those instructions which
+       are available on the specified CPU.
+
+       Options are:
+
+       (*) `CPU 8086' Assemble only 8086 instruction set
+
+       (*) `CPU 186' Assemble instructions up to the 80186 instruction set
+
+       (*) `CPU 286' Assemble instructions up to the 286 instruction set
+
+       (*) `CPU 386' Assemble instructions up to the 386 instruction set
+
+       (*) `CPU 486' 486 instruction set
+
+       (*) `CPU 586' Pentium instruction set
+
+       (*) `CPU PENTIUM' Same as 586
+
+       (*) `CPU 686' P6 instruction set
+
+       (*) `CPU PPRO' Same as 686
+
+       (*) `CPU P2' Same as 686
+
+       (*) `CPU P3' Pentium III (Katmai) instruction sets
+
+       (*) `CPU KATMAI' Same as P3
+
+       (*) `CPU P4' Pentium 4 (Willamette) instruction set
+
+       (*) `CPU WILLAMETTE' Same as P4
+
+       (*) `CPU PRESCOTT' Prescott instruction set
+
+       (*) `CPU X64' x86-64 (x64/AMD64/Intel 64) instruction set
+
+       (*) `CPU IA64' IA64 CPU (in x86 mode) instruction set
+
+       All options are case insensitive. All instructions will be selected
+       only if they apply to the selected CPU or lower. By default, all
+       instructions are available.
+
+   6.9 `FLOAT': Handling of floating-point constants
+
+       By default, floating-point constants are rounded to nearest, and
+       IEEE denormals are supported. The following options can be set to
+       alter this behaviour:
+
+       (*) `FLOAT DAZ' Flush denormals to zero
+
+       (*) `FLOAT NODAZ' Do not flush denormals to zero (default)
+
+       (*) `FLOAT NEAR' Round to nearest (default)
+
+       (*) `FLOAT UP' Round up (toward +Infinity)
+
+       (*) `FLOAT DOWN' Round down (toward -Infinity)
+
+       (*) `FLOAT ZERO' Round toward zero
+
+       (*) `FLOAT DEFAULT' Restore default settings
+
+       The standard macros `__FLOAT_DAZ__', `__FLOAT_ROUND__', and
+       `__FLOAT__' contain the current state, as long as the programmer has
+       avoided the use of the brackeded primitive form, (`[FLOAT]').
+
+       `__FLOAT__' contains the full set of floating-point settings; this
+       value can be saved away and invoked later to restore the setting.
+
+Chapter 7: Output Formats
+-------------------------
+
+       NASM is a portable assembler, designed to be able to compile on any
+       ANSI C-supporting platform and produce output to run on a variety of
+       Intel x86 operating systems. For this reason, it has a large number
+       of available output formats, selected using the `-f' option on the
+       NASM command line. Each of these formats, along with its extensions
+       to the base NASM syntax, is detailed in this chapter.
+
+       As stated in section 2.1.1, NASM chooses a default name for your
+       output file based on the input file name and the chosen output
+       format. This will be generated by removing the extension (`.asm',
+       `.s', or whatever you like to use) from the input file name, and
+       substituting an extension defined by the output format. The
+       extensions are given with each format below.
+
+   7.1 `bin': Flat-Form Binary Output
+
+       The `bin' format does not produce object files: it generates nothing
+       in the output file except the code you wrote. Such `pure binary'
+       files are used by MS-DOS: `.COM' executables and `.SYS' device
+       drivers are pure binary files. Pure binary output is also useful for
+       operating system and boot loader development.
+
+       The `bin' format supports multiple section names. For details of how
+       NASM handles sections in the `bin' format, see section 7.1.3.
+
+       Using the `bin' format puts NASM by default into 16-bit mode (see
+       section 6.1). In order to use `bin' to write 32-bit or 64-bit code,
+       such as an OS kernel, you need to explicitly issue the `BITS 32' or
+       `BITS 64' directive.
+
+       `bin' has no default output file name extension: instead, it leaves
+       your file name as it is once the original extension has been
+       removed. Thus, the default is for NASM to assemble `binprog.asm'
+       into a binary file called `binprog'.
+
+ 7.1.1 `ORG': Binary File Program Origin
+
+       The `bin' format provides an additional directive to the list given
+       in chapter 6: `ORG'. The function of the `ORG' directive is to
+       specify the origin address which NASM will assume the program begins
+       at when it is loaded into memory.
+
+       For example, the following code will generate the longword
+       `0x00000104':
+
+               org     0x100 
+               dd      label 
+       label:
+
+       Unlike the `ORG' directive provided by MASM-compatible assemblers,
+       which allows you to jump around in the object file and overwrite
+       code you have already generated, NASM's `ORG' does exactly what the
+       directive says: _origin_. Its sole function is to specify one offset
+       which is added to all internal address references within the
+       section; it does not permit any of the trickery that MASM's version
+       does. See section 12.1.3 for further comments.
+
+ 7.1.2 `bin' Extensions to the `SECTION' Directive
+
+       The `bin' output format extends the `SECTION' (or `SEGMENT')
+       directive to allow you to specify the alignment requirements of
+       segments. This is done by appending the `ALIGN' qualifier to the end
+       of the section-definition line. For example,
+
+       section .data   align=16
+
+       switches to the section `.data' and also specifies that it must be
+       aligned on a 16-byte boundary.
+
+       The parameter to `ALIGN' specifies how many low bits of the section
+       start address must be forced to zero. The alignment value given may
+       be any power of two.
+
+ 7.1.3 Multisection Support for the `bin' Format
+
+       The `bin' format allows the use of multiple sections, of arbitrary
+       names, besides the "known" `.text', `.data', and `.bss' names.
+
+       (*) Sections may be designated `progbits' or `nobits'. Default is
+           `progbits' (except `.bss', which defaults to `nobits', of
+           course).
+
+       (*) Sections can be aligned at a specified boundary following the
+           previous section with `align=', or at an arbitrary byte-granular
+           position with `start='.
+
+       (*) Sections can be given a virtual start address, which will be
+           used for the calculation of all memory references within that
+           section with `vstart='.
+
+       (*) Sections can be ordered using `follows='`<section>' or
+           `vfollows='`<section>' as an alternative to specifying an
+           explicit start address.
+
+       (*) Arguments to `org', `start', `vstart', and `align=' are critical
+           expressions. See section 3.8. E.g. `align=(1 << ALIGN_SHIFT)' -
+           `ALIGN_SHIFT' must be defined before it is used here.
+
+       (*) Any code which comes before an explicit `SECTION' directive is
+           directed by default into the `.text' section.
+
+       (*) If an `ORG' statement is not given, `ORG 0' is used by default.
+
+       (*) The `.bss' section will be placed after the last `progbits'
+           section, unless `start=', `vstart=', `follows=', or `vfollows='
+           has been specified.
+
+       (*) All sections are aligned on dword boundaries, unless a different
+           alignment has been specified.
+
+       (*) Sections may not overlap.
+
+       (*) NASM creates the `section.<secname>.start' for each section,
+           which may be used in your code.
+
+ 7.1.4 Map Files
+
+       Map files can be generated in `-f bin' format by means of the
+       `[map]' option. Map types of `all' (default), `brief', `sections',
+       `segments', or `symbols' may be specified. Output may be directed to
+       `stdout' (default), `stderr', or a specified file. E.g.
+       `[map symbols myfile.map]'. No "user form" exists, the square
+       brackets must be used.
+
+   7.2 `ith': Intel Hex Output
+
+       The `ith' file format produces Intel hex-format files. Just as the
+       `bin' format, this is a flat memory image format with no support for
+       relocation or linking. It is usually used with ROM programmers and
+       similar utilities.
+
+       All extensions supported by the `bin' file format is also supported
+       by the `ith' file format.
+
+       `ith' provides a default output file-name extension of `.ith'.
+
+   7.3 `srec': Motorola S-Records Output
+
+       The `srec' file format produces Motorola S-records files. Just as
+       the `bin' format, this is a flat memory image format with no support
+       for relocation or linking. It is usually used with ROM programmers
+       and similar utilities.
+
+       All extensions supported by the `bin' file format is also supported
+       by the `srec' file format.
+
+       `srec' provides a default output file-name extension of `.srec'.
+
+   7.4 `obj': Microsoft OMF Object Files
+
+       The `obj' file format (NASM calls it `obj' rather than `omf' for
+       historical reasons) is the one produced by MASM and TASM, which is
+       typically fed to 16-bit DOS linkers to produce `.EXE' files. It is
+       also the format used by OS/2.
+
+       `obj' provides a default output file-name extension of `.obj'.
+
+       `obj' is not exclusively a 16-bit format, though: NASM has full
+       support for the 32-bit extensions to the format. In particular, 32-
+       bit `obj' format files are used by Borland's Win32 compilers,
+       instead of using Microsoft's newer `win32' object file format.
+
+       The `obj' format does not define any special segment names: you can
+       call your segments anything you like. Typical names for segments in
+       `obj' format files are `CODE', `DATA' and `BSS'.
+
+       If your source file contains code before specifying an explicit
+       `SEGMENT' directive, then NASM will invent its own segment called
+       `__NASMDEFSEG' for you.
+
+       When you define a segment in an `obj' file, NASM defines the segment
+       name as a symbol as well, so that you can access the segment address
+       of the segment. So, for example:
+
+       segment data 
+       
+       dvar:   dw      1234 
+       
+       segment code 
+       
+       function: 
+               mov     ax,data         ; get segment address of data 
+               mov     ds,ax           ; and move it into DS 
+               inc     word [dvar]     ; now this reference will work 
+               ret
+
+       The `obj' format also enables the use of the `SEG' and `WRT'
+       operators, so that you can write code which does things like
+
+       extern  foo 
+       
+             mov   ax,seg foo            ; get preferred segment of foo 
+             mov   ds,ax 
+             mov   ax,data               ; a different segment 
+             mov   es,ax 
+             mov   ax,[ds:foo]           ; this accesses `foo' 
+             mov   [es:foo wrt data],bx  ; so does this
+
+ 7.4.1 `obj' Extensions to the `SEGMENT' Directive
+
+       The `obj' output format extends the `SEGMENT' (or `SECTION')
+       directive to allow you to specify various properties of the segment
+       you are defining. This is done by appending extra qualifiers to the
+       end of the segment-definition line. For example,
+
+       segment code private align=16
+
+       defines the segment `code', but also declares it to be a private
+       segment, and requires that the portion of it described in this code
+       module must be aligned on a 16-byte boundary.
+
+       The available qualifiers are:
+
+       (*) `PRIVATE', `PUBLIC', `COMMON' and `STACK' specify the
+           combination characteristics of the segment. `PRIVATE' segments
+           do not get combined with any others by the linker; `PUBLIC' and
+           `STACK' segments get concatenated together at link time; and
+           `COMMON' segments all get overlaid on top of each other rather
+           than stuck end-to-end.
+
+       (*) `ALIGN' is used, as shown above, to specify how many low bits of
+           the segment start address must be forced to zero. The alignment
+           value given may be any power of two from 1 to 4096; in reality,
+           the only values supported are 1, 2, 4, 16, 256 and 4096, so if 8
+           is specified it will be rounded up to 16, and 32, 64 and 128
+           will all be rounded up to 256, and so on. Note that alignment to
+           4096-byte boundaries is a PharLap extension to the format and
+           may not be supported by all linkers.
+
+       (*) `CLASS' can be used to specify the segment class; this feature
+           indicates to the linker that segments of the same class should
+           be placed near each other in the output file. The class name can
+           be any word, e.g. `CLASS=CODE'.
+
+       (*) `OVERLAY', like `CLASS', is specified with an arbitrary word as
+           an argument, and provides overlay information to an overlay-
+           capable linker.
+
+       (*) Segments can be declared as `USE16' or `USE32', which has the
+           effect of recording the choice in the object file and also
+           ensuring that NASM's default assembly mode when assembling in
+           that segment is 16-bit or 32-bit respectively.
+
+       (*) When writing OS/2 object files, you should declare 32-bit
+           segments as `FLAT', which causes the default segment base for
+           anything in the segment to be the special group `FLAT', and also
+           defines the group if it is not already defined.
+
+       (*) The `obj' file format also allows segments to be declared as
+           having a pre-defined absolute segment address, although no
+           linkers are currently known to make sensible use of this
+           feature; nevertheless, NASM allows you to declare a segment such
+           as `SEGMENT SCREEN ABSOLUTE=0xB800' if you need to. The
+           `ABSOLUTE' and `ALIGN' keywords are mutually exclusive.
+
+       NASM's default segment attributes are `PUBLIC', `ALIGN=1', no class,
+       no overlay, and `USE16'.
+
+ 7.4.2 `GROUP': Defining Groups of Segments
+
+       The `obj' format also allows segments to be grouped, so that a
+       single segment register can be used to refer to all the segments in
+       a group. NASM therefore supplies the `GROUP' directive, whereby you
+       can code
+
+       segment data 
+       
+               ; some data 
+       
+       segment bss 
+       
+               ; some uninitialized data 
+       
+       group dgroup data bss
+
+       which will define a group called `dgroup' to contain the segments
+       `data' and `bss'. Like `SEGMENT', `GROUP' causes the group name to
+       be defined as a symbol, so that you can refer to a variable `var' in
+       the `data' segment as `var wrt data' or as `var wrt dgroup',
+       depending on which segment value is currently in your segment
+       register.
+
+       If you just refer to `var', however, and `var' is declared in a
+       segment which is part of a group, then NASM will default to giving
+       you the offset of `var' from the beginning of the _group_, not the
+       _segment_. Therefore `SEG var', also, will return the group base
+       rather than the segment base.
+
+       NASM will allow a segment to be part of more than one group, but
+       will generate a warning if you do this. Variables declared in a
+       segment which is part of more than one group will default to being
+       relative to the first group that was defined to contain the segment.
+
+       A group does not have to contain any segments; you can still make
+       `WRT' references to a group which does not contain the variable you
+       are referring to. OS/2, for example, defines the special group
+       `FLAT' with no segments in it.
+
+ 7.4.3 `UPPERCASE': Disabling Case Sensitivity in Output
+
+       Although NASM itself is case sensitive, some OMF linkers are not;
+       therefore it can be useful for NASM to output single-case object
+       files. The `UPPERCASE' format-specific directive causes all segment,
+       group and symbol names that are written to the object file to be
+       forced to upper case just before being written. Within a source
+       file, NASM is still case-sensitive; but the object file can be
+       written entirely in upper case if desired.
+
+       `UPPERCASE' is used alone on a line; it requires no parameters.
+
+ 7.4.4 `IMPORT': Importing DLL Symbols
+
+       The `IMPORT' format-specific directive defines a symbol to be
+       imported from a DLL, for use if you are writing a DLL's import
+       library in NASM. You still need to declare the symbol as `EXTERN' as
+       well as using the `IMPORT' directive.
+
+       The `IMPORT' directive takes two required parameters, separated by
+       white space, which are (respectively) the name of the symbol you
+       wish to import and the name of the library you wish to import it
+       from. For example:
+
+           import  WSAStartup wsock32.dll
+
+       A third optional parameter gives the name by which the symbol is
+       known in the library you are importing it from, in case this is not
+       the same as the name you wish the symbol to be known by to your code
+       once you have imported it. For example:
+
+           import  asyncsel wsock32.dll WSAAsyncSelect
+
+ 7.4.5 `EXPORT': Exporting DLL Symbols
+
+       The `EXPORT' format-specific directive defines a global symbol to be
+       exported as a DLL symbol, for use if you are writing a DLL in NASM.
+       You still need to declare the symbol as `GLOBAL' as well as using
+       the `EXPORT' directive.
+
+       `EXPORT' takes one required parameter, which is the name of the
+       symbol you wish to export, as it was defined in your source file. An
+       optional second parameter (separated by white space from the first)
+       gives the _external_ name of the symbol: the name by which you wish
+       the symbol to be known to programs using the DLL. If this name is
+       the same as the internal name, you may leave the second parameter
+       off.
+
+       Further parameters can be given to define attributes of the exported
+       symbol. These parameters, like the second, are separated by white
+       space. If further parameters are given, the external name must also
+       be specified, even if it is the same as the internal name. The
+       available attributes are:
+
+       (*) `resident' indicates that the exported name is to be kept
+           resident by the system loader. This is an optimisation for
+           frequently used symbols imported by name.
+
+       (*) `nodata' indicates that the exported symbol is a function which
+           does not make use of any initialized data.
+
+       (*) `parm=NNN', where `NNN' is an integer, sets the number of
+           parameter words for the case in which the symbol is a call gate
+           between 32-bit and 16-bit segments.
+
+       (*) An attribute which is just a number indicates that the symbol
+           should be exported with an identifying number (ordinal), and
+           gives the desired number.
+
+       For example:
+
+           export  myfunc 
+           export  myfunc TheRealMoreFormalLookingFunctionName 
+           export  myfunc myfunc 1234  ; export by ordinal 
+           export  myfunc myfunc resident parm=23 nodata
+
+ 7.4.6 `..start': Defining the Program Entry Point
+
+       `OMF' linkers require exactly one of the object files being linked
+       to define the program entry point, where execution will begin when
+       the program is run. If the object file that defines the entry point
+       is assembled using NASM, you specify the entry point by declaring
+       the special symbol `..start' at the point where you wish execution
+       to begin.
+
+ 7.4.7 `obj' Extensions to the `EXTERN' Directive
+
+       If you declare an external symbol with the directive
+
+           extern  foo
+
+       then references such as `mov ax,foo' will give you the offset of
+       `foo' from its preferred segment base (as specified in whichever
+       module `foo' is actually defined in). So to access the contents of
+       `foo' you will usually need to do something like
+
+               mov     ax,seg foo      ; get preferred segment base 
+               mov     es,ax           ; move it into ES 
+               mov     ax,[es:foo]     ; and use offset `foo' from it
+
+       This is a little unwieldy, particularly if you know that an external
+       is going to be accessible from a given segment or group, say
+       `dgroup'. So if `DS' already contained `dgroup', you could simply
+       code
+
+               mov     ax,[foo wrt dgroup]
+
+       However, having to type this every time you want to access `foo' can
+       be a pain; so NASM allows you to declare `foo' in the alternative
+       form
+
+           extern  foo:wrt dgroup
+
+       This form causes NASM to pretend that the preferred segment base of
+       `foo' is in fact `dgroup'; so the expression `seg foo' will now
+       return `dgroup', and the expression `foo' is equivalent to
+       `foo wrt dgroup'.
+
+       This default-`WRT' mechanism can be used to make externals appear to
+       be relative to any group or segment in your program. It can also be
+       applied to common variables: see section 7.4.8.
+
+ 7.4.8 `obj' Extensions to the `COMMON' Directive
+
+       The `obj' format allows common variables to be either near or far;
+       NASM allows you to specify which your variables should be by the use
+       of the syntax
+
+       common  nearvar 2:near   ; `nearvar' is a near common 
+       common  farvar  10:far   ; and `farvar' is far
+
+       Far common variables may be greater in size than 64Kb, and so the
+       OMF specification says that they are declared as a number of
+       _elements_ of a given size. So a 10-byte far common variable could
+       be declared as ten one-byte elements, five two-byte elements, two
+       five-byte elements or one ten-byte element.
+
+       Some `OMF' linkers require the element size, as well as the variable
+       size, to match when resolving common variables declared in more than
+       one module. Therefore NASM must allow you to specify the element
+       size on your far common variables. This is done by the following
+       syntax:
+
+       common  c_5by2  10:far 5        ; two five-byte elements 
+       common  c_2by5  10:far 2        ; five two-byte elements
+
+       If no element size is specified, the default is 1. Also, the `FAR'
+       keyword is not required when an element size is specified, since
+       only far commons may have element sizes at all. So the above
+       declarations could equivalently be
+
+       common  c_5by2  10:5            ; two five-byte elements 
+       common  c_2by5  10:2            ; five two-byte elements
+
+       In addition to these extensions, the `COMMON' directive in `obj'
+       also supports default-`WRT' specification like `EXTERN' does
+       (explained in section 7.4.7). So you can also declare things like
+
+       common  foo     10:wrt dgroup 
+       common  bar     16:far 2:wrt data 
+       common  baz     24:wrt data:6
+
+   7.5 `win32': Microsoft Win32 Object Files
+
+       The `win32' output format generates Microsoft Win32 object files,
+       suitable for passing to Microsoft linkers such as Visual C++. Note
+       that Borland Win32 compilers do not use this format, but use `obj'
+       instead (see section 7.4).
+
+       `win32' provides a default output file-name extension of `.obj'.
+
+       Note that although Microsoft say that Win32 object files follow the
+       `COFF' (Common Object File Format) standard, the object files
+       produced by Microsoft Win32 compilers are not compatible with COFF
+       linkers such as DJGPP's, and vice versa. This is due to a difference
+       of opinion over the precise semantics of PC-relative relocations. To
+       produce COFF files suitable for DJGPP, use NASM's `coff' output
+       format; conversely, the `coff' format does not produce object files
+       that Win32 linkers can generate correct output from.
+
+ 7.5.1 `win32' Extensions to the `SECTION' Directive
+
+       Like the `obj' format, `win32' allows you to specify additional
+       information on the `SECTION' directive line, to control the type and
+       properties of sections you declare. Section types and properties are
+       generated automatically by NASM for the standard section names
+       `.text', `.data' and `.bss', but may still be overridden by these
+       qualifiers.
+
+       The available qualifiers are:
+
+       (*) `code', or equivalently `text', defines the section to be a code
+           section. This marks the section as readable and executable, but
+           not writable, and also indicates to the linker that the type of
+           the section is code.
+
+       (*) `data' and `bss' define the section to be a data section,
+           analogously to `code'. Data sections are marked as readable and
+           writable, but not executable. `data' declares an initialized
+           data section, whereas `bss' declares an uninitialized data
+           section.
+
+       (*) `rdata' declares an initialized data section that is readable
+           but not writable. Microsoft compilers use this section to place
+           constants in it.
+
+       (*) `info' defines the section to be an informational section, which
+           is not included in the executable file by the linker, but may
+           (for example) pass information _to_ the linker. For example,
+           declaring an `info'-type section called `.drectve' causes the
+           linker to interpret the contents of the section as command-line
+           options.
+
+       (*) `align=', used with a trailing number as in `obj', gives the
+           alignment requirements of the section. The maximum you may
+           specify is 64: the Win32 object file format contains no means to
+           request a greater section alignment than this. If alignment is
+           not explicitly specified, the defaults are 16-byte alignment for
+           code sections, 8-byte alignment for rdata sections and 4-byte
+           alignment for data (and BSS) sections. Informational sections
+           get a default alignment of 1 byte (no alignment), though the
+           value does not matter.
+
+       The defaults assumed by NASM if you do not specify the above
+       qualifiers are:
+
+       section .text    code  align=16 
+       section .data    data  align=4 
+       section .rdata   rdata align=8 
+       section .bss     bss   align=4
+
+       Any other section name is treated by default like `.text'.
+
+ 7.5.2 `win32': Safe Structured Exception Handling
+
+       Among other improvements in Windows XP SP2 and Windows Server 2003
+       Microsoft has introduced concept of "safe structured exception
+       handling." General idea is to collect handlers' entry points in
+       designated read-only table and have alleged entry point verified
+       against this table prior exception control is passed to the handler.
+       In order for an executable module to be equipped with such "safe
+       exception handler table," all object modules on linker command line
+       has to comply with certain criteria. If one single module among them
+       does not, then the table in question is omitted and above mentioned
+       run-time checks will not be performed for application in question.
+       Table omission is by default silent and therefore can be easily
+       overlooked. One can instruct linker to refuse to produce binary
+       without such table by passing `/safeseh' command line option.
+
+       Without regard to this run-time check merits it's natural to expect
+       NASM to be capable of generating modules suitable for `/safeseh'
+       linking. From developer's viewpoint the problem is two-fold:
+
+       (*) how to adapt modules not deploying exception handlers of their
+           own;
+
+       (*) how to adapt/develop modules utilizing custom exception
+           handling;
+
+       Former can be easily achieved with any NASM version by adding
+       following line to source code:
+
+       $@feat.00 equ 1
+
+       As of version 2.03 NASM adds this absolute symbol automatically. If
+       it's not already present to be precise. I.e. if for whatever reason
+       developer would choose to assign another value in source file, it
+       would still be perfectly possible.
+
+       Registering custom exception handler on the other hand requires
+       certain "magic." As of version 2.03 additional directive is
+       implemented, `safeseh', which instructs the assembler to produce
+       appropriately formatted input data for above mentioned "safe
+       exception handler table." Its typical use would be:
+
+       section .text 
+       extern  _MessageBoxA@16 
+       %if     __NASM_VERSION_ID__ >= 0x02030000 
+       safeseh handler         ; register handler as "safe handler" 
+       %endif 
+       handler: 
+               push    DWORD 1 ; MB_OKCANCEL 
+               push    DWORD caption 
+               push    DWORD text 
+               push    DWORD 0 
+               call    _MessageBoxA@16 
+               sub     eax,1   ; incidentally suits as return value 
+                               ; for exception handler 
+               ret 
+       global  _main 
+       _main: 
+               push    DWORD handler 
+               push    DWORD [fs:0] 
+               mov     DWORD [fs:0],esp ; engage exception handler 
+               xor     eax,eax 
+               mov     eax,DWORD[eax]   ; cause exception 
+               pop     DWORD [fs:0]     ; disengage exception handler 
+               add     esp,4 
+               ret 
+       text:   db      'OK to rethrow, CANCEL to generate core dump',0 
+       caption:db      'SEGV',0 
+       
+       section .drectve info 
+               db      '/defaultlib:user32.lib /defaultlib:msvcrt.lib '
+
+       As you might imagine, it's perfectly possible to produce .exe binary
+       with "safe exception handler table" and yet engage unregistered
+       exception handler. Indeed, handler is engaged by simply manipulating
+       `[fs:0]' location at run-time, something linker has no power over,
+       run-time that is. It should be explicitly mentioned that such
+       failure to register handler's entry point with `safeseh' directive
+       has undesired side effect at run-time. If exception is raised and
+       unregistered handler is to be executed, the application is abruptly
+       terminated without any notification whatsoever. One can argue that
+       system could at least have logged some kind "non-safe exception
+       handler in x.exe at address n" message in event log, but no,
+       literally no notification is provided and user is left with no clue
+       on what caused application failure.
+
+       Finally, all mentions of linker in this paragraph refer to Microsoft
+       linker version 7.x and later. Presence of `@feat.00' symbol and
+       input data for "safe exception handler table" causes no backward
+       incompatibilities and "safeseh" modules generated by NASM 2.03 and
+       later can still be linked by earlier versions or non-Microsoft
+       linkers.
+
+   7.6 `win64': Microsoft Win64 Object Files
+
+       The `win64' output format generates Microsoft Win64 object files,
+       which is nearly 100% identical to the `win32' object format (section
+       7.5) with the exception that it is meant to target 64-bit code and
+       the x86-64 platform altogether. This object file is used exactly the
+       same as the `win32' object format (section 7.5), in NASM, with
+       regard to this exception.
+
+ 7.6.1 `win64': Writing Position-Independent Code
+
+       While `REL' takes good care of RIP-relative addressing, there is one
+       aspect that is easy to overlook for a Win64 programmer: indirect
+       references. Consider a switch dispatch table:
+
+               jmp     QWORD[dsptch+rax*8] 
+               ... 
+       dsptch: dq      case0 
+               dq      case1 
+               ...
+
+       Even novice Win64 assembler programmer will soon realize that the
+       code is not 64-bit savvy. Most notably linker will refuse to link it
+       with
+       "`'ADDR32' relocation to '.text' invalid without /LARGEADDRESSAWARE:NO'".
+       So [s]he will have to split jmp instruction as following:
+
+               lea     rbx,[rel dsptch] 
+               jmp     QWORD[rbx+rax*8]
+
+       What happens behind the scene is that effective address in `lea' is
+       encoded relative to instruction pointer, or in perfectly position-
+       independent manner. But this is only part of the problem! Trouble is
+       that in .dll context `caseN' relocations will make their way to the
+       final module and might have to be adjusted at .dll load time. To be
+       specific when it can't be loaded at preferred address. And when this
+       occurs, pages with such relocations will be rendered private to
+       current process, which kind of undermines the idea of sharing .dll.
+       But no worry, it's trivial to fix:
+
+               lea     rbx,[rel dsptch] 
+               add     rbx,QWORD[rbx+rax*8] 
+               jmp     rbx 
+               ... 
+       dsptch: dq      case0-dsptch 
+               dq      case1-dsptch 
+               ...
+
+       NASM version 2.03 and later provides another alternative,
+       `wrt ..imagebase' operator, which returns offset from base address
+       of the current image, be it .exe or .dll module, therefore the name.
+       For those acquainted with PE-COFF format base address denotes start
+       of `IMAGE_DOS_HEADER' structure. Here is how to implement switch
+       with these image-relative references:
+
+               lea     rbx,[rel dsptch] 
+               mov     eax,DWORD[rbx+rax*4] 
+               sub     rbx,dsptch wrt ..imagebase 
+               add     rbx,rax 
+               jmp     rbx 
+               ... 
+       dsptch: dd      case0 wrt ..imagebase 
+               dd      case1 wrt ..imagebase
+
+       One can argue that the operator is redundant. Indeed, snippet before
+       last works just fine with any NASM version and is not even Windows
+       specific... The real reason for implementing `wrt ..imagebase' will
+       become apparent in next paragraph.
+
+       It should be noted that `wrt ..imagebase' is defined as 32-bit
+       operand only:
+
+               dd      label wrt ..imagebase           ; ok 
+               dq      label wrt ..imagebase           ; bad 
+               mov     eax,label wrt ..imagebase       ; ok 
+               mov     rax,label wrt ..imagebase       ; bad
+
+ 7.6.2 `win64': Structured Exception Handling
+
+       Structured exception handing in Win64 is completely different matter
+       from Win32. Upon exception program counter value is noted, and
+       linker-generated table comprising start and end addresses of all the
+       functions [in given executable module] is traversed and compared to
+       the saved program counter. Thus so called `UNWIND_INFO' structure is
+       identified. If it's not found, then offending subroutine is assumed
+       to be "leaf" and just mentioned lookup procedure is attempted for
+       its caller. In Win64 leaf function is such function that does not
+       call any other function _nor_ modifies any Win64 non-volatile
+       registers, including stack pointer. The latter ensures that it's
+       possible to identify leaf function's caller by simply pulling the
+       value from the top of the stack.
+
+       While majority of subroutines written in assembler are not calling
+       any other function, requirement for non-volatile registers'
+       immutability leaves developer with not more than 7 registers and no
+       stack frame, which is not necessarily what [s]he counted with.
+       Customarily one would meet the requirement by saving non-volatile
+       registers on stack and restoring them upon return, so what can go
+       wrong? If [and only if] an exception is raised at run-time and no
+       `UNWIND_INFO' structure is associated with such "leaf" function, the
+       stack unwind procedure will expect to find caller's return address
+       on the top of stack immediately followed by its frame. Given that
+       developer pushed caller's non-volatile registers on stack, would the
+       value on top point at some code segment or even addressable space?
+       Well, developer can attempt copying caller's return address to the
+       top of stack and this would actually work in some very specific
+       circumstances. But unless developer can guarantee that these
+       circumstances are always met, it's more appropriate to assume worst
+       case scenario, i.e. stack unwind procedure going berserk. Relevant
+       question is what happens then? Application is abruptly terminated
+       without any notification whatsoever. Just like in Win32 case, one
+       can argue that system could at least have logged "unwind procedure
+       went berserk in x.exe at address n" in event log, but no, no trace
+       of failure is left.
+
+       Now, when we understand significance of the `UNWIND_INFO' structure,
+       let's discuss what's in it and/or how it's processed. First of all
+       it is checked for presence of reference to custom language-specific
+       exception handler. If there is one, then it's invoked. Depending on
+       the return value, execution flow is resumed (exception is said to be
+       "handled"), _or_ rest of `UNWIND_INFO' structure is processed as
+       following. Beside optional reference to custom handler, it carries
+       information about current callee's stack frame and where non-
+       volatile registers are saved. Information is detailed enough to be
+       able to reconstruct contents of caller's non-volatile registers upon
+       call to current callee. And so caller's context is reconstructed,
+       and then unwind procedure is repeated, i.e. another `UNWIND_INFO'
+       structure is associated, this time, with caller's instruction
+       pointer, which is then checked for presence of reference to
+       language-specific handler, etc. The procedure is recursively
+       repeated till exception is handled. As last resort system "handles"
+       it by generating memory core dump and terminating the application.
+
+       As for the moment of this writing NASM unfortunately does not
+       facilitate generation of above mentioned detailed information about
+       stack frame layout. But as of version 2.03 it implements building
+       blocks for generating structures involved in stack unwinding. As
+       simplest example, here is how to deploy custom exception handler for
+       leaf function:
+
+       default rel 
+       section .text 
+       extern  MessageBoxA 
+       handler: 
+               sub     rsp,40 
+               mov     rcx,0 
+               lea     rdx,[text] 
+               lea     r8,[caption] 
+               mov     r9,1    ; MB_OKCANCEL 
+               call    MessageBoxA 
+               sub     eax,1   ; incidentally suits as return value 
+                               ; for exception handler 
+               add     rsp,40 
+               ret 
+       global  main 
+       main: 
+               xor     rax,rax 
+               mov     rax,QWORD[rax]  ; cause exception 
+               ret 
+       main_end: 
+       text:   db      'OK to rethrow, CANCEL to generate core dump',0 
+       caption:db      'SEGV',0 
+       
+       section .pdata  rdata align=4 
+               dd      main wrt ..imagebase 
+               dd      main_end wrt ..imagebase 
+               dd      xmain wrt ..imagebase 
+       section .xdata  rdata align=8 
+       xmain:  db      9,0,0,0 
+               dd      handler wrt ..imagebase 
+       section .drectve info 
+               db      '/defaultlib:user32.lib /defaultlib:msvcrt.lib '
+
+       What you see in `.pdata' section is element of the "table comprising
+       start and end addresses of function" along with reference to
+       associated `UNWIND_INFO' structure. And what you see in `.xdata'
+       section is `UNWIND_INFO' structure describing function with no
+       frame, but with designated exception handler. References are
+       _required_ to be image-relative (which is the real reason for
+       implementing `wrt ..imagebase' operator). It should be noted that
+       `rdata align=n', as well as `wrt ..imagebase', are optional in these
+       two segments' contexts, i.e. can be omitted. Latter means that _all_
+       32-bit references, not only above listed required ones, placed into
+       these two segments turn out image-relative. Why is it important to
+       understand? Developer is allowed to append handler-specific data to
+       `UNWIND_INFO' structure, and if [s]he adds a 32-bit reference, then
+       [s]he will have to remember to adjust its value to obtain the real
+       pointer.
+
+       As already mentioned, in Win64 terms leaf function is one that does
+       not call any other function _nor_ modifies any non-volatile
+       register, including stack pointer. But it's not uncommon that
+       assembler programmer plans to utilize every single register and
+       sometimes even have variable stack frame. Is there anything one can
+       do with bare building blocks? I.e. besides manually composing fully-
+       fledged `UNWIND_INFO' structure, which would surely be considered
+       error-prone? Yes, there is. Recall that exception handler is called
+       first, before stack layout is analyzed. As it turned out, it's
+       perfectly possible to manipulate current callee's context in custom
+       handler in manner that permits further stack unwinding. General idea
+       is that handler would not actually "handle" the exception, but
+       instead restore callee's context, as it was at its entry point and
+       thus mimic leaf function. In other words, handler would simply
+       undertake part of unwinding procedure. Consider following example:
+
+       function: 
+               mov     rax,rsp         ; copy rsp to volatile register 
+               push    r15             ; save non-volatile registers 
+               push    rbx 
+               push    rbp 
+               mov     r11,rsp         ; prepare variable stack frame 
+               sub     r11,rcx 
+               and     r11,-64 
+               mov     QWORD[r11],rax  ; check for exceptions 
+               mov     rsp,r11         ; allocate stack frame 
+               mov     QWORD[rsp],rax  ; save original rsp value 
+       magic_point: 
+               ... 
+               mov     r11,QWORD[rsp]  ; pull original rsp value 
+               mov     rbp,QWORD[r11-24] 
+               mov     rbx,QWORD[r11-16] 
+               mov     r15,QWORD[r11-8] 
+               mov     rsp,r11         ; destroy frame 
+               ret
+
+       The keyword is that up to `magic_point' original `rsp' value remains
+       in chosen volatile register and no non-volatile register, except for
+       `rsp', is modified. While past `magic_point' `rsp' remains constant
+       till the very end of the `function'. In this case custom language-
+       specific exception handler would look like this:
+
+       EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, 
+               CONTEXT *context,DISPATCHER_CONTEXT *disp) 
+       {   ULONG64 *rsp; 
+           if (context->Rip<(ULONG64)magic_point) 
+               rsp = (ULONG64 *)context->Rax; 
+           else 
+           {   rsp = ((ULONG64 **)context->Rsp)[0]; 
+               context->Rbp = rsp[-3]; 
+               context->Rbx = rsp[-2]; 
+               context->R15 = rsp[-1]; 
+           } 
+           context->Rsp = (ULONG64)rsp; 
+       
+           memcpy (disp->ContextRecord,context,sizeof(CONTEXT)); 
+           RtlVirtualUnwind(UNW_FLAG_NHANDLER,disp->ImageBase, 
+               dips->ControlPc,disp->FunctionEntry,disp->ContextRecord, 
+               &disp->HandlerData,&disp->EstablisherFrame,NULL); 
+           return ExceptionContinueSearch; 
+       }
+
+       As custom handler mimics leaf function, corresponding `UNWIND_INFO'
+       structure does not have to contain any information about stack frame
+       and its layout.
+
+   7.7 `coff': Common Object File Format
+
+       The `coff' output type produces `COFF' object files suitable for
+       linking with the DJGPP linker.
+
+       `coff' provides a default output file-name extension of `.o'.
+
+       The `coff' format supports the same extensions to the `SECTION'
+       directive as `win32' does, except that the `align' qualifier and the
+       `info' section type are not supported.
+
+   7.8 `macho32' and `macho64': Mach Object File Format
+
+       The `macho32' and `macho64' output formts produces `Mach-O' object
+       files suitable for linking with the MacOS X linker. `macho' is a
+       synonym for `macho32'.
+
+       `macho' provides a default output file-name extension of `.o'.
+
+   7.9 `elf32' and `elf64': Executable and Linkable Format Object Files
+
+       The `elf32' and `elf64' output formats generate `ELF32 and ELF64'
+       (Executable and Linkable Format) object files, as used by Linux as
+       well as Unix System V, including Solaris x86, UnixWare and SCO Unix.
+       `elf' provides a default output file-name extension of `.o'. `elf'
+       is a synonym for `elf32'.
+
+ 7.9.1 ELF specific directive `osabi'
+
+       The ELF header specifies the application binary interface for the
+       target operating system (OSABI). This field can be set by using the
+       `osabi' directive with the numeric value (0-255) of the target
+       system. If this directive is not used, the default value will be
+       "UNIX System V ABI" (0) which will work on most systems which
+       support ELF.
+
+ 7.9.2 `elf' Extensions to the `SECTION' Directive
+
+       Like the `obj' format, `elf' allows you to specify additional
+       information on the `SECTION' directive line, to control the type and
+       properties of sections you declare. Section types and properties are
+       generated automatically by NASM for the standard section names, but
+       may still be overridden by these qualifiers.
+
+       The available qualifiers are:
+
+       (*) `alloc' defines the section to be one which is loaded into
+           memory when the program is run. `noalloc' defines it to be one
+           which is not, such as an informational or comment section.
+
+       (*) `exec' defines the section to be one which should have execute
+           permission when the program is run. `noexec' defines it as one
+           which should not.
+
+       (*) `write' defines the section to be one which should be writable
+           when the program is run. `nowrite' defines it as one which
+           should not.
+
+       (*) `progbits' defines the section to be one with explicit contents
+           stored in the object file: an ordinary code or data section, for
+           example, `nobits' defines the section to be one with no explicit
+           contents given, such as a BSS section.
+
+       (*) `align=', used with a trailing number as in `obj', gives the
+           alignment requirements of the section.
+
+       (*) `tls' defines the section to be one which contains thread local
+           variables.
+
+       The defaults assumed by NASM if you do not specify the above
+       qualifiers are:
+
+
+       section .text    progbits  alloc   exec    nowrite  align=16 
+       section .rodata  progbits  alloc   noexec  nowrite  align=4 
+       section .lrodata progbits  alloc   noexec  nowrite  align=4 
+       section .data    progbits  alloc   noexec  write    align=4 
+       section .ldata   progbits  alloc   noexec  write    align=4 
+       section .bss     nobits    alloc   noexec  write    align=4 
+       section .lbss    nobits    alloc   noexec  write    align=4 
+       section .tdata   progbits  alloc   noexec  write    align=4    tls 
+       section .tbss    nobits    alloc   noexec  write    align=4    tls 
+       section .comment progbits  noalloc noexec  nowrite  align=1 
+       section other    progbits  alloc   noexec  nowrite  align=1
+
+       (Any section name other than those in the above table is treated by
+       default like `other' in the above table. Please note that section
+       names are case sensitive.)
+
+ 7.9.3 Position-Independent Code: `elf' Special Symbols and `WRT'
+
+       The `ELF' specification contains enough features to allow position-
+       independent code (PIC) to be written, which makes ELF shared
+       libraries very flexible. However, it also means NASM has to be able
+       to generate a variety of ELF specific relocation types in ELF object
+       files, if it is to be an assembler which can write PIC.
+
+       Since `ELF' does not support segment-base references, the `WRT'
+       operator is not used for its normal purpose; therefore NASM's `elf'
+       output format makes use of `WRT' for a different purpose, namely the
+       PIC-specific relocation types.
+
+       `elf' defines five special symbols which you can use as the right-
+       hand side of the `WRT' operator to obtain PIC relocation types. They
+       are `..gotpc', `..gotoff', `..got', `..plt' and `..sym'. Their
+       functions are summarized here:
+
+       (*) Referring to the symbol marking the global offset table base
+           using `wrt ..gotpc' will end up giving the distance from the
+           beginning of the current section to the global offset table.
+           (`_GLOBAL_OFFSET_TABLE_' is the standard symbol name used to
+           refer to the GOT.) So you would then need to add `$$' to the
+           result to get the real address of the GOT.
+
+       (*) Referring to a location in one of your own sections using
+           `wrt ..gotoff' will give the distance from the beginning of the
+           GOT to the specified location, so that adding on the address of
+           the GOT would give the real address of the location you wanted.
+
+       (*) Referring to an external or global symbol using `wrt ..got'
+           causes the linker to build an entry _in_ the GOT containing the
+           address of the symbol, and the reference gives the distance from
+           the beginning of the GOT to the entry; so you can add on the
+           address of the GOT, load from the resulting address, and end up
+           with the address of the symbol.
+
+       (*) Referring to a procedure name using `wrt ..plt' causes the
+           linker to build a procedure linkage table entry for the symbol,
+           and the reference gives the address of the PLT entry. You can
+           only use this in contexts which would generate a PC-relative
+           relocation normally (i.e. as the destination for `CALL' or
+           `JMP'), since ELF contains no relocation type to refer to PLT
+           entries absolutely.
+
+       (*) Referring to a symbol name using `wrt ..sym' causes NASM to
+           write an ordinary relocation, but instead of making the
+           relocation relative to the start of the section and then adding
+           on the offset to the symbol, it will write a relocation record
+           aimed directly at the symbol in question. The distinction is a
+           necessary one due to a peculiarity of the dynamic linker.
+
+       A fuller explanation of how to use these relocation types to write
+       shared libraries entirely in NASM is given in section 9.2.
+
+ 7.9.4 Thread Local Storage: `elf' Special Symbols and `WRT'
+
+       (*) In ELF32 mode, referring to an external or global symbol using
+           `wrt ..tlsie'  causes the linker to build an entry _in_ the GOT
+           containing the offset of the symbol within the TLS block, so you
+           can access the value of the symbol with code such as:
+
+              mov  eax,[tid wrt ..tlsie] 
+              mov  [gs:eax],ebx
+
+       (*) In ELF64 mode, referring to an external or global symbol using
+           `wrt ..gottpoff'  causes the linker to build an entry _in_ the
+           GOT containing the offset of the symbol within the TLS block, so
+           you can access the value of the symbol with code such as:
+
+              mov   rax,[rel tid wrt ..gottpoff] 
+              mov   rcx,[fs:rax]
+
+ 7.9.5 `elf' Extensions to the `GLOBAL' Directive
+
+       `ELF' object files can contain more information about a global
+       symbol than just its address: they can contain the size of the
+       symbol and its type as well. These are not merely debugger
+       conveniences, but are actually necessary when the program being
+       written is a shared library. NASM therefore supports some extensions
+       to the `GLOBAL' directive, allowing you to specify these features.
+
+       You can specify whether a global variable is a function or a data
+       object by suffixing the name with a colon and the word `function' or
+       `data'. (`object' is a synonym for `data'.) For example:
+
+       global   hashlookup:function, hashtable:data
+
+       exports the global symbol `hashlookup' as a function and `hashtable'
+       as a data object.
+
+       Optionally, you can control the ELF visibility of the symbol. Just
+       add one of the visibility keywords: `default', `internal', `hidden',
+       or `protected'. The default is `default' of course. For example, to
+       make `hashlookup' hidden:
+
+       global   hashlookup:function hidden
+
+       You can also specify the size of the data associated with the
+       symbol, as a numeric expression (which may involve labels, and even
+       forward references) after the type specifier. Like this:
+
+       global  hashtable:data (hashtable.end - hashtable) 
+       
+       hashtable: 
+               db this,that,theother  ; some data here 
+       .end:
+
+       This makes NASM automatically calculate the length of the table and
+       place that information into the `ELF' symbol table.
+
+       Declaring the type and size of global symbols is necessary when
+       writing shared library code. For more information, see section
+       9.2.4.
+
+ 7.9.6 `elf' Extensions to the `COMMON' Directive 
+
+       `ELF' also allows you to specify alignment requirements on common
+       variables. This is done by putting a number (which must be a power
+       of two) after the name and size of the common variable, separated
+       (as usual) by a colon. For example, an array of doublewords would
+       benefit from 4-byte alignment:
+
+       common  dwordarray 128:4
+
+       This declares the total size of the array to be 128 bytes, and
+       requires that it be aligned on a 4-byte boundary.
+
+ 7.9.7 16-bit code and ELF 
+
+       The `ELF32' specification doesn't provide relocations for 8- and 16-
+       bit values, but the GNU `ld' linker adds these as an extension. NASM
+       can generate GNU-compatible relocations, to allow 16-bit code to be
+       linked as ELF using GNU `ld'. If NASM is used with the
+       `-w+gnu-elf-extensions' option, a warning is issued when one of
+       these relocations is generated.
+
+ 7.9.8 Debug formats and ELF 
+
+       `ELF32' and `ELF64' provide debug information in `STABS' and `DWARF'
+       formats. Line number information is generated for all executable
+       sections, but please note that only the ".text" section is
+       executable by default.
+
+  7.10 `aout': Linux `a.out' Object Files
+
+       The `aout' format generates `a.out' object files, in the form used
+       by early Linux systems (current Linux systems use ELF, see section
+       7.9.) These differ from other `a.out' object files in that the magic
+       number in the first four bytes of the file is different; also, some
+       implementations of `a.out', for example NetBSD's, support position-
+       independent code, which Linux's implementation does not.
+
+       `a.out' provides a default output file-name extension of `.o'.
+
+       `a.out' is a very simple object format. It supports no special
+       directives, no special symbols, no use of `SEG' or `WRT', and no
+       extensions to any standard directives. It supports only the three
+       standard section names `.text', `.data' and `.bss'.
+
+  7.11 `aoutb': NetBSD/FreeBSD/OpenBSD `a.out' Object Files
+
+       The `aoutb' format generates `a.out' object files, in the form used
+       by the various free `BSD Unix' clones, `NetBSD', `FreeBSD' and
+       `OpenBSD'. For simple object files, this object format is exactly
+       the same as `aout' except for the magic number in the first four
+       bytes of the file. However, the `aoutb' format supports
+       position-independent code in the same way as the `elf' format, so
+       you can use it to write `BSD' shared libraries.
+
+       `aoutb' provides a default output file-name extension of `.o'.
+
+       `aoutb' supports no special directives, no special symbols, and only
+       the three standard section names `.text', `.data' and `.bss'.
+       However, it also supports the same use of `WRT' as `elf' does, to
+       provide position-independent code relocation types. See section
+       7.9.3 for full documentation of this feature.
+
+       `aoutb' also supports the same extensions to the `GLOBAL' directive
+       as `elf' does: see section 7.9.5 for documentation of this.
+
+  7.12 `as86': Minix/Linux `as86' Object Files
+
+       The Minix/Linux 16-bit assembler `as86' has its own non-standard
+       object file format. Although its companion linker `ld86' produces
+       something close to ordinary `a.out' binaries as output, the object
+       file format used to communicate between `as86' and `ld86' is not
+       itself `a.out'.
+
+       NASM supports this format, just in case it is useful, as `as86'.
+       `as86' provides a default output file-name extension of `.o'.
+
+       `as86' is a very simple object format (from the NASM user's point of
+       view). It supports no special directives, no use of `SEG' or `WRT',
+       and no extensions to any standard directives. It supports only the
+       three standard section names `.text', `.data' and `.bss'. The only
+       special symbol supported is `..start'.
+
+  7.13 `rdf': Relocatable Dynamic Object File Format
+
+       The `rdf' output format produces `RDOFF' object files. `RDOFF'
+       (Relocatable Dynamic Object File Format) is a home-grown object-file
+       format, designed alongside NASM itself and reflecting in its file
+       format the internal structure of the assembler.
+
+       `RDOFF' is not used by any well-known operating systems. Those
+       writing their own systems, however, may well wish to use `RDOFF' as
+       their object format, on the grounds that it is designed primarily
+       for simplicity and contains very little file-header bureaucracy.
+
+       The Unix NASM archive, and the DOS archive which includes sources,
+       both contain an `rdoff' subdirectory holding a set of RDOFF
+       utilities: an RDF linker, an `RDF' static-library manager, an RDF
+       file dump utility, and a program which will load and execute an RDF
+       executable under Linux.
+
+       `rdf' supports only the standard section names `.text', `.data' and
+       `.bss'.
+
+7.13.1 Requiring a Library: The `LIBRARY' Directive
+
+       `RDOFF' contains a mechanism for an object file to demand a given
+       library to be linked to the module, either at load time or run time.
+       This is done by the `LIBRARY' directive, which takes one argument
+       which is the name of the module:
+
+           library  mylib.rdl
+
+7.13.2 Specifying a Module Name: The `MODULE' Directive
+
+       Special `RDOFF' header record is used to store the name of the
+       module. It can be used, for example, by run-time loader to perform
+       dynamic linking. `MODULE' directive takes one argument which is the
+       name of current module:
+
+           module  mymodname
+
+       Note that when you statically link modules and tell linker to strip
+       the symbols from output file, all module names will be stripped too.
+       To avoid it, you should start module names with `$', like:
+
+           module  $kernel.core
+
+7.13.3 `rdf' Extensions to the `GLOBAL' Directive
+
+       `RDOFF' global symbols can contain additional information needed by
+       the static linker. You can mark a global symbol as exported, thus
+       telling the linker do not strip it from target executable or library
+       file. Like in `ELF', you can also specify whether an exported symbol
+       is a procedure (function) or data object.
+
+       Suffixing the name with a colon and the word `export' you make the
+       symbol exported:
+
+           global  sys_open:export
+
+       To specify that exported symbol is a procedure (function), you add
+       the word `proc' or `function' after declaration:
+
+           global  sys_open:export proc
+
+       Similarly, to specify exported data object, add the word `data' or
+       `object' to the directive:
+
+           global  kernel_ticks:export data
+
+7.13.4 `rdf' Extensions to the `EXTERN' Directive
+
+       By default the `EXTERN' directive in `RDOFF' declares a "pure
+       external" symbol (i.e. the static linker will complain if such a
+       symbol is not resolved). To declare an "imported" symbol, which must
+       be resolved later during a dynamic linking phase, `RDOFF' offers an
+       additional `import' modifier. As in `GLOBAL', you can also specify
+       whether an imported symbol is a procedure (function) or data object.
+       For example:
+
+           library $libc 
+           extern  _open:import 
+           extern  _printf:import proc 
+           extern  _errno:import data
+
+       Here the directive `LIBRARY' is also included, which gives the
+       dynamic linker a hint as to where to find requested symbols.
+
+  7.14 `dbg': Debugging Format
+
+       The `dbg' output format is not built into NASM in the default
+       configuration. If you are building your own NASM executable from the
+       sources, you can define `OF_DBG' in `output/outform.h' or on the
+       compiler command line, and obtain the `dbg' output format.
+
+       The `dbg' format does not output an object file as such; instead, it
+       outputs a text file which contains a complete list of all the
+       transactions between the main body of NASM and the output-format
+       back end module. It is primarily intended to aid people who want to
+       write their own output drivers, so that they can get a clearer idea
+       of the various requests the main program makes of the output driver,
+       and in what order they happen.
+
+       For simple files, one can easily use the `dbg' format like this:
+
+       nasm -f dbg filename.asm
+
+       which will generate a diagnostic file called `filename.dbg'.
+       However, this will not work well on files which were designed for a
+       different object format, because each object format defines its own
+       macros (usually user-level forms of directives), and those macros
+       will not be defined in the `dbg' format. Therefore it can be useful
+       to run NASM twice, in order to do the preprocessing with the native
+       object format selected:
+
+       nasm -e -f rdf -o rdfprog.i rdfprog.asm 
+       nasm -a -f dbg rdfprog.i
+
+       This preprocesses `rdfprog.asm' into `rdfprog.i', keeping the `rdf'
+       object format selected in order to make sure RDF special directives
+       are converted into primitive form correctly. Then the preprocessed
+       source is fed through the `dbg' format to generate the final
+       diagnostic output.
+
+       This workaround will still typically not work for programs intended
+       for `obj' format, because the `obj' `SEGMENT' and `GROUP' directives
+       have side effects of defining the segment and group names as
+       symbols; `dbg' will not do this, so the program will not assemble.
+       You will have to work around that by defining the symbols yourself
+       (using `EXTERN', for example) if you really need to get a `dbg'
+       trace of an `obj'-specific source file.
+
+       `dbg' accepts any section name and any directives at all, and logs
+       them all to its output file.
+
+Chapter 8: Writing 16-bit Code (DOS, Windows 3/3.1)
+---------------------------------------------------
+
+       This chapter attempts to cover some of the common issues encountered
+       when writing 16-bit code to run under `MS-DOS' or `Windows 3.x'. It
+       covers how to link programs to produce `.EXE' or `.COM' files, how
+       to write `.SYS' device drivers, and how to interface assembly
+       language code with 16-bit C compilers and with Borland Pascal.
+
+   8.1 Producing `.EXE' Files
+
+       Any large program written under DOS needs to be built as a `.EXE'
+       file: only `.EXE' files have the necessary internal structure
+       required to span more than one 64K segment. Windows programs, also,
+       have to be built as `.EXE' files, since Windows does not support the
+       `.COM' format.
+
+       In general, you generate `.EXE' files by using the `obj' output
+       format to produce one or more `.OBJ' files, and then linking them
+       together using a linker. However, NASM also supports the direct
+       generation of simple DOS `.EXE' files using the `bin' output format
+       (by using `DB' and `DW' to construct the `.EXE' file header), and a
+       macro package is supplied to do this. Thanks to Yann Guidon for
+       contributing the code for this.
+
+       NASM may also support `.EXE' natively as another output format in
+       future releases.
+
+ 8.1.1 Using the `obj' Format To Generate `.EXE' Files
+
+       This section describes the usual method of generating `.EXE' files
+       by linking `.OBJ' files together.
+
+       Most 16-bit programming language packages come with a suitable
+       linker; if you have none of these, there is a free linker called
+       VAL, available in `LZH' archive format from `x2ftp.oulu.fi'. An LZH
+       archiver can be found at `ftp.simtel.net'. There is another `free'
+       linker (though this one doesn't come with sources) called FREELINK,
+       available from `www.pcorner.com'. A third, `djlink', written by DJ
+       Delorie, is available at `www.delorie.com'. A fourth linker,
+       `ALINK', written by Anthony A.J. Williams, is available at
+       `alink.sourceforge.net'.
+
+       When linking several `.OBJ' files into a `.EXE' file, you should
+       ensure that exactly one of them has a start point defined (using the
+       `..start' special symbol defined by the `obj' format: see section
+       7.4.6). If no module defines a start point, the linker will not know
+       what value to give the entry-point field in the output file header;
+       if more than one defines a start point, the linker will not know
+       _which_ value to use.
+
+       An example of a NASM source file which can be assembled to a `.OBJ'
+       file and linked on its own to a `.EXE' is given here. It
+       demonstrates the basic principles of defining a stack, initialising
+       the segment registers, and declaring a start point. This file is
+       also provided in the `test' subdirectory of the NASM archives, under
+       the name `objexe.asm'.
+
+       segment code 
+       
+       ..start: 
+               mov     ax,data 
+               mov     ds,ax 
+               mov     ax,stack 
+               mov     ss,ax 
+               mov     sp,stacktop
+
+       This initial piece of code sets up `DS' to point to the data
+       segment, and initializes `SS' and `SP' to point to the top of the
+       provided stack. Notice that interrupts are implicitly disabled for
+       one instruction after a move into `SS', precisely for this
+       situation, so that there's no chance of an interrupt occurring
+       between the loads of `SS' and `SP' and not having a stack to execute
+       on.
+
+       Note also that the special symbol `..start' is defined at the
+       beginning of this code, which means that will be the entry point
+       into the resulting executable file.
+
+               mov     dx,hello 
+               mov     ah,9 
+               int     0x21
+
+       The above is the main program: load `DS:DX' with a pointer to the
+       greeting message (`hello' is implicitly relative to the segment
+       `data', which was loaded into `DS' in the setup code, so the full
+       pointer is valid), and call the DOS print-string function.
+
+               mov     ax,0x4c00 
+               int     0x21
+
+       This terminates the program using another DOS system call.
+
+       segment data 
+       
+       hello:  db      'hello, world', 13, 10, '$'
+
+       The data segment contains the string we want to display.
+
+       segment stack stack 
+               resb 64 
+       stacktop:
+
+       The above code declares a stack segment containing 64 bytes of
+       uninitialized stack space, and points `stacktop' at the top of it.
+       The directive `segment stack stack' defines a segment _called_
+       `stack', and also of _type_ `STACK'. The latter is not necessary to
+       the correct running of the program, but linkers are likely to issue
+       warnings or errors if your program has no segment of type `STACK'.
+
+       The above file, when assembled into a `.OBJ' file, will link on its
+       own to a valid `.EXE' file, which when run will print `hello, world'
+       and then exit.
+
+ 8.1.2 Using the `bin' Format To Generate `.EXE' Files
+
+       The `.EXE' file format is simple enough that it's possible to build
+       a `.EXE' file by writing a pure-binary program and sticking a 32-
+       byte header on the front. This header is simple enough that it can
+       be generated using `DB' and `DW' commands by NASM itself, so that
+       you can use the `bin' output format to directly generate `.EXE'
+       files.
+
+       Included in the NASM archives, in the `misc' subdirectory, is a file
+       `exebin.mac' of macros. It defines three macros: `EXE_begin',
+       `EXE_stack' and `EXE_end'.
+
+       To produce a `.EXE' file using this method, you should start by
+       using `%include' to load the `exebin.mac' macro package into your
+       source file. You should then issue the `EXE_begin' macro call (which
+       takes no arguments) to generate the file header data. Then write
+       code as normal for the `bin' format - you can use all three standard
+       sections `.text', `.data' and `.bss'. At the end of the file you
+       should call the `EXE_end' macro (again, no arguments), which defines
+       some symbols to mark section sizes, and these symbols are referred
+       to in the header code generated by `EXE_begin'.
+
+       In this model, the code you end up writing starts at `0x100', just
+       like a `.COM' file - in fact, if you strip off the 32-byte header
+       from the resulting `.EXE' file, you will have a valid `.COM'
+       program. All the segment bases are the same, so you are limited to a
+       64K program, again just like a `.COM' file. Note that an `ORG'
+       directive is issued by the `EXE_begin' macro, so you should not
+       explicitly issue one of your own.
+
+       You can't directly refer to your segment base value, unfortunately,
+       since this would require a relocation in the header, and things
+       would get a lot more complicated. So you should get your segment
+       base by copying it out of `CS' instead.
+
+       On entry to your `.EXE' file, `SS:SP' are already set up to point to
+       the top of a 2Kb stack. You can adjust the default stack size of 2Kb
+       by calling the `EXE_stack' macro. For example, to change the stack
+       size of your program to 64 bytes, you would call `EXE_stack 64'.
+
+       A sample program which generates a `.EXE' file in this way is given
+       in the `test' subdirectory of the NASM archive, as `binexe.asm'.
+
+   8.2 Producing `.COM' Files
+
+       While large DOS programs must be written as `.EXE' files, small ones
+       are often better written as `.COM' files. `.COM' files are pure
+       binary, and therefore most easily produced using the `bin' output
+       format.
+
+ 8.2.1 Using the `bin' Format To Generate `.COM' Files
+
+       `.COM' files expect to be loaded at offset `100h' into their segment
+       (though the segment may change). Execution then begins at `100h',
+       i.e. right at the start of the program. So to write a `.COM'
+       program, you would create a source file looking like
+
+               org 100h 
+       
+       section .text 
+       
+       start: 
+               ; put your code here 
+       
+       section .data 
+       
+               ; put data items here 
+       
+       section .bss 
+       
+               ; put uninitialized data here
+
+       The `bin' format puts the `.text' section first in the file, so you
+       can declare data or BSS items before beginning to write code if you
+       want to and the code will still end up at the front of the file
+       where it belongs.
+
+       The BSS (uninitialized data) section does not take up space in the
+       `.COM' file itself: instead, addresses of BSS items are resolved to
+       point at space beyond the end of the file, on the grounds that this
+       will be free memory when the program is run. Therefore you should
+       not rely on your BSS being initialized to all zeros when you run.
+
+       To assemble the above program, you should use a command line like
+
+       nasm myprog.asm -fbin -o myprog.com
+
+       The `bin' format would produce a file called `myprog' if no explicit
+       output file name were specified, so you have to override it and give
+       the desired file name.
+
+ 8.2.2 Using the `obj' Format To Generate `.COM' Files
+
+       If you are writing a `.COM' program as more than one module, you may
+       wish to assemble several `.OBJ' files and link them together into a
+       `.COM' program. You can do this, provided you have a linker capable
+       of outputting `.COM' files directly (TLINK does this), or
+       alternatively a converter program such as `EXE2BIN' to transform the
+       `.EXE' file output from the linker into a `.COM' file.
+
+       If you do this, you need to take care of several things:
+
+       (*) The first object file containing code should start its code
+           segment with a line like `RESB 100h'. This is to ensure that the
+           code begins at offset `100h' relative to the beginning of the
+           code segment, so that the linker or converter program does not
+           have to adjust address references within the file when
+           generating the `.COM' file. Other assemblers use an `ORG'
+           directive for this purpose, but `ORG' in NASM is a format-
+           specific directive to the `bin' output format, and does not mean
+           the same thing as it does in MASM-compatible assemblers.
+
+       (*) You don't need to define a stack segment.
+
+       (*) All your segments should be in the same group, so that every
+           time your code or data references a symbol offset, all offsets
+           are relative to the same segment base. This is because, when a
+           `.COM' file is loaded, all the segment registers contain the
+           same value.
+
+   8.3 Producing `.SYS' Files
+
+       MS-DOS device drivers - `.SYS' files - are pure binary files,
+       similar to `.COM' files, except that they start at origin zero
+       rather than `100h'. Therefore, if you are writing a device driver
+       using the `bin' format, you do not need the `ORG' directive, since
+       the default origin for `bin' is zero. Similarly, if you are using
+       `obj', you do not need the `RESB 100h' at the start of your code
+       segment.
+
+       `.SYS' files start with a header structure, containing pointers to
+       the various routines inside the driver which do the work. This
+       structure should be defined at the start of the code segment, even
+       though it is not actually code.
+
+       For more information on the format of `.SYS' files, and the data
+       which has to go in the header structure, a list of books is given in
+       the Frequently Asked Questions list for the newsgroup
+       `comp.os.msdos.programmer'.
+
+   8.4 Interfacing to 16-bit C Programs
+
+       This section covers the basics of writing assembly routines that
+       call, or are called from, C programs. To do this, you would
+       typically write an assembly module as a `.OBJ' file, and link it
+       with your C modules to produce a mixed-language program.
+
+ 8.4.1 External Symbol Names
+
+       C compilers have the convention that the names of all global symbols
+       (functions or data) they define are formed by prefixing an
+       underscore to the name as it appears in the C program. So, for
+       example, the function a C programmer thinks of as `printf' appears
+       to an assembly language programmer as `_printf'. This means that in
+       your assembly programs, you can define symbols without a leading
+       underscore, and not have to worry about name clashes with C symbols.
+
+       If you find the underscores inconvenient, you can define macros to
+       replace the `GLOBAL' and `EXTERN' directives as follows:
+
+       %macro  cglobal 1 
+       
+         global  _%1 
+         %define %1 _%1 
+       
+       %endmacro 
+       
+       %macro  cextern 1 
+       
+         extern  _%1 
+         %define %1 _%1 
+       
+       %endmacro
+
+       (These forms of the macros only take one argument at a time; a
+       `%rep' construct could solve this.)
+
+       If you then declare an external like this:
+
+       cextern printf
+
+       then the macro will expand it as
+
+       extern  _printf 
+       %define printf _printf
+
+       Thereafter, you can reference `printf' as if it was a symbol, and
+       the preprocessor will put the leading underscore on where necessary.
+
+       The `cglobal' macro works similarly. You must use `cglobal' before
+       defining the symbol in question, but you would have had to do that
+       anyway if you used `GLOBAL'.
+
+       Also see section 2.1.27.
+
+ 8.4.2 Memory Models
+
+       NASM contains no mechanism to support the various C memory models
+       directly; you have to keep track yourself of which one you are
+       writing for. This means you have to keep track of the following
+       things:
+
+       (*) In models using a single code segment (tiny, small and compact),
+           functions are near. This means that function pointers, when
+           stored in data segments or pushed on the stack as function
+           arguments, are 16 bits long and contain only an offset field
+           (the `CS' register never changes its value, and always gives the
+           segment part of the full function address), and that functions
+           are called using ordinary near `CALL' instructions and return
+           using `RETN' (which, in NASM, is synonymous with `RET' anyway).
+           This means both that you should write your own routines to
+           return with `RETN', and that you should call external C routines
+           with near `CALL' instructions.
+
+       (*) In models using more than one code segment (medium, large and
+           huge), functions are far. This means that function pointers are
+           32 bits long (consisting of a 16-bit offset followed by a 16-bit
+           segment), and that functions are called using `CALL FAR' (or
+           `CALL seg:offset') and return using `RETF'. Again, you should
+           therefore write your own routines to return with `RETF' and use
+           `CALL FAR' to call external routines.
+
+       (*) In models using a single data segment (tiny, small and medium),
+           data pointers are 16 bits long, containing only an offset field
+           (the `DS' register doesn't change its value, and always gives
+           the segment part of the full data item address).
+
+       (*) In models using more than one data segment (compact, large and
+           huge), data pointers are 32 bits long, consisting of a 16-bit
+           offset followed by a 16-bit segment. You should still be careful
+           not to modify `DS' in your routines without restoring it
+           afterwards, but `ES' is free for you to use to access the
+           contents of 32-bit data pointers you are passed.
+
+       (*) The huge memory model allows single data items to exceed 64K in
+           size. In all other memory models, you can access the whole of a
+           data item just by doing arithmetic on the offset field of the
+           pointer you are given, whether a segment field is present or
+           not; in huge model, you have to be more careful of your pointer
+           arithmetic.
+
+       (*) In most memory models, there is a _default_ data segment, whose
+           segment address is kept in `DS' throughout the program. This
+           data segment is typically the same segment as the stack, kept in
+           `SS', so that functions' local variables (which are stored on
+           the stack) and global data items can both be accessed easily
+           without changing `DS'. Particularly large data items are
+           typically stored in other segments. However, some memory models
+           (though not the standard ones, usually) allow the assumption
+           that `SS' and `DS' hold the same value to be removed. Be careful
+           about functions' local variables in this latter case.
+
+       In models with a single code segment, the segment is called `_TEXT',
+       so your code segment must also go by this name in order to be linked
+       into the same place as the main code segment. In models with a
+       single data segment, or with a default data segment, it is called
+       `_DATA'.
+
+ 8.4.3 Function Definitions and Function Calls
+
+       The C calling convention in 16-bit programs is as follows. In the
+       following description, the words _caller_ and _callee_ are used to
+       denote the function doing the calling and the function which gets
+       called.
+
+       (*) The caller pushes the function's parameters on the stack, one
+           after another, in reverse order (right to left, so that the
+           first argument specified to the function is pushed last).
+
+       (*) The caller then executes a `CALL' instruction to pass control to
+           the callee. This `CALL' is either near or far depending on the
+           memory model.
+
+       (*) The callee receives control, and typically (although this is not
+           actually necessary, in functions which do not need to access
+           their parameters) starts by saving the value of `SP' in `BP' so
+           as to be able to use `BP' as a base pointer to find its
+           parameters on the stack. However, the caller was probably doing
+           this too, so part of the calling convention states that `BP'
+           must be preserved by any C function. Hence the callee, if it is
+           going to set up `BP' as a _frame pointer_, must push the
+           previous value first.
+
+       (*) The callee may then access its parameters relative to `BP'. The
+           word at `[BP]' holds the previous value of `BP' as it was
+           pushed; the next word, at `[BP+2]', holds the offset part of the
+           return address, pushed implicitly by `CALL'. In a small-model
+           (near) function, the parameters start after that, at `[BP+4]';
+           in a large-model (far) function, the segment part of the return
+           address lives at `[BP+4]', and the parameters begin at `[BP+6]'.
+           The leftmost parameter of the function, since it was pushed
+           last, is accessible at this offset from `BP'; the others follow,
+           at successively greater offsets. Thus, in a function such as
+           `printf' which takes a variable number of parameters, the
+           pushing of the parameters in reverse order means that the
+           function knows where to find its first parameter, which tells it
+           the number and type of the remaining ones.
+
+       (*) The callee may also wish to decrease `SP' further, so as to
+           allocate space on the stack for local variables, which will then
+           be accessible at negative offsets from `BP'.
+
+       (*) The callee, if it wishes to return a value to the caller, should
+           leave the value in `AL', `AX' or `DX:AX' depending on the size
+           of the value. Floating-point results are sometimes (depending on
+           the compiler) returned in `ST0'.
+
+       (*) Once the callee has finished processing, it restores `SP' from
+           `BP' if it had allocated local stack space, then pops the
+           previous value of `BP', and returns via `RETN' or `RETF'
+           depending on memory model.
+
+       (*) When the caller regains control from the callee, the function
+           parameters are still on the stack, so it typically adds an
+           immediate constant to `SP' to remove them (instead of executing
+           a number of slow `POP' instructions). Thus, if a function is
+           accidentally called with the wrong number of parameters due to a
+           prototype mismatch, the stack will still be returned to a
+           sensible state since the caller, which _knows_ how many
+           parameters it pushed, does the removing.
+
+       It is instructive to compare this calling convention with that for
+       Pascal programs (described in section 8.5.1). Pascal has a simpler
+       convention, since no functions have variable numbers of parameters.
+       Therefore the callee knows how many parameters it should have been
+       passed, and is able to deallocate them from the stack itself by
+       passing an immediate argument to the `RET' or `RETF' instruction, so
+       the caller does not have to do it. Also, the parameters are pushed
+       in left-to-right order, not right-to-left, which means that a
+       compiler can give better guarantees about sequence points without
+       performance suffering.
+
+       Thus, you would define a function in C style in the following way.
+       The following example is for small model:
+
+       global  _myfunc 
+       
+       _myfunc: 
+               push    bp 
+               mov     bp,sp 
+               sub     sp,0x40         ; 64 bytes of local stack space 
+               mov     bx,[bp+4]       ; first parameter to function 
+       
+               ; some more code 
+       
+               mov     sp,bp           ; undo "sub sp,0x40" above 
+               pop     bp 
+               ret
+
+       For a large-model function, you would replace `RET' by `RETF', and
+       look for the first parameter at `[BP+6]' instead of `[BP+4]'. Of
+       course, if one of the parameters is a pointer, then the offsets of
+       _subsequent_ parameters will change depending on the memory model as
+       well: far pointers take up four bytes on the stack when passed as a
+       parameter, whereas near pointers take up two.
+
+       At the other end of the process, to call a C function from your
+       assembly code, you would do something like this:
+
+       extern  _printf 
+       
+             ; and then, further down... 
+       
+             push    word [myint]        ; one of my integer variables 
+             push    word mystring       ; pointer into my data segment 
+             call    _printf 
+             add     sp,byte 4           ; `byte' saves space 
+       
+             ; then those data items... 
+       
+       segment _DATA 
+       
+       myint         dw    1234 
+       mystring      db    'This number -> %d <- should be 1234',10,0
+
+       This piece of code is the small-model assembly equivalent of the C
+       code
+
+           int myint = 1234; 
+           printf("This number -> %d <- should be 1234\n", myint);
+
+       In large model, the function-call code might look more like this. In
+       this example, it is assumed that `DS' already holds the segment base
+       of the segment `_DATA'. If not, you would have to initialize it
+       first.
+
+             push    word [myint] 
+             push    word seg mystring   ; Now push the segment, and... 
+             push    word mystring       ; ... offset of "mystring" 
+             call    far _printf 
+             add    sp,byte 6
+
+       The integer value still takes up one word on the stack, since large
+       model does not affect the size of the `int' data type. The first
+       argument (pushed last) to `printf', however, is a data pointer, and
+       therefore has to contain a segment and offset part. The segment
+       should be stored second in memory, and therefore must be pushed
+       first. (Of course, `PUSH DS' would have been a shorter instruction
+       than `PUSH WORD SEG mystring', if `DS' was set up as the above
+       example assumed.) Then the actual call becomes a far call, since
+       functions expect far calls in large model; and `SP' has to be
+       increased by 6 rather than 4 afterwards to make up for the extra
+       word of parameters.
+
+ 8.4.4 Accessing Data Items
+
+       To get at the contents of C variables, or to declare variables which
+       C can access, you need only declare the names as `GLOBAL' or
+       `EXTERN'. (Again, the names require leading underscores, as stated
+       in section 8.4.1.) Thus, a C variable declared as `int i' can be
+       accessed from assembler as
+
+       extern _i 
+       
+               mov ax,[_i]
+
+       And to declare your own integer variable which C programs can access
+       as `extern int j', you do this (making sure you are assembling in
+       the `_DATA' segment, if necessary):
+
+       global  _j 
+       
+       _j      dw      0
+
+       To access a C array, you need to know the size of the components of
+       the array. For example, `int' variables are two bytes long, so if a
+       C program declares an array as `int a[10]', you can access `a[3]' by
+       coding `mov ax,[_a+6]'. (The byte offset 6 is obtained by
+       multiplying the desired array index, 3, by the size of the array
+       element, 2.) The sizes of the C base types in 16-bit compilers are:
+       1 for `char', 2 for `short' and `int', 4 for `long' and `float', and
+       8 for `double'.
+
+       To access a C data structure, you need to know the offset from the
+       base of the structure to the field you are interested in. You can
+       either do this by converting the C structure definition into a NASM
+       structure definition (using `STRUC'), or by calculating the one
+       offset and using just that.
+
+       To do either of these, you should read your C compiler's manual to
+       find out how it organizes data structures. NASM gives no special
+       alignment to structure members in its own `STRUC' macro, so you have
+       to specify alignment yourself if the C compiler generates it.
+       Typically, you might find that a structure like
+
+       struct { 
+           char c; 
+           int i; 
+       } foo;
+
+       might be four bytes long rather than three, since the `int' field
+       would be aligned to a two-byte boundary. However, this sort of
+       feature tends to be a configurable option in the C compiler, either
+       using command-line options or `#pragma' lines, so you have to find
+       out how your own compiler does it.
+
+ 8.4.5 `c16.mac': Helper Macros for the 16-bit C Interface
+
+       Included in the NASM archives, in the `misc' directory, is a file
+       `c16.mac' of macros. It defines three macros: `proc', `arg' and
+       `endproc'. These are intended to be used for C-style procedure
+       definitions, and they automate a lot of the work involved in keeping
+       track of the calling convention.
+
+       (An alternative, TASM compatible form of `arg' is also now built
+       into NASM's preprocessor. See section 4.8 for details.)
+
+       An example of an assembly function using the macro set is given
+       here:
+
+       proc    _nearproc 
+       
+       %$i     arg 
+       %$j     arg 
+               mov     ax,[bp + %$i] 
+               mov     bx,[bp + %$j] 
+               add     ax,[bx] 
+       
+       endproc
+
+       This defines `_nearproc' to be a procedure taking two arguments, the
+       first (`i') an integer and the second (`j') a pointer to an integer.
+       It returns `i + *j'.
+
+       Note that the `arg' macro has an `EQU' as the first line of its
+       expansion, and since the label before the macro call gets prepended
+       to the first line of the expanded macro, the `EQU' works, defining
+       `%$i' to be an offset from `BP'. A context-local variable is used,
+       local to the context pushed by the `proc' macro and popped by the
+       `endproc' macro, so that the same argument name can be used in later
+       procedures. Of course, you don't _have_ to do that.
+
+       The macro set produces code for near functions (tiny, small and
+       compact-model code) by default. You can have it generate far
+       functions (medium, large and huge-model code) by means of coding
+       `%define FARCODE'. This changes the kind of return instruction
+       generated by `endproc', and also changes the starting point for the
+       argument offsets. The macro set contains no intrinsic dependency on
+       whether data pointers are far or not.
+
+       `arg' can take an optional parameter, giving the size of the
+       argument. If no size is given, 2 is assumed, since it is likely that
+       many function parameters will be of type `int'.
+
+       The large-model equivalent of the above function would look like
+       this:
+
+       %define FARCODE 
+       
+       proc    _farproc 
+       
+       %$i     arg 
+       %$j     arg     4 
+               mov     ax,[bp + %$i] 
+               mov     bx,[bp + %$j] 
+               mov     es,[bp + %$j + 2] 
+               add     ax,[bx] 
+       
+       endproc
+
+       This makes use of the argument to the `arg' macro to define a
+       parameter of size 4, because `j' is now a far pointer. When we load
+       from `j', we must load a segment and an offset.
+
+   8.5 Interfacing to Borland Pascal Programs
+
+       Interfacing to Borland Pascal programs is similar in concept to
+       interfacing to 16-bit C programs. The differences are:
+
+       (*) The leading underscore required for interfacing to C programs is
+           not required for Pascal.
+
+       (*) The memory model is always large: functions are far, data
+           pointers are far, and no data item can be more than 64K long.
+           (Actually, some functions are near, but only those functions
+           that are local to a Pascal unit and never called from outside
+           it. All assembly functions that Pascal calls, and all Pascal
+           functions that assembly routines are able to call, are far.)
+           However, all static data declared in a Pascal program goes into
+           the default data segment, which is the one whose segment address
+           will be in `DS' when control is passed to your assembly code.
+           The only things that do not live in the default data segment are
+           local variables (they live in the stack segment) and dynamically
+           allocated variables. All data _pointers_, however, are far.
+
+       (*) The function calling convention is different - described below.
+
+       (*) Some data types, such as strings, are stored differently.
+
+       (*) There are restrictions on the segment names you are allowed to
+           use - Borland Pascal will ignore code or data declared in a
+           segment it doesn't like the name of. The restrictions are
+           described below.
+
+ 8.5.1 The Pascal Calling Convention
+
+       The 16-bit Pascal calling convention is as follows. In the following
+       description, the words _caller_ and _callee_ are used to denote the
+       function doing the calling and the function which gets called.
+
+       (*) The caller pushes the function's parameters on the stack, one
+           after another, in normal order (left to right, so that the first
+           argument specified to the function is pushed first).
+
+       (*) The caller then executes a far `CALL' instruction to pass
+           control to the callee.
+
+       (*) The callee receives control, and typically (although this is not
+           actually necessary, in functions which do not need to access
+           their parameters) starts by saving the value of `SP' in `BP' so
+           as to be able to use `BP' as a base pointer to find its
+           parameters on the stack. However, the caller was probably doing
+           this too, so part of the calling convention states that `BP'
+           must be preserved by any function. Hence the callee, if it is
+           going to set up `BP' as a frame pointer, must push the previous
+           value first.
+
+       (*) The callee may then access its parameters relative to `BP'. The
+           word at `[BP]' holds the previous value of `BP' as it was
+           pushed. The next word, at `[BP+2]', holds the offset part of the
+           return address, and the next one at `[BP+4]' the segment part.
+           The parameters begin at `[BP+6]'. The rightmost parameter of the
+           function, since it was pushed last, is accessible at this offset
+           from `BP'; the others follow, at successively greater offsets.
+
+       (*) The callee may also wish to decrease `SP' further, so as to
+           allocate space on the stack for local variables, which will then
+           be accessible at negative offsets from `BP'.
+
+       (*) The callee, if it wishes to return a value to the caller, should
+           leave the value in `AL', `AX' or `DX:AX' depending on the size
+           of the value. Floating-point results are returned in `ST0'.
+           Results of type `Real' (Borland's own custom floating-point data
+           type, not handled directly by the FPU) are returned in
+           `DX:BX:AX'. To return a result of type `String', the caller
+           pushes a pointer to a temporary string before pushing the
+           parameters, and the callee places the returned string value at
+           that location. The pointer is not a parameter, and should not be
+           removed from the stack by the `RETF' instruction.
+
+       (*) Once the callee has finished processing, it restores `SP' from
+           `BP' if it had allocated local stack space, then pops the
+           previous value of `BP', and returns via `RETF'. It uses the form
+           of `RETF' with an immediate parameter, giving the number of
+           bytes taken up by the parameters on the stack. This causes the
+           parameters to be removed from the stack as a side effect of the
+           return instruction.
+
+       (*) When the caller regains control from the callee, the function
+           parameters have already been removed from the stack, so it needs
+           to do nothing further.
+
+       Thus, you would define a function in Pascal style, taking two
+       `Integer'-type parameters, in the following way:
+
+       global  myfunc 
+       
+       myfunc: push    bp 
+               mov     bp,sp 
+               sub     sp,0x40         ; 64 bytes of local stack space 
+               mov     bx,[bp+8]       ; first parameter to function 
+               mov     bx,[bp+6]       ; second parameter to function 
+       
+               ; some more code 
+       
+               mov     sp,bp           ; undo "sub sp,0x40" above 
+               pop     bp 
+               retf    4               ; total size of params is 4
+
+       At the other end of the process, to call a Pascal function from your
+       assembly code, you would do something like this:
+
+       extern  SomeFunc 
+       
+              ; and then, further down... 
+       
+              push   word seg mystring   ; Now push the segment, and... 
+              push   word mystring       ; ... offset of "mystring" 
+              push   word [myint]        ; one of my variables 
+              call   far SomeFunc
+
+       This is equivalent to the Pascal code
+
+       procedure SomeFunc(String: PChar; Int: Integer); 
+           SomeFunc(@mystring, myint);
+
+ 8.5.2 Borland Pascal Segment Name Restrictions
+
+       Since Borland Pascal's internal unit file format is completely
+       different from `OBJ', it only makes a very sketchy job of actually
+       reading and understanding the various information contained in a
+       real `OBJ' file when it links that in. Therefore an object file
+       intended to be linked to a Pascal program must obey a number of
+       restrictions:
+
+       (*) Procedures and functions must be in a segment whose name is
+           either `CODE', `CSEG', or something ending in `_TEXT'.
+
+       (*) initialized data must be in a segment whose name is either
+           `CONST' or something ending in `_DATA'.
+
+       (*) Uninitialized data must be in a segment whose name is either
+           `DATA', `DSEG', or something ending in `_BSS'.
+
+       (*) Any other segments in the object file are completely ignored.
+           `GROUP' directives and segment attributes are also ignored.
+
+ 8.5.3 Using `c16.mac' With Pascal Programs
+
+       The `c16.mac' macro package, described in section 8.4.5, can also be
+       used to simplify writing functions to be called from Pascal
+       programs, if you code `%define PASCAL'. This definition ensures that
+       functions are far (it implies `FARCODE'), and also causes procedure
+       return instructions to be generated with an operand.
+
+       Defining `PASCAL' does not change the code which calculates the
+       argument offsets; you must declare your function's arguments in
+       reverse order. For example:
+
+       %define PASCAL 
+       
+       proc    _pascalproc 
+       
+       %$j     arg 4 
+       %$i     arg 
+               mov     ax,[bp + %$i] 
+               mov     bx,[bp + %$j] 
+               mov     es,[bp + %$j + 2] 
+               add     ax,[bx] 
+       
+       endproc
+
+       This defines the same routine, conceptually, as the example in
+       section 8.4.5: it defines a function taking two arguments, an
+       integer and a pointer to an integer, which returns the sum of the
+       integer and the contents of the pointer. The only difference between
+       this code and the large-model C version is that `PASCAL' is defined
+       instead of `FARCODE', and that the arguments are declared in reverse
+       order.
+
+Chapter 9: Writing 32-bit Code (Unix, Win32, DJGPP)
+---------------------------------------------------
+
+       This chapter attempts to cover some of the common issues involved
+       when writing 32-bit code, to run under Win32 or Unix, or to be
+       linked with C code generated by a Unix-style C compiler such as
+       DJGPP. It covers how to write assembly code to interface with 32-bit
+       C routines, and how to write position-independent code for shared
+       libraries.
+
+       Almost all 32-bit code, and in particular all code running under
+       `Win32', `DJGPP' or any of the PC Unix variants, runs in _flat_
+       memory model. This means that the segment registers and paging have
+       already been set up to give you the same 32-bit 4Gb address space no
+       matter what segment you work relative to, and that you should ignore
+       all segment registers completely. When writing flat-model
+       application code, you never need to use a segment override or modify
+       any segment register, and the code-section addresses you pass to
+       `CALL' and `JMP' live in the same address space as the data-section
+       addresses you access your variables by and the stack-section
+       addresses you access local variables and procedure parameters by.
+       Every address is 32 bits long and contains only an offset part.
+
+   9.1 Interfacing to 32-bit C Programs
+
+       A lot of the discussion in section 8.4, about interfacing to 16-bit
+       C programs, still applies when working in 32 bits. The absence of
+       memory models or segmentation worries simplifies things a lot.
+
+ 9.1.1 External Symbol Names
+
+       Most 32-bit C compilers share the convention used by 16-bit
+       compilers, that the names of all global symbols (functions or data)
+       they define are formed by prefixing an underscore to the name as it
+       appears in the C program. However, not all of them do: the `ELF'
+       specification states that C symbols do _not_ have a leading
+       underscore on their assembly-language names.
+
+       The older Linux `a.out' C compiler, all `Win32' compilers, `DJGPP',
+       and `NetBSD' and `FreeBSD', all use the leading underscore; for
+       these compilers, the macros `cextern' and `cglobal', as given in
+       section 8.4.1, will still work. For `ELF', though, the leading
+       underscore should not be used.
+
+       See also section 2.1.27.
+
+ 9.1.2 Function Definitions and Function Calls
+
+       The C calling convention in 32-bit programs is as follows. In the
+       following description, the words _caller_ and _callee_ are used to
+       denote the function doing the calling and the function which gets
+       called.
+
+       (*) The caller pushes the function's parameters on the stack, one
+           after another, in reverse order (right to left, so that the
+           first argument specified to the function is pushed last).
+
+       (*) The caller then executes a near `CALL' instruction to pass
+           control to the callee.
+
+       (*) The callee receives control, and typically (although this is not
+           actually necessary, in functions which do not need to access
+           their parameters) starts by saving the value of `ESP' in `EBP'
+           so as to be able to use `EBP' as a base pointer to find its
+           parameters on the stack. However, the caller was probably doing
+           this too, so part of the calling convention states that `EBP'
+           must be preserved by any C function. Hence the callee, if it is
+           going to set up `EBP' as a frame pointer, must push the previous
+           value first.
+
+       (*) The callee may then access its parameters relative to `EBP'. The
+           doubleword at `[EBP]' holds the previous value of `EBP' as it
+           was pushed; the next doubleword, at `[EBP+4]', holds the return
+           address, pushed implicitly by `CALL'. The parameters start after
+           that, at `[EBP+8]'. The leftmost parameter of the function,
+           since it was pushed last, is accessible at this offset from
+           `EBP'; the others follow, at successively greater offsets. Thus,
+           in a function such as `printf' which takes a variable number of
+           parameters, the pushing of the parameters in reverse order means
+           that the function knows where to find its first parameter, which
+           tells it the number and type of the remaining ones.
+
+       (*) The callee may also wish to decrease `ESP' further, so as to
+           allocate space on the stack for local variables, which will then
+           be accessible at negative offsets from `EBP'.
+
+       (*) The callee, if it wishes to return a value to the caller, should
+           leave the value in `AL', `AX' or `EAX' depending on the size of
+           the value. Floating-point results are typically returned in
+           `ST0'.
+
+       (*) Once the callee has finished processing, it restores `ESP' from
+           `EBP' if it had allocated local stack space, then pops the
+           previous value of `EBP', and returns via `RET' (equivalently,
+           `RETN').
+
+       (*) When the caller regains control from the callee, the function
+           parameters are still on the stack, so it typically adds an
+           immediate constant to `ESP' to remove them (instead of executing
+           a number of slow `POP' instructions). Thus, if a function is
+           accidentally called with the wrong number of parameters due to a
+           prototype mismatch, the stack will still be returned to a
+           sensible state since the caller, which _knows_ how many
+           parameters it pushed, does the removing.
+
+       There is an alternative calling convention used by Win32 programs
+       for Windows API calls, and also for functions called _by_ the
+       Windows API such as window procedures: they follow what Microsoft
+       calls the `__stdcall' convention. This is slightly closer to the
+       Pascal convention, in that the callee clears the stack by passing a
+       parameter to the `RET' instruction. However, the parameters are
+       still pushed in right-to-left order.
+
+       Thus, you would define a function in C style in the following way:
+
+       global  _myfunc 
+       
+       _myfunc: 
+               push    ebp 
+               mov     ebp,esp 
+               sub     esp,0x40        ; 64 bytes of local stack space 
+               mov     ebx,[ebp+8]     ; first parameter to function 
+       
+               ; some more code 
+       
+               leave                   ; mov esp,ebp / pop ebp 
+               ret
+
+       At the other end of the process, to call a C function from your
+       assembly code, you would do something like this:
+
+       extern  _printf 
+       
+               ; and then, further down... 
+       
+               push    dword [myint]   ; one of my integer variables 
+               push    dword mystring  ; pointer into my data segment 
+               call    _printf 
+               add     esp,byte 8      ; `byte' saves space 
+       
+               ; then those data items... 
+       
+       segment _DATA 
+       
+       myint       dd   1234 
+       mystring    db   'This number -> %d <- should be 1234',10,0
+
+       This piece of code is the assembly equivalent of the C code
+
+           int myint = 1234; 
+           printf("This number -> %d <- should be 1234\n", myint);
+
+ 9.1.3 Accessing Data Items
+
+       To get at the contents of C variables, or to declare variables which
+       C can access, you need only declare the names as `GLOBAL' or
+       `EXTERN'. (Again, the names require leading underscores, as stated
+       in section 9.1.1.) Thus, a C variable declared as `int i' can be
+       accessed from assembler as
+
+                 extern _i 
+                 mov eax,[_i]
+
+       And to declare your own integer variable which C programs can access
+       as `extern int j', you do this (making sure you are assembling in
+       the `_DATA' segment, if necessary):
+
+                 global _j 
+       _j        dd 0
+
+       To access a C array, you need to know the size of the components of
+       the array. For example, `int' variables are four bytes long, so if a
+       C program declares an array as `int a[10]', you can access `a[3]' by
+       coding `mov ax,[_a+12]'. (The byte offset 12 is obtained by
+       multiplying the desired array index, 3, by the size of the array
+       element, 4.) The sizes of the C base types in 32-bit compilers are:
+       1 for `char', 2 for `short', 4 for `int', `long' and `float', and 8
+       for `double'. Pointers, being 32-bit addresses, are also 4 bytes
+       long.
+
+       To access a C data structure, you need to know the offset from the
+       base of the structure to the field you are interested in. You can
+       either do this by converting the C structure definition into a NASM
+       structure definition (using `STRUC'), or by calculating the one
+       offset and using just that.
+
+       To do either of these, you should read your C compiler's manual to
+       find out how it organizes data structures. NASM gives no special
+       alignment to structure members in its own `STRUC' macro, so you have
+       to specify alignment yourself if the C compiler generates it.
+       Typically, you might find that a structure like
+
+       struct { 
+           char c; 
+           int i; 
+       } foo;
+
+       might be eight bytes long rather than five, since the `int' field
+       would be aligned to a four-byte boundary. However, this sort of
+       feature is sometimes a configurable option in the C compiler, either
+       using command-line options or `#pragma' lines, so you have to find
+       out how your own compiler does it.
+
+ 9.1.4 `c32.mac': Helper Macros for the 32-bit C Interface
+
+       Included in the NASM archives, in the `misc' directory, is a file
+       `c32.mac' of macros. It defines three macros: `proc', `arg' and
+       `endproc'. These are intended to be used for C-style procedure
+       definitions, and they automate a lot of the work involved in keeping
+       track of the calling convention.
+
+       An example of an assembly function using the macro set is given
+       here:
+
+       proc    _proc32 
+       
+       %$i     arg 
+       %$j     arg 
+               mov     eax,[ebp + %$i] 
+               mov     ebx,[ebp + %$j] 
+               add     eax,[ebx] 
+       
+       endproc
+
+       This defines `_proc32' to be a procedure taking two arguments, the
+       first (`i') an integer and the second (`j') a pointer to an integer.
+       It returns `i + *j'.
+
+       Note that the `arg' macro has an `EQU' as the first line of its
+       expansion, and since the label before the macro call gets prepended
+       to the first line of the expanded macro, the `EQU' works, defining
+       `%$i' to be an offset from `BP'. A context-local variable is used,
+       local to the context pushed by the `proc' macro and popped by the
+       `endproc' macro, so that the same argument name can be used in later
+       procedures. Of course, you don't _have_ to do that.
+
+       `arg' can take an optional parameter, giving the size of the
+       argument. If no size is given, 4 is assumed, since it is likely that
+       many function parameters will be of type `int' or pointers.
+
+   9.2 Writing NetBSD/FreeBSD/OpenBSD and Linux/ELF Shared Libraries
+
+       `ELF' replaced the older `a.out' object file format under Linux
+       because it contains support for position-independent code (PIC),
+       which makes writing shared libraries much easier. NASM supports the
+       `ELF' position-independent code features, so you can write Linux
+       `ELF' shared libraries in NASM.
+
+       NetBSD, and its close cousins FreeBSD and OpenBSD, take a different
+       approach by hacking PIC support into the `a.out' format. NASM
+       supports this as the `aoutb' output format, so you can write BSD
+       shared libraries in NASM too.
+
+       The operating system loads a PIC shared library by memory-mapping
+       the library file at an arbitrarily chosen point in the address space
+       of the running process. The contents of the library's code section
+       must therefore not depend on where it is loaded in memory.
+
+       Therefore, you cannot get at your variables by writing code like
+       this:
+
+               mov     eax,[myvar]             ; WRONG
+
+       Instead, the linker provides an area of memory called the _global
+       offset table_, or GOT; the GOT is situated at a constant distance
+       from your library's code, so if you can find out where your library
+       is loaded (which is typically done using a `CALL' and `POP'
+       combination), you can obtain the address of the GOT, and you can
+       then load the addresses of your variables out of linker-generated
+       entries in the GOT.
+
+       The _data_ section of a PIC shared library does not have these
+       restrictions: since the data section is writable, it has to be
+       copied into memory anyway rather than just paged in from the library
+       file, so as long as it's being copied it can be relocated too. So
+       you can put ordinary types of relocation in the data section without
+       too much worry (but see section 9.2.4 for a caveat).
+
+ 9.2.1 Obtaining the Address of the GOT
+
+       Each code module in your shared library should define the GOT as an
+       external symbol:
+
+       extern  _GLOBAL_OFFSET_TABLE_   ; in ELF 
+       extern  __GLOBAL_OFFSET_TABLE_  ; in BSD a.out
+
+       At the beginning of any function in your shared library which plans
+       to access your data or BSS sections, you must first calculate the
+       address of the GOT. This is typically done by writing the function
+       in this form:
+
+       func:   push    ebp 
+               mov     ebp,esp 
+               push    ebx 
+               call    .get_GOT 
+       .get_GOT: 
+               pop     ebx 
+               add     ebx,_GLOBAL_OFFSET_TABLE_+$$-.get_GOT wrt ..gotpc 
+       
+               ; the function body comes here 
+       
+               mov     ebx,[ebp-4] 
+               mov     esp,ebp 
+               pop     ebp 
+               ret
+
+       (For BSD, again, the symbol `_GLOBAL_OFFSET_TABLE' requires a second
+       leading underscore.)
+
+       The first two lines of this function are simply the standard C
+       prologue to set up a stack frame, and the last three lines are
+       standard C function epilogue. The third line, and the fourth to last
+       line, save and restore the `EBX' register, because PIC shared
+       libraries use this register to store the address of the GOT.
+
+       The interesting bit is the `CALL' instruction and the following two
+       lines. The `CALL' and `POP' combination obtains the address of the
+       label `.get_GOT', without having to know in advance where the
+       program was loaded (since the `CALL' instruction is encoded relative
+       to the current position). The `ADD' instruction makes use of one of
+       the special PIC relocation types: GOTPC relocation. With the
+       `WRT ..gotpc' qualifier specified, the symbol referenced (here
+       `_GLOBAL_OFFSET_TABLE_', the special symbol assigned to the GOT) is
+       given as an offset from the beginning of the section. (Actually,
+       `ELF' encodes it as the offset from the operand field of the `ADD'
+       instruction, but NASM simplifies this deliberately, so you do things
+       the same way for both `ELF' and `BSD'.) So the instruction then
+       _adds_ the beginning of the section, to get the real address of the
+       GOT, and subtracts the value of `.get_GOT' which it knows is in
+       `EBX'. Therefore, by the time that instruction has finished, `EBX'
+       contains the address of the GOT.
+
+       If you didn't follow that, don't worry: it's never necessary to
+       obtain the address of the GOT by any other means, so you can put
+       those three instructions into a macro and safely ignore them:
+
+       %macro  get_GOT 0 
+       
+               call    %%getgot 
+         %%getgot: 
+               pop     ebx 
+               add     ebx,_GLOBAL_OFFSET_TABLE_+$$-%%getgot wrt ..gotpc 
+       
+       %endmacro
+
+ 9.2.2 Finding Your Local Data Items
+
+       Having got the GOT, you can then use it to obtain the addresses of
+       your data items. Most variables will reside in the sections you have
+       declared; they can be accessed using the `..gotoff' special `WRT'
+       type. The way this works is like this:
+
+               lea     eax,[ebx+myvar wrt ..gotoff]
+
+       The expression `myvar wrt ..gotoff' is calculated, when the shared
+       library is linked, to be the offset to the local variable `myvar'
+       from the beginning of the GOT. Therefore, adding it to `EBX' as
+       above will place the real address of `myvar' in `EAX'.
+
+       If you declare variables as `GLOBAL' without specifying a size for
+       them, they are shared between code modules in the library, but do
+       not get exported from the library to the program that loaded it.
+       They will still be in your ordinary data and BSS sections, so you
+       can access them in the same way as local variables, using the above
+       `..gotoff' mechanism.
+
+       Note that due to a peculiarity of the way BSD `a.out' format handles
+       this relocation type, there must be at least one non-local symbol in
+       the same section as the address you're trying to access.
+
+ 9.2.3 Finding External and Common Data Items
+
+       If your library needs to get at an external variable (external to
+       the _library_, not just to one of the modules within it), you must
+       use the `..got' type to get at it. The `..got' type, instead of
+       giving you the offset from the GOT base to the variable, gives you
+       the offset from the GOT base to a GOT _entry_ containing the address
+       of the variable. The linker will set up this GOT entry when it
+       builds the library, and the dynamic linker will place the correct
+       address in it at load time. So to obtain the address of an external
+       variable `extvar' in `EAX', you would code
+
+               mov     eax,[ebx+extvar wrt ..got]
+
+       This loads the address of `extvar' out of an entry in the GOT. The
+       linker, when it builds the shared library, collects together every
+       relocation of type `..got', and builds the GOT so as to ensure it
+       has every necessary entry present.
+
+       Common variables must also be accessed in this way.
+
+ 9.2.4 Exporting Symbols to the Library User
+
+       If you want to export symbols to the user of the library, you have
+       to declare whether they are functions or data, and if they are data,
+       you have to give the size of the data item. This is because the
+       dynamic linker has to build procedure linkage table entries for any
+       exported functions, and also moves exported data items away from the
+       library's data section in which they were declared.
+
+       So to export a function to users of the library, you must use
+
+       global  func:function           ; declare it as a function 
+       
+       func:   push    ebp 
+       
+               ; etc.
+
+       And to export a data item such as an array, you would have to code
+
+       global  array:data array.end-array      ; give the size too 
+       
+       array:  resd    128 
+       .end:
+
+       Be careful: If you export a variable to the library user, by
+       declaring it as `GLOBAL' and supplying a size, the variable will end
+       up living in the data section of the main program, rather than in
+       your library's data section, where you declared it. So you will have
+       to access your own global variable with the `..got' mechanism rather
+       than `..gotoff', as if it were external (which, effectively, it has
+       become).
+
+       Equally, if you need to store the address of an exported global in
+       one of your data sections, you can't do it by means of the standard
+       sort of code:
+
+       dataptr:        dd      global_data_item        ; WRONG
+
+       NASM will interpret this code as an ordinary relocation, in which
+       `global_data_item' is merely an offset from the beginning of the
+       `.data' section (or whatever); so this reference will end up
+       pointing at your data section instead of at the exported global
+       which resides elsewhere.
+
+       Instead of the above code, then, you must write
+
+       dataptr:        dd      global_data_item wrt ..sym
+
+       which makes use of the special `WRT' type `..sym' to instruct NASM
+       to search the symbol table for a particular symbol at that address,
+       rather than just relocating by section base.
+
+       Either method will work for functions: referring to one of your
+       functions by means of
+
+       funcptr:        dd      my_function
+
+       will give the user the address of the code you wrote, whereas
+
+       funcptr:        dd      my_function wrt .sym
+
+       will give the address of the procedure linkage table for the
+       function, which is where the calling program will _believe_ the
+       function lives. Either address is a valid way to call the function.
+
+ 9.2.5 Calling Procedures Outside the Library
+
+       Calling procedures outside your shared library has to be done by
+       means of a _procedure linkage table_, or PLT. The PLT is placed at a
+       known offset from where the library is loaded, so the library code
+       can make calls to the PLT in a position-independent way. Within the
+       PLT there is code to jump to offsets contained in the GOT, so
+       function calls to other shared libraries or to routines in the main
+       program can be transparently passed off to their real destinations.
+
+       To call an external routine, you must use another special PIC
+       relocation type, `WRT ..plt'. This is much easier than the GOT-based
+       ones: you simply replace calls such as `CALL printf' with the PLT-
+       relative version `CALL printf WRT ..plt'.
+
+ 9.2.6 Generating the Library File
+
+       Having written some code modules and assembled them to `.o' files,
+       you then generate your shared library with a command such as
+
+       ld -shared -o library.so module1.o module2.o       # for ELF 
+       ld -Bshareable -o library.so module1.o module2.o   # for BSD
+
+       For ELF, if your shared library is going to reside in system
+       directories such as `/usr/lib' or `/lib', it is usually worth using
+       the `-soname' flag to the linker, to store the final library file
+       name, with a version number, into the library:
+
+       ld -shared -soname library.so.1 -o library.so.1.2 *.o
+
+       You would then copy `library.so.1.2' into the library directory, and
+       create `library.so.1' as a symbolic link to it.
+
+Chapter 10: Mixing 16 and 32 Bit Code
+-------------------------------------
+
+       This chapter tries to cover some of the issues, largely related to
+       unusual forms of addressing and jump instructions, encountered when
+       writing operating system code such as protected-mode initialisation
+       routines, which require code that operates in mixed segment sizes,
+       such as code in a 16-bit segment trying to modify data in a 32-bit
+       one, or jumps between different-size segments.
+
+  10.1 Mixed-Size Jumps
+
+       The most common form of mixed-size instruction is the one used when
+       writing a 32-bit OS: having done your setup in 16-bit mode, such as
+       loading the kernel, you then have to boot it by switching into
+       protected mode and jumping to the 32-bit kernel start address. In a
+       fully 32-bit OS, this tends to be the _only_ mixed-size instruction
+       you need, since everything before it can be done in pure 16-bit
+       code, and everything after it can be pure 32-bit.
+
+       This jump must specify a 48-bit far address, since the target
+       segment is a 32-bit one. However, it must be assembled in a 16-bit
+       segment, so just coding, for example,
+
+               jmp     0x1234:0x56789ABC       ; wrong!
+
+       will not work, since the offset part of the address will be
+       truncated to `0x9ABC' and the jump will be an ordinary 16-bit far
+       one.
+
+       The Linux kernel setup code gets round the inability of `as86' to
+       generate the required instruction by coding it manually, using `DB'
+       instructions. NASM can go one better than that, by actually
+       generating the right instruction itself. Here's how to do it right:
+
+               jmp     dword 0x1234:0x56789ABC         ; right
+
+       The `DWORD' prefix (strictly speaking, it should come _after_ the
+       colon, since it is declaring the _offset_ field to be a doubleword;
+       but NASM will accept either form, since both are unambiguous) forces
+       the offset part to be treated as far, in the assumption that you are
+       deliberately writing a jump from a 16-bit segment to a 32-bit one.
+
+       You can do the reverse operation, jumping from a 32-bit segment to a
+       16-bit one, by means of the `WORD' prefix:
+
+               jmp     word 0x8765:0x4321      ; 32 to 16 bit
+
+       If the `WORD' prefix is specified in 16-bit mode, or the `DWORD'
+       prefix in 32-bit mode, they will be ignored, since each is
+       explicitly forcing NASM into a mode it was in anyway.
+
+  10.2 Addressing Between Different-Size Segments
+
+       If your OS is mixed 16 and 32-bit, or if you are writing a DOS
+       extender, you are likely to have to deal with some 16-bit segments
+       and some 32-bit ones. At some point, you will probably end up
+       writing code in a 16-bit segment which has to access data in a 32-
+       bit segment, or vice versa.
+
+       If the data you are trying to access in a 32-bit segment lies within
+       the first 64K of the segment, you may be able to get away with using
+       an ordinary 16-bit addressing operation for the purpose; but sooner
+       or later, you will want to do 32-bit addressing from 16-bit mode.
+
+       The easiest way to do this is to make sure you use a register for
+       the address, since any effective address containing a 32-bit
+       register is forced to be a 32-bit address. So you can do
+
+               mov     eax,offset_into_32_bit_segment_specified_by_fs 
+               mov     dword [fs:eax],0x11223344
+
+       This is fine, but slightly cumbersome (since it wastes an
+       instruction and a register) if you already know the precise offset
+       you are aiming at. The x86 architecture does allow 32-bit effective
+       addresses to specify nothing but a 4-byte offset, so why shouldn't
+       NASM be able to generate the best instruction for the purpose?
+
+       It can. As in section 10.1, you need only prefix the address with
+       the `DWORD' keyword, and it will be forced to be a 32-bit address:
+
+               mov     dword [fs:dword my_offset],0x11223344
+
+       Also as in section 10.1, NASM is not fussy about whether the `DWORD'
+       prefix comes before or after the segment override, so arguably a
+       nicer-looking way to code the above instruction is
+
+               mov     dword [dword fs:my_offset],0x11223344
+
+       Don't confuse the `DWORD' prefix _outside_ the square brackets,
+       which controls the size of the data stored at the address, with the
+       one `inside' the square brackets which controls the length of the
+       address itself. The two can quite easily be different:
+
+               mov     word [dword 0x12345678],0x9ABC
+
+       This moves 16 bits of data to an address specified by a 32-bit
+       offset.
+
+       You can also specify `WORD' or `DWORD' prefixes along with the `FAR'
+       prefix to indirect far jumps or calls. For example:
+
+               call    dword far [fs:word 0x4321]
+
+       This instruction contains an address specified by a 16-bit offset;
+       it loads a 48-bit far pointer from that (16-bit segment and 32-bit
+       offset), and calls that address.
+
+  10.3 Other Mixed-Size Instructions
+
+       The other way you might want to access data might be using the
+       string instructions (`LODSx', `STOSx' and so on) or the `XLATB'
+       instruction. These instructions, since they take no parameters,
+       might seem to have no easy way to make them perform 32-bit
+       addressing when assembled in a 16-bit segment.
+
+       This is the purpose of NASM's `a16', `a32' and `a64' prefixes. If
+       you are coding `LODSB' in a 16-bit segment but it is supposed to be
+       accessing a string in a 32-bit segment, you should load the desired
+       address into `ESI' and then code
+
+               a32     lodsb
+
+       The prefix forces the addressing size to 32 bits, meaning that
+       `LODSB' loads from `[DS:ESI]' instead of `[DS:SI]'. To access a
+       string in a 16-bit segment when coding in a 32-bit one, the
+       corresponding `a16' prefix can be used.
+
+       The `a16', `a32' and `a64' prefixes can be applied to any
+       instruction in NASM's instruction table, but most of them can
+       generate all the useful forms without them. The prefixes are
+       necessary only for instructions with implicit addressing: `CMPSx',
+       `SCASx', `LODSx', `STOSx', `MOVSx', `INSx', `OUTSx', and `XLATB'.
+       Also, the various push and pop instructions (`PUSHA' and `POPF' as
+       well as the more usual `PUSH' and `POP') can accept `a16', `a32' or
+       `a64' prefixes to force a particular one of `SP', `ESP' or `RSP' to
+       be used as a stack pointer, in case the stack segment in use is a
+       different size from the code segment.
+
+       `PUSH' and `POP', when applied to segment registers in 32-bit mode,
+       also have the slightly odd behaviour that they push and pop 4 bytes
+       at a time, of which the top two are ignored and the bottom two give
+       the value of the segment register being manipulated. To force the
+       16-bit behaviour of segment-register push and pop instructions, you
+       can use the operand-size prefix `o16':
+
+               o16 push    ss 
+               o16 push    ds
+
+       This code saves a doubleword of stack space by fitting two segment
+       registers into the space which would normally be consumed by pushing
+       one.
+
+       (You can also use the `o32' prefix to force the 32-bit behaviour
+       when in 16-bit mode, but this seems less useful.)
+
+Chapter 11: Writing 64-bit Code (Unix, Win64)
+---------------------------------------------
+
+       This chapter attempts to cover some of the common issues involved
+       when writing 64-bit code, to run under Win64 or Unix. It covers how
+       to write assembly code to interface with 64-bit C routines, and how
+       to write position-independent code for shared libraries.
+
+       All 64-bit code uses a flat memory model, since segmentation is not
+       available in 64-bit mode. The one exception is the `FS' and `GS'
+       registers, which still add their bases.
+
+       Position independence in 64-bit mode is significantly simpler, since
+       the processor supports `RIP'-relative addressing directly; see the
+       `REL' keyword (section 3.3). On most 64-bit platforms, it is
+       probably desirable to make that the default, using the directive
+       `DEFAULT REL' (section 6.2).
+
+       64-bit programming is relatively similar to 32-bit programming, but
+       of course pointers are 64 bits long; additionally, all existing
+       platforms pass arguments in registers rather than on the stack.
+       Furthermore, 64-bit platforms use SSE2 by default for floating
+       point. Please see the ABI documentation for your platform.
+
+       64-bit platforms differ in the sizes of the fundamental datatypes,
+       not just from 32-bit platforms but from each other. If a specific
+       size data type is desired, it is probably best to use the types
+       defined in the Standard C header `<inttypes.h>'.
+
+       In 64-bit mode, the default instruction size is still 32 bits. When
+       loading a value into a 32-bit register (but not an 8- or 16-bit
+       register), the upper 32 bits of the corresponding 64-bit register
+       are set to zero.
+
+  11.1 Register Names in 64-bit Mode
+
+       NASM uses the following names for general-purpose registers in 64-
+       bit mode, for 8-, 16-, 32- and 64-bit references, respecitively:
+
+            AL/AH, CL/CH, DL/DH, BL/BH, SPL, BPL, SIL, DIL, R8B-R15B 
+            AX, CX, DX, BX, SP, BP, SI, DI, R8W-R15W 
+            EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI, R8D-R15D 
+            RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, R8-R15
+
+       This is consistent with the AMD documentation and most other
+       assemblers. The Intel documentation, however, uses the names
+       `R8L-R15L' for 8-bit references to the higher registers. It is
+       possible to use those names by definiting them as macros; similarly,
+       if one wants to use numeric names for the low 8 registers, define
+       them as macros. The standard macro package `altreg' (see section
+       5.1) can be used for this purpose.
+
+  11.2 Immediates and Displacements in 64-bit Mode
+
+       In 64-bit mode, immediates and displacements are generally only 32
+       bits wide. NASM will therefore truncate most displacements and
+       immediates to 32 bits.
+
+       The only instruction which takes a full 64-bit immediate is:
+
+            MOV reg64,imm64
+
+       NASM will produce this instruction whenever the programmer uses
+       `MOV' with an immediate into a 64-bit register. If this is not
+       desirable, simply specify the equivalent 32-bit register, which will
+       be automatically zero-extended by the processor, or specify the
+       immediate as `DWORD':
+
+            mov rax,foo             ; 64-bit immediate 
+            mov rax,qword foo       ; (identical) 
+            mov eax,foo             ; 32-bit immediate, zero-extended 
+            mov rax,dword foo       ; 32-bit immediate, sign-extended
+
+       The length of these instructions are 10, 5 and 7 bytes,
+       respectively.
+
+       The only instructions which take a full 64-bit _displacement_ is
+       loading or storing, using `MOV', `AL', `AX', `EAX' or `RAX' (but no
+       other registers) to an absolute 64-bit address. Since this is a
+       relatively rarely used instruction (64-bit code generally uses
+       relative addressing), the programmer has to explicitly declare the
+       displacement size as `QWORD':
+
+            default abs 
+       
+            mov eax,[foo]           ; 32-bit absolute disp, sign-extended 
+            mov eax,[a32 foo]       ; 32-bit absolute disp, zero-extended 
+            mov eax,[qword foo]     ; 64-bit absolute disp 
+       
+            default rel 
+       
+            mov eax,[foo]           ; 32-bit relative disp 
+            mov eax,[a32 foo]       ; d:o, address truncated to 32 bits(!) 
+            mov eax,[qword foo]     ; error 
+            mov eax,[abs qword foo] ; 64-bit absolute disp
+
+       A sign-extended absolute displacement can access from -2 GB to +2
+       GB; a zero-extended absolute displacement can access from 0 to 4 GB.
+
+  11.3 Interfacing to 64-bit C Programs (Unix)
+
+       On Unix, the 64-bit ABI is defined by the document:
+
+       `http://www.x86-64.org/documentation/abi.pdf'
+
+       Although written for AT&T-syntax assembly, the concepts apply
+       equally well for NASM-style assembly. What follows is a simplified
+       summary.
+
+       The first six integer arguments (from the left) are passed in `RDI',
+       `RSI', `RDX', `RCX', `R8', and `R9', in that order. Additional
+       integer arguments are passed on the stack. These registers, plus
+       `RAX', `R10' and `R11' are destroyed by function calls, and thus are
+       available for use by the function without saving.
+
+       Integer return values are passed in `RAX' and `RDX', in that order.
+
+       Floating point is done using SSE registers, except for
+       `long double'. Floating-point arguments are passed in `XMM0' to
+       `XMM7'; return is `XMM0' and `XMM1'. `long double' are passed on the
+       stack, and returned in `ST0' and `ST1'.
+
+       All SSE and x87 registers are destroyed by function calls.
+
+       On 64-bit Unix, `long' is 64 bits.
+
+       Integer and SSE register arguments are counted separately, so for
+       the case of
+
+            void foo(long a, double b, int c)
+
+       `a' is passed in `RDI', `b' in `XMM0', and `c' in `ESI'.
+
+  11.4 Interfacing to 64-bit C Programs (Win64)
+
+       The Win64 ABI is described at:
+
+       `http://msdn2.microsoft.com/en-gb/library/ms794533.aspx'
+
+       What follows is a simplified summary.
+
+       The first four integer arguments are passed in `RCX', `RDX', `R8'
+       and `R9', in that order. Additional integer arguments are passed on
+       the stack. These registers, plus `RAX', `R10' and `R11' are
+       destroyed by function calls, and thus are available for use by the
+       function without saving.
+
+       Integer return values are passed in `RAX' only.
+
+       Floating point is done using SSE registers, except for
+       `long double'. Floating-point arguments are passed in `XMM0' to
+       `XMM3'; return is `XMM0' only.
+
+       On Win64, `long' is 32 bits; `long long' or `_int64' is 64 bits.
+
+       Integer and SSE register arguments are counted together, so for the
+       case of
+
+            void foo(long long a, double b, int c)
+
+       `a' is passed in `RCX', `b' in `XMM1', and `c' in `R8D'.
+
+Chapter 12: Troubleshooting
+---------------------------
+
+       This chapter describes some of the common problems that users have
+       been known to encounter with NASM, and answers them. It also gives
+       instructions for reporting bugs in NASM if you find a difficulty
+       that isn't listed here.
+
+  12.1 Common Problems
+
+12.1.1 NASM Generates Inefficient Code
+
+       We sometimes get `bug' reports about NASM generating inefficient, or
+       even `wrong', code on instructions such as `ADD ESP,8'. This is a
+       deliberate design feature, connected to predictability of output:
+       NASM, on seeing `ADD ESP,8', will generate the form of the
+       instruction which leaves room for a 32-bit offset. You need to code
+       `ADD ESP,BYTE 8' if you want the space-efficient form of the
+       instruction. This isn't a bug, it's user error: if you prefer to
+       have NASM produce the more efficient code automatically enable
+       optimization with the `-O' option (see section 2.1.22).
+
+12.1.2 My Jumps are Out of Range
+
+       Similarly, people complain that when they issue conditional jumps
+       (which are `SHORT' by default) that try to jump too far, NASM
+       reports `short jump out of range' instead of making the jumps
+       longer.
+
+       This, again, is partly a predictability issue, but in fact has a
+       more practical reason as well. NASM has no means of being told what
+       type of processor the code it is generating will be run on; so it
+       cannot decide for itself that it should generate `Jcc NEAR' type
+       instructions, because it doesn't know that it's working for a 386 or
+       above. Alternatively, it could replace the out-of-range short `JNE'
+       instruction with a very short `JE' instruction that jumps over a
+       `JMP NEAR'; this is a sensible solution for processors below a 386,
+       but hardly efficient on processors which have good branch prediction
+       _and_ could have used `JNE NEAR' instead. So, once again, it's up to
+       the user, not the assembler, to decide what instructions should be
+       generated. See section 2.1.22.
+
+12.1.3 `ORG' Doesn't Work
+
+       People writing boot sector programs in the `bin' format often
+       complain that `ORG' doesn't work the way they'd like: in order to
+       place the `0xAA55' signature word at the end of a 512-byte boot
+       sector, people who are used to MASM tend to code
+
+               ORG 0 
+       
+               ; some boot sector code 
+       
+               ORG 510 
+               DW 0xAA55
+
+       This is not the intended use of the `ORG' directive in NASM, and
+       will not work. The correct way to solve this problem in NASM is to
+       use the `TIMES' directive, like this:
+
+               ORG 0 
+       
+               ; some boot sector code 
+       
+               TIMES 510-($-$$) DB 0 
+               DW 0xAA55
+
+       The `TIMES' directive will insert exactly enough zero bytes into the
+       output to move the assembly point up to 510. This method also has
+       the advantage that if you accidentally fill your boot sector too
+       full, NASM will catch the problem at assembly time and report it, so
+       you won't end up with a boot sector that you have to disassemble to
+       find out what's wrong with it.
+
+12.1.4 `TIMES' Doesn't Work
+
+       The other common problem with the above code is people who write the
+       `TIMES' line as
+
+               TIMES 510-$ DB 0
+
+       by reasoning that `$' should be a pure number, just like 510, so the
+       difference between them is also a pure number and can happily be fed
+       to `TIMES'.
+
+       NASM is a _modular_ assembler: the various component parts are
+       designed to be easily separable for re-use, so they don't exchange
+       information unnecessarily. In consequence, the `bin' output format,
+       even though it has been told by the `ORG' directive that the `.text'
+       section should start at 0, does not pass that information back to
+       the expression evaluator. So from the evaluator's point of view, `$'
+       isn't a pure number: it's an offset from a section base. Therefore
+       the difference between `$' and 510 is also not a pure number, but
+       involves a section base. Values involving section bases cannot be
+       passed as arguments to `TIMES'.
+
+       The solution, as in the previous section, is to code the `TIMES'
+       line in the form
+
+               TIMES 510-($-$$) DB 0
+
+       in which `$' and `$$' are offsets from the same section base, and so
+       their difference is a pure number. This will solve the problem and
+       generate sensible code.
+
+  12.2 Bugs
+
+       We have never yet released a version of NASM with any _known_ bugs.
+       That doesn't usually stop there being plenty we didn't know about,
+       though. Any that you find should be reported firstly via the
+       `bugtracker' at `https://sourceforge.net/projects/nasm/' (click on
+       "Bugs"), or if that fails then through one of the contacts in
+       section 1.2.
+
+       Please read section 2.2 first, and don't report the bug if it's
+       listed in there as a deliberate feature. (If you think the feature
+       is badly thought out, feel free to send us reasons why you think it
+       should be changed, but don't just send us mail saying `This is a
+       bug' if the documentation says we did it on purpose.) Then read
+       section 12.1, and don't bother reporting the bug if it's listed
+       there.
+
+       If you do report a bug, _please_ give us all of the following
+       information:
+
+       (*) What operating system you're running NASM under. DOS, Linux,
+           NetBSD, Win16, Win32, VMS (I'd be impressed), whatever.
+
+       (*) If you're running NASM under DOS or Win32, tell us whether
+           you've compiled your own executable from the DOS source archive,
+           or whether you were using the standard distribution binaries out
+           of the archive. If you were using a locally built executable,
+           try to reproduce the problem using one of the standard binaries,
+           as this will make it easier for us to reproduce your problem
+           prior to fixing it.
+
+       (*) Which version of NASM you're using, and exactly how you invoked
+           it. Give us the precise command line, and the contents of the
+           `NASMENV' environment variable if any.
+
+       (*) Which versions of any supplementary programs you're using, and
+           how you invoked them. If the problem only becomes visible at
+           link time, tell us what linker you're using, what version of it
+           you've got, and the exact linker command line. If the problem
+           involves linking against object files generated by a compiler,
+           tell us what compiler, what version, and what command line or
+           options you used. (If you're compiling in an IDE, please try to
+           reproduce the problem with the command-line version of the
+           compiler.)
+
+       (*) If at all possible, send us a NASM source file which exhibits
+           the problem. If this causes copyright problems (e.g. you can
+           only reproduce the bug in restricted-distribution code) then
+           bear in mind the following two points: firstly, we guarantee
+           that any source code sent to us for the purposes of debugging
+           NASM will be used _only_ for the purposes of debugging NASM, and
+           that we will delete all our copies of it as soon as we have
+           found and fixed the bug or bugs in question; and secondly, we
+           would prefer _not_ to be mailed large chunks of code anyway. The
+           smaller the file, the better. A three-line sample file that does
+           nothing useful _except_ demonstrate the problem is much easier
+           to work with than a fully fledged ten-thousand-line program. (Of
+           course, some errors _do_ only crop up in large files, so this
+           may not be possible.)
+
+       (*) A description of what the problem actually _is_. `It doesn't
+           work' is _not_ a helpful description! Please describe exactly
+           what is happening that shouldn't be, or what isn't happening
+           that should. Examples might be: `NASM generates an error message
+           saying Line 3 for an error that's actually on Line 5'; `NASM
+           generates an error message that I believe it shouldn't be
+           generating at all'; `NASM fails to generate an error message
+           that I believe it _should_ be generating'; `the object file
+           produced from this source code crashes my linker'; `the ninth
+           byte of the output file is 66 and I think it should be 77
+           instead'.
+
+       (*) If you believe the output file from NASM to be faulty, send it
+           to us. That allows us to determine whether our own copy of NASM
+           generates the same file, or whether the problem is related to
+           portability issues between our development platforms and yours.
+           We can handle binary files mailed to us as MIME attachments,
+           uuencoded, and even BinHex. Alternatively, we may be able to
+           provide an FTP site you can upload the suspect files to; but
+           mailing them is easier for us.
+
+       (*) Any other information or data files that might be helpful. If,
+           for example, the problem involves NASM failing to generate an
+           object file while TASM can generate an equivalent file without
+           trouble, then send us _both_ object files, so we can see what
+           TASM is doing differently from us.
+
+Appendix A: Ndisasm
+-------------------
+
+       The Netwide Disassembler, NDISASM
+
+   A.1 Introduction
+
+       The Netwide Disassembler is a small companion program to the Netwide
+       Assembler, NASM. It seemed a shame to have an x86 assembler,
+       complete with a full instruction table, and not make as much use of
+       it as possible, so here's a disassembler which shares the
+       instruction table (and some other bits of code) with NASM.
+
+       The Netwide Disassembler does nothing except to produce
+       disassemblies of _binary_ source files. NDISASM does not have any
+       understanding of object file formats, like `objdump', and it will
+       not understand `DOS .EXE' files like `debug' will. It just
+       disassembles.
+
+   A.2 Getting Started: Installation
+
+       See section 1.3 for installation instructions. NDISASM, like NASM,
+       has a `man page' which you may want to put somewhere useful, if you
+       are on a Unix system.
+
+   A.3 Running NDISASM
+
+       To disassemble a file, you will typically use a command of the form
+
+              ndisasm -b {16|32|64} filename
+
+       NDISASM can disassemble 16-, 32- or 64-bit code equally easily,
+       provided of course that you remember to specify which it is to work
+       with. If no `-b' switch is present, NDISASM works in 16-bit mode by
+       default. The `-u' switch (for USE32) also invokes 32-bit mode.
+
+       Two more command line options are `-r' which reports the version
+       number of NDISASM you are running, and `-h' which gives a short
+       summary of command line options.
+
+ A.3.1 COM Files: Specifying an Origin
+
+       To disassemble a `DOS .COM' file correctly, a disassembler must
+       assume that the first instruction in the file is loaded at address
+       `0x100', rather than at zero. NDISASM, which assumes by default that
+       any file you give it is loaded at zero, will therefore need to be
+       informed of this.
+
+       The `-o' option allows you to declare a different origin for the
+       file you are disassembling. Its argument may be expressed in any of
+       the NASM numeric formats: decimal by default, if it begins with
+       ``$'' or ``0x'' or ends in ``H'' it's `hex', if it ends in ``Q''
+       it's `octal', and if it ends in ``B'' it's `binary'.
+
+       Hence, to disassemble a `.COM' file:
+
+              ndisasm -o100h filename.com
+
+       will do the trick.
+
+ A.3.2 Code Following Data: Synchronisation
+
+       Suppose you are disassembling a file which contains some data which
+       isn't machine code, and _then_ contains some machine code. NDISASM
+       will faithfully plough through the data section, producing machine
+       instructions wherever it can (although most of them will look
+       bizarre, and some may have unusual prefixes, e.g.
+       ``FS OR AX,0x240A''), and generating `DB' instructions ever so often
+       if it's totally stumped. Then it will reach the code section.
+
+       Supposing NDISASM has just finished generating a strange machine
+       instruction from part of the data section, and its file position is
+       now one byte _before_ the beginning of the code section. It's
+       entirely possible that another spurious instruction will get
+       generated, starting with the final byte of the data section, and
+       then the correct first instruction in the code section will not be
+       seen because the starting point skipped over it. This isn't really
+       ideal.
+
+       To avoid this, you can specify a ``synchronisation'' point, or
+       indeed as many synchronisation points as you like (although NDISASM
+       can only handle 2147483647 sync points internally). The definition
+       of a sync point is this: NDISASM guarantees to hit sync points
+       exactly during disassembly. If it is thinking about generating an
+       instruction which would cause it to jump over a sync point, it will
+       discard that instruction and output a ``db'' instead. So it _will_
+       start disassembly exactly from the sync point, and so you _will_ see
+       all the instructions in your code section.
+
+       Sync points are specified using the `-s' option: they are measured
+       in terms of the program origin, not the file position. So if you
+       want to synchronize after 32 bytes of a `.COM' file, you would have
+       to do
+
+              ndisasm -o100h -s120h file.com
+
+       rather than
+
+              ndisasm -o100h -s20h file.com
+
+       As stated above, you can specify multiple sync markers if you need
+       to, just by repeating the `-s' option.
+
+ A.3.3 Mixed Code and Data: Automatic (Intelligent) Synchronisation 
+
+       Suppose you are disassembling the boot sector of a `DOS' floppy
+       (maybe it has a virus, and you need to understand the virus so that
+       you know what kinds of damage it might have done you). Typically,
+       this will contain a `JMP' instruction, then some data, then the rest
+       of the code. So there is a very good chance of NDISASM being
+       _misaligned_ when the data ends and the code begins. Hence a sync
+       point is needed.
+
+       On the other hand, why should you have to specify the sync point
+       manually? What you'd do in order to find where the sync point would
+       be, surely, would be to read the `JMP' instruction, and then to use
+       its target address as a sync point. So can NDISASM do that for you?
+
+       The answer, of course, is yes: using either of the synonymous
+       switches `-a' (for automatic sync) or `-i' (for intelligent sync)
+       will enable `auto-sync' mode. Auto-sync mode automatically generates
+       a sync point for any forward-referring PC-relative jump or call
+       instruction that NDISASM encounters. (Since NDISASM is one-pass, if
+       it encounters a PC-relative jump whose target has already been
+       processed, there isn't much it can do about it...)
+
+       Only PC-relative jumps are processed, since an absolute jump is
+       either through a register (in which case NDISASM doesn't know what
+       the register contains) or involves a segment address (in which case
+       the target code isn't in the same segment that NDISASM is working
+       in, and so the sync point can't be placed anywhere useful).
+
+       For some kinds of file, this mechanism will automatically put sync
+       points in all the right places, and save you from having to place
+       any sync points manually. However, it should be stressed that auto-
+       sync mode is _not_ guaranteed to catch all the sync points, and you
+       may still have to place some manually.
+
+       Auto-sync mode doesn't prevent you from declaring manual sync
+       points: it just adds automatically generated ones to the ones you
+       provide. It's perfectly feasible to specify `-i' _and_ some `-s'
+       options.
+
+       Another caveat with auto-sync mode is that if, by some unpleasant
+       fluke, something in your data section should disassemble to a PC-
+       relative call or jump instruction, NDISASM may obediently place a
+       sync point in a totally random place, for example in the middle of
+       one of the instructions in your code section. So you may end up with
+       a wrong disassembly even if you use auto-sync. Again, there isn't
+       much I can do about this. If you have problems, you'll have to use
+       manual sync points, or use the `-k' option (documented below) to
+       suppress disassembly of the data area.
+
+ A.3.4 Other Options
+
+       The `-e' option skips a header on the file, by ignoring the first N
+       bytes. This means that the header is _not_ counted towards the
+       disassembly offset: if you give `-e10 -o10', disassembly will start
+       at byte 10 in the file, and this will be given offset 10, not 20.
+
+       The `-k' option is provided with two comma-separated numeric
+       arguments, the first of which is an assembly offset and the second
+       is a number of bytes to skip. This _will_ count the skipped bytes
+       towards the assembly offset: its use is to suppress disassembly of a
+       data section which wouldn't contain anything you wanted to see
+       anyway.
+
+   A.4 Bugs and Improvements
+
+       There are no known bugs. However, any you find, with patches if
+       possible, should be sent to `nasm-bugs@lists.sourceforge.net', or to
+       the developer's site at `https://sourceforge.net/projects/nasm/' and
+       we'll try to fix them. Feel free to send contributions and new
+       features as well.
+
+Appendix B: Instruction List
+----------------------------
+
+   B.1 Introduction
+
+       The following sections show the instructions which NASM currently
+       supports. For each instruction, there is a separate entry for each
+       supported addressing mode. The third column shows the processor type
+       in which the instruction was introduced and, when appropriate, one
+       or more usage flags.
+
+ B.1.1 Special instructions...
+
+       DB                                         
+       DW                                         
+       DD                                         
+       DQ                                         
+       DT                                         
+       DO                                         
+       DY                                         
+       RESB             imm                      8086 
+       RESW                                       
+       RESD                                       
+       RESQ                                       
+       REST                                       
+       RESO                                       
+       RESY                                      
+
+ B.1.2 Conventional instructions
+
+       AAA                                       8086,NOLONG 
+       AAD                                       8086,NOLONG 
+       AAD              imm                      8086,NOLONG 
+       AAM                                       8086,NOLONG 
+       AAM              imm                      8086,NOLONG 
+       AAS                                       8086,NOLONG 
+       ADC              mem,reg8                 8086 
+       ADC              reg8,reg8                8086 
+       ADC              mem,reg16                8086 
+       ADC              reg16,reg16              8086 
+       ADC              mem,reg32                386 
+       ADC              reg32,reg32              386 
+       ADC              mem,reg64                X64 
+       ADC              reg64,reg64              X64 
+       ADC              reg8,mem                 8086 
+       ADC              reg8,reg8                8086 
+       ADC              reg16,mem                8086 
+       ADC              reg16,reg16              8086 
+       ADC              reg32,mem                386 
+       ADC              reg32,reg32              386 
+       ADC              reg64,mem                X64 
+       ADC              reg64,reg64              X64 
+       ADC              rm16,imm8                8086 
+       ADC              rm32,imm8                386 
+       ADC              rm64,imm8                X64 
+       ADC              reg_al,imm               8086 
+       ADC              reg_ax,sbyte16           8086 
+       ADC              reg_ax,imm               8086 
+       ADC              reg_eax,sbyte32          386 
+       ADC              reg_eax,imm              386 
+       ADC              reg_rax,sbyte64          X64 
+       ADC              reg_rax,imm              X64 
+       ADC              rm8,imm                  8086 
+       ADC              rm16,imm                 8086 
+       ADC              rm32,imm                 386 
+       ADC              rm64,imm                 X64 
+       ADC              mem,imm8                 8086 
+       ADC              mem,imm16                8086 
+       ADC              mem,imm32                386 
+       ADD              mem,reg8                 8086 
+       ADD              reg8,reg8                8086 
+       ADD              mem,reg16                8086 
+       ADD              reg16,reg16              8086 
+       ADD              mem,reg32                386 
+       ADD              reg32,reg32              386 
+       ADD              mem,reg64                X64 
+       ADD              reg64,reg64              X64 
+       ADD              reg8,mem                 8086 
+       ADD              reg8,reg8                8086 
+       ADD              reg16,mem                8086 
+       ADD              reg16,reg16              8086 
+       ADD              reg32,mem                386 
+       ADD              reg32,reg32              386 
+       ADD              reg64,mem                X64 
+       ADD              reg64,reg64              X64 
+       ADD              rm16,imm8                8086 
+       ADD              rm32,imm8                386 
+       ADD              rm64,imm8                X64 
+       ADD              reg_al,imm               8086 
+       ADD              reg_ax,sbyte16           8086 
+       ADD              reg_ax,imm               8086 
+       ADD              reg_eax,sbyte32          386 
+       ADD              reg_eax,imm              386 
+       ADD              reg_rax,sbyte64          X64 
+       ADD              reg_rax,imm              X64 
+       ADD              rm8,imm                  8086 
+       ADD              rm16,imm                 8086 
+       ADD              rm32,imm                 386 
+       ADD              rm64,imm                 X64 
+       ADD              mem,imm8                 8086 
+       ADD              mem,imm16                8086 
+       ADD              mem,imm32                386 
+       AND              mem,reg8                 8086 
+       AND              reg8,reg8                8086 
+       AND              mem,reg16                8086 
+       AND              reg16,reg16              8086 
+       AND              mem,reg32                386 
+       AND              reg32,reg32              386 
+       AND              mem,reg64                X64 
+       AND              reg64,reg64              X64 
+       AND              reg8,mem                 8086 
+       AND              reg8,reg8                8086 
+       AND              reg16,mem                8086 
+       AND              reg16,reg16              8086 
+       AND              reg32,mem                386 
+       AND              reg32,reg32              386 
+       AND              reg64,mem                X64 
+       AND              reg64,reg64              X64 
+       AND              rm16,imm8                8086 
+       AND              rm32,imm8                386 
+       AND              rm64,imm8                X64 
+       AND              reg_al,imm               8086 
+       AND              reg_ax,sbyte16           8086 
+       AND              reg_ax,imm               8086 
+       AND              reg_eax,sbyte32          386 
+       AND              reg_eax,imm              386 
+       AND              reg_rax,sbyte64          X64 
+       AND              reg_rax,imm              X64 
+       AND              rm8,imm                  8086 
+       AND              rm16,imm                 8086 
+       AND              rm32,imm                 386 
+       AND              rm64,imm                 X64 
+       AND              mem,imm8                 8086 
+       AND              mem,imm16                8086 
+       AND              mem,imm32                386 
+       ARPL             mem,reg16                286,PROT,NOLONG 
+       ARPL             reg16,reg16              286,PROT,NOLONG 
+       BB0_RESET                                 PENT,CYRIX,ND 
+       BB1_RESET                                 PENT,CYRIX,ND 
+       BOUND            reg16,mem                186,NOLONG 
+       BOUND            reg32,mem                386,NOLONG 
+       BSF              reg16,mem                386 
+       BSF              reg16,reg16              386 
+       BSF              reg32,mem                386 
+       BSF              reg32,reg32              386 
+       BSF              reg64,mem                X64 
+       BSF              reg64,reg64              X64 
+       BSR              reg16,mem                386 
+       BSR              reg16,reg16              386 
+       BSR              reg32,mem                386 
+       BSR              reg32,reg32              386 
+       BSR              reg64,mem                X64 
+       BSR              reg64,reg64              X64 
+       BSWAP            reg32                    486 
+       BSWAP            reg64                    X64 
+       BT               mem,reg16                386 
+       BT               reg16,reg16              386 
+       BT               mem,reg32                386 
+       BT               reg32,reg32              386 
+       BT               mem,reg64                X64 
+       BT               reg64,reg64              X64 
+       BT               rm16,imm                 386 
+       BT               rm32,imm                 386 
+       BT               rm64,imm                 X64 
+       BTC              mem,reg16                386 
+       BTC              reg16,reg16              386 
+       BTC              mem,reg32                386 
+       BTC              reg32,reg32              386 
+       BTC              mem,reg64                X64 
+       BTC              reg64,reg64              X64 
+       BTC              rm16,imm                 386 
+       BTC              rm32,imm                 386 
+       BTC              rm64,imm                 X64 
+       BTR              mem,reg16                386 
+       BTR              reg16,reg16              386 
+       BTR              mem,reg32                386 
+       BTR              reg32,reg32              386 
+       BTR              mem,reg64                X64 
+       BTR              reg64,reg64              X64 
+       BTR              rm16,imm                 386 
+       BTR              rm32,imm                 386 
+       BTR              rm64,imm                 X64 
+       BTS              mem,reg16                386 
+       BTS              reg16,reg16              386 
+       BTS              mem,reg32                386 
+       BTS              reg32,reg32              386 
+       BTS              mem,reg64                X64 
+       BTS              reg64,reg64              X64 
+       BTS              rm16,imm                 386 
+       BTS              rm32,imm                 386 
+       BTS              rm64,imm                 X64 
+       CALL             imm                      8086 
+       CALL             imm|near                 8086 
+       CALL             imm|far                  8086,ND,NOLONG 
+       CALL             imm16                    8086 
+       CALL             imm16|near               8086 
+       CALL             imm16|far                8086,ND,NOLONG 
+       CALL             imm32                    386 
+       CALL             imm32|near               386 
+       CALL             imm32|far                386,ND,NOLONG 
+       CALL             imm:imm                  8086,NOLONG 
+       CALL             imm16:imm                8086,NOLONG 
+       CALL             imm:imm16                8086,NOLONG 
+       CALL             imm32:imm                386,NOLONG 
+       CALL             imm:imm32                386,NOLONG 
+       CALL             mem|far                  8086,NOLONG 
+       CALL             mem|far                  X64 
+       CALL             mem16|far                8086 
+       CALL             mem32|far                386 
+       CALL             mem64|far                X64 
+       CALL             mem|near                 8086 
+       CALL             mem16|near               8086 
+       CALL             mem32|near               386,NOLONG 
+       CALL             mem64|near               X64 
+       CALL             reg16                    8086 
+       CALL             reg32                    386,NOLONG 
+       CALL             reg64                    X64 
+       CALL             mem                      8086 
+       CALL             mem16                    8086 
+       CALL             mem32                    386,NOLONG 
+       CALL             mem64                    X64 
+       CBW                                       8086 
+       CDQ                                       386 
+       CDQE                                      X64 
+       CLC                                       8086 
+       CLD                                       8086 
+       CLGI                                      X64,AMD 
+       CLI                                       8086 
+       CLTS                                      286,PRIV 
+       CMC                                       8086 
+       CMP              mem,reg8                 8086 
+       CMP              reg8,reg8                8086 
+       CMP              mem,reg16                8086 
+       CMP              reg16,reg16              8086 
+       CMP              mem,reg32                386 
+       CMP              reg32,reg32              386 
+       CMP              mem,reg64                X64 
+       CMP              reg64,reg64              X64 
+       CMP              reg8,mem                 8086 
+       CMP              reg8,reg8                8086 
+       CMP              reg16,mem                8086 
+       CMP              reg16,reg16              8086 
+       CMP              reg32,mem                386 
+       CMP              reg32,reg32              386 
+       CMP              reg64,mem                X64 
+       CMP              reg64,reg64              X64 
+       CMP              rm16,imm8                8086 
+       CMP              rm32,imm8                386 
+       CMP              rm64,imm8                X64 
+       CMP              reg_al,imm               8086 
+       CMP              reg_ax,sbyte16           8086 
+       CMP              reg_ax,imm               8086 
+       CMP              reg_eax,sbyte32          386 
+       CMP              reg_eax,imm              386 
+       CMP              reg_rax,sbyte64          X64 
+       CMP              reg_rax,imm              X64 
+       CMP              rm8,imm                  8086 
+       CMP              rm16,imm                 8086 
+       CMP              rm32,imm                 386 
+       CMP              rm64,imm                 X64 
+       CMP              mem,imm8                 8086 
+       CMP              mem,imm16                8086 
+       CMP              mem,imm32                386 
+       CMPSB                                     8086 
+       CMPSD                                     386 
+       CMPSQ                                     X64 
+       CMPSW                                     8086 
+       CMPXCHG          mem,reg8                 PENT 
+       CMPXCHG          reg8,reg8                PENT 
+       CMPXCHG          mem,reg16                PENT 
+       CMPXCHG          reg16,reg16              PENT 
+       CMPXCHG          mem,reg32                PENT 
+       CMPXCHG          reg32,reg32              PENT 
+       CMPXCHG          mem,reg64                X64 
+       CMPXCHG          reg64,reg64              X64 
+       CMPXCHG486       mem,reg8                 486,UNDOC,ND 
+       CMPXCHG486       reg8,reg8                486,UNDOC,ND 
+       CMPXCHG486       mem,reg16                486,UNDOC,ND 
+       CMPXCHG486       reg16,reg16              486,UNDOC,ND 
+       CMPXCHG486       mem,reg32                486,UNDOC,ND 
+       CMPXCHG486       reg32,reg32              486,UNDOC,ND 
+       CMPXCHG8B        mem                      PENT 
+       CMPXCHG16B       mem                      X64 
+       CPUID                                     PENT 
+       CPU_READ                                  PENT,CYRIX 
+       CPU_WRITE                                 PENT,CYRIX 
+       CQO                                       X64 
+       CWD                                       8086 
+       CWDE                                      386 
+       DAA                                       8086,NOLONG 
+       DAS                                       8086,NOLONG 
+       DEC              reg16                    8086,NOLONG 
+       DEC              reg32                    386,NOLONG 
+       DEC              rm8                      8086 
+       DEC              rm16                     8086 
+       DEC              rm32                     386 
+       DEC              rm64                     X64 
+       DIV              rm8                      8086 
+       DIV              rm16                     8086 
+       DIV              rm32                     386 
+       DIV              rm64                     X64 
+       DMINT                                     P6,CYRIX 
+       EMMS                                      PENT,MMX 
+       ENTER            imm,imm                  186 
+       EQU              imm                      8086 
+       EQU              imm:imm                  8086 
+       F2XM1                                     8086,FPU 
+       FABS                                      8086,FPU 
+       FADD             mem32                    8086,FPU 
+       FADD             mem64                    8086,FPU 
+       FADD             fpureg|to                8086,FPU 
+       FADD             fpureg                   8086,FPU 
+       FADD             fpureg,fpu0              8086,FPU 
+       FADD             fpu0,fpureg              8086,FPU 
+       FADD                                      8086,FPU,ND 
+       FADDP            fpureg                   8086,FPU 
+       FADDP            fpureg,fpu0              8086,FPU 
+       FADDP                                     8086,FPU,ND 
+       FBLD             mem80                    8086,FPU 
+       FBLD             mem                      8086,FPU 
+       FBSTP            mem80                    8086,FPU 
+       FBSTP            mem                      8086,FPU 
+       FCHS                                      8086,FPU 
+       FCLEX                                     8086,FPU 
+       FCMOVB           fpureg                   P6,FPU 
+       FCMOVB           fpu0,fpureg              P6,FPU 
+       FCMOVB                                    P6,FPU,ND 
+       FCMOVBE          fpureg                   P6,FPU 
+       FCMOVBE          fpu0,fpureg              P6,FPU 
+       FCMOVBE                                   P6,FPU,ND 
+       FCMOVE           fpureg                   P6,FPU 
+       FCMOVE           fpu0,fpureg              P6,FPU 
+       FCMOVE                                    P6,FPU,ND 
+       FCMOVNB          fpureg                   P6,FPU 
+       FCMOVNB          fpu0,fpureg              P6,FPU 
+       FCMOVNB                                   P6,FPU,ND 
+       FCMOVNBE         fpureg                   P6,FPU 
+       FCMOVNBE         fpu0,fpureg              P6,FPU 
+       FCMOVNBE                                  P6,FPU,ND 
+       FCMOVNE          fpureg                   P6,FPU 
+       FCMOVNE          fpu0,fpureg              P6,FPU 
+       FCMOVNE                                   P6,FPU,ND 
+       FCMOVNU          fpureg                   P6,FPU 
+       FCMOVNU          fpu0,fpureg              P6,FPU 
+       FCMOVNU                                   P6,FPU,ND 
+       FCMOVU           fpureg                   P6,FPU 
+       FCMOVU           fpu0,fpureg              P6,FPU 
+       FCMOVU                                    P6,FPU,ND 
+       FCOM             mem32                    8086,FPU 
+       FCOM             mem64                    8086,FPU 
+       FCOM             fpureg                   8086,FPU 
+       FCOM             fpu0,fpureg              8086,FPU 
+       FCOM                                      8086,FPU,ND 
+       FCOMI            fpureg                   P6,FPU 
+       FCOMI            fpu0,fpureg              P6,FPU 
+       FCOMI                                     P6,FPU,ND 
+       FCOMIP           fpureg                   P6,FPU 
+       FCOMIP           fpu0,fpureg              P6,FPU 
+       FCOMIP                                    P6,FPU,ND 
+       FCOMP            mem32                    8086,FPU 
+       FCOMP            mem64                    8086,FPU 
+       FCOMP            fpureg                   8086,FPU 
+       FCOMP            fpu0,fpureg              8086,FPU 
+       FCOMP                                     8086,FPU,ND 
+       FCOMPP                                    8086,FPU 
+       FCOS                                      386,FPU 
+       FDECSTP                                   8086,FPU 
+       FDISI                                     8086,FPU 
+       FDIV             mem32                    8086,FPU 
+       FDIV             mem64                    8086,FPU 
+       FDIV             fpureg|to                8086,FPU 
+       FDIV             fpureg                   8086,FPU 
+       FDIV             fpureg,fpu0              8086,FPU 
+       FDIV             fpu0,fpureg              8086,FPU 
+       FDIV                                      8086,FPU,ND 
+       FDIVP            fpureg                   8086,FPU 
+       FDIVP            fpureg,fpu0              8086,FPU 
+       FDIVP                                     8086,FPU,ND 
+       FDIVR            mem32                    8086,FPU 
+       FDIVR            mem64                    8086,FPU 
+       FDIVR            fpureg|to                8086,FPU 
+       FDIVR            fpureg,fpu0              8086,FPU 
+       FDIVR            fpureg                   8086,FPU 
+       FDIVR            fpu0,fpureg              8086,FPU 
+       FDIVR                                     8086,FPU,ND 
+       FDIVRP           fpureg                   8086,FPU 
+       FDIVRP           fpureg,fpu0              8086,FPU 
+       FDIVRP                                    8086,FPU,ND 
+       FEMMS                                     PENT,3DNOW 
+       FENI                                      8086,FPU 
+       FFREE            fpureg                   8086,FPU 
+       FFREE                                     8086,FPU 
+       FFREEP           fpureg                   286,FPU,UNDOC 
+       FFREEP                                    286,FPU,UNDOC 
+       FIADD            mem32                    8086,FPU 
+       FIADD            mem16                    8086,FPU 
+       FICOM            mem32                    8086,FPU 
+       FICOM            mem16                    8086,FPU 
+       FICOMP           mem32                    8086,FPU 
+       FICOMP           mem16                    8086,FPU 
+       FIDIV            mem32                    8086,FPU 
+       FIDIV            mem16                    8086,FPU 
+       FIDIVR           mem32                    8086,FPU 
+       FIDIVR           mem16                    8086,FPU 
+       FILD             mem32                    8086,FPU 
+       FILD             mem16                    8086,FPU 
+       FILD             mem64                    8086,FPU 
+       FIMUL            mem32                    8086,FPU 
+       FIMUL            mem16                    8086,FPU 
+       FINCSTP                                   8086,FPU 
+       FINIT                                     8086,FPU 
+       FIST             mem32                    8086,FPU 
+       FIST             mem16                    8086,FPU 
+       FISTP            mem32                    8086,FPU 
+       FISTP            mem16                    8086,FPU 
+       FISTP            mem64                    8086,FPU 
+       FISTTP           mem16                    PRESCOTT,FPU 
+       FISTTP           mem32                    PRESCOTT,FPU 
+       FISTTP           mem64                    PRESCOTT,FPU 
+       FISUB            mem32                    8086,FPU 
+       FISUB            mem16                    8086,FPU 
+       FISUBR           mem32                    8086,FPU 
+       FISUBR           mem16                    8086,FPU 
+       FLD              mem32                    8086,FPU 
+       FLD              mem64                    8086,FPU 
+       FLD              mem80                    8086,FPU 
+       FLD              fpureg                   8086,FPU 
+       FLD                                       8086,FPU,ND 
+       FLD1                                      8086,FPU 
+       FLDCW            mem                      8086,FPU,SW 
+       FLDENV           mem                      8086,FPU 
+       FLDL2E                                    8086,FPU 
+       FLDL2T                                    8086,FPU 
+       FLDLG2                                    8086,FPU 
+       FLDLN2                                    8086,FPU 
+       FLDPI                                     8086,FPU 
+       FLDZ                                      8086,FPU 
+       FMUL             mem32                    8086,FPU 
+       FMUL             mem64                    8086,FPU 
+       FMUL             fpureg|to                8086,FPU 
+       FMUL             fpureg,fpu0              8086,FPU 
+       FMUL             fpureg                   8086,FPU 
+       FMUL             fpu0,fpureg              8086,FPU 
+       FMUL                                      8086,FPU,ND 
+       FMULP            fpureg                   8086,FPU 
+       FMULP            fpureg,fpu0              8086,FPU 
+       FMULP                                     8086,FPU,ND 
+       FNCLEX                                    8086,FPU 
+       FNDISI                                    8086,FPU 
+       FNENI                                     8086,FPU 
+       FNINIT                                    8086,FPU 
+       FNOP                                      8086,FPU 
+       FNSAVE           mem                      8086,FPU 
+       FNSTCW           mem                      8086,FPU,SW 
+       FNSTENV          mem                      8086,FPU 
+       FNSTSW           mem                      8086,FPU,SW 
+       FNSTSW           reg_ax                   286,FPU 
+       FPATAN                                    8086,FPU 
+       FPREM                                     8086,FPU 
+       FPREM1                                    386,FPU 
+       FPTAN                                     8086,FPU 
+       FRNDINT                                   8086,FPU 
+       FRSTOR           mem                      8086,FPU 
+       FSAVE            mem                      8086,FPU 
+       FSCALE                                    8086,FPU 
+       FSETPM                                    286,FPU 
+       FSIN                                      386,FPU 
+       FSINCOS                                   386,FPU 
+       FSQRT                                     8086,FPU 
+       FST              mem32                    8086,FPU 
+       FST              mem64                    8086,FPU 
+       FST              fpureg                   8086,FPU 
+       FST                                       8086,FPU,ND 
+       FSTCW            mem                      8086,FPU,SW 
+       FSTENV           mem                      8086,FPU 
+       FSTP             mem32                    8086,FPU 
+       FSTP             mem64                    8086,FPU 
+       FSTP             mem80                    8086,FPU 
+       FSTP             fpureg                   8086,FPU 
+       FSTP                                      8086,FPU,ND 
+       FSTSW            mem                      8086,FPU,SW 
+       FSTSW            reg_ax                   286,FPU 
+       FSUB             mem32                    8086,FPU 
+       FSUB             mem64                    8086,FPU 
+       FSUB             fpureg|to                8086,FPU 
+       FSUB             fpureg,fpu0              8086,FPU 
+       FSUB             fpureg                   8086,FPU 
+       FSUB             fpu0,fpureg              8086,FPU 
+       FSUB                                      8086,FPU,ND 
+       FSUBP            fpureg                   8086,FPU 
+       FSUBP            fpureg,fpu0              8086,FPU 
+       FSUBP                                     8086,FPU,ND 
+       FSUBR            mem32                    8086,FPU 
+       FSUBR            mem64                    8086,FPU 
+       FSUBR            fpureg|to                8086,FPU 
+       FSUBR            fpureg,fpu0              8086,FPU 
+       FSUBR            fpureg                   8086,FPU 
+       FSUBR            fpu0,fpureg              8086,FPU 
+       FSUBR                                     8086,FPU,ND 
+       FSUBRP           fpureg                   8086,FPU 
+       FSUBRP           fpureg,fpu0              8086,FPU 
+       FSUBRP                                    8086,FPU,ND 
+       FTST                                      8086,FPU 
+       FUCOM            fpureg                   386,FPU 
+       FUCOM            fpu0,fpureg              386,FPU 
+       FUCOM                                     386,FPU,ND 
+       FUCOMI           fpureg                   P6,FPU 
+       FUCOMI           fpu0,fpureg              P6,FPU 
+       FUCOMI                                    P6,FPU,ND 
+       FUCOMIP          fpureg                   P6,FPU 
+       FUCOMIP          fpu0,fpureg              P6,FPU 
+       FUCOMIP                                   P6,FPU,ND 
+       FUCOMP           fpureg                   386,FPU 
+       FUCOMP           fpu0,fpureg              386,FPU 
+       FUCOMP                                    386,FPU,ND 
+       FUCOMPP                                   386,FPU 
+       FXAM                                      8086,FPU 
+       FXCH             fpureg                   8086,FPU 
+       FXCH             fpureg,fpu0              8086,FPU 
+       FXCH             fpu0,fpureg              8086,FPU 
+       FXCH                                      8086,FPU,ND 
+       FXTRACT                                   8086,FPU 
+       FYL2X                                     8086,FPU 
+       FYL2XP1                                   8086,FPU 
+       HLT                                       8086,PRIV 
+       IBTS             mem,reg16                386,SW,UNDOC,ND 
+       IBTS             reg16,reg16              386,UNDOC,ND 
+       IBTS             mem,reg32                386,SD,UNDOC,ND 
+       IBTS             reg32,reg32              386,UNDOC,ND 
+       ICEBP                                     386,ND 
+       IDIV             rm8                      8086 
+       IDIV             rm16                     8086 
+       IDIV             rm32                     386 
+       IDIV             rm64                     X64 
+       IMUL             rm8                      8086 
+       IMUL             rm16                     8086 
+       IMUL             rm32                     386 
+       IMUL             rm64                     X64 
+       IMUL             reg16,mem                386 
+       IMUL             reg16,reg16              386 
+       IMUL             reg32,mem                386 
+       IMUL             reg32,reg32              386 
+       IMUL             reg64,mem                X64 
+       IMUL             reg64,reg64              X64 
+       IMUL             reg16,mem,imm8           186 
+       IMUL             reg16,mem,sbyte16        186,ND 
+       IMUL             reg16,mem,imm16          186 
+       IMUL             reg16,mem,imm            186,ND 
+       IMUL             reg16,reg16,imm8         186 
+       IMUL             reg16,reg16,sbyte16      186,ND 
+       IMUL             reg16,reg16,imm16        186 
+       IMUL             reg16,reg16,imm          186,ND 
+       IMUL             reg32,mem,imm8           386 
+       IMUL             reg32,mem,sbyte32        386,ND 
+       IMUL             reg32,mem,imm32          386 
+       IMUL             reg32,mem,imm            386,ND 
+       IMUL             reg32,reg32,imm8         386 
+       IMUL             reg32,reg32,sbyte32      386,ND 
+       IMUL             reg32,reg32,imm32        386 
+       IMUL             reg32,reg32,imm          386,ND 
+       IMUL             reg64,mem,imm8           X64 
+       IMUL             reg64,mem,sbyte64        X64,ND 
+       IMUL             reg64,mem,imm32          X64 
+       IMUL             reg64,mem,imm            X64,ND 
+       IMUL             reg64,reg64,imm8         X64 
+       IMUL             reg64,reg64,sbyte64      X64,ND 
+       IMUL             reg64,reg64,imm32        X64 
+       IMUL             reg64,reg64,imm          X64,ND 
+       IMUL             reg16,imm8               186 
+       IMUL             reg16,sbyte16            186,ND 
+       IMUL             reg16,imm16              186 
+       IMUL             reg16,imm                186,ND 
+       IMUL             reg32,imm8               386 
+       IMUL             reg32,sbyte32            386,ND 
+       IMUL             reg32,imm32              386 
+       IMUL             reg32,imm                386,ND 
+       IMUL             reg64,imm8               X64 
+       IMUL             reg64,sbyte64            X64,ND 
+       IMUL             reg64,imm32              X64 
+       IMUL             reg64,imm                X64,ND 
+       IN               reg_al,imm               8086 
+       IN               reg_ax,imm               8086 
+       IN               reg_eax,imm              386 
+       IN               reg_al,reg_dx            8086 
+       IN               reg_ax,reg_dx            8086 
+       IN               reg_eax,reg_dx           386 
+       INC              reg16                    8086,NOLONG 
+       INC              reg32                    386,NOLONG 
+       INC              rm8                      8086 
+       INC              rm16                     8086 
+       INC              rm32                     386 
+       INC              rm64                     X64 
+       INCBIN                                     
+       INSB                                      186 
+       INSD                                      386 
+       INSW                                      186 
+       INT              imm                      8086 
+       INT01                                     386,ND 
+       INT1                                      386 
+       INT03                                     8086,ND 
+       INT3                                      8086 
+       INTO                                      8086,NOLONG 
+       INVD                                      486,PRIV 
+       INVLPG           mem                      486,PRIV 
+       INVLPGA          reg_ax,reg_ecx           X86_64,AMD,NOLONG 
+       INVLPGA          reg_eax,reg_ecx          X86_64,AMD 
+       INVLPGA          reg_rax,reg_ecx          X64,AMD 
+       INVLPGA                                   X86_64,AMD 
+       IRET                                      8086 
+       IRETD                                     386 
+       IRETQ                                     X64 
+       IRETW                                     8086 
+       JCXZ             imm                      8086,NOLONG 
+       JECXZ            imm                      386 
+       JRCXZ            imm                      X64 
+       JMP              imm|short                8086 
+       JMP              imm                      8086,ND 
+       JMP              imm                      8086 
+       JMP              imm|near                 8086,ND 
+       JMP              imm|far                  8086,ND,NOLONG 
+       JMP              imm16                    8086 
+       JMP              imm16|near               8086,ND 
+       JMP              imm16|far                8086,ND,NOLONG 
+       JMP              imm32                    386 
+       JMP              imm32|near               386,ND 
+       JMP              imm32|far                386,ND,NOLONG 
+       JMP              imm:imm                  8086,NOLONG 
+       JMP              imm16:imm                8086,NOLONG 
+       JMP              imm:imm16                8086,NOLONG 
+       JMP              imm32:imm                386,NOLONG 
+       JMP              imm:imm32                386,NOLONG 
+       JMP              mem|far                  8086,NOLONG 
+       JMP              mem|far                  X64 
+       JMP              mem16|far                8086 
+       JMP              mem32|far                386 
+       JMP              mem64|far                X64 
+       JMP              mem|near                 8086 
+       JMP              mem16|near               8086 
+       JMP              mem32|near               386,NOLONG 
+       JMP              mem64|near               X64 
+       JMP              reg16                    8086 
+       JMP              reg32                    386,NOLONG 
+       JMP              reg64                    X64 
+       JMP              mem                      8086 
+       JMP              mem16                    8086 
+       JMP              mem32                    386,NOLONG 
+       JMP              mem64                    X64 
+       JMPE             imm                      IA64 
+       JMPE             imm16                    IA64 
+       JMPE             imm32                    IA64 
+       JMPE             rm16                     IA64 
+       JMPE             rm32                     IA64 
+       LAHF                                      8086 
+       LAR              reg16,mem                286,PROT,SW 
+       LAR              reg16,reg16              286,PROT 
+       LAR              reg16,reg32              386,PROT 
+       LAR              reg16,reg64              X64,PROT,ND 
+       LAR              reg32,mem                386,PROT,SW 
+       LAR              reg32,reg16              386,PROT 
+       LAR              reg32,reg32              386,PROT 
+       LAR              reg32,reg64              X64,PROT,ND 
+       LAR              reg64,mem                X64,PROT,SW 
+       LAR              reg64,reg16              X64,PROT 
+       LAR              reg64,reg32              X64,PROT 
+       LAR              reg64,reg64              X64,PROT 
+       LDS              reg16,mem                8086,NOLONG 
+       LDS              reg32,mem                386,NOLONG 
+       LEA              reg16,mem                8086 
+       LEA              reg32,mem                386 
+       LEA              reg64,mem                X64 
+       LEAVE                                     186 
+       LES              reg16,mem                8086,NOLONG 
+       LES              reg32,mem                386,NOLONG 
+       LFENCE                                    X64,AMD 
+       LFS              reg16,mem                386 
+       LFS              reg32,mem                386 
+       LGDT             mem                      286,PRIV 
+       LGS              reg16,mem                386 
+       LGS              reg32,mem                386 
+       LIDT             mem                      286,PRIV 
+       LLDT             mem                      286,PROT,PRIV 
+       LLDT             mem16                    286,PROT,PRIV 
+       LLDT             reg16                    286,PROT,PRIV 
+       LMSW             mem                      286,PRIV 
+       LMSW             mem16                    286,PRIV 
+       LMSW             reg16                    286,PRIV 
+       LOADALL                                   386,UNDOC 
+       LOADALL286                                286,UNDOC 
+       LODSB                                     8086 
+       LODSD                                     386 
+       LODSQ                                     X64 
+       LODSW                                     8086 
+       LOOP             imm                      8086 
+       LOOP             imm,reg_cx               8086,NOLONG 
+       LOOP             imm,reg_ecx              386 
+       LOOP             imm,reg_rcx              X64 
+       LOOPE            imm                      8086 
+       LOOPE            imm,reg_cx               8086,NOLONG 
+       LOOPE            imm,reg_ecx              386 
+       LOOPE            imm,reg_rcx              X64 
+       LOOPNE           imm                      8086 
+       LOOPNE           imm,reg_cx               8086,NOLONG 
+       LOOPNE           imm,reg_ecx              386 
+       LOOPNE           imm,reg_rcx              X64 
+       LOOPNZ           imm                      8086 
+       LOOPNZ           imm,reg_cx               8086,NOLONG 
+       LOOPNZ           imm,reg_ecx              386 
+       LOOPNZ           imm,reg_rcx              X64 
+       LOOPZ            imm                      8086 
+       LOOPZ            imm,reg_cx               8086,NOLONG 
+       LOOPZ            imm,reg_ecx              386 
+       LOOPZ            imm,reg_rcx              X64 
+       LSL              reg16,mem                286,PROT,SW 
+       LSL              reg16,reg16              286,PROT 
+       LSL              reg16,reg32              386,PROT 
+       LSL              reg16,reg64              X64,PROT,ND 
+       LSL              reg32,mem                386,PROT,SW 
+       LSL              reg32,reg16              386,PROT 
+       LSL              reg32,reg32              386,PROT 
+       LSL              reg32,reg64              X64,PROT,ND 
+       LSL              reg64,mem                X64,PROT,SW 
+       LSL              reg64,reg16              X64,PROT 
+       LSL              reg64,reg32              X64,PROT 
+       LSL              reg64,reg64              X64,PROT 
+       LSS              reg16,mem                386 
+       LSS              reg32,mem                386 
+       LTR              mem                      286,PROT,PRIV 
+       LTR              mem16                    286,PROT,PRIV 
+       LTR              reg16                    286,PROT,PRIV 
+       MFENCE                                    X64,AMD 
+       MONITOR                                   PRESCOTT 
+       MONITOR          reg_eax,reg_ecx,reg_edx  PRESCOTT,ND 
+       MONITOR          reg_rax,reg_ecx,reg_edx  X64,ND 
+       MOV              mem,reg_sreg             8086 
+       MOV              reg16,reg_sreg           8086 
+       MOV              reg32,reg_sreg           386 
+       MOV              reg_sreg,mem             8086 
+       MOV              reg_sreg,reg16           8086 
+       MOV              reg_sreg,reg32           386 
+       MOV              reg_al,mem_offs          8086 
+       MOV              reg_ax,mem_offs          8086 
+       MOV              reg_eax,mem_offs         386 
+       MOV              reg_rax,mem_offs         X64 
+       MOV              mem_offs,reg_al          8086 
+       MOV              mem_offs,reg_ax          8086 
+       MOV              mem_offs,reg_eax         386 
+       MOV              mem_offs,reg_rax         X64 
+       MOV              reg32,reg_creg           386,PRIV,NOLONG 
+       MOV              reg64,reg_creg           X64,PRIV 
+       MOV              reg_creg,reg32           386,PRIV,NOLONG 
+       MOV              reg_creg,reg64           X64,PRIV 
+       MOV              reg32,reg_dreg           386,PRIV,NOLONG 
+       MOV              reg64,reg_dreg           X64,PRIV 
+       MOV              reg_dreg,reg32           386,PRIV,NOLONG 
+       MOV              reg_dreg,reg64           X64,PRIV 
+       MOV              reg32,reg_treg           386,NOLONG,ND 
+       MOV              reg_treg,reg32           386,NOLONG,ND 
+       MOV              mem,reg8                 8086 
+       MOV              reg8,reg8                8086 
+       MOV              mem,reg16                8086 
+       MOV              reg16,reg16              8086 
+       MOV              mem,reg32                386 
+       MOV              reg32,reg32              386 
+       MOV              mem,reg64                X64 
+       MOV              reg64,reg64              X64 
+       MOV              reg8,mem                 8086 
+       MOV              reg8,reg8                8086 
+       MOV              reg16,mem                8086 
+       MOV              reg16,reg16              8086 
+       MOV              reg32,mem                386 
+       MOV              reg32,reg32              386 
+       MOV              reg64,mem                X64 
+       MOV              reg64,reg64              X64 
+       MOV              reg8,imm                 8086 
+       MOV              reg16,imm                8086 
+       MOV              reg32,imm                386 
+       MOV              reg64,imm                X64 
+       MOV              reg64,imm32              X64 
+       MOV              rm8,imm                  8086 
+       MOV              rm16,imm                 8086 
+       MOV              rm32,imm                 386 
+       MOV              rm64,imm                 X64 
+       MOV              mem,imm8                 8086 
+       MOV              mem,imm16                8086 
+       MOV              mem,imm32                386 
+       MOVD             mmxreg,mem               PENT,MMX,SD 
+       MOVD             mmxreg,reg32             PENT,MMX 
+       MOVD             mem,mmxreg               PENT,MMX,SD 
+       MOVD             reg32,mmxreg             PENT,MMX 
+       MOVD             xmmreg,mem               X64,SD 
+       MOVD             xmmreg,reg32             X64 
+       MOVD             mem,xmmreg               X64,SD 
+       MOVD             reg32,xmmreg             X64,SSE 
+       MOVQ             mmxreg,mmxrm             PENT,MMX 
+       MOVQ             mmxrm,mmxreg             PENT,MMX 
+       MOVQ             mmxreg,rm64              X64,MMX 
+       MOVQ             rm64,mmxreg              X64,MMX 
+       MOVSB                                     8086 
+       MOVSD                                     386 
+       MOVSQ                                     X64 
+       MOVSW                                     8086 
+       MOVSX            reg16,mem                386 
+       MOVSX            reg16,reg8               386 
+       MOVSX            reg32,rm8                386 
+       MOVSX            reg32,rm16               386 
+       MOVSX            reg64,rm8                X64 
+       MOVSX            reg64,rm16               X64 
+       MOVSXD           reg64,rm32               X64 
+       MOVSX            reg64,rm32               X64,ND 
+       MOVZX            reg16,mem                386 
+       MOVZX            reg16,reg8               386 
+       MOVZX            reg32,rm8                386 
+       MOVZX            reg32,rm16               386 
+       MOVZX            reg64,rm8                X64 
+       MOVZX            reg64,rm16               X64 
+       MUL              rm8                      8086 
+       MUL              rm16                     8086 
+       MUL              rm32                     386 
+       MUL              rm64                     X64 
+       MWAIT                                     PRESCOTT 
+       MWAIT            reg_eax,reg_ecx          PRESCOTT,ND 
+       NEG              rm8                      8086 
+       NEG              rm16                     8086 
+       NEG              rm32                     386 
+       NEG              rm64                     X64 
+       NOP                                       8086 
+       NOP              rm16                     P6 
+       NOP              rm32                     P6 
+       NOP              rm64                     X64 
+       NOT              rm8                      8086 
+       NOT              rm16                     8086 
+       NOT              rm32                     386 
+       NOT              rm64                     X64 
+       OR               mem,reg8                 8086 
+       OR               reg8,reg8                8086 
+       OR               mem,reg16                8086 
+       OR               reg16,reg16              8086 
+       OR               mem,reg32                386 
+       OR               reg32,reg32              386 
+       OR               mem,reg64                X64 
+       OR               reg64,reg64              X64 
+       OR               reg8,mem                 8086 
+       OR               reg8,reg8                8086 
+       OR               reg16,mem                8086 
+       OR               reg16,reg16              8086 
+       OR               reg32,mem                386 
+       OR               reg32,reg32              386 
+       OR               reg64,mem                X64 
+       OR               reg64,reg64              X64 
+       OR               rm16,imm8                8086 
+       OR               rm32,imm8                386 
+       OR               rm64,imm8                X64 
+       OR               reg_al,imm               8086 
+       OR               reg_ax,sbyte16           8086 
+       OR               reg_ax,imm               8086 
+       OR               reg_eax,sbyte32          386 
+       OR               reg_eax,imm              386 
+       OR               reg_rax,sbyte64          X64 
+       OR               reg_rax,imm              X64 
+       OR               rm8,imm                  8086 
+       OR               rm16,imm                 8086 
+       OR               rm32,imm                 386 
+       OR               rm64,imm                 X64 
+       OR               mem,imm8                 8086 
+       OR               mem,imm16                8086 
+       OR               mem,imm32                386 
+       OUT              imm,reg_al               8086 
+       OUT              imm,reg_ax               8086 
+       OUT              imm,reg_eax              386 
+       OUT              reg_dx,reg_al            8086 
+       OUT              reg_dx,reg_ax            8086 
+       OUT              reg_dx,reg_eax           386 
+       OUTSB                                     186 
+       OUTSD                                     386 
+       OUTSW                                     186 
+       PACKSSDW         mmxreg,mmxrm             PENT,MMX 
+       PACKSSWB         mmxreg,mmxrm             PENT,MMX 
+       PACKUSWB         mmxreg,mmxrm             PENT,MMX 
+       PADDB            mmxreg,mmxrm             PENT,MMX 
+       PADDD            mmxreg,mmxrm             PENT,MMX 
+       PADDSB           mmxreg,mmxrm             PENT,MMX 
+       PADDSIW          mmxreg,mmxrm             PENT,MMX,CYRIX 
+       PADDSW           mmxreg,mmxrm             PENT,MMX 
+       PADDUSB          mmxreg,mmxrm             PENT,MMX 
+       PADDUSW          mmxreg,mmxrm             PENT,MMX 
+       PADDW            mmxreg,mmxrm             PENT,MMX 
+       PAND             mmxreg,mmxrm             PENT,MMX 
+       PANDN            mmxreg,mmxrm             PENT,MMX 
+       PAUSE                                     8086 
+       PAVEB            mmxreg,mmxrm             PENT,MMX,CYRIX 
+       PAVGUSB          mmxreg,mmxrm             PENT,3DNOW 
+       PCMPEQB          mmxreg,mmxrm             PENT,MMX 
+       PCMPEQD          mmxreg,mmxrm             PENT,MMX 
+       PCMPEQW          mmxreg,mmxrm             PENT,MMX 
+       PCMPGTB          mmxreg,mmxrm             PENT,MMX 
+       PCMPGTD          mmxreg,mmxrm             PENT,MMX 
+       PCMPGTW          mmxreg,mmxrm             PENT,MMX 
+       PDISTIB          mmxreg,mem               PENT,MMX,CYRIX 
+       PF2ID            mmxreg,mmxrm             PENT,3DNOW 
+       PFACC            mmxreg,mmxrm             PENT,3DNOW 
+       PFADD            mmxreg,mmxrm             PENT,3DNOW 
+       PFCMPEQ          mmxreg,mmxrm             PENT,3DNOW 
+       PFCMPGE          mmxreg,mmxrm             PENT,3DNOW 
+       PFCMPGT          mmxreg,mmxrm             PENT,3DNOW 
+       PFMAX            mmxreg,mmxrm             PENT,3DNOW 
+       PFMIN            mmxreg,mmxrm             PENT,3DNOW 
+       PFMUL            mmxreg,mmxrm             PENT,3DNOW 
+       PFRCP            mmxreg,mmxrm             PENT,3DNOW 
+       PFRCPIT1         mmxreg,mmxrm             PENT,3DNOW 
+       PFRCPIT2         mmxreg,mmxrm             PENT,3DNOW 
+       PFRSQIT1         mmxreg,mmxrm             PENT,3DNOW 
+       PFRSQRT          mmxreg,mmxrm             PENT,3DNOW 
+       PFSUB            mmxreg,mmxrm             PENT,3DNOW 
+       PFSUBR           mmxreg,mmxrm             PENT,3DNOW 
+       PI2FD            mmxreg,mmxrm             PENT,3DNOW 
+       PMACHRIW         mmxreg,mem               PENT,MMX,CYRIX 
+       PMADDWD          mmxreg,mmxrm             PENT,MMX 
+       PMAGW            mmxreg,mmxrm             PENT,MMX,CYRIX 
+       PMULHRIW         mmxreg,mmxrm             PENT,MMX,CYRIX 
+       PMULHRWA         mmxreg,mmxrm             PENT,3DNOW 
+       PMULHRWC         mmxreg,mmxrm             PENT,MMX,CYRIX 
+       PMULHW           mmxreg,mmxrm             PENT,MMX 
+       PMULLW           mmxreg,mmxrm             PENT,MMX 
+       PMVGEZB          mmxreg,mem               PENT,MMX,CYRIX 
+       PMVLZB           mmxreg,mem               PENT,MMX,CYRIX 
+       PMVNZB           mmxreg,mem               PENT,MMX,CYRIX 
+       PMVZB            mmxreg,mem               PENT,MMX,CYRIX 
+       POP              reg16                    8086 
+       POP              reg32                    386,NOLONG 
+       POP              reg64                    X64 
+       POP              rm16                     8086 
+       POP              rm32                     386,NOLONG 
+       POP              rm64                     X64 
+       POP              reg_cs                   8086,UNDOC,ND 
+       POP              reg_dess                 8086,NOLONG 
+       POP              reg_fsgs                 386 
+       POPA                                      186,NOLONG 
+       POPAD                                     386,NOLONG 
+       POPAW                                     186,NOLONG 
+       POPF                                      8086 
+       POPFD                                     386,NOLONG 
+       POPFQ                                     X64 
+       POPFW                                     8086 
+       POR              mmxreg,mmxrm             PENT,MMX 
+       PREFETCH         mem                      PENT,3DNOW 
+       PREFETCHW        mem                      PENT,3DNOW 
+       PSLLD            mmxreg,mmxrm             PENT,MMX 
+       PSLLD            mmxreg,imm               PENT,MMX 
+       PSLLQ            mmxreg,mmxrm             PENT,MMX 
+       PSLLQ            mmxreg,imm               PENT,MMX 
+       PSLLW            mmxreg,mmxrm             PENT,MMX 
+       PSLLW            mmxreg,imm               PENT,MMX 
+       PSRAD            mmxreg,mmxrm             PENT,MMX 
+       PSRAD            mmxreg,imm               PENT,MMX 
+       PSRAW            mmxreg,mmxrm             PENT,MMX 
+       PSRAW            mmxreg,imm               PENT,MMX 
+       PSRLD            mmxreg,mmxrm             PENT,MMX 
+       PSRLD            mmxreg,imm               PENT,MMX 
+       PSRLQ            mmxreg,mmxrm             PENT,MMX 
+       PSRLQ            mmxreg,imm               PENT,MMX 
+       PSRLW            mmxreg,mmxrm             PENT,MMX 
+       PSRLW            mmxreg,imm               PENT,MMX 
+       PSUBB            mmxreg,mmxrm             PENT,MMX 
+       PSUBD            mmxreg,mmxrm             PENT,MMX 
+       PSUBSB           mmxreg,mmxrm             PENT,MMX 
+       PSUBSIW          mmxreg,mmxrm             PENT,MMX,CYRIX 
+       PSUBSW           mmxreg,mmxrm             PENT,MMX 
+       PSUBUSB          mmxreg,mmxrm             PENT,MMX 
+       PSUBUSW          mmxreg,mmxrm             PENT,MMX 
+       PSUBW            mmxreg,mmxrm             PENT,MMX 
+       PUNPCKHBW        mmxreg,mmxrm             PENT,MMX 
+       PUNPCKHDQ        mmxreg,mmxrm             PENT,MMX 
+       PUNPCKHWD        mmxreg,mmxrm             PENT,MMX 
+       PUNPCKLBW        mmxreg,mmxrm             PENT,MMX 
+       PUNPCKLDQ        mmxreg,mmxrm             PENT,MMX 
+       PUNPCKLWD        mmxreg,mmxrm             PENT,MMX 
+       PUSH             reg16                    8086 
+       PUSH             reg32                    386,NOLONG 
+       PUSH             reg64                    X64 
+       PUSH             rm16                     8086 
+       PUSH             rm32                     386,NOLONG 
+       PUSH             rm64                     X64 
+       PUSH             reg_cs                   8086,NOLONG 
+       PUSH             reg_dess                 8086,NOLONG 
+       PUSH             reg_fsgs                 386 
+       PUSH             imm8                     186 
+       PUSH             imm16                    186,AR0,SZ 
+       PUSH             imm32                    386,NOLONG,AR0,SZ 
+       PUSH             imm32                    386,NOLONG,SD 
+       PUSH             imm64                    X64,AR0,SZ 
+       PUSHA                                     186,NOLONG 
+       PUSHAD                                    386,NOLONG 
+       PUSHAW                                    186,NOLONG 
+       PUSHF                                     8086 
+       PUSHFD                                    386,NOLONG 
+       PUSHFQ                                    X64 
+       PUSHFW                                    8086 
+       PXOR             mmxreg,mmxrm             PENT,MMX 
+       RCL              rm8,unity                8086 
+       RCL              rm8,reg_cl               8086 
+       RCL              rm8,imm                  186 
+       RCL              rm16,unity               8086 
+       RCL              rm16,reg_cl              8086 
+       RCL              rm16,imm                 186 
+       RCL              rm32,unity               386 
+       RCL              rm32,reg_cl              386 
+       RCL              rm32,imm                 386 
+       RCL              rm64,unity               X64 
+       RCL              rm64,reg_cl              X64 
+       RCL              rm64,imm                 X64 
+       RCR              rm8,unity                8086 
+       RCR              rm8,reg_cl               8086 
+       RCR              rm8,imm                  186 
+       RCR              rm16,unity               8086 
+       RCR              rm16,reg_cl              8086 
+       RCR              rm16,imm                 186 
+       RCR              rm32,unity               386 
+       RCR              rm32,reg_cl              386 
+       RCR              rm32,imm                 386 
+       RCR              rm64,unity               X64 
+       RCR              rm64,reg_cl              X64 
+       RCR              rm64,imm                 X64 
+       RDSHR            rm32                     P6,CYRIXM 
+       RDMSR                                     PENT,PRIV 
+       RDPMC                                     P6 
+       RDTSC                                     PENT 
+       RDTSCP                                    X86_64 
+       RET                                       8086 
+       RET              imm                      8086,SW 
+       RETF                                      8086 
+       RETF             imm                      8086,SW 
+       RETN                                      8086 
+       RETN             imm                      8086,SW 
+       ROL              rm8,unity                8086 
+       ROL              rm8,reg_cl               8086 
+       ROL              rm8,imm                  186 
+       ROL              rm16,unity               8086 
+       ROL              rm16,reg_cl              8086 
+       ROL              rm16,imm                 186 
+       ROL              rm32,unity               386 
+       ROL              rm32,reg_cl              386 
+       ROL              rm32,imm                 386 
+       ROL              rm64,unity               X64 
+       ROL              rm64,reg_cl              X64 
+       ROL              rm64,imm                 X64 
+       ROR              rm8,unity                8086 
+       ROR              rm8,reg_cl               8086 
+       ROR              rm8,imm                  186 
+       ROR              rm16,unity               8086 
+       ROR              rm16,reg_cl              8086 
+       ROR              rm16,imm                 186 
+       ROR              rm32,unity               386 
+       ROR              rm32,reg_cl              386 
+       ROR              rm32,imm                 386 
+       ROR              rm64,unity               X64 
+       ROR              rm64,reg_cl              X64 
+       ROR              rm64,imm                 X64 
+       RDM                                       P6,CYRIX,ND 
+       RSDC             reg_sreg,mem80           486,CYRIXM 
+       RSLDT            mem80                    486,CYRIXM 
+       RSM                                       PENTM 
+       RSTS             mem80                    486,CYRIXM 
+       SAHF                                      8086 
+       SAL              rm8,unity                8086,ND 
+       SAL              rm8,reg_cl               8086,ND 
+       SAL              rm8,imm                  186,ND 
+       SAL              rm16,unity               8086,ND 
+       SAL              rm16,reg_cl              8086,ND 
+       SAL              rm16,imm                 186,ND 
+       SAL              rm32,unity               386,ND 
+       SAL              rm32,reg_cl              386,ND 
+       SAL              rm32,imm                 386,ND 
+       SAL              rm64,unity               X64,ND 
+       SAL              rm64,reg_cl              X64,ND 
+       SAL              rm64,imm                 X64,ND 
+       SALC                                      8086,UNDOC 
+       SAR              rm8,unity                8086 
+       SAR              rm8,reg_cl               8086 
+       SAR              rm8,imm                  186 
+       SAR              rm16,unity               8086 
+       SAR              rm16,reg_cl              8086 
+       SAR              rm16,imm                 186 
+       SAR              rm32,unity               386 
+       SAR              rm32,reg_cl              386 
+       SAR              rm32,imm                 386 
+       SAR              rm64,unity               X64 
+       SAR              rm64,reg_cl              X64 
+       SAR              rm64,imm                 X64 
+       SBB              mem,reg8                 8086 
+       SBB              reg8,reg8                8086 
+       SBB              mem,reg16                8086 
+       SBB              reg16,reg16              8086 
+       SBB              mem,reg32                386 
+       SBB              reg32,reg32              386 
+       SBB              mem,reg64                X64 
+       SBB              reg64,reg64              X64 
+       SBB              reg8,mem                 8086 
+       SBB              reg8,reg8                8086 
+       SBB              reg16,mem                8086 
+       SBB              reg16,reg16              8086 
+       SBB              reg32,mem                386 
+       SBB              reg32,reg32              386 
+       SBB              reg64,mem                X64 
+       SBB              reg64,reg64              X64 
+       SBB              rm16,imm8                8086 
+       SBB              rm32,imm8                386 
+       SBB              rm64,imm8                X64 
+       SBB              reg_al,imm               8086 
+       SBB              reg_ax,sbyte16           8086 
+       SBB              reg_ax,imm               8086 
+       SBB              reg_eax,sbyte32          386 
+       SBB              reg_eax,imm              386 
+       SBB              reg_rax,sbyte64          X64 
+       SBB              reg_rax,imm              X64 
+       SBB              rm8,imm                  8086 
+       SBB              rm16,imm                 8086 
+       SBB              rm32,imm                 386 
+       SBB              rm64,imm                 X64 
+       SBB              mem,imm8                 8086 
+       SBB              mem,imm16                8086 
+       SBB              mem,imm32                386 
+       SCASB                                     8086 
+       SCASD                                     386 
+       SCASQ                                     X64 
+       SCASW                                     8086 
+       SFENCE                                    X64,AMD 
+       SGDT             mem                      286 
+       SHL              rm8,unity                8086 
+       SHL              rm8,reg_cl               8086 
+       SHL              rm8,imm                  186 
+       SHL              rm16,unity               8086 
+       SHL              rm16,reg_cl              8086 
+       SHL              rm16,imm                 186 
+       SHL              rm32,unity               386 
+       SHL              rm32,reg_cl              386 
+       SHL              rm32,imm                 386 
+       SHL              rm64,unity               X64 
+       SHL              rm64,reg_cl              X64 
+       SHL              rm64,imm                 X64 
+       SHLD             mem,reg16,imm            3862 
+       SHLD             reg16,reg16,imm          3862 
+       SHLD             mem,reg32,imm            3862 
+       SHLD             reg32,reg32,imm          3862 
+       SHLD             mem,reg64,imm            X642 
+       SHLD             reg64,reg64,imm          X642 
+       SHLD             mem,reg16,reg_cl         386 
+       SHLD             reg16,reg16,reg_cl       386 
+       SHLD             mem,reg32,reg_cl         386 
+       SHLD             reg32,reg32,reg_cl       386 
+       SHLD             mem,reg64,reg_cl         X64 
+       SHLD             reg64,reg64,reg_cl       X64 
+       SHR              rm8,unity                8086 
+       SHR              rm8,reg_cl               8086 
+       SHR              rm8,imm                  186 
+       SHR              rm16,unity               8086 
+       SHR              rm16,reg_cl              8086 
+       SHR              rm16,imm                 186 
+       SHR              rm32,unity               386 
+       SHR              rm32,reg_cl              386 
+       SHR              rm32,imm                 386 
+       SHR              rm64,unity               X64 
+       SHR              rm64,reg_cl              X64 
+       SHR              rm64,imm                 X64 
+       SHRD             mem,reg16,imm            3862 
+       SHRD             reg16,reg16,imm          3862 
+       SHRD             mem,reg32,imm            3862 
+       SHRD             reg32,reg32,imm          3862 
+       SHRD             mem,reg64,imm            X642 
+       SHRD             reg64,reg64,imm          X642 
+       SHRD             mem,reg16,reg_cl         386 
+       SHRD             reg16,reg16,reg_cl       386 
+       SHRD             mem,reg32,reg_cl         386 
+       SHRD             reg32,reg32,reg_cl       386 
+       SHRD             mem,reg64,reg_cl         X64 
+       SHRD             reg64,reg64,reg_cl       X64 
+       SIDT             mem                      286 
+       SLDT             mem                      286 
+       SLDT             mem16                    286 
+       SLDT             reg16                    286 
+       SLDT             reg32                    386 
+       SLDT             reg64                    X64,ND 
+       SLDT             reg64                    X64 
+       SKINIT                                    X64 
+       SMI                                       386,UNDOC 
+       SMINT                                     P6,CYRIX,ND 
+       SMINTOLD                                  486,CYRIX,ND 
+       SMSW             mem                      286 
+       SMSW             mem16                    286 
+       SMSW             reg16                    286 
+       SMSW             reg32                    386 
+       STC                                       8086 
+       STD                                       8086 
+       STGI                                      X64 
+       STI                                       8086 
+       STOSB                                     8086 
+       STOSD                                     386 
+       STOSQ                                     X64 
+       STOSW                                     8086 
+       STR              mem                      286,PROT 
+       STR              mem16                    286,PROT 
+       STR              reg16                    286,PROT 
+       STR              reg32                    386,PROT 
+       STR              reg64                    X64 
+       SUB              mem,reg8                 8086 
+       SUB              reg8,reg8                8086 
+       SUB              mem,reg16                8086 
+       SUB              reg16,reg16              8086 
+       SUB              mem,reg32                386 
+       SUB              reg32,reg32              386 
+       SUB              mem,reg64                X64 
+       SUB              reg64,reg64              X64 
+       SUB              reg8,mem                 8086 
+       SUB              reg8,reg8                8086 
+       SUB              reg16,mem                8086 
+       SUB              reg16,reg16              8086 
+       SUB              reg32,mem                386 
+       SUB              reg32,reg32              386 
+       SUB              reg64,mem                X64 
+       SUB              reg64,reg64              X64 
+       SUB              rm16,imm8                8086 
+       SUB              rm32,imm8                386 
+       SUB              rm64,imm8                X64 
+       SUB              reg_al,imm               8086 
+       SUB              reg_ax,sbyte16           8086 
+       SUB              reg_ax,imm               8086 
+       SUB              reg_eax,sbyte32          386 
+       SUB              reg_eax,imm              386 
+       SUB              reg_rax,sbyte64          X64 
+       SUB              reg_rax,imm              X64 
+       SUB              rm8,imm                  8086 
+       SUB              rm16,imm                 8086 
+       SUB              rm32,imm                 386 
+       SUB              rm64,imm                 X64 
+       SUB              mem,imm8                 8086 
+       SUB              mem,imm16                8086 
+       SUB              mem,imm32                386 
+       SVDC             mem80,reg_sreg           486,CYRIXM 
+       SVLDT            mem80                    486,CYRIXM,ND 
+       SVTS             mem80                    486,CYRIXM 
+       SWAPGS                                    X64 
+       SYSCALL                                   P6,AMD 
+       SYSENTER                                  P6 
+       SYSEXIT                                   P6,PRIV 
+       SYSRET                                    P6,PRIV,AMD 
+       TEST             mem,reg8                 8086 
+       TEST             reg8,reg8                8086 
+       TEST             mem,reg16                8086 
+       TEST             reg16,reg16              8086 
+       TEST             mem,reg32                386 
+       TEST             reg32,reg32              386 
+       TEST             mem,reg64                X64 
+       TEST             reg64,reg64              X64 
+       TEST             reg8,mem                 8086 
+       TEST             reg16,mem                8086 
+       TEST             reg32,mem                386 
+       TEST             reg64,mem                X64 
+       TEST             reg_al,imm               8086 
+       TEST             reg_ax,imm               8086 
+       TEST             reg_eax,imm              386 
+       TEST             reg_rax,imm              X64 
+       TEST             rm8,imm                  8086 
+       TEST             rm16,imm                 8086 
+       TEST             rm32,imm                 386 
+       TEST             rm64,imm                 X64 
+       TEST             mem,imm8                 8086 
+       TEST             mem,imm16                8086 
+       TEST             mem,imm32                386 
+       UD0                                       186,UNDOC 
+       UD1                                       186,UNDOC 
+       UD2B                                      186,UNDOC,ND 
+       UD2                                       186 
+       UD2A                                      186,ND 
+       UMOV             mem,reg8                 386,UNDOC,ND 
+       UMOV             reg8,reg8                386,UNDOC,ND 
+       UMOV             mem,reg16                386,UNDOC,ND 
+       UMOV             reg16,reg16              386,UNDOC,ND 
+       UMOV             mem,reg32                386,UNDOC,ND 
+       UMOV             reg32,reg32              386,UNDOC,ND 
+       UMOV             reg8,mem                 386,UNDOC,ND 
+       UMOV             reg8,reg8                386,UNDOC,ND 
+       UMOV             reg16,mem                386,UNDOC,ND 
+       UMOV             reg16,reg16              386,UNDOC,ND 
+       UMOV             reg32,mem                386,UNDOC,ND 
+       UMOV             reg32,reg32              386,UNDOC,ND 
+       VERR             mem                      286,PROT 
+       VERR             mem16                    286,PROT 
+       VERR             reg16                    286,PROT 
+       VERW             mem                      286,PROT 
+       VERW             mem16                    286,PROT 
+       VERW             reg16                    286,PROT 
+       FWAIT                                     8086 
+       WBINVD                                    486,PRIV 
+       WRSHR            rm32                     P6,CYRIXM 
+       WRMSR                                     PENT,PRIV 
+       XADD             mem,reg8                 486 
+       XADD             reg8,reg8                486 
+       XADD             mem,reg16                486 
+       XADD             reg16,reg16              486 
+       XADD             mem,reg32                486 
+       XADD             reg32,reg32              486 
+       XADD             mem,reg64                X64 
+       XADD             reg64,reg64              X64 
+       XBTS             reg16,mem                386,SW,UNDOC,ND 
+       XBTS             reg16,reg16              386,UNDOC,ND 
+       XBTS             reg32,mem                386,SD,UNDOC,ND 
+       XBTS             reg32,reg32              386,UNDOC,ND 
+       XCHG             reg_ax,reg16             8086 
+       XCHG             reg_eax,reg32na          386 
+       XCHG             reg_rax,reg64            X64 
+       XCHG             reg16,reg_ax             8086 
+       XCHG             reg32na,reg_eax          386 
+       XCHG             reg64,reg_rax            X64 
+       XCHG             reg_eax,reg_eax          386,NOLONG 
+       XCHG             reg8,mem                 8086 
+       XCHG             reg8,reg8                8086 
+       XCHG             reg16,mem                8086 
+       XCHG             reg16,reg16              8086 
+       XCHG             reg32,mem                386 
+       XCHG             reg32,reg32              386 
+       XCHG             reg64,mem                X64 
+       XCHG             reg64,reg64              X64 
+       XCHG             mem,reg8                 8086 
+       XCHG             reg8,reg8                8086 
+       XCHG             mem,reg16                8086 
+       XCHG             reg16,reg16              8086 
+       XCHG             mem,reg32                386 
+       XCHG             reg32,reg32              386 
+       XCHG             mem,reg64                X64 
+       XCHG             reg64,reg64              X64 
+       XLATB                                     8086 
+       XLAT                                      8086 
+       XOR              mem,reg8                 8086 
+       XOR              reg8,reg8                8086 
+       XOR              mem,reg16                8086 
+       XOR              reg16,reg16              8086 
+       XOR              mem,reg32                386 
+       XOR              reg32,reg32              386 
+       XOR              mem,reg64                X64 
+       XOR              reg64,reg64              X64 
+       XOR              reg8,mem                 8086 
+       XOR              reg8,reg8                8086 
+       XOR              reg16,mem                8086 
+       XOR              reg16,reg16              8086 
+       XOR              reg32,mem                386 
+       XOR              reg32,reg32              386 
+       XOR              reg64,mem                X64 
+       XOR              reg64,reg64              X64 
+       XOR              rm16,imm8                8086 
+       XOR              rm32,imm8                386 
+       XOR              rm64,imm8                X64 
+       XOR              reg_al,imm               8086 
+       XOR              reg_ax,sbyte16           8086 
+       XOR              reg_ax,imm               8086 
+       XOR              reg_eax,sbyte32          386 
+       XOR              reg_eax,imm              386 
+       XOR              reg_rax,sbyte64          X64 
+       XOR              reg_rax,imm              X64 
+       XOR              rm8,imm                  8086 
+       XOR              rm16,imm                 8086 
+       XOR              rm32,imm                 386 
+       XOR              rm64,imm                 X64 
+       XOR              mem,imm8                 8086 
+       XOR              mem,imm16                8086 
+       XOR              mem,imm32                386 
+       CMOVcc           reg16,mem                P6 
+       CMOVcc           reg16,reg16              P6 
+       CMOVcc           reg32,mem                P6 
+       CMOVcc           reg32,reg32              P6 
+       CMOVcc           reg64,mem                X64 
+       CMOVcc           reg64,reg64              X64 
+       Jcc              imm|near                 386 
+       Jcc              imm16|near               386 
+       Jcc              imm32|near               386 
+       Jcc              imm|short                8086,ND 
+       Jcc              imm                      8086,ND 
+       Jcc              imm                      386,ND 
+       Jcc              imm                      8086,ND 
+       Jcc              imm                      8086 
+       SETcc            mem                      386 
+       SETcc            reg8                     386
+
+ B.1.3 Katmai Streaming SIMD instructions (SSE -- a.k.a. KNI, XMM, MMX2)
+
+       ADDPS            xmmreg,xmmrm             KATMAI,SSE 
+       ADDSS            xmmreg,xmmrm             KATMAI,SSE,SD 
+       ANDNPS           xmmreg,xmmrm             KATMAI,SSE 
+       ANDPS            xmmreg,xmmrm             KATMAI,SSE 
+       CMPEQPS          xmmreg,xmmrm             KATMAI,SSE 
+       CMPEQSS          xmmreg,xmmrm             KATMAI,SSE 
+       CMPLEPS          xmmreg,xmmrm             KATMAI,SSE 
+       CMPLESS          xmmreg,xmmrm             KATMAI,SSE 
+       CMPLTPS          xmmreg,xmmrm             KATMAI,SSE 
+       CMPLTSS          xmmreg,xmmrm             KATMAI,SSE 
+       CMPNEQPS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPNEQSS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPNLEPS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPNLESS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPNLTPS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPNLTSS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPORDPS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPORDSS         xmmreg,xmmrm             KATMAI,SSE 
+       CMPUNORDPS       xmmreg,xmmrm             KATMAI,SSE 
+       CMPUNORDSS       xmmreg,xmmrm             KATMAI,SSE 
+       CMPPS            xmmreg,mem,imm           KATMAI,SSE 
+       CMPPS            xmmreg,xmmreg,imm        KATMAI,SSE 
+       CMPSS            xmmreg,mem,imm           KATMAI,SSE 
+       CMPSS            xmmreg,xmmreg,imm        KATMAI,SSE 
+       COMISS           xmmreg,xmmrm             KATMAI,SSE 
+       CVTPI2PS         xmmreg,mmxrm             KATMAI,SSE,MMX 
+       CVTPS2PI         mmxreg,xmmrm             KATMAI,SSE,MMX 
+       CVTSI2SS         xmmreg,mem               KATMAI,SSE,SD,AR1,ND 
+       CVTSI2SS         xmmreg,rm32              KATMAI,SSE,SD,AR1 
+       CVTSI2SS         xmmreg,rm64              X64,SSE,AR1 
+       CVTSS2SI         reg32,xmmreg             KATMAI,SSE,SD,AR1 
+       CVTSS2SI         reg32,mem                KATMAI,SSE,SD,AR1 
+       CVTSS2SI         reg64,xmmreg             X64,SSE,SD,AR1 
+       CVTSS2SI         reg64,mem                X64,SSE,SD,AR1 
+       CVTTPS2PI        mmxreg,xmmrm             KATMAI,SSE,MMX 
+       CVTTSS2SI        reg32,xmmrm              KATMAI,SSE,SD,AR1 
+       CVTTSS2SI        reg64,xmmrm              X64,SSE,SD,AR1 
+       DIVPS            xmmreg,xmmrm             KATMAI,SSE 
+       DIVSS            xmmreg,xmmrm             KATMAI,SSE 
+       LDMXCSR          mem                      KATMAI,SSE,SD 
+       MAXPS            xmmreg,xmmrm             KATMAI,SSE 
+       MAXSS            xmmreg,xmmrm             KATMAI,SSE 
+       MINPS            xmmreg,xmmrm             KATMAI,SSE 
+       MINSS            xmmreg,xmmrm             KATMAI,SSE 
+       MOVAPS           xmmreg,mem               KATMAI,SSE 
+       MOVAPS           mem,xmmreg               KATMAI,SSE 
+       MOVAPS           xmmreg,xmmreg            KATMAI,SSE 
+       MOVAPS           xmmreg,xmmreg            KATMAI,SSE 
+       MOVHPS           xmmreg,mem               KATMAI,SSE 
+       MOVHPS           mem,xmmreg               KATMAI,SSE 
+       MOVLHPS          xmmreg,xmmreg            KATMAI,SSE 
+       MOVLPS           xmmreg,mem               KATMAI,SSE 
+       MOVLPS           mem,xmmreg               KATMAI,SSE 
+       MOVHLPS          xmmreg,xmmreg            KATMAI,SSE 
+       MOVMSKPS         reg32,xmmreg             KATMAI,SSE 
+       MOVMSKPS         reg64,xmmreg             X64,SSE 
+       MOVNTPS          mem,xmmreg               KATMAI,SSE 
+       MOVSS            xmmreg,mem               KATMAI,SSE 
+       MOVSS            mem,xmmreg               KATMAI,SSE 
+       MOVSS            xmmreg,xmmreg            KATMAI,SSE 
+       MOVSS            xmmreg,xmmreg            KATMAI,SSE 
+       MOVUPS           xmmreg,mem               KATMAI,SSE 
+       MOVUPS           mem,xmmreg               KATMAI,SSE 
+       MOVUPS           xmmreg,xmmreg            KATMAI,SSE 
+       MOVUPS           xmmreg,xmmreg            KATMAI,SSE 
+       MULPS            xmmreg,xmmrm             KATMAI,SSE 
+       MULSS            xmmreg,xmmrm             KATMAI,SSE 
+       ORPS             xmmreg,xmmrm             KATMAI,SSE 
+       RCPPS            xmmreg,xmmrm             KATMAI,SSE 
+       RCPSS            xmmreg,xmmrm             KATMAI,SSE 
+       RSQRTPS          xmmreg,xmmrm             KATMAI,SSE 
+       RSQRTSS          xmmreg,xmmrm             KATMAI,SSE 
+       SHUFPS           xmmreg,mem,imm           KATMAI,SSE 
+       SHUFPS           xmmreg,xmmreg,imm        KATMAI,SSE 
+       SQRTPS           xmmreg,xmmrm             KATMAI,SSE 
+       SQRTSS           xmmreg,xmmrm             KATMAI,SSE 
+       STMXCSR          mem                      KATMAI,SSE,SD 
+       SUBPS            xmmreg,xmmrm             KATMAI,SSE 
+       SUBSS            xmmreg,xmmrm             KATMAI,SSE 
+       UCOMISS          xmmreg,xmmrm             KATMAI,SSE 
+       UNPCKHPS         xmmreg,xmmrm             KATMAI,SSE 
+       UNPCKLPS         xmmreg,xmmrm             KATMAI,SSE 
+       XORPS            xmmreg,xmmrm             KATMAI,SSE
+
+ B.1.4 Introduced in Deschutes but necessary for SSE support
+
+       FXRSTOR          mem                      P6,SSE,FPU 
+       FXSAVE           mem                      P6,SSE,FPU
+
+ B.1.5 XSAVE group (AVX and extended state)
+
+       XGETBV                                    NEHALEM 
+       XSETBV                                    NEHALEM,PRIV 
+       XSAVE            mem                      NEHALEM 
+       XRSTOR           mem                      NEHALEM
+
+ B.1.6 Generic memory operations
+
+       PREFETCHNTA      mem                      KATMAI 
+       PREFETCHT0       mem                      KATMAI 
+       PREFETCHT1       mem                      KATMAI 
+       PREFETCHT2       mem                      KATMAI 
+       SFENCE                                    KATMAI
+
+ B.1.7 New MMX instructions introduced in Katmai
+
+       MASKMOVQ         mmxreg,mmxreg            KATMAI,MMX 
+       MOVNTQ           mem,mmxreg               KATMAI,MMX 
+       PAVGB            mmxreg,mmxrm             KATMAI,MMX 
+       PAVGW            mmxreg,mmxrm             KATMAI,MMX 
+       PEXTRW           reg32,mmxreg,imm         KATMAI,MMX 
+       PINSRW           mmxreg,mem,imm           KATMAI,MMX 
+       PINSRW           mmxreg,rm16,imm          KATMAI,MMX 
+       PINSRW           mmxreg,reg32,imm         KATMAI,MMX 
+       PMAXSW           mmxreg,mmxrm             KATMAI,MMX 
+       PMAXUB           mmxreg,mmxrm             KATMAI,MMX 
+       PMINSW           mmxreg,mmxrm             KATMAI,MMX 
+       PMINUB           mmxreg,mmxrm             KATMAI,MMX 
+       PMOVMSKB         reg32,mmxreg             KATMAI,MMX 
+       PMULHUW          mmxreg,mmxrm             KATMAI,MMX 
+       PSADBW           mmxreg,mmxrm             KATMAI,MMX 
+       PSHUFW           mmxreg,mmxrm,imm         KATMAI,MMX2
+
+ B.1.8 AMD Enhanced 3DNow! (Athlon) instructions
+
+       PF2IW            mmxreg,mmxrm             PENT,3DNOW 
+       PFNACC           mmxreg,mmxrm             PENT,3DNOW 
+       PFPNACC          mmxreg,mmxrm             PENT,3DNOW 
+       PI2FW            mmxreg,mmxrm             PENT,3DNOW 
+       PSWAPD           mmxreg,mmxrm             PENT,3DNOW
+
+ B.1.9 Willamette SSE2 Cacheability Instructions
+
+       MASKMOVDQU       xmmreg,xmmreg            WILLAMETTE,SSE2 
+       CLFLUSH          mem                      WILLAMETTE,SSE2 
+       MOVNTDQ          mem,xmmreg               WILLAMETTE,SSE2,SO 
+       MOVNTI           mem,reg32                WILLAMETTE,SD 
+       MOVNTI           mem,reg64                X64 
+       MOVNTPD          mem,xmmreg               WILLAMETTE,SSE2,SO 
+       LFENCE                                    WILLAMETTE,SSE2 
+       MFENCE                                    WILLAMETTE,SSE2
+
+B.1.10 Willamette MMX instructions (SSE2 SIMD Integer Instructions)
+
+       MOVD             mem,xmmreg               WILLAMETTE,SSE2,SD 
+       MOVD             xmmreg,mem               WILLAMETTE,SSE2,SD 
+       MOVD             xmmreg,rm32              WILLAMETTE,SSE2 
+       MOVD             rm32,xmmreg              WILLAMETTE,SSE2 
+       MOVDQA           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVDQA           mem,xmmreg               WILLAMETTE,SSE2,SO 
+       MOVDQA           xmmreg,mem               WILLAMETTE,SSE2,SO 
+       MOVDQA           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVDQU           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVDQU           mem,xmmreg               WILLAMETTE,SSE2,SO 
+       MOVDQU           xmmreg,mem               WILLAMETTE,SSE2,SO 
+       MOVDQU           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVDQ2Q          mmxreg,xmmreg            WILLAMETTE,SSE2 
+       MOVQ             xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVQ             xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVQ             mem,xmmreg               WILLAMETTE,SSE2 
+       MOVQ             xmmreg,mem               WILLAMETTE,SSE2 
+       MOVQ             xmmreg,rm64              X64,SSE2 
+       MOVQ             rm64,xmmreg              X64,SSE2 
+       MOVQ2DQ          xmmreg,mmxreg            WILLAMETTE,SSE2 
+       PACKSSWB         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PACKSSDW         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PACKUSWB         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDB            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDW            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDQ            mmxreg,mmxrm             WILLAMETTE,MMX 
+       PADDQ            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDSB           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDSW           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDUSB          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PADDUSW          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PAND             xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PANDN            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PAVGB            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PAVGW            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PCMPEQB          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PCMPEQW          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PCMPEQD          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PCMPGTB          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PCMPGTW          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PCMPGTD          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PEXTRW           reg32,xmmreg,imm         WILLAMETTE,SSE2 
+       PINSRW           xmmreg,reg16,imm         WILLAMETTE,SSE2 
+       PINSRW           xmmreg,reg32,imm         WILLAMETTE,SSE2,ND 
+       PINSRW           xmmreg,mem,imm           WILLAMETTE,SSE2 
+       PINSRW           xmmreg,mem16,imm         WILLAMETTE,SSE2 
+       PMADDWD          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMAXSW           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMAXUB           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMINSW           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMINUB           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMOVMSKB         reg32,xmmreg             WILLAMETTE,SSE2 
+       PMULHUW          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMULHW           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMULLW           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PMULUDQ          mmxreg,mmxrm             WILLAMETTE,SSE2,SO 
+       PMULUDQ          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       POR              xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSADBW           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSHUFD           xmmreg,xmmreg,imm        WILLAMETTE,SSE2 
+       PSHUFD           xmmreg,mem,imm           WILLAMETTE,SSE22 
+       PSHUFHW          xmmreg,xmmreg,imm        WILLAMETTE,SSE2 
+       PSHUFHW          xmmreg,mem,imm           WILLAMETTE,SSE22 
+       PSHUFLW          xmmreg,xmmreg,imm        WILLAMETTE,SSE2 
+       PSHUFLW          xmmreg,mem,imm           WILLAMETTE,SSE22 
+       PSLLDQ           xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSLLW            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSLLW            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSLLD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSLLD            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSLLQ            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSLLQ            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSRAW            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSRAW            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSRAD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSRAD            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSRLDQ           xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSRLW            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSRLW            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSRLD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSRLD            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSRLQ            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSRLQ            xmmreg,imm               WILLAMETTE,SSE2,AR1 
+       PSUBB            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSUBW            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSUBD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSUBQ            mmxreg,mmxrm             WILLAMETTE,SSE2,SO 
+       PSUBQ            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSUBSB           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSUBSW           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSUBUSB          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PSUBUSW          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKHBW        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKHWD        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKHDQ        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKHQDQ       xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKLBW        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKLWD        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKLDQ        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PUNPCKLQDQ       xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       PXOR             xmmreg,xmmrm             WILLAMETTE,SSE2,SO
+
+B.1.11 Willamette Streaming SIMD instructions (SSE2)
+
+       ADDPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       ADDSD            xmmreg,xmmrm             WILLAMETTE,SSE2 
+       ANDNPD           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       ANDPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPEQPD          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPEQSD          xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPLEPD          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPLESD          xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPLTPD          xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPLTSD          xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPNEQPD         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPNEQSD         xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPNLEPD         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPNLESD         xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPNLTPD         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPNLTSD         xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPORDPD         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPORDSD         xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPUNORDPD       xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CMPUNORDSD       xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CMPPD            xmmreg,xmmrm,imm         WILLAMETTE,SSE22 
+       CMPSD            xmmreg,xmmrm,imm         WILLAMETTE,SSE2 
+       COMISD           xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CVTDQ2PD         xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CVTDQ2PS         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTPD2DQ         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTPD2PI         mmxreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTPD2PS         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTPI2PD         xmmreg,mmxrm             WILLAMETTE,SSE2 
+       CVTPS2DQ         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTPS2PD         xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CVTSD2SI         reg32,xmmreg             WILLAMETTE,SSE2,AR1 
+       CVTSD2SI         reg32,mem                WILLAMETTE,SSE2,AR1 
+       CVTSD2SI         reg64,xmmreg             X64,SSE2,AR1 
+       CVTSD2SI         reg64,mem                X64,SSE2,AR1 
+       CVTSD2SS         xmmreg,xmmrm             WILLAMETTE,SSE2 
+       CVTSI2SD         xmmreg,mem               WILLAMETTE,SSE2,SD,AR1,ND 
+       CVTSI2SD         xmmreg,rm32              WILLAMETTE,SSE2,SD,AR1 
+       CVTSI2SD         xmmreg,rm64              X64,SSE2,AR1 
+       CVTSS2SD         xmmreg,xmmrm             WILLAMETTE,SSE2,SD 
+       CVTTPD2PI        mmxreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTTPD2DQ        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTTPS2DQ        xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       CVTTSD2SI        reg32,xmmreg             WILLAMETTE,SSE2,AR1 
+       CVTTSD2SI        reg32,mem                WILLAMETTE,SSE2,AR1 
+       CVTTSD2SI        reg64,xmmreg             X64,SSE2,AR1 
+       CVTTSD2SI        reg64,mem                X64,SSE2,AR1 
+       DIVPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       DIVSD            xmmreg,xmmrm             WILLAMETTE,SSE2 
+       MAXPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       MAXSD            xmmreg,xmmrm             WILLAMETTE,SSE2 
+       MINPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       MINSD            xmmreg,xmmrm             WILLAMETTE,SSE2 
+       MOVAPD           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVAPD           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVAPD           mem,xmmreg               WILLAMETTE,SSE2,SO 
+       MOVAPD           xmmreg,mem               WILLAMETTE,SSE2,SO 
+       MOVHPD           mem,xmmreg               WILLAMETTE,SSE2 
+       MOVHPD           xmmreg,mem               WILLAMETTE,SSE2 
+       MOVLPD           mem,xmmreg               WILLAMETTE,SSE2 
+       MOVLPD           xmmreg,mem               WILLAMETTE,SSE2 
+       MOVMSKPD         reg32,xmmreg             WILLAMETTE,SSE2 
+       MOVMSKPD         reg64,xmmreg             X64,SSE2 
+       MOVSD            xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVSD            xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVSD            mem,xmmreg               WILLAMETTE,SSE2 
+       MOVSD            xmmreg,mem               WILLAMETTE,SSE2 
+       MOVUPD           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVUPD           xmmreg,xmmreg            WILLAMETTE,SSE2 
+       MOVUPD           mem,xmmreg               WILLAMETTE,SSE2,SO 
+       MOVUPD           xmmreg,mem               WILLAMETTE,SSE2,SO 
+       MULPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       MULSD            xmmreg,xmmrm             WILLAMETTE,SSE2 
+       ORPD             xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       SHUFPD           xmmreg,xmmreg,imm        WILLAMETTE,SSE2 
+       SHUFPD           xmmreg,mem,imm           WILLAMETTE,SSE2 
+       SQRTPD           xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       SQRTSD           xmmreg,xmmrm             WILLAMETTE,SSE2 
+       SUBPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       SUBSD            xmmreg,xmmrm             WILLAMETTE,SSE2 
+       UCOMISD          xmmreg,xmmrm             WILLAMETTE,SSE2 
+       UNPCKHPD         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       UNPCKLPD         xmmreg,xmmrm             WILLAMETTE,SSE2,SO 
+       XORPD            xmmreg,xmmrm             WILLAMETTE,SSE2,SO
+
+B.1.12 Prescott New Instructions (SSE3)
+
+       ADDSUBPD         xmmreg,xmmrm             PRESCOTT,SSE3,SO 
+       ADDSUBPS         xmmreg,xmmrm             PRESCOTT,SSE3,SO 
+       HADDPD           xmmreg,xmmrm             PRESCOTT,SSE3,SO 
+       HADDPS           xmmreg,xmmrm             PRESCOTT,SSE3,SO 
+       HSUBPD           xmmreg,xmmrm             PRESCOTT,SSE3,SO 
+       HSUBPS           xmmreg,xmmrm             PRESCOTT,SSE3,SO 
+       LDDQU            xmmreg,mem               PRESCOTT,SSE3,SO 
+       MOVDDUP          xmmreg,xmmrm             PRESCOTT,SSE3 
+       MOVSHDUP         xmmreg,xmmrm             PRESCOTT,SSE3 
+       MOVSLDUP         xmmreg,xmmrm             PRESCOTT,SSE3
+
+B.1.13 VMX Instructions
+
+       VMCALL                                    VMX 
+       VMCLEAR          mem                      VMX 
+       VMLAUNCH                                  VMX 
+       VMLOAD                                    X64,VMX 
+       VMMCALL                                   X64,VMX 
+       VMPTRLD          mem                      VMX 
+       VMPTRST          mem                      VMX 
+       VMREAD           rm32,reg32               VMX,NOLONG,SD 
+       VMREAD           rm64,reg64               X64,VMX 
+       VMRESUME                                  VMX 
+       VMRUN                                     X64,VMX 
+       VMSAVE                                    X64,VMX 
+       VMWRITE          reg32,rm32               VMX,NOLONG,SD 
+       VMWRITE          reg64,rm64               X64,VMX 
+       VMXOFF                                    VMX 
+       VMXON            mem                      VMX
+
+B.1.14 Extended Page Tables VMX instructions
+
+       INVEPT           reg32,mem                VMX,SO,NOLONG 
+       INVEPT           reg64,mem                VMX,SO,LONG 
+       INVVPID          reg32,mem                VMX,SO,NOLONG 
+       INVVPID          reg64,mem                VMX,SO,LONG
+
+B.1.15 Tejas New Instructions (SSSE3)
+
+       PABSB            mmxreg,mmxrm             SSSE3,MMX 
+       PABSB            xmmreg,xmmrm             SSSE3 
+       PABSW            mmxreg,mmxrm             SSSE3,MMX 
+       PABSW            xmmreg,xmmrm             SSSE3 
+       PABSD            mmxreg,mmxrm             SSSE3,MMX 
+       PABSD            xmmreg,xmmrm             SSSE3 
+       PALIGNR          mmxreg,mmxrm,imm         SSSE3,MMX 
+       PALIGNR          xmmreg,xmmrm,imm         SSSE3 
+       PHADDW           mmxreg,mmxrm             SSSE3,MMX 
+       PHADDW           xmmreg,xmmrm             SSSE3 
+       PHADDD           mmxreg,mmxrm             SSSE3,MMX 
+       PHADDD           xmmreg,xmmrm             SSSE3 
+       PHADDSW          mmxreg,mmxrm             SSSE3,MMX 
+       PHADDSW          xmmreg,xmmrm             SSSE3 
+       PHSUBW           mmxreg,mmxrm             SSSE3,MMX 
+       PHSUBW           xmmreg,xmmrm             SSSE3 
+       PHSUBD           mmxreg,mmxrm             SSSE3,MMX 
+       PHSUBD           xmmreg,xmmrm             SSSE3 
+       PHSUBSW          mmxreg,mmxrm             SSSE3,MMX 
+       PHSUBSW          xmmreg,xmmrm             SSSE3 
+       PMADDUBSW        mmxreg,mmxrm             SSSE3,MMX 
+       PMADDUBSW        xmmreg,xmmrm             SSSE3 
+       PMULHRSW         mmxreg,mmxrm             SSSE3,MMX 
+       PMULHRSW         xmmreg,xmmrm             SSSE3 
+       PSHUFB           mmxreg,mmxrm             SSSE3,MMX 
+       PSHUFB           xmmreg,xmmrm             SSSE3 
+       PSIGNB           mmxreg,mmxrm             SSSE3,MMX 
+       PSIGNB           xmmreg,xmmrm             SSSE3 
+       PSIGNW           mmxreg,mmxrm             SSSE3,MMX 
+       PSIGNW           xmmreg,xmmrm             SSSE3 
+       PSIGND           mmxreg,mmxrm             SSSE3,MMX 
+       PSIGND           xmmreg,xmmrm             SSSE3
+
+B.1.16 AMD SSE4A
+
+       EXTRQ            xmmreg,imm,imm           SSE4A,AMD 
+       EXTRQ            xmmreg,xmmreg            SSE4A,AMD 
+       INSERTQ          xmmreg,xmmreg,imm,imm    SSE4A,AMD 
+       INSERTQ          xmmreg,xmmreg            SSE4A,AMD 
+       MOVNTSD          mem,xmmreg               SSE4A,AMD 
+       MOVNTSS          mem,xmmreg               SSE4A,AMD,SD
+
+B.1.17 New instructions in Barcelona
+
+       LZCNT            reg16,rm16               P6,AMD 
+       LZCNT            reg32,rm32               P6,AMD 
+       LZCNT            reg64,rm64               X64,AMD
+
+B.1.18 Penryn New Instructions (SSE4.1)
+
+       BLENDPD          xmmreg,xmmrm,imm         SSE41 
+       BLENDPS          xmmreg,xmmrm,imm         SSE41 
+       BLENDVPD         xmmreg,xmmrm,xmm0        SSE41 
+       BLENDVPS         xmmreg,xmmrm,xmm0        SSE41 
+       DPPD             xmmreg,xmmrm,imm         SSE41 
+       DPPS             xmmreg,xmmrm,imm         SSE41 
+       EXTRACTPS        rm32,xmmreg,imm          SSE41 
+       EXTRACTPS        reg64,xmmreg,imm         SSE41,X64 
+       INSERTPS         xmmreg,xmmrm,imm         SSE41,SD 
+       MOVNTDQA         xmmreg,mem               SSE41 
+       MPSADBW          xmmreg,xmmrm,imm         SSE41 
+       PACKUSDW         xmmreg,xmmrm             SSE41 
+       PBLENDVB         xmmreg,xmmrm,xmm0        SSE41 
+       PBLENDW          xmmreg,xmmrm,imm         SSE41 
+       PCMPEQQ          xmmreg,xmmrm             SSE41 
+       PEXTRB           reg32,xmmreg,imm         SSE41 
+       PEXTRB           mem8,xmmreg,imm          SSE41 
+       PEXTRB           reg64,xmmreg,imm         SSE41,X64 
+       PEXTRD           rm32,xmmreg,imm          SSE41 
+       PEXTRQ           rm64,xmmreg,imm          SSE41,X64 
+       PEXTRW           reg32,xmmreg,imm         SSE41 
+       PEXTRW           mem16,xmmreg,imm         SSE41 
+       PEXTRW           reg64,xmmreg,imm         SSE41,X64 
+       PHMINPOSUW       xmmreg,xmmrm             SSE41 
+       PINSRB           xmmreg,mem,imm           SSE41 
+       PINSRB           xmmreg,rm8,imm           SSE41 
+       PINSRB           xmmreg,reg32,imm         SSE41 
+       PINSRD           xmmreg,mem,imm           SSE41 
+       PINSRD           xmmreg,rm32,imm          SSE41 
+       PINSRQ           xmmreg,mem,imm           SSE41,X64 
+       PINSRQ           xmmreg,rm64,imm          SSE41,X64 
+       PMAXSB           xmmreg,xmmrm             SSE41 
+       PMAXSD           xmmreg,xmmrm             SSE41 
+       PMAXUD           xmmreg,xmmrm             SSE41 
+       PMAXUW           xmmreg,xmmrm             SSE41 
+       PMINSB           xmmreg,xmmrm             SSE41 
+       PMINSD           xmmreg,xmmrm             SSE41 
+       PMINUD           xmmreg,xmmrm             SSE41 
+       PMINUW           xmmreg,xmmrm             SSE41 
+       PMOVSXBW         xmmreg,xmmrm             SSE41 
+       PMOVSXBD         xmmreg,xmmrm             SSE41,SD 
+       PMOVSXBQ         xmmreg,xmmrm             SSE41,SW 
+       PMOVSXWD         xmmreg,xmmrm             SSE41 
+       PMOVSXWQ         xmmreg,xmmrm             SSE41,SD 
+       PMOVSXDQ         xmmreg,xmmrm             SSE41 
+       PMOVZXBW         xmmreg,xmmrm             SSE41 
+       PMOVZXBD         xmmreg,xmmrm             SSE41,SD 
+       PMOVZXBQ         xmmreg,xmmrm             SSE41,SW 
+       PMOVZXWD         xmmreg,xmmrm             SSE41 
+       PMOVZXWQ         xmmreg,xmmrm             SSE41,SD 
+       PMOVZXDQ         xmmreg,xmmrm             SSE41 
+       PMULDQ           xmmreg,xmmrm             SSE41 
+       PMULLD           xmmreg,xmmrm             SSE41 
+       PTEST            xmmreg,xmmrm             SSE41 
+       ROUNDPD          xmmreg,xmmrm,imm         SSE41 
+       ROUNDPS          xmmreg,xmmrm,imm         SSE41 
+       ROUNDSD          xmmreg,xmmrm,imm         SSE41 
+       ROUNDSS          xmmreg,xmmrm,imm         SSE41
+
+B.1.19 Nehalem New Instructions (SSE4.2)
+
+       CRC32            reg32,rm8                SSE42 
+       CRC32            reg32,rm16               SSE42 
+       CRC32            reg32,rm32               SSE42 
+       CRC32            reg64,rm8                SSE42,X64 
+       CRC32            reg64,rm64               SSE42,X64 
+       PCMPESTRI        xmmreg,xmmrm,imm         SSE42 
+       PCMPESTRM        xmmreg,xmmrm,imm         SSE42 
+       PCMPISTRI        xmmreg,xmmrm,imm         SSE42 
+       PCMPISTRM        xmmreg,xmmrm,imm         SSE42 
+       PCMPGTQ          xmmreg,xmmrm             SSE42 
+       POPCNT           reg16,rm16               NEHALEM,SW 
+       POPCNT           reg32,rm32               NEHALEM,SD 
+       POPCNT           reg64,rm64               NEHALEM,X64
+
+B.1.20 Intel SMX
+
+       GETSEC                                    KATMAI
+
+B.1.21 Geode (Cyrix) 3DNow! additions
+
+       PFRCPV           mmxreg,mmxrm             PENT,3DNOW,CYRIX 
+       PFRSQRTV         mmxreg,mmxrm             PENT,3DNOW,CYRIX
+
+B.1.22 Intel new instructions in ???
+
+       MOVBE            reg16,mem16              NEHALEM 
+       MOVBE            reg32,mem32              NEHALEM 
+       MOVBE            reg64,mem64              NEHALEM 
+       MOVBE            mem16,reg16              NEHALEM 
+       MOVBE            mem32,reg32              NEHALEM 
+       MOVBE            mem64,reg64              NEHALEM
+
+B.1.23 Intel AES instructions
+
+       AESENC           xmmreg,xmmrm128          SSE,WESTMERE 
+       AESENCLAST       xmmreg,xmmrm128          SSE,WESTMERE 
+       AESDEC           xmmreg,xmmrm128          SSE,WESTMERE 
+       AESDECLAST       xmmreg,xmmrm128          SSE,WESTMERE 
+       AESIMC           xmmreg,xmmrm128          SSE,WESTMERE 
+       AESKEYGENASSIST  xmmreg,xmmrm128,imm8     SSE,WESTMERE
+
+B.1.24 Intel AVX AES instructions
+
+       VAESENC          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VAESENCLAST      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VAESDEC          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VAESDECLAST      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VAESIMC          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VAESKEYGENASSIST xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE
+
+B.1.25 Intel AVX instructions
+
+       VADDPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VADDPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VADDPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VADDPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VADDSD           xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VADDSS           xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VADDSUBPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VADDSUBPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VADDSUBPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VADDSUBPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VANDPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VANDPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VANDPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VANDPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VANDNPD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VANDNPD          ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VANDNPS          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VANDNPS          ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VBLENDPD         xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VBLENDPD         ymmreg,ymmreg*,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VBLENDPS         xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VBLENDPS         ymmreg,ymmreg*,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VBLENDVPD        xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VBLENDVPD        xmmreg,xmmrm128,xmm0     AVX,SANDYBRIDGE 
+       VBLENDVPD        ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VBLENDVPD        ymmreg,ymmrm256,ymm0     AVX,SANDYBRIDGE 
+       VBLENDVPS        xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VBLENDVPS        xmmreg,xmmrm128,xmm0     AVX,SANDYBRIDGE 
+       VBLENDVPS        ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VBLENDVPD        ymmreg,ymmrm256,ymm0     AVX,SANDYBRIDGE 
+       VBROADCASTSS     xmmreg,mem32             AVX,SANDYBRIDGE 
+       VBROADCASTSS     ymmreg,mem32             AVX,SANDYBRIDGE 
+       VBROADCASTSD     ymmreg,mem64             AVX,SANDYBRIDGE 
+       VBROADCASTF128   ymmreg,mem128            AVX,SANDYBRIDGE 
+       VCMPEQPD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQPD         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLTPD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLTPD         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLEPD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLEPD         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPUNORDPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPUNORDPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLTPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLTPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLEPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLEPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPORDPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPORDPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPEQ_UQPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQ_UQPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGEPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGEPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGTPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGTPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPFALSEPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPFALSEPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQ_OQPD     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQ_OQPD     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGEPD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGEPD         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGTPD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGTPD         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPTRUEPD       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPTRUEPD       ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPEQ_OSPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQ_OSPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLT_OQPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLT_OQPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLE_OQPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLE_OQPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPUNORD_SPD    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPUNORD_SPD    ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQ_USPD     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQ_USPD     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLT_UQPD     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLT_UQPD     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLE_UQPD     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLE_UQPD     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPORD_SPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPORD_SPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPEQ_USPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQ_USPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGE_UQPD     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGE_UQPD     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGT_UQPD     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGT_UQPD     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPFALSE_OSPD   xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPFALSE_OSPD   ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQ_OSPD     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQ_OSPD     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGE_OQPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGE_OQPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGT_OQPD      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGT_OQPD      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPTRUE_USPD    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPTRUE_USPD    ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPPD           xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VCMPPD           ymmreg,ymmreg*,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VCMPEQPS         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQPS         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLTPS         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLTPS         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLEPS         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLEPS         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPUNORDPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPUNORDPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLTPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLTPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLEPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLEPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPORDPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPORDPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPEQ_UQPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQ_UQPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGEPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGEPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGTPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGTPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPFALSEPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPFALSEPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQ_OQPS     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQ_OQPS     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGEPS         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGEPS         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGTPS         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGTPS         ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPTRUEPS       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPTRUEPS       ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPEQ_OSPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQ_OSPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLT_OQPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLT_OQPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPLE_OQPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPLE_OQPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPUNORD_SPS    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPUNORD_SPS    ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQ_USPS     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQ_USPS     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLT_UQPS     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLT_UQPS     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNLE_UQPS     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNLE_UQPS     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPORD_SPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPORD_SPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPEQ_USPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPEQ_USPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGE_UQPS     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGE_UQPS     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNGT_UQPS     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNGT_UQPS     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPFALSE_OSPS   xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPFALSE_OSPS   ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPNEQ_OSPS     xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPNEQ_OSPS     ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGE_OQPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGE_OQPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPGT_OQPS      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPGT_OQPS      ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPTRUE_USPS    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VCMPTRUE_USPS    ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VCMPPS           xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VCMPPS           ymmreg,ymmreg*,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VCMPEQSD         xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPLTSD         xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPLESD         xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPUNORDSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNEQSD        xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNLTSD        xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNLESD        xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPORDSD        xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPEQ_UQSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNGESD        xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNGTSD        xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPFALSESD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNEQ_OQSD     xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPGESD         xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPGTSD         xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPTRUESD       xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPEQ_OSSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPLT_OQSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPLE_OQSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPUNORD_SSD    xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNEQ_USSD     xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNLT_UQSD     xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNLE_UQSD     xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPORD_SSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPEQ_USSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNGE_UQSD     xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNGT_UQSD     xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPFALSE_OSSD   xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPNEQ_OSSD     xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPGE_OQSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPGT_OQSD      xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPTRUE_USSD    xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCMPSD           xmmreg,xmmreg*,xmmrm64,imm8 AVX,SANDYBRIDGE 
+       VCMPEQSS         xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPLTSS         xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPLESS         xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPUNORDSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNEQSS        xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNLTSS        xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNLESS        xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPORDSS        xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPEQ_UQSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNGESS        xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNGTSS        xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPFALSESS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNEQ_OQSS     xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPGESS         xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPGTSS         xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPTRUESS       xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPEQ_OSSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPLT_OQSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPLE_OQSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPUNORD_SSS    xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNEQ_USSS     xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNLT_UQSS     xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNLE_UQSS     xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPORD_SSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPEQ_USSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNGE_UQSS     xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNGT_UQSS     xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPFALSE_OSSS   xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPNEQ_OSSS     xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPGE_OQSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPGT_OQSS      xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPTRUE_USSS    xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCMPSS           xmmreg,xmmreg*,xmmrm32,imm8 AVX,SANDYBRIDGE 
+       VCOMISD          xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VCOMISS          xmmreg,xmmrm32           AVX,SANDYBRIDGE 
+       VCVTDQ2PD        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VCVTDQ2PD        ymmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VCVTDQ2PS        xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VCVTDQ2PS        ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VCVTPD2DQ        xmmreg,xmmreg            AVX,SANDYBRIDGE 
+       VCVTPD2DQ        xmmreg,mem128            AVX,SANDYBRIDGE,SO 
+       VCVTPD2DQ        xmmreg,ymmreg            AVX,SANDYBRIDGE 
+       VCVTPD2DQ        xmmreg,mem256            AVX,SANDYBRIDGE,SY 
+       VCVTPD2PS        xmmreg,xmmreg            AVX,SANDYBRIDGE 
+       VCVTPD2PS        xmmreg,mem128            AVX,SANDYBRIDGE,SO 
+       VCVTPD2PS        xmmreg,ymmreg            AVX,SANDYBRIDGE 
+       VCVTPD2PS        xmmreg,mem256            AVX,SANDYBRIDGE,SY 
+       VCVTPS2DQ        xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VCVTPS2DQ        ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VCVTPS2PD        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VCVTPS2PD        ymmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VCVTSD2SI        reg32,xmmrm64            AVX,SANDYBRIDGE 
+       VCVTSD2SI        reg64,xmmrm64            AVX,SANDYBRIDGE,LONG 
+       VCVTSD2SS        xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VCVTSI2SD        xmmreg,xmmreg*,rm32      AVX,SANDYBRIDGE,SD 
+       VCVTSI2SD        xmmreg,xmmreg*,mem32     AVX,SANDYBRIDGE,ND,SD 
+       VCVTSI2SD        xmmreg,xmmreg*,rm64      AVX,SANDYBRIDGE,LONG 
+       VCVTSI2SS        xmmreg,xmmreg*,rm32      AVX,SANDYBRIDGE,SD 
+       VCVTSI2SS        xmmreg,xmmreg*,mem32     AVX,SANDYBRIDGE,ND,SD 
+       VCVTSI2SS        xmmreg,xmmreg*,rm64      AVX,SANDYBRIDGE,LONG 
+       VCVTSS2SD        xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VCVTSS2SI        reg32,xmmrm32            AVX,SANDYBRIDGE 
+       VCVTSS2SI        reg64,xmmrm32            AVX,SANDYBRIDGE,LONG 
+       VCVTTPD2DQ       xmmreg,xmmreg            AVX,SANDYBRIDGE 
+       VCVTTPD2DQ       xmmreg,mem128            AVX,SANDYBRIDGE,SO 
+       VCVTTPD2DQ       xmmreg,ymmreg            AVX,SANDYBRIDGE 
+       VCVTTPD2DQ       xmmreg,mem256            AVX,SANDYBRIDGE,SY 
+       VCVTTPS2DQ       xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VCVTTPS2DQ       ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VCVTTSD2SI       reg32,xmmrm64            AVX,SANDYBRIDGE 
+       VCVTTSD2SI       reg64,xmmrm64            AVX,SANDYBRIDGE,LONG 
+       VCVTTSS2SI       reg32,xmmrm32            AVX,SANDYBRIDGE 
+       VCVTTSS2SI       reg64,xmmrm32            AVX,SANDYBRIDGE,LONG 
+       VDIVPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VDIVPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VDIVPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VDIVPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VDIVSD           xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VDIVSS           xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VDPPD            xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VDPPS            xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VDPPS            ymmreg,ymmreg*,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VEXTRACTF128     xmmrm128,xmmreg,imm8     AVX,SANDYBRIDGE 
+       VEXTRACTPS       rm32,xmmreg,imm8         AVX,SANDYBRIDGE 
+       VHADDPD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VHADDPD          ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VHADDPS          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VHADDPS          ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VHSUBPD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VHSUBPD          ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VHSUBPS          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VHSUBPS          ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VINSERTF128      ymmreg,ymmreg,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VINSERTPS        xmmreg,xmmreg*,xmmrm32,imm8 AVX,SANDYBRIDGE 
+       VLDDQU           xmmreg,mem128            AVX,SANDYBRIDGE 
+       VLDQQU           ymmreg,mem256            AVX,SANDYBRIDGE 
+       VLDDQU           ymmreg,mem256            AVX,SANDYBRIDGE 
+       VLDMXCSR         mem32                    AVX,SANDYBRIDGE 
+       VMASKMOVDQU      xmmreg,xmmreg            AVX,SANDYBRIDGE 
+       VMASKMOVPS       xmmreg,xmmreg,mem128     AVX,SANDYBRIDGE 
+       VMASKMOVPS       ymmreg,ymmreg,mem256     AVX,SANDYBRIDGE 
+       VMASKMOVPS       mem128,xmmreg,xmmreg     AVX,SANDYBRIDGE,SO 
+       VMASKMOVPS       mem256,xmmreg,xmmreg     AVX,SANDYBRIDGE,SY 
+       VMASKMOVPD       xmmreg,xmmreg,mem128     AVX,SANDYBRIDGE 
+       VMASKMOVPD       ymmreg,ymmreg,mem256     AVX,SANDYBRIDGE 
+       VMASKMOVPD       mem128,xmmreg,xmmreg     AVX,SANDYBRIDGE 
+       VMASKMOVPD       mem256,ymmreg,ymmreg     AVX,SANDYBRIDGE 
+       VMAXPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VMAXPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VMAXPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VMAXPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VMAXSD           xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VMAXSS           xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VMINPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VMINPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VMINPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VMINPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VMINSD           xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VMINSS           xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VMOVAPD          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVAPD          xmmrm128,xmmreg          AVX,SANDYBRIDGE 
+       VMOVAPD          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVAPD          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMOVAPS          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVAPS          xmmrm128,xmmreg          AVX,SANDYBRIDGE 
+       VMOVAPS          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVAPS          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMOVQ            xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VMOVQ            xmmrm64,xmmreg           AVX,SANDYBRIDGE 
+       VMOVQ            xmmreg,rm64              AVX,SANDYBRIDGE,LONG 
+       VMOVQ            rm64,xmmreg              AVX,SANDYBRIDGE,LONG 
+       VMOVD            xmmreg,rm32              AVX,SANDYBRIDGE 
+       VMOVD            rm32,xmmreg              AVX,SANDYBRIDGE 
+       VMOVDDUP         xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VMOVDDUP         ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVDQA          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVDQA          xmmrm128,xmmreg          AVX,SANDYBRIDGE 
+       VMOVQQA          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVQQA          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMOVDQA          ymmreg,ymmrm             AVX,SANDYBRIDGE 
+       VMOVDQA          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMOVDQU          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVDQU          xmmrm128,xmmreg          AVX,SANDYBRIDGE 
+       VMOVQQU          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVQQU          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMOVDQU          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVDQU          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMOVHLPS         xmmreg,xmmreg*,xmmreg    AVX,SANDYBRIDGE 
+       VMOVHPD          xmmreg,xmmreg*,mem64     AVX,SANDYBRIDGE 
+       VMOVHPD          mem64,xmmreg             AVX,SANDYBRIDGE 
+       VMOVHPS          xmmreg,xmmreg*,mem64     AVX,SANDYBRIDGE 
+       VMOVHPS          mem64,xmmreg             AVX,SANDYBRIDGE 
+       VMOVLHPS         xmmreg,xmmreg*,xmmreg    AVX,SANDYBRIDGE 
+       VMOVLPD          xmmreg,xmmreg*,mem64     AVX,SANDYBRIDGE 
+       VMOVLPD          mem64,xmmreg             AVX,SANDYBRIDGE 
+       VMOVLPS          xmmreg,xmmreg*,mem64     AVX,SANDYBRIDGE 
+       VMOVLPS          mem64,xmmreg             AVX,SANDYBRIDGE 
+       VMOVMSKPD        reg64,xmmreg             AVX,SANDYBRIDGE,LONG 
+       VMOVMSKPD        reg32,xmmreg             AVX,SANDYBRIDGE 
+       VMOVMSKPD        reg64,ymmreg             AVX,SANDYBRIDGE,LONG 
+       VMOVMSKPD        reg32,ymmreg             AVX,SANDYBRIDGE 
+       VMOVMSKPS        reg64,xmmreg             AVX,SANDYBRIDGE,LONG 
+       VMOVMSKPS        reg32,xmmreg             AVX,SANDYBRIDGE 
+       VMOVMSKPS        reg64,ymmreg             AVX,SANDYBRIDGE,LONG 
+       VMOVMSKPS        reg32,ymmreg             AVX,SANDYBRIDGE 
+       VMOVNTDQ         mem128,xmmreg            AVX,SANDYBRIDGE 
+       VMOVNTQQ         mem256,ymmreg            AVX,SANDYBRIDGE 
+       VMOVNTDQ         mem256,ymmreg            AVX,SANDYBRIDGE 
+       VMOVNTDQA        xmmreg,mem128            AVX,SANDYBRIDGE 
+       VMOVNTPD         mem128,xmmreg            AVX,SANDYBRIDGE 
+       VMOVNTPD         mem256,ymmreg            AVX,SANDYBRIDGE 
+       VMOVNTPS         mem128,xmmreg            AVX,SANDYBRIDGE 
+       VMOVNTPS         mem128,ymmreg            AVX,SANDYBRIDGE 
+       VMOVSD           xmmreg,xmmreg*,xmmreg    AVX,SANDYBRIDGE 
+       VMOVSD           xmmreg,mem64             AVX,SANDYBRIDGE 
+       VMOVSD           xmmreg,xmmreg*,xmmreg    AVX,SANDYBRIDGE 
+       VMOVSD           mem64,xmmreg             AVX,SANDYBRIDGE 
+       VMOVSHDUP        xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVSHDUP        ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVSLDUP        xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVSLDUP        ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVSS           xmmreg,xmmreg*,xmmreg    AVX,SANDYBRIDGE 
+       VMOVSS           xmmreg,mem64             AVX,SANDYBRIDGE 
+       VMOVSS           xmmreg,xmmreg*,xmmreg    AVX,SANDYBRIDGE 
+       VMOVSS           mem64,xmmreg             AVX,SANDYBRIDGE 
+       VMOVUPD          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVUPD          xmmrm128,xmmreg          AVX,SANDYBRIDGE 
+       VMOVUPD          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVUPD          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMOVUPS          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VMOVUPS          xmmrm128,xmmreg          AVX,SANDYBRIDGE 
+       VMOVUPS          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VMOVUPS          ymmrm256,ymmreg          AVX,SANDYBRIDGE 
+       VMPSADBW         xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VMULPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VMULPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VMULPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VMULPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VMULSD           xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VMULSS           xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VORPD            xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VORPD            ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VORPS            xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VORPS            ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VPABSB           xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VPABSW           xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VPABSD           xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VPACKSSWB        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPACKSSDW        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPACKUSWB        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPACKUSDW        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDB           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDW           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDQ           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDSB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDSW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDUSB         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPADDUSW         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPALIGNR         xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VPAND            xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPANDN           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPAVGB           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPAVGW           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPBLENDVB        xmmreg,xmmreg*,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VPBLENDW         xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VPCMPESTRI       xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPCMPESTRM       xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPCMPISTRI       xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPCMPISTRM       xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPCMPEQB         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCMPEQW         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCMPEQD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCMPEQQ         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCMPGTB         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCMPGTW         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCMPGTD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCMPGTQ         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPERMILPD        xmmreg,xmmreg,xmmrm128   AVX,SANDYBRIDGE 
+       VPERMILPD        ymmreg,ymmreg,ymmrm256   AVX,SANDYBRIDGE 
+       VPERMILPD        xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPERMILPD        ymmreg,ymmrm256,imm8     AVX,SANDYBRIDGE 
+       VPERMILTD2PD     xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VPERMILTD2PD     xmmreg,xmmreg,xmmreg,xmmrm128 AVX,SANDYBRIDGE 
+       VPERMILTD2PD     ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VPERMILTD2PD     ymmreg,ymmreg,ymmreg,ymmrm256 AVX,SANDYBRIDGE 
+       VPERMILMO2PD     xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VPERMILMO2PD     xmmreg,xmmreg,xmmreg,xmmrm128 AVX,SANDYBRIDGE 
+       VPERMILMO2PD     ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VPERMILMO2PD     ymmreg,ymmreg,ymmreg,ymmrm256 AVX,SANDYBRIDGE 
+       VPERMILMZ2PD     xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VPERMILMZ2PD     xmmreg,xmmreg,xmmreg,xmmrm128 AVX,SANDYBRIDGE 
+       VPERMILMZ2PD     ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VPERMILMZ2PD     ymmreg,ymmreg,ymmreg,ymmrm256 AVX,SANDYBRIDGE 
+       VPERMIL2PD       xmmreg,xmmreg,xmmrm128,xmmreg,imm8 AVX,SANDYBRIDGE 
+       VPERMIL2PD       xmmreg,xmmreg,xmmreg,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VPERMIL2PD       ymmreg,ymmreg,ymmrm256,ymmreg,imm8 AVX,SANDYBRIDGE 
+       VPERMIL2PD       ymmreg,ymmreg,ymmreg,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VPERMILPS        xmmreg,xmmreg,xmmrm128   AVX,SANDYBRIDGE 
+       VPERMILPS        ymmreg,ymmreg,ymmrm256   AVX,SANDYBRIDGE 
+       VPERMILPS        xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPERMILPS        ymmreg,ymmrm256,imm8     AVX,SANDYBRIDGE 
+       VPERMILTD2PS     xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VPERMILTD2PS     xmmreg,xmmreg,xmmreg,xmmrm128 AVX,SANDYBRIDGE 
+       VPERMILTD2PS     ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VPERMILTD2PS     ymmreg,ymmreg,ymmreg,ymmrm256 AVX,SANDYBRIDGE 
+       VPERMILMO2PS     xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VPERMILMO2PS     xmmreg,xmmreg,xmmreg,xmmrm128 AVX,SANDYBRIDGE 
+       VPERMILMO2PS     ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VPERMILMO2PS     ymmreg,ymmreg,ymmreg,ymmrm256 AVX,SANDYBRIDGE 
+       VPERMILMZ2PS     xmmreg,xmmreg,xmmrm128,xmmreg AVX,SANDYBRIDGE 
+       VPERMILMZ2PS     xmmreg,xmmreg,xmmreg,xmmrm128 AVX,SANDYBRIDGE 
+       VPERMILMZ2PS     ymmreg,ymmreg,ymmrm256,ymmreg AVX,SANDYBRIDGE 
+       VPERMILMZ2PS     ymmreg,ymmreg,ymmreg,ymmrm256 AVX,SANDYBRIDGE 
+       VPERMIL2PS       xmmreg,xmmreg,xmmrm128,xmmreg,imm8 AVX,SANDYBRIDGE 
+       VPERMIL2PS       xmmreg,xmmreg,xmmreg,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VPERMIL2PS       ymmreg,ymmreg,ymmrm256,ymmreg,imm8 AVX,SANDYBRIDGE 
+       VPERMIL2PS       ymmreg,ymmreg,ymmreg,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VPERM2F128       ymmreg,ymmreg,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VPEXTRB          reg64,xmmreg,imm8        AVX,SANDYBRIDGE,LONG 
+       VPEXTRB          reg32,xmmreg,imm8        AVX,SANDYBRIDGE 
+       VPEXTRB          mem8,xmmreg,imm8         AVX,SANDYBRIDGE 
+       VPEXTRW          reg64,xmmreg,imm8        AVX,SANDYBRIDGE,LONG 
+       VPEXTRW          reg32,xmmreg,imm8        AVX,SANDYBRIDGE 
+       VPEXTRW          mem16,xmmreg,imm8        AVX,SANDYBRIDGE 
+       VPEXTRW          reg64,xmmreg,imm8        AVX,SANDYBRIDGE,LONG 
+       VPEXTRW          reg32,xmmreg,imm8        AVX,SANDYBRIDGE 
+       VPEXTRW          mem16,xmmreg,imm8        AVX,SANDYBRIDGE 
+       VPEXTRD          reg64,xmmreg,imm8        AVX,SANDYBRIDGE,LONG 
+       VPEXTRD          rm32,xmmreg,imm8         AVX,SANDYBRIDGE 
+       VPEXTRQ          rm64,xmmreg,imm8         AVX,SANDYBRIDGE,LONG 
+       VPHADDW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPHADDD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPHADDSW         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPHMINPOSUW      xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VPHSUBW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPHSUBD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPHSUBSW         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPINSRB          xmmreg,xmmreg*,mem8,imm8 AVX,SANDYBRIDGE 
+       VPINSRB          xmmreg,xmmreg*,rm8,imm8  AVX,SANDYBRIDGE 
+       VPINSRB          xmmreg,xmmreg*,reg32,imm8 AVX,SANDYBRIDGE 
+       VPINSRW          xmmreg,xmmreg*,mem16,imm8 AVX,SANDYBRIDGE 
+       VPINSRW          xmmreg,xmmreg*,rm16,imm8 AVX,SANDYBRIDGE 
+       VPINSRW          xmmreg,xmmreg*,reg32,imm8 AVX,SANDYBRIDGE 
+       VPINSRD          xmmreg,xmmreg*,mem32,imm8 AVX,SANDYBRIDGE 
+       VPINSRD          xmmreg,xmmreg*,rm32,imm8 AVX,SANDYBRIDGE 
+       VPINSRQ          xmmreg,xmmreg*,mem64,imm8 AVX,SANDYBRIDGE,LONG 
+       VPINSRQ          xmmreg,xmmreg*,rm64,imm8 AVX,SANDYBRIDGE,LONG 
+       VPMADDWD         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMADDUBSW       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMAXSB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMAXSW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMAXSD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMAXUB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMAXUW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMAXUD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMINSB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMINSW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMINSD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMINUB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMINUW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMINUD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMOVMSKB        reg64,xmmreg             AVX,SANDYBRIDGE,LONG 
+       VPMOVMSKB        reg32,xmmreg             AVX,SANDYBRIDGE 
+       VPMOVSXBW        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VPMOVSXBD        xmmreg,xmmrm32           AVX,SANDYBRIDGE 
+       VPMOVSXBQ        xmmreg,xmmrm16           AVX,SANDYBRIDGE 
+       VPMOVSXWD        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VPMOVSXWQ        xmmreg,xmmrm32           AVX,SANDYBRIDGE 
+       VPMOVSXDQ        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VPMOVZXBW        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VPMOVZXBD        xmmreg,xmmrm32           AVX,SANDYBRIDGE 
+       VPMOVZXBQ        xmmreg,xmmrm16           AVX,SANDYBRIDGE 
+       VPMOVZXWD        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VPMOVZXWQ        xmmreg,xmmrm32           AVX,SANDYBRIDGE 
+       VPMOVZXDQ        xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VPMULHUW         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMULHRSW        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMULHW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMULLW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMULLD          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMULUDQ         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPMULDQ          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPOR             xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSADBW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSHUFB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSHUFD          xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPSHUFHW         xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPSHUFLW         xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VPSIGNB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSIGNW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSIGND          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSLLDQ          xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSRLDQ          xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSLLW           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSLLW           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSLLD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSLLD           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSLLQ           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSLLQ           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSRAW           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSRAW           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSRAD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSRAD           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSRLW           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSRLW           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSRLD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSRLD           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPSRLQ           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSRLQ           xmmreg,xmmreg*,imm8      AVX,SANDYBRIDGE 
+       VPTEST           xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VPTEST           ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VPSUBB           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSUBW           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSUBD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSUBQ           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSUBSB          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSUBSW          xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSUBUSB         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPSUBUSW         xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKHBW       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKHWD       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKHDQ       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKHQDQ      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKLBW       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKLWD       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKLDQ       xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPUNPCKLQDQ      xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPXOR            xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VRCPPS           xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VRCPPS           ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VRCPSS           xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VRSQRTPS         xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VRSQRTPS         ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VRSQRTSS         xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VROUNDPD         xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VROUNDPD         ymmreg,ymmrm256,imm8     AVX,SANDYBRIDGE 
+       VROUNDPS         xmmreg,xmmrm128,imm8     AVX,SANDYBRIDGE 
+       VROUNDPS         ymmreg,ymmrm256,imm8     AVX,SANDYBRIDGE 
+       VROUNDSD         xmmreg,xmmreg*,xmmrm64,imm8 AVX,SANDYBRIDGE 
+       VROUNDSS         xmmreg,xmmreg*,xmmrm32,imm8 AVX,SANDYBRIDGE 
+       VSHUFPD          xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VSHUFPD          ymmreg,ymmreg*,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VSHUFPS          xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE 
+       VSHUFPS          ymmreg,ymmreg*,ymmrm256,imm8 AVX,SANDYBRIDGE 
+       VSQRTPD          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VSQRTPD          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VSQRTPS          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VSQRTPS          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VSQRTSD          xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VSQRTSS          xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VSTMXCSR         mem32                    AVX,SANDYBRIDGE 
+       VSUBPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VSUBPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VSUBPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VSUBPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VSUBSD           xmmreg,xmmreg*,xmmrm64   AVX,SANDYBRIDGE 
+       VSUBSS           xmmreg,xmmreg*,xmmrm32   AVX,SANDYBRIDGE 
+       VTESTPS          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VTESTPS          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VTESTPD          xmmreg,xmmrm128          AVX,SANDYBRIDGE 
+       VTESTPD          ymmreg,ymmrm256          AVX,SANDYBRIDGE 
+       VUCOMISD         xmmreg,xmmrm64           AVX,SANDYBRIDGE 
+       VUCOMISS         xmmreg,xmmrm32           AVX,SANDYBRIDGE 
+       VUNPCKHPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VUNPCKHPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VUNPCKHPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VUNPCKHPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VUNPCKLPD        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VUNPCKLPD        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VUNPCKLPS        xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VUNPCKLPS        ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VXORPD           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VXORPD           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VXORPS           xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VXORPS           ymmreg,ymmreg*,ymmrm256  AVX,SANDYBRIDGE 
+       VZEROALL                                  AVX,SANDYBRIDGE 
+       VZEROUPPER                                AVX,SANDYBRIDGE
+
+B.1.26 Intel Carry-Less Multiplication instructions (CLMUL)
+
+       PCLMULLQLQDQ     xmmreg,xmmrm128          SSE,WESTMERE 
+       PCLMULHQLQDQ     xmmreg,xmmrm128          SSE,WESTMERE 
+       PCLMULLQHQDQ     xmmreg,xmmrm128          SSE,WESTMERE 
+       PCLMULHQHQDQ     xmmreg,xmmrm128          SSE,WESTMERE 
+       PCLMULQDQ        xmmreg,xmmrm128,imm8     SSE,WESTMERE
+
+B.1.27 Intel AVX Carry-Less Multiplication instructions (CLMUL)
+
+       VPCLMULLQLQDQ    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCLMULHQLQDQ    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCLMULLQHQDQ    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCLMULHQHQDQ    xmmreg,xmmreg*,xmmrm128  AVX,SANDYBRIDGE 
+       VPCLMULQDQ       xmmreg,xmmreg*,xmmrm128,imm8 AVX,SANDYBRIDGE
+
+B.1.28 Intel Fused Multiply-Add instructions (FMA)
+
+       VFMADD132PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD132PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD132PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD132PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD312PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD312PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD312PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD312PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD213PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD213PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD213PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD213PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD123PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD123PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD123PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD123PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD231PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD231PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD231PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD231PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD321PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD321PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD321PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADD321PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB132PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB132PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB132PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB132PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB312PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB312PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB312PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB312PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB213PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB213PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB213PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB213PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB123PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB123PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB123PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB123PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB231PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB231PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB231PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB231PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB321PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB321PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADDSUB321PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMADDSUB321PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB132PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB132PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB132PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB132PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB312PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB312PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB312PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB312PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB213PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB213PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB213PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB213PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB123PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB123PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB123PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB123PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB231PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB231PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB231PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB231PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB321PS      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB321PS      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUB321PD      xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUB321PD      ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD132PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD132PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD132PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD132PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD312PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD312PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD312PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD312PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD213PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD213PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD213PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD213PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD123PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD123PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD123PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD123PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD231PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD231PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD231PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD231PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD321PS   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD321PS   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMSUBADD321PD   xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFMSUBADD321PD   ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD132PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD132PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD132PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD132PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD312PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD312PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD312PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD312PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD213PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD213PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD213PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD213PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD123PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD123PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD123PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD123PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD231PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD231PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD231PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD231PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD321PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD321PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMADD321PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMADD321PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB132PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB132PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB132PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB132PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB312PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB312PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB312PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB312PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB213PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB213PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB213PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB213PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB123PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB123PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB123PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB123PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB231PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB231PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB231PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB231PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB321PS     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB321PS     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFNMSUB321PD     xmmreg,xmmreg,xmmrm128   FMA,FUTURE 
+       VFNMSUB321PD     ymmreg,ymmreg,ymmrm256   FMA,FUTURE 
+       VFMADD132SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMADD132SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMADD312SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMADD312SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMADD213SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMADD213SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMADD123SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMADD123SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMADD231SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMADD231SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMADD321SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMADD321SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMSUB132SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMSUB132SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMSUB312SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMSUB312SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMSUB213SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMSUB213SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMSUB123SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMSUB123SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMSUB231SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMSUB231SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFMSUB321SS      xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFMSUB321SD      xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMADD132SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMADD132SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMADD312SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMADD312SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMADD213SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMADD213SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMADD123SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMADD123SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMADD231SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMADD231SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMADD321SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMADD321SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMSUB132SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMSUB132SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMSUB312SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMSUB312SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMSUB213SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMSUB213SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMSUB123SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMSUB123SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMSUB231SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMSUB231SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE 
+       VFNMSUB321SS     xmmreg,xmmreg,xmmrm32    FMA,FUTURE 
+       VFNMSUB321SD     xmmreg,xmmreg,xmmrm64    FMA,FUTURE
+
+B.1.29 VIA (Centaur) security instructions
+
+       XSTORE                                    PENT,CYRIX 
+       XCRYPTECB                                 PENT,CYRIX 
+       XCRYPTCBC                                 PENT,CYRIX 
+       XCRYPTCTR                                 PENT,CYRIX 
+       XCRYPTCFB                                 PENT,CYRIX 
+       XCRYPTOFB                                 PENT,CYRIX 
+       MONTMUL                                   PENT,CYRIX 
+       XSHA1                                     PENT,CYRIX 
+       XSHA256                                   PENT,CYRIX
+
+B.1.30 AMD Lightweight Profiling (LWP) instructions
+
+       LLWPCB           reg16                    AMD 
+       LLWPCB           reg32                    AMD,386 
+       LLWPCB           reg64                    AMD,X64 
+       SLWPCB           reg16                    AMD 
+       SLWPCB           reg32                    AMD,386 
+       SLWPCB           reg64                    AMD,X64 
+       LWPVAL           reg16,rm32,imm16         AMD,386 
+       LWPVAL           reg32,rm32,imm32         AMD,386 
+       LWPVAL           reg64,rm32,imm32         AMD,X64 
+       LWPINS           reg16,rm32,imm16         AMD,386 
+       LWPINS           reg32,rm32,imm32         AMD,386 
+       LWPINS           reg64,rm32,imm32         AMD,X64
+
+B.1.31 AMD XOP, FMA4 and CVT16 instructions (SSE5)
+
+       VCVTPH2PS        xmmreg,xmmrm64*,imm8     AMD,SSE5 
+       VCVTPH2PS        ymmreg,xmmrm128,imm8     AMD,SSE5 
+       VCVTPH2PS        ymmreg,ymmrm128*,imm8    AMD,SSE5 
+       VCVTPS2PH        xmmrm64,xmmreg*,imm8     AMD,SSE5 
+       VCVTPS2PH        xmmrm128,ymmreg,imm8     AMD,SSE5 
+       VCVTPS2PH        ymmrm128,ymmreg*,imm8    AMD,SSE5 
+       VFMADDPD         xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMADDPD         ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMADDPD         xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMADDPD         ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMADDPS         xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMADDPS         ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMADDPS         xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMADDPS         ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMADDSD         xmmreg,xmmreg*,xmmrm64,xmmreg AMD,SSE5 
+       VFMADDSD         xmmreg,xmmreg*,xmmreg,xmmrm64 AMD,SSE5 
+       VFMADDSS         xmmreg,xmmreg*,xmmrm32,xmmreg AMD,SSE5 
+       VFMADDSS         xmmreg,xmmreg*,xmmreg,xmmrm32 AMD,SSE5 
+       VFMADDSUBPD      xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMADDSUBPD      ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMADDSUBPD      xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMADDSUBPD      ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMADDSUBPS      xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMADDSUBPS      ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMADDSUBPS      xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMADDSUBPS      ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMSUBADDPD      xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMSUBADDPD      ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMSUBADDPD      xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMSUBADDPD      ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMSUBADDPS      xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMSUBADDPS      ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMSUBADDPS      xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMSUBADDPS      ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMSUBPD         xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMSUBPD         ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMSUBPD         xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMSUBPD         ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMSUBPS         xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFMSUBPS         ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFMSUBPS         xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFMSUBPS         ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFMSUBSD         xmmreg,xmmreg*,xmmrm64,xmmreg AMD,SSE5 
+       VFMSUBSD         xmmreg,xmmreg*,xmmreg,xmmrm64 AMD,SSE5 
+       VFMSUBSS         xmmreg,xmmreg*,xmmrm32,xmmreg AMD,SSE5 
+       VFMSUBSS         xmmreg,xmmreg*,xmmreg,xmmrm32 AMD,SSE5 
+       VFNMADDPD        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFNMADDPD        ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFNMADDPD        xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFNMADDPD        ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFNMADDPS        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFNMADDPS        ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFNMADDPS        xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFNMADDPS        ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFNMADDSD        xmmreg,xmmreg*,xmmrm64,xmmreg AMD,SSE5 
+       VFNMADDSD        xmmreg,xmmreg*,xmmreg,xmmrm64 AMD,SSE5 
+       VFNMADDSS        xmmreg,xmmreg*,xmmrm32,xmmreg AMD,SSE5 
+       VFNMADDSS        xmmreg,xmmreg*,xmmreg,xmmrm32 AMD,SSE5 
+       VFNMSUBPD        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFNMSUBPD        ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFNMSUBPD        xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFNMSUBPD        ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFNMSUBPS        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VFNMSUBPS        ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VFNMSUBPS        xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VFNMSUBPS        ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VFNMSUBSD        xmmreg,xmmreg*,xmmrm64,xmmreg AMD,SSE5 
+       VFNMSUBSD        xmmreg,xmmreg*,xmmreg,xmmrm64 AMD,SSE5 
+       VFNMSUBSS        xmmreg,xmmreg*,xmmrm32,xmmreg AMD,SSE5 
+       VFNMSUBSS        xmmreg,xmmreg*,xmmreg,xmmrm32 AMD,SSE5 
+       VFRCZPD          xmmreg,xmmrm128*         AMD,SSE5 
+       VFRCZPD          ymmreg,ymmrm256*         AMD,SSE5 
+       VFRCZPS          xmmreg,xmmrm128*         AMD,SSE5 
+       VFRCZPS          ymmreg,ymmrm256*         AMD,SSE5 
+       VFRCZSD          xmmreg,xmmrm64*          AMD,SSE5 
+       VFRCZSS          xmmreg,xmmrm32*          AMD,SSE5 
+       VPCMOV           xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPCMOV           ymmreg,ymmreg*,ymmrm256,ymmreg AMD,SSE5 
+       VPCMOV           xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VPCMOV           ymmreg,ymmreg*,ymmreg,ymmrm256 AMD,SSE5 
+       VPCOMB           xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPCOMD           xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPCOMQ           xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPCOMUB          xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPCOMUD          xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPCOMUQ          xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPCOMUW          xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPCOMW           xmmreg,xmmreg*,xmmrm128,imm8 AMD,SSE5 
+       VPHADDBD         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDBQ         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDBW         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDDQ         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDUBD        xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDUBQ        xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDUBW        xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDUDQ        xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDUWD        xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDUWQ        xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDWD         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHADDWQ         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHSUBBW         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHSUBDQ         xmmreg,xmmrm128*         AMD,SSE5 
+       VPHSUBWD         xmmreg,xmmrm128*         AMD,SSE5 
+       VPMACSDD         xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSDQH        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSDQL        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSSDD        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSSDQH       xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSSDQL       xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSSWD        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSSWW        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSWD         xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMACSWW         xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMADCSSWD       xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPMADCSWD        xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPPERM           xmmreg,xmmreg*,xmmreg,xmmrm128 AMD,SSE5 
+       VPPERM           xmmreg,xmmreg*,xmmrm128,xmmreg AMD,SSE5 
+       VPROTB           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPROTB           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPROTB           xmmreg,xmmrm128*,imm8    AMD,SSE5 
+       VPROTD           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPROTD           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPROTD           xmmreg,xmmrm128*,imm8    AMD,SSE5 
+       VPROTQ           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPROTQ           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPROTQ           xmmreg,xmmrm128*,imm8    AMD,SSE5 
+       VPROTW           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPROTW           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPROTW           xmmreg,xmmrm128*,imm8    AMD,SSE5 
+       VPSHAB           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHAB           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPSHAD           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHAD           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPSHAQ           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHAQ           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPSHAW           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHAW           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPSHLB           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHLB           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPSHLD           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHLD           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPSHLQ           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHLQ           xmmreg,xmmreg*,xmmrm128  AMD,SSE5 
+       VPSHLW           xmmreg,xmmrm128*,xmmreg  AMD,SSE5 
+       VPSHLW           xmmreg,xmmreg*,xmmrm128  AMD,SSE5
+
+B.1.32 Systematic names for the hinting nop instructions
+
+       HINT_NOP0        rm16                     P6,UNDOC 
+       HINT_NOP0        rm32                     P6,UNDOC 
+       HINT_NOP0        rm64                     X64,UNDOC 
+       HINT_NOP1        rm16                     P6,UNDOC 
+       HINT_NOP1        rm32                     P6,UNDOC 
+       HINT_NOP1        rm64                     X64,UNDOC 
+       HINT_NOP2        rm16                     P6,UNDOC 
+       HINT_NOP2        rm32                     P6,UNDOC 
+       HINT_NOP2        rm64                     X64,UNDOC 
+       HINT_NOP3        rm16                     P6,UNDOC 
+       HINT_NOP3        rm32                     P6,UNDOC 
+       HINT_NOP3        rm64                     X64,UNDOC 
+       HINT_NOP4        rm16                     P6,UNDOC 
+       HINT_NOP4        rm32                     P6,UNDOC 
+       HINT_NOP4        rm64                     X64,UNDOC 
+       HINT_NOP5        rm16                     P6,UNDOC 
+       HINT_NOP5        rm32                     P6,UNDOC 
+       HINT_NOP5        rm64                     X64,UNDOC 
+       HINT_NOP6        rm16                     P6,UNDOC 
+       HINT_NOP6        rm32                     P6,UNDOC 
+       HINT_NOP6        rm64                     X64,UNDOC 
+       HINT_NOP7        rm16                     P6,UNDOC 
+       HINT_NOP7        rm32                     P6,UNDOC 
+       HINT_NOP7        rm64                     X64,UNDOC 
+       HINT_NOP8        rm16                     P6,UNDOC 
+       HINT_NOP8        rm32                     P6,UNDOC 
+       HINT_NOP8        rm64                     X64,UNDOC 
+       HINT_NOP9        rm16                     P6,UNDOC 
+       HINT_NOP9        rm32                     P6,UNDOC 
+       HINT_NOP9        rm64                     X64,UNDOC 
+       HINT_NOP10       rm16                     P6,UNDOC 
+       HINT_NOP10       rm32                     P6,UNDOC 
+       HINT_NOP10       rm64                     X64,UNDOC 
+       HINT_NOP11       rm16                     P6,UNDOC 
+       HINT_NOP11       rm32                     P6,UNDOC 
+       HINT_NOP11       rm64                     X64,UNDOC 
+       HINT_NOP12       rm16                     P6,UNDOC 
+       HINT_NOP12       rm32                     P6,UNDOC 
+       HINT_NOP12       rm64                     X64,UNDOC 
+       HINT_NOP13       rm16                     P6,UNDOC 
+       HINT_NOP13       rm32                     P6,UNDOC 
+       HINT_NOP13       rm64                     X64,UNDOC 
+       HINT_NOP14       rm16                     P6,UNDOC 
+       HINT_NOP14       rm32                     P6,UNDOC 
+       HINT_NOP14       rm64                     X64,UNDOC 
+       HINT_NOP15       rm16                     P6,UNDOC 
+       HINT_NOP15       rm32                     P6,UNDOC 
+       HINT_NOP15       rm64                     X64,UNDOC 
+       HINT_NOP16       rm16                     P6,UNDOC 
+       HINT_NOP16       rm32                     P6,UNDOC 
+       HINT_NOP16       rm64                     X64,UNDOC 
+       HINT_NOP17       rm16                     P6,UNDOC 
+       HINT_NOP17       rm32                     P6,UNDOC 
+       HINT_NOP17       rm64                     X64,UNDOC 
+       HINT_NOP18       rm16                     P6,UNDOC 
+       HINT_NOP18       rm32                     P6,UNDOC 
+       HINT_NOP18       rm64                     X64,UNDOC 
+       HINT_NOP19       rm16                     P6,UNDOC 
+       HINT_NOP19       rm32                     P6,UNDOC 
+       HINT_NOP19       rm64                     X64,UNDOC 
+       HINT_NOP20       rm16                     P6,UNDOC 
+       HINT_NOP20       rm32                     P6,UNDOC 
+       HINT_NOP20       rm64                     X64,UNDOC 
+       HINT_NOP21       rm16                     P6,UNDOC 
+       HINT_NOP21       rm32                     P6,UNDOC 
+       HINT_NOP21       rm64                     X64,UNDOC 
+       HINT_NOP22       rm16                     P6,UNDOC 
+       HINT_NOP22       rm32                     P6,UNDOC 
+       HINT_NOP22       rm64                     X64,UNDOC 
+       HINT_NOP23       rm16                     P6,UNDOC 
+       HINT_NOP23       rm32                     P6,UNDOC 
+       HINT_NOP23       rm64                     X64,UNDOC 
+       HINT_NOP24       rm16                     P6,UNDOC 
+       HINT_NOP24       rm32                     P6,UNDOC 
+       HINT_NOP24       rm64                     X64,UNDOC 
+       HINT_NOP25       rm16                     P6,UNDOC 
+       HINT_NOP25       rm32                     P6,UNDOC 
+       HINT_NOP25       rm64                     X64,UNDOC 
+       HINT_NOP26       rm16                     P6,UNDOC 
+       HINT_NOP26       rm32                     P6,UNDOC 
+       HINT_NOP26       rm64                     X64,UNDOC 
+       HINT_NOP27       rm16                     P6,UNDOC 
+       HINT_NOP27       rm32                     P6,UNDOC 
+       HINT_NOP27       rm64                     X64,UNDOC 
+       HINT_NOP28       rm16                     P6,UNDOC 
+       HINT_NOP28       rm32                     P6,UNDOC 
+       HINT_NOP28       rm64                     X64,UNDOC 
+       HINT_NOP29       rm16                     P6,UNDOC 
+       HINT_NOP29       rm32                     P6,UNDOC 
+       HINT_NOP29       rm64                     X64,UNDOC 
+       HINT_NOP30       rm16                     P6,UNDOC 
+       HINT_NOP30       rm32                     P6,UNDOC 
+       HINT_NOP30       rm64                     X64,UNDOC 
+       HINT_NOP31       rm16                     P6,UNDOC 
+       HINT_NOP31       rm32                     P6,UNDOC 
+       HINT_NOP31       rm64                     X64,UNDOC 
+       HINT_NOP32       rm16                     P6,UNDOC 
+       HINT_NOP32       rm32                     P6,UNDOC 
+       HINT_NOP32       rm64                     X64,UNDOC 
+       HINT_NOP33       rm16                     P6,UNDOC 
+       HINT_NOP33       rm32                     P6,UNDOC 
+       HINT_NOP33       rm64                     X64,UNDOC 
+       HINT_NOP34       rm16                     P6,UNDOC 
+       HINT_NOP34       rm32                     P6,UNDOC 
+       HINT_NOP34       rm64                     X64,UNDOC 
+       HINT_NOP35       rm16                     P6,UNDOC 
+       HINT_NOP35       rm32                     P6,UNDOC 
+       HINT_NOP35       rm64                     X64,UNDOC 
+       HINT_NOP36       rm16                     P6,UNDOC 
+       HINT_NOP36       rm32                     P6,UNDOC 
+       HINT_NOP36       rm64                     X64,UNDOC 
+       HINT_NOP37       rm16                     P6,UNDOC 
+       HINT_NOP37       rm32                     P6,UNDOC 
+       HINT_NOP37       rm64                     X64,UNDOC 
+       HINT_NOP38       rm16                     P6,UNDOC 
+       HINT_NOP38       rm32                     P6,UNDOC 
+       HINT_NOP38       rm64                     X64,UNDOC 
+       HINT_NOP39       rm16                     P6,UNDOC 
+       HINT_NOP39       rm32                     P6,UNDOC 
+       HINT_NOP39       rm64                     X64,UNDOC 
+       HINT_NOP40       rm16                     P6,UNDOC 
+       HINT_NOP40       rm32                     P6,UNDOC 
+       HINT_NOP40       rm64                     X64,UNDOC 
+       HINT_NOP41       rm16                     P6,UNDOC 
+       HINT_NOP41       rm32                     P6,UNDOC 
+       HINT_NOP41       rm64                     X64,UNDOC 
+       HINT_NOP42       rm16                     P6,UNDOC 
+       HINT_NOP42       rm32                     P6,UNDOC 
+       HINT_NOP42       rm64                     X64,UNDOC 
+       HINT_NOP43       rm16                     P6,UNDOC 
+       HINT_NOP43       rm32                     P6,UNDOC 
+       HINT_NOP43       rm64                     X64,UNDOC 
+       HINT_NOP44       rm16                     P6,UNDOC 
+       HINT_NOP44       rm32                     P6,UNDOC 
+       HINT_NOP44       rm64                     X64,UNDOC 
+       HINT_NOP45       rm16                     P6,UNDOC 
+       HINT_NOP45       rm32                     P6,UNDOC 
+       HINT_NOP45       rm64                     X64,UNDOC 
+       HINT_NOP46       rm16                     P6,UNDOC 
+       HINT_NOP46       rm32                     P6,UNDOC 
+       HINT_NOP46       rm64                     X64,UNDOC 
+       HINT_NOP47       rm16                     P6,UNDOC 
+       HINT_NOP47       rm32                     P6,UNDOC 
+       HINT_NOP47       rm64                     X64,UNDOC 
+       HINT_NOP48       rm16                     P6,UNDOC 
+       HINT_NOP48       rm32                     P6,UNDOC 
+       HINT_NOP48       rm64                     X64,UNDOC 
+       HINT_NOP49       rm16                     P6,UNDOC 
+       HINT_NOP49       rm32                     P6,UNDOC 
+       HINT_NOP49       rm64                     X64,UNDOC 
+       HINT_NOP50       rm16                     P6,UNDOC 
+       HINT_NOP50       rm32                     P6,UNDOC 
+       HINT_NOP50       rm64                     X64,UNDOC 
+       HINT_NOP51       rm16                     P6,UNDOC 
+       HINT_NOP51       rm32                     P6,UNDOC 
+       HINT_NOP51       rm64                     X64,UNDOC 
+       HINT_NOP52       rm16                     P6,UNDOC 
+       HINT_NOP52       rm32                     P6,UNDOC 
+       HINT_NOP52       rm64                     X64,UNDOC 
+       HINT_NOP53       rm16                     P6,UNDOC 
+       HINT_NOP53       rm32                     P6,UNDOC 
+       HINT_NOP53       rm64                     X64,UNDOC 
+       HINT_NOP54       rm16                     P6,UNDOC 
+       HINT_NOP54       rm32                     P6,UNDOC 
+       HINT_NOP54       rm64                     X64,UNDOC 
+       HINT_NOP55       rm16                     P6,UNDOC 
+       HINT_NOP55       rm32                     P6,UNDOC 
+       HINT_NOP55       rm64                     X64,UNDOC 
+       HINT_NOP56       rm16                     P6,UNDOC 
+       HINT_NOP56       rm32                     P6,UNDOC 
+       HINT_NOP56       rm64                     X64,UNDOC 
+       HINT_NOP57       rm16                     P6,UNDOC 
+       HINT_NOP57       rm32                     P6,UNDOC 
+       HINT_NOP57       rm64                     X64,UNDOC 
+       HINT_NOP58       rm16                     P6,UNDOC 
+       HINT_NOP58       rm32                     P6,UNDOC 
+       HINT_NOP58       rm64                     X64,UNDOC 
+       HINT_NOP59       rm16                     P6,UNDOC 
+       HINT_NOP59       rm32                     P6,UNDOC 
+       HINT_NOP59       rm64                     X64,UNDOC 
+       HINT_NOP60       rm16                     P6,UNDOC 
+       HINT_NOP60       rm32                     P6,UNDOC 
+       HINT_NOP60       rm64                     X64,UNDOC 
+       HINT_NOP61       rm16                     P6,UNDOC 
+       HINT_NOP61       rm32                     P6,UNDOC 
+       HINT_NOP61       rm64                     X64,UNDOC 
+       HINT_NOP62       rm16                     P6,UNDOC 
+       HINT_NOP62       rm32                     P6,UNDOC 
+       HINT_NOP62       rm64                     X64,UNDOC 
+       HINT_NOP63       rm16                     P6,UNDOC 
+       HINT_NOP63       rm32                     P6,UNDOC 
+       HINT_NOP63       rm64                     X64,UNDOC
+
+Appendix C: NASM Version History
+--------------------------------
+
+   C.1 NASM 2 Series
+
+       The NASM 2 series support x86-64, and is the production version of
+       NASM since 2007.
+
+ C.1.1 Version 2.08
+
+       (*) A number of enhancements/fixes in macros area.
+
+       (*) Support for arbitrarily terminating macro expansions
+           `%exitmacro'. See section 4.3.12.
+
+       (*) Support for recursive macro expansion `%rmacro/irmacro'. See
+           section 4.3.1.
+
+       (*) Support for converting strings to tokens. See section 4.1.9.
+
+       (*) Fuzzy operand size logic introduced.
+
+       (*) Fix COFF stack overrun on too long export identifiers.
+
+       (*) Fix Macho-O alignment bug.
+
+       (*) Fix crashes with -fwin32 on file with many exports.
+
+       (*) Fix stack overrun for too long [DEBUG id].
+
+       (*) Fix incorrect sbyte usage in IMUL (hit only if optimization flag
+           passed).
+
+       (*) Append ending token for `.stabs' records in the ELF output
+           format.
+
+       (*) New NSIS script which uses ModernUI and MultiUser approach.
+
+       (*) Visual Studio 2008 NASM integration (rules file).
+
+       (*) Warn a user if a constant is too long (and as result will be
+           stripped).
+
+       (*) The obsoleted pre-XOP AMD SSE5 instruction set which was never
+           actualized was removed.
+
+       (*) Fix stack overrun on too long error file name passed from the
+           command line.
+
+       (*) Bind symbols to the .text section by default (ie in case if
+           SECTION directive was omitted) in the ELF output format.
+
+       (*) Fix sync points array index wrapping.
+
+       (*) A few fixes for FMA4 and XOP instruction templates.
+
+       (*) Add AMD Lightweight Profiling (LWP) instructions.
+
+ C.1.2 Version 2.07
+
+       (*) NASM is now under the 2-clause BSD license. See section 1.1.2.
+
+       (*) Fix the section type for the `.strtab' section in the `elf64'
+           output format.
+
+       (*) Fix the handling of `COMMON' directives in the `obj' output
+           format.
+
+       (*) New `ith' and `srec' output formats; these are variants of the
+           `bin' output format which output Intel hex and Motorola S-
+           records, respectively. See section 7.2 and section 7.3.
+
+       (*) `rdf2ihx' replaced with an enhanced `rdf2bin', which can output
+           binary, COM, Intel hex or Motorola S-records.
+
+       (*) The Windows installer now puts the NASM directory first in the
+           `PATH' of the "NASM Shell".
+
+       (*) Revert the early expansion behavior of `%+' to pre-2.06
+           behavior: `%+' is only expanded late.
+
+       (*) Yet another Mach-O alignment fix.
+
+       (*) Don't delete the list file on errors. Also, include error and
+           warning information in the list file.
+
+       (*) Support for 64-bit Mach-O output, see section 7.8.
+
+       (*) Fix assert failure on certain operations that involve strings
+           with high-bit bytes.
+
+ C.1.3 Version 2.06
+
+       (*) This release is dedicated to the memory of Charles A. Crayne,
+           long time NASM developer as well as moderator of
+           `comp.lang.asm.x86' and author of the book _Serious Assembler_.
+           We miss you, Chuck.
+
+       (*) Support for indirect macro expansion (`%[...]'). See section
+           4.1.3.
+
+       (*) `%pop' can now take an argument, see section 4.7.1.
+
+       (*) The argument to `%use' is no longer macro-expanded. Use `%[...]'
+           if macro expansion is desired.
+
+       (*) Support for thread-local storage in ELF32 and ELF64. See section
+           7.9.4.
+
+       (*) Fix crash on `%ifmacro' without an argument.
+
+       (*) Correct the arguments to the `POPCNT' instruction.
+
+       (*) Fix section alignment in the Mach-O format.
+
+       (*) Update AVX support to version 5 of the Intel specification.
+
+       (*) Fix the handling of accesses to context-local macros from higher
+           levels in the context stack.
+
+       (*) Treat `WAIT' as a prefix rather than as an instruction, thereby
+           allowing constructs like `O16 FSAVE' to work correctly.
+
+       (*) Support for structures with a non-zero base offset. See section
+           4.11.10.
+
+       (*) Correctly handle preprocessor token concatenation (see section
+           4.3.8) involving floating-point numbers.
+
+       (*) The `PINSR' series of instructions have been corrected and
+           rationalized.
+
+       (*) Removed AMD SSE5, replaced with the new XOP/FMA4/CVT16 (rev
+           3.03) spec.
+
+       (*) The ELF backends no longer automatically generate a `.comment'
+           section.
+
+       (*) Add additional "well-known" ELF sections with default
+           attributes. See section 7.9.2.
+
+ C.1.4 Version 2.05.01
+
+       (*) Fix the `-w'/`-W' option parsing, which was broken in NASM 2.05.
+
+ C.1.5 Version 2.05
+
+       (*) Fix redundant REX.W prefix on `JMP reg64'.
+
+       (*) Make the behaviour of `-O0' match NASM 0.98 legacy behavior. See
+           section 2.1.22.
+
+       (*) `-w-user' can be used to suppress the output of `%warning'
+           directives. See section 2.1.24.
+
+       (*) Fix bug where `ALIGN' would issue a full alignment datum instead
+           of zero bytes.
+
+       (*) Fix offsets in list files.
+
+       (*) Fix `%include' inside multi-line macros or loops.
+
+       (*) Fix error where NASM would generate a spurious warning on valid
+           optimizations of immediate values.
+
+       (*) Fix arguments to a number of the `CVT' SSE instructions.
+
+       (*) Fix RIP-relative offsets when the instruction carries an
+           immediate.
+
+       (*) Massive overhaul of the ELF64 backend for spec compliance.
+
+       (*) Fix the Geode `PFRCPV' and `PFRSQRTV' instruction.
+
+       (*) Fix the SSE 4.2 `CRC32' instruction.
+
+ C.1.6 Version 2.04
+
+       (*) Sanitize macro handing in the `%error' directive.
+
+       (*) New `%warning' directive to issue user-controlled warnings.
+
+       (*) `%error' directives are now deferred to the final assembly
+           phase.
+
+       (*) New `%fatal' directive to immediately terminate assembly.
+
+       (*) New `%strcat' directive to join quoted strings together.
+
+       (*) New `%use' macro directive to support standard macro directives.
+           See section 4.6.4.
+
+       (*) Excess default parameters to `%macro' now issues a warning by
+           default. See section 4.3.
+
+       (*) Fix `%ifn' and `%elifn'.
+
+       (*) Fix nested `%else' clauses.
+
+       (*) Correct the handling of nested `%rep's.
+
+       (*) New `%unmacro' directive to undeclare a multi-line macro. See
+           section 4.3.11.
+
+       (*) Builtin macro `__PASS__' which expands to the current assembly
+           pass. See section 4.11.9.
+
+       (*) `__utf16__' and `__utf32__' operators to generate UTF-16 and
+           UTF-32 strings. See section 3.4.5.
+
+       (*) Fix bug in case-insensitive matching when compiled on platforms
+           that don't use the `configure' script. Of the official release
+           binaries, that only affected the OS/2 binary.
+
+       (*) Support for x87 packed BCD constants. See section 3.4.7.
+
+       (*) Correct the `LTR' and `SLDT' instructions in 64-bit mode.
+
+       (*) Fix unnecessary REX.W prefix on indirect jumps in 64-bit mode.
+
+       (*) Add AVX versions of the AES instructions (`VAES'...).
+
+       (*) Fix the 256-bit FMA instructions.
+
+       (*) Add 256-bit AVX stores per the latest AVX spec.
+
+       (*) VIA XCRYPT instructions can now be written either with or
+           without `REP', apparently different versions of the VIA spec
+           wrote them differently.
+
+       (*) Add missing 64-bit `MOVNTI' instruction.
+
+       (*) Fix the operand size of `VMREAD' and `VMWRITE'.
+
+       (*) Numerous bug fixes, especially to the AES, AVX and VTX
+           instructions.
+
+       (*) The optimizer now always runs until it converges. It also runs
+           even when disabled, but doesn't optimize. This allows most
+           forward references to be resolved properly.
+
+       (*) `%push' no longer needs a context identifier; omitting the
+           context identifier results in an anonymous context.
+
+ C.1.7 Version 2.03.01
+
+       (*) Fix buffer overflow in the listing module.
+
+       (*) Fix the handling of hexadecimal escape codes in `...` strings.
+
+       (*) The Postscript/PDF documentation has been reformatted.
+
+       (*) The `-F' option now implies `-g'.
+
+ C.1.8 Version 2.03
+
+       (*) Add support for Intel AVX, CLMUL and FMA instructions, including
+           YMM registers.
+
+       (*) `dy', `resy' and `yword' for 32-byte operands.
+
+       (*) Fix some SSE5 instructions.
+
+       (*) Intel `INVEPT', `INVVPID' and `MOVBE' instructions.
+
+       (*) Fix checking for critical expressions when the optimizer is
+           enabled.
+
+       (*) Support the DWARF debugging format for ELF targets.
+
+       (*) Fix optimizations of signed bytes.
+
+       (*) Fix operation on bigendian machines.
+
+       (*) Fix buffer overflow in the preprocessor.
+
+       (*) `SAFESEH' support for Win32, `IMAGEREL' for Win64 (SEH).
+
+       (*) `%?' and `%??' to refer to the name of a macro itself. In
+           particular, `%idefine keyword $%?' can be used to make a keyword
+           "disappear".
+
+       (*) New options for dependency generation: `-MD', `-MF', `-MP',
+           `-MT', `-MQ'.
+
+       (*) New preprocessor directives `%pathsearch' and `%depend'; INCBIN
+           reimplemented as a macro.
+
+       (*) `%include' now resolves macros in a sane manner.
+
+       (*) `%substr' can now be used to get other than one-character
+           substrings.
+
+       (*) New type of character/string constants, using backquotes
+           (``...`'), which support C-style escape sequences.
+
+       (*) `%defstr' and `%idefstr' to stringize macro definitions before
+           creation.
+
+       (*) Fix forward references used in `EQU' statements.
+
+ C.1.9 Version 2.02
+
+       (*) Additional fixes for MMX operands with explicit `qword', as well
+           as (hopefully) SSE operands with `oword'.
+
+       (*) Fix handling of truncated strings with `DO'.
+
+       (*) Fix segfaults due to memory overwrites when floating-point
+           constants were used.
+
+       (*) Fix segfaults due to missing include files.
+
+       (*) Fix OpenWatcom Makefiles for DOS and OS/2.
+
+       (*) Add autogenerated instruction list back into the documentation.
+
+       (*) ELF: Fix segfault when generating stabs, and no symbols have
+           been defined.
+
+       (*) ELF: Experimental support for DWARF debugging information.
+
+       (*) New compile date and time standard macros.
+
+       (*) `%ifnum' now returns true for negative numbers.
+
+       (*) New `%iftoken' test for a single token.
+
+       (*) New `%ifempty' test for empty expansion.
+
+       (*) Add support for the `XSAVE' instruction group.
+
+       (*) Makefile for Netware/gcc.
+
+       (*) Fix issue with some warnings getting emitted way too many times.
+
+       (*) Autogenerated instruction list added to the documentation.
+
+C.1.10 Version 2.01
+
+       (*) Fix the handling of MMX registers with explicit `qword' tags on
+           memory (broken in 2.00 due to 64-bit changes.)
+
+       (*) Fix the PREFETCH instructions.
+
+       (*) Fix the documentation.
+
+       (*) Fix debugging info when using `-f elf' (backwards compatibility
+           alias for `-f elf32').
+
+       (*) Man pages for rdoff tools (from the Debian project.)
+
+       (*) ELF: handle large numbers of sections.
+
+       (*) Fix corrupt output when the optimizer runs out of passes.
+
+C.1.11 Version 2.00
+
+       (*) Added c99 data-type compliance.
+
+       (*) Added general x86-64 support.
+
+       (*) Added win64 (x86-64 COFF) output format.
+
+       (*) Added `__BITS__' standard macro.
+
+       (*) Renamed the `elf' output format to `elf32' for clarity.
+
+       (*) Added `elf64' and `macho' (MacOS X) output formats.
+
+       (*) Added Numeric constants in `dq' directive.
+
+       (*) Added `oword', `do' and `reso' pseudo operands.
+
+       (*) Allow underscores in numbers.
+
+       (*) Added 8-, 16- and 128-bit floating-point formats.
+
+       (*) Added binary, octal and hexadecimal floating-point.
+
+       (*) Correct the generation of floating-point constants.
+
+       (*) Added floating-point option control.
+
+       (*) Added Infinity and NaN floating point support.
+
+       (*) Added ELF Symbol Visibility support.
+
+       (*) Added setting OSABI value in ELF header directive.
+
+       (*) Added Generate Makefile Dependencies option.
+
+       (*) Added Unlimited Optimization Passes option.
+
+       (*) Added `%IFN' and `%ELIFN' support.
+
+       (*) Added Logical Negation Operator.
+
+       (*) Enhanced Stack Relative Preprocessor Directives.
+
+       (*) Enhanced ELF Debug Formats.
+
+       (*) Enhanced Send Errors to a File option.
+
+       (*) Added SSSE3, SSE4.1, SSE4.2, SSE5 support.
+
+       (*) Added a large number of additional instructions.
+
+       (*) Significant performance improvements.
+
+       (*) `-w+warning' and `-w-warning' can now be written as -Wwarning
+           and -Wno-warning, respectively. See section 2.1.24.
+
+       (*) Add `-w+error' to treat warnings as errors. See section 2.1.24.
+
+       (*) Add `-w+all' and `-w-all' to enable or disable all suppressible
+           warnings. See section 2.1.24.
+
+   C.2 NASM 0.98 Series
+
+       The 0.98 series was the production versions of NASM from 1999 to
+       2007.
+
+ C.2.1 Version 0.98.39
+
+       (*) fix buffer overflow
+
+       (*) fix outas86's `.bss' handling
+
+       (*) "make spotless" no longer deletes config.h.in.
+
+       (*) `%(el)if(n)idn' insensitivity to string quotes difference
+           (#809300).
+
+       (*) (nasm.c)`__OUTPUT_FORMAT__' changed to string value instead of
+           symbol.
+
+ C.2.2 Version 0.98.38
+
+       (*) Add Makefile for 16-bit DOS binaries under OpenWatcom, and
+           modify `mkdep.pl' to be able to generate completely pathless
+           dependencies, as required by OpenWatcom wmake (it supports path
+           searches, but not explicit paths.)
+
+       (*) Fix the `STR' instruction.
+
+       (*) Fix the ELF output format, which was broken under certain
+           circumstances due to the addition of stabs support.
+
+       (*) Quick-fix Borland format debug-info for `-f obj'
+
+       (*) Fix for `%rep' with no arguments (#560568)
+
+       (*) Fix concatenation of preprocessor function call (#794686)
+
+       (*) Fix long label causes coredump (#677841)
+
+       (*) Use autoheader as well as autoconf to keep configure from
+           generating ridiculously long command lines.
+
+       (*) Make sure that all of the formats which support debugging output
+           actually will suppress debugging output when `-g' not specified.
+
+ C.2.3 Version 0.98.37
+
+       (*) Paths given in `-I' switch searched for `incbin'-ed as well as
+           `%include'-ed files.
+
+       (*) Added stabs debugging for the ELF output format, patch from
+           Martin Wawro.
+
+       (*) Fix `output/outbin.c' to allow origin > 80000000h.
+
+       (*) Make `-U' switch work.
+
+       (*) Fix the use of relative offsets with explicit prefixes, e.g.
+           `a32 loop foo'.
+
+       (*) Remove `backslash()'.
+
+       (*) Fix the `SMSW' and `SLDT' instructions.
+
+       (*) `-O2' and `-O3' are no longer aliases for `-O10' and `-O15'. If
+           you mean the latter, please say so! :)
+
+ C.2.4 Version 0.98.36
+
+       (*) Update rdoff - librarian/archiver - common rec - docs!
+
+       (*) Fix signed/unsigned problems.
+
+       (*) Fix `JMP FAR label' and `CALL FAR label'.
+
+       (*) Add new multisection support - map files - fix align bug
+
+       (*) Fix sysexit, movhps/movlps reg,reg bugs in insns.dat
+
+       (*) `Q' or `O' suffixes indicate octal
+
+       (*) Support Prescott new instructions (PNI).
+
+       (*) Cyrix `XSTORE' instruction.
+
+ C.2.5 Version 0.98.35
+
+       (*) Fix build failure on 16-bit DOS (Makefile.bc3 workaround for
+           compiler bug.)
+
+       (*) Fix dependencies and compiler warnings.
+
+       (*) Add "const" in a number of places.
+
+       (*) Add -X option to specify error reporting format (use -Xvc to
+           integrate with Microsoft Visual Studio.)
+
+       (*) Minor changes for code legibility.
+
+       (*) Drop use of tmpnam() in rdoff (security fix.)
+
+ C.2.6 Version 0.98.34
+
+       (*) Correct additional address-size vs. operand-size confusions.
+
+       (*) Generate dependencies for all Makefiles automatically.
+
+       (*) Add support for unimplemented (but theoretically available)
+           registers such as tr0 and cr5. Segment registers 6 and 7 are
+           called segr6 and segr7 for the operations which they can be
+           represented.
+
+       (*) Correct some disassembler bugs related to redundant address-size
+           prefixes. Some work still remains in this area.
+
+       (*) Correctly generate an error for things like "SEG eax".
+
+       (*) Add the JMPE instruction, enabled by "CPU IA64".
+
+       (*) Correct compilation on newer gcc/glibc platforms.
+
+       (*) Issue an error on things like "jmp far eax".
+
+ C.2.7 Version 0.98.33
+
+       (*) New __NASM_PATCHLEVEL__ and __NASM_VERSION_ID__ standard macros
+           to round out the version-query macros. version.pl now
+           understands X.YYplWW or X.YY.ZZplWW as a version number,
+           equivalent to X.YY.ZZ.WW (or X.YY.0.WW, as appropriate).
+
+       (*) New keyword "strict" to disable the optimization of specific
+           operands.
+
+       (*) Fix the handing of size overrides with JMP instructions
+           (instructions such as "jmp dword foo".)
+
+       (*) Fix the handling of "ABSOLUTE label", where "label" points into
+           a relocatable segment.
+
+       (*) Fix OBJ output format with lots of externs.
+
+       (*) More documentation updates.
+
+       (*) Add -Ov option to get verbose information about optimizations.
+
+       (*) Undo a braindead change which broke `%elif' directives.
+
+       (*) Makefile updates.
+
+ C.2.8 Version 0.98.32
+
+       (*) Fix NASM crashing when `%macro' directives were left
+           unterminated.
+
+       (*) Lots of documentation updates.
+
+       (*) Complete rewrite of the PostScript/PDF documentation generator.
+
+       (*) The MS Visual C++ Makefile was updated and corrected.
+
+       (*) Recognize .rodata as a standard section name in ELF.
+
+       (*) Fix some obsolete Perl4-isms in Perl scripts.
+
+       (*) Fix configure.in to work with autoconf 2.5x.
+
+       (*) Fix a couple of "make cleaner" misses.
+
+       (*) Make the normal "./configure && make" work with Cygwin.
+
+ C.2.9 Version 0.98.31
+
+       (*) Correctly build in a separate object directory again.
+
+       (*) Derive all references to the version number from the version
+           file.
+
+       (*) New standard macros __NASM_SUBMINOR__ and __NASM_VER__ macros.
+
+       (*) Lots of Makefile updates and bug fixes.
+
+       (*) New `%ifmacro' directive to test for multiline macros.
+
+       (*) Documentation updates.
+
+       (*) Fixes for 16-bit OBJ format output.
+
+       (*) Changed the NASM environment variable to NASMENV.
+
+C.2.10 Version 0.98.30
+
+       (*) Changed doc files a lot: completely removed old READMExx and
+           Wishlist files, incorporating all information in CHANGES and
+           TODO.
+
+       (*) I waited a long time to rename zoutieee.c to (original)
+           outieee.c
+
+       (*) moved all output modules to output/ subdirectory.
+
+       (*) Added 'make strip' target to strip debug info from nasm &
+           ndisasm.
+
+       (*) Added INSTALL file with installation instructions.
+
+       (*) Added -v option description to nasm man.
+
+       (*) Added dist makefile target to produce source distributions.
+
+       (*) 16-bit support for ELF output format (GNU extension, but
+           useful.)
+
+C.2.11 Version 0.98.28
+
+       (*) Fastcooked this for Debian's Woody release: Frank applied the
+           INCBIN bug patch to 0.98.25alt and called it 0.98.28 to not
+           confuse poor little apt-get.
+
+C.2.12 Version 0.98.26
+
+       (*) Reorganised files even better from 0.98.25alt
+
+C.2.13 Version 0.98.25alt
+
+       (*) Prettified the source tree. Moved files to more reasonable
+           places.
+
+       (*) Added findleak.pl script to misc/ directory.
+
+       (*) Attempted to fix doc.
+
+C.2.14 Version 0.98.25
+
+       (*) Line continuation character `\'.
+
+       (*) Docs inadvertantly reverted - "dos packaging".
+
+C.2.15 Version 0.98.24p1
+
+       (*) FIXME: Someone, document this please.
+
+C.2.16 Version 0.98.24
+
+       (*) Documentation - Ndisasm doc added to Nasm.doc.
+
+C.2.17 Version 0.98.23
+
+       (*) Attempted to remove rdoff version1
+
+       (*) Lino Mastrodomenico's patches to preproc.c (%$$ bug?).
+
+C.2.18 Version 0.98.22
+
+       (*) Update rdoff2 - attempt to remove v1.
+
+C.2.19 Version 0.98.21
+
+       (*) Optimization fixes.
+
+C.2.20 Version 0.98.20
+
+       (*) Optimization fixes.
+
+C.2.21 Version 0.98.19
+
+       (*) H. J. Lu's patch back out.
+
+C.2.22 Version 0.98.18
+
+       (*) Added ".rdata" to "-f win32".
+
+C.2.23 Version 0.98.17
+
+       (*) H. J. Lu's "bogus elf" patch. (Red Hat problem?)
+
+C.2.24 Version 0.98.16
+
+       (*) Fix whitespace before "[section ..." bug.
+
+C.2.25 Version 0.98.15
+
+       (*) Rdoff changes (?).
+
+       (*) Fix fixes to memory leaks.
+
+C.2.26 Version 0.98.14
+
+       (*) Fix memory leaks.
+
+C.2.27 Version 0.98.13
+
+       (*) There was no 0.98.13
+
+C.2.28 Version 0.98.12
+
+       (*) Update optimization (new function of "-O1")
+
+       (*) Changes to test/bintest.asm (?).
+
+C.2.29 Version 0.98.11
+
+       (*) Optimization changes.
+
+       (*) Ndisasm fixed.
+
+C.2.30 Version 0.98.10
+
+       (*) There was no 0.98.10
+
+C.2.31 Version 0.98.09
+
+       (*) Add multiple sections support to "-f bin".
+
+       (*) Changed GLOBAL_TEMP_BASE in outelf.c from 6 to 15.
+
+       (*) Add "-v" as an alias to the "-r" switch.
+
+       (*) Remove "#ifdef" from Tasm compatibility options.
+
+       (*) Remove redundant size-overrides on "mov ds, ex", etc.
+
+       (*) Fixes to SSE2, other insns.dat (?).
+
+       (*) Enable uppercase "I" and "P" switches.
+
+       (*) Case insinsitive "seg" and "wrt".
+
+       (*) Update install.sh (?).
+
+       (*) Allocate tokens in blocks.
+
+       (*) Improve "invalid effective address" messages.
+
+C.2.32 Version 0.98.08
+
+       (*) Add "`%strlen'" and "`%substr'" macro operators
+
+       (*) Fixed broken c16.mac.
+
+       (*) Unterminated string error reported.
+
+       (*) Fixed bugs as per 0.98bf
+
+C.2.33 Version 0.98.09b with John Coffman patches released 28-Oct-2001
+
+       Changes from 0.98.07 release to 98.09b as of 28-Oct-2001
+
+       (*) More closely compatible with 0.98 when -O0 is implied or
+           specified. Not strictly identical, since backward branches in
+           range of short offsets are recognized, and signed byte values
+           with no explicit size specification will be assembled as a
+           single byte.
+
+       (*) More forgiving with the PUSH instruction. 0.98 requires a size
+           to be specified always. 0.98.09b will imply the size from the
+           current BITS setting (16 or 32).
+
+       (*) Changed definition of the optimization flag:
+
+       -O0 strict two-pass assembly, JMP and Jcc are handled more like
+       0.98, except that back- ward JMPs are short, if possible.
+
+       -O1 strict two-pass assembly, but forward branches are assembled
+       with code guaranteed to reach; may produce larger code than -O0, but
+       will produce successful assembly more often if branch offset sizes
+       are not specified.
+
+       -O2 multi-pass optimization, minimize branch offsets; also will
+       minimize signed immed- iate bytes, overriding size specification.
+
+       -O3 like -O2, but more passes taken, if needed
+
+C.2.34 Version 0.98.07 released 01/28/01
+
+       (*) Added Stepane Denis' SSE2 instructions to a *working* version of
+           the code - some earlier versions were based on broken code -
+           sorry 'bout that. version "0.98.07"
+
+       01/28/01
+
+       (*) Cosmetic modifications to nasm.c, nasm.h, AUTHORS, MODIFIED
+
+C.2.35 Version 0.98.06f released 01/18/01
+
+       (*) - Add "metalbrain"s jecxz bug fix in insns.dat - alter
+           nasmdoc.src to match - version "0.98.06f"
+
+C.2.36 Version 0.98.06e released 01/09/01
+
+       (*) Removed the "outforms.h" file - it appears to be someone's old
+           backup of "outform.h". version "0.98.06e"
+
+       01/09/01
+
+       (*) fbk - finally added the fix for the "multiple %includes bug",
+           known since 7/27/99 - reported originally (?) and sent to us by
+           Austin Lunnen - he reports that John Fine had a fix within the
+           day. Here it is...
+
+       (*) Nelson Rush resigns from the group. Big thanks to Nelson for his
+           leadership and enthusiasm in getting these changes incorporated
+           into Nasm!
+
+       (*) fbk - [list +], [list -] directives - ineptly implemented,
+           should be re-written or removed, perhaps.
+
+       (*) Brian Raiter / fbk - "elfso bug" fix - applied to aoutb format
+           as well - testing might be desirable...
+
+       08/07/00
+
+       (*) James Seter - -postfix, -prefix command line switches.
+
+       (*) Yuri Zaporogets - rdoff utility changes.
+
+C.2.37 Version 0.98p1
+
+       (*) GAS-like palign (Panos Minos)
+
+       (*) FIXME: Someone, fill this in with details
+
+C.2.38 Version 0.98bf (bug-fixed)
+
+       (*) Fixed - elf and aoutb bug - shared libraries - multiple
+           "%include" bug in "-f obj" - jcxz, jecxz bug - unrecognized
+           option bug in ndisasm
+
+C.2.39 Version 0.98.03 with John Coffman's changes released 27-Jul-2000
+
+       (*) Added signed byte optimizations for the 0x81/0x83 class of
+           instructions: ADC, ADD, AND, CMP, OR, SBB, SUB, XOR: when used
+           as 'ADD reg16,imm' or 'ADD reg32,imm.' Also optimization of
+           signed byte form of 'PUSH imm' and 'IMUL reg,imm'/'IMUL
+           reg,reg,imm.' No size specification is needed.
+
+       (*) Added multi-pass JMP and Jcc offset optimization. Offsets on
+           forward references will preferentially use the short form,
+           without the need to code a specific size (short or near) for the
+           branch. Added instructions for 'Jcc label' to use the form
+           'Jnotcc $+3/JMP label', in cases where a short offset is out of
+           bounds. If compiling for a 386 or higher CPU, then the 386 form
+           of Jcc will be used instead.
+
+       This feature is controlled by a new command-line switch: "O", (upper
+       case letter O). "-O0" reverts the assembler to no extra optimization
+       passes, "-O1" allows up to 5 extra passes, and "-O2"(default),
+       allows up to 10 extra optimization passes.
+
+       (*) Added a new directive: 'cpu XXX', where XXX is any of: 8086,
+           186, 286, 386, 486, 586, pentium, 686, PPro, P2, P3 or Katmai.
+           All are case insensitive. All instructions will be selected only
+           if they apply to the selected cpu or lower. Corrected a couple
+           of bugs in cpu-dependence in 'insns.dat'.
+
+       (*) Added to 'standard.mac', the "use16" and "use32" forms of the
+           "bits 16/32" directive. This is nothing new, just conforms to a
+           lot of other assemblers. (minor)
+
+       (*) Changed label allocation from 320/32 (10000 labels @ 200K+) to
+           32/37 (1000 labels); makes running under DOS much easier. Since
+           additional label space is allocated dynamically, this should
+           have no effect on large programs with lots of labels. The 37 is
+           a prime, believed to be better for hashing. (minor)
+
+C.2.40 Version 0.98.03
+
+       "Integrated patchfile 0.98-0.98.01. I call this version 0.98.03 for
+       historical reasons: 0.98.02 was trashed." --John Coffman
+       <johninsd@san.rr.com>, 27-Jul-2000
+
+       (*) Kendall Bennett's SciTech MGL changes
+
+       (*) Note that you must define "TASM_COMPAT" at compile-time to get
+           the Tasm Ideal Mode compatibility.
+
+       (*) All changes can be compiled in and out using the TASM_COMPAT
+           macros, and when compiled without TASM_COMPAT defined we get the
+           exact same binary as the unmodified 0.98 sources.
+
+       (*) standard.mac, macros.c: Added macros to ignore TASM directives
+           before first include
+
+       (*) nasm.h: Added extern declaration for tasm_compatible_mode
+
+       (*) nasm.c: Added global variable tasm_compatible_mode
+
+       (*) Added command line switch for TASM compatible mode (-t)
+
+       (*) Changed version command line to reflect when compiled with TASM
+           additions
+
+       (*) Added response file processing to allow all arguments on a
+           single line (response file is @resp rather than -@resp for NASM
+           format).
+
+       (*) labels.c: Changes islocal() macro to support TASM style @@local
+           labels.
+
+       (*) Added islocalchar() macro to support TASM style @@local labels.
+
+       (*) parser.c: Added support for TASM style memory references (ie:
+           mov [DWORD eax],10 rather than the NASM style mov DWORD
+           [eax],10).
+
+       (*) preproc.c: Added new directives, `%arg', `%local', `%stacksize'
+           to directives table
+
+       (*) Added support for TASM style directives without a leading %
+           symbol.
+
+       (*) Integrated a block of changes from Andrew Zabolotny
+           <bit@eltech.ru>:
+
+       (*) A new keyword `%xdefine' and its case-insensitive counterpart
+           `%ixdefine'. They work almost the same way as `%define' and
+           `%idefine' but expand the definition immediately, not on the
+           invocation. Something like a cross between `%define' and
+           `%assign'. The "x" suffix stands for "eXpand", so "xdefine" can
+           be deciphered as "expand-and-define". Thus you can do things
+           like this:
+
+            %assign ofs     0 
+       
+            %macro  arg     1 
+                    %xdefine %1 dword [esp+ofs] 
+                    %assign ofs ofs+4 
+            %endmacro
+
+       (*) Changed the place where the expansion of %$name macros are
+           expanded. Now they are converted into ..@ctxnum.name form when
+           detokenizing, so there are no quirks as before when using %$name
+           arguments to macros, in macros etc. For example:
+
+            %macro  abc     1 
+                    %define %1 hello 
+            %endm 
+       
+            abc     %$here 
+            %$here
+
+       Now last line will be expanded into "hello" as expected. This also
+       allows for lots of goodies, a good example are extended "proc"
+       macros included in this archive.
+
+       (*) Added a check for "cstk" in smacro_defined() before calling
+           get_ctx() - this allows for things like:
+
+            %ifdef %$abc 
+            %endif
+
+       to work without warnings even in no context.
+
+       (*) Added a check for "cstk" in %if*ctx and %elif*ctx directives -
+           this allows to use `%ifctx' without excessive warnings. If there
+           is no active context, `%ifctx' goes through "false" branch.
+
+       (*) Removed "user error: " prefix with `%error' directive: it just
+           clobbers the output and has absolutely no functionality.
+           Besides, this allows to write macros that does not differ from
+           built-in functions in any way.
+
+       (*) Added expansion of string that is output by `%error' directive.
+           Now you can do things like:
+
+            %define hello(x) Hello, x! 
+       
+            %define %$name andy 
+            %error "hello(%$name)"
+
+       Same happened with `%include' directive.
+
+       (*) Now all directives that expect an identifier will try to expand
+           and concatenate everything without whitespaces in between before
+           usage. For example, with "unfixed" nasm the commands
+
+            %define %$abc hello 
+            %define __%$abc goodbye 
+            __%$abc
+
+       would produce "incorrect" output: last line will expand to
+
+            hello goodbyehello
+
+       Not quite what you expected, eh? :-) The answer is that preprocessor
+       treats the `%define' construct as if it would be
+
+            %define __ %$abc goodbye
+
+       (note the white space between __ and %$abc). After my "fix" it will
+       "correctly" expand into
+
+            goodbye
+
+       as expected. Note that I use quotes around words "correct",
+       "incorrect" etc because this is rather a feature not a bug; however
+       current behaviour is more logical (and allows more advanced macro
+       usage :-).
+
+       Same change was applied to:
+       `%push',`%macro',`%imacro',`%define',`%idefine',`%xdefine',`%ixdefine',
+       `%assign',`%iassign',`%undef'
+
+       (*) A new directive [WARNING {+|-}warning-id] have been added. It
+           works only if the assembly phase is enabled (i.e. it doesn't
+           work with nasm -e).
+
+       (*) A new warning type: macro-selfref. By default this warning is
+           disabled; when enabled NASM warns when a macro self-references
+           itself; for example the following source:
+
+              [WARNING macro-selfref] 
+       
+              %macro          push    1-* 
+                      %rep    %0 
+                              push    %1 
+                              %rotate 1 
+                      %endrep 
+              %endmacro 
+       
+                              push    eax,ebx,ecx
+
+       will produce a warning, but if we remove the first line we won't see
+       it anymore (which is The Right Thing To Do {tm} IMHO since C
+       preprocessor eats such constructs without warnings at all).
+
+       (*) Added a "error" routine to preprocessor which always will set
+           ERR_PASS1 bit in severity_code. This removes annoying repeated
+           errors on first and second passes from preprocessor.
+
+       (*) Added the %+ operator in single-line macros for concatenating
+           two identifiers. Usage example:
+
+              %define _myfunc _otherfunc 
+              %define cextern(x) _ %+ x 
+              cextern (myfunc)
+
+       After first expansion, third line will become "_myfunc". After this
+       expansion is performed again so it becomes "_otherunc".
+
+       (*) Now if preprocessor is in a non-emitting state, no warning or
+           error will be emitted. Example:
+
+              %if 1 
+                      mov     eax,ebx 
+              %else 
+                      put anything you want between these two brackets, 
+                      even macro-parameter references %1 or local 
+                      labels %$zz or macro-local labels %%zz - no 
+                      warning will be emitted. 
+              %endif
+
+       (*) Context-local variables on expansion as a last resort are looked
+           up in outer contexts. For example, the following piece:
+
+              %push   outer 
+              %define %$a [esp] 
+       
+                      %push   inner 
+                      %$a 
+                      %pop 
+              %pop
+
+       will expand correctly the fourth line to [esp]; if we'll define
+       another %$a inside the "inner" context, it will take precedence over
+       outer definition. However, this modification has been applied only
+       to expand_smacro and not to smacro_define: as a consequence
+       expansion looks in outer contexts, but `%ifdef' won't look in outer
+       contexts.
+
+       This behaviour is needed because we don't want nested contexts to
+       act on already defined local macros. Example:
+
+              %define %$arg1  [esp+4] 
+              test    eax,eax 
+              if      nz 
+                      mov     eax,%$arg1 
+              endif
+
+       In this example the "if" mmacro enters into the "if" context, so
+       %$arg1 is not valid anymore inside "if". Of course it could be
+       worked around by using explicitely %$$arg1 but this is ugly IMHO.
+
+       (*) Fixed memory leak in `%undef'. The origline wasn't freed before
+           exiting on success.
+
+       (*) Fixed trap in preprocessor when line expanded to empty set of
+           tokens. This happens, for example, in the following case:
+
+              #define SOMETHING 
+              SOMETHING
+
+C.2.41 Version 0.98
+
+       All changes since NASM 0.98p3 have been produced by H. Peter Anvin
+       <hpa@zytor.com>.
+
+       (*) The documentation comment delimiter is
+
+       (*) Allow EQU definitions to refer to external labels; reported by
+           Pedro Gimeno.
+
+       (*) Re-enable support for RDOFF v1; reported by Pedro Gimeno.
+
+       (*) Updated License file per OK from Simon and Julian.
+
+C.2.42 Version 0.98p9
+
+       (*) Update documentation (although the instruction set reference
+           will have to wait; I don't want to hold up the 0.98 release for
+           it.)
+
+       (*) Verified that the NASM implementation of the PEXTRW and PMOVMSKB
+           instructions is correct. The encoding differs from what the
+           Intel manuals document, but the Pentium III behaviour matches
+           NASM, not the Intel manuals.
+
+       (*) Fix handling of implicit sizes in PSHUFW and PINSRW, reported by
+           Stefan Hoffmeister.
+
+       (*) Resurrect the -s option, which was removed when changing the
+           diagnostic output to stdout.
+
+C.2.43 Version 0.98p8
+
+       (*) Fix for "DB" when NASM is running on a bigendian machine.
+
+       (*) Invoke insns.pl once for each output script, making Makefile.in
+           legal for "make -j".
+
+       (*) Improve the Unix configure-based makefiles to make package
+           creation easier.
+
+       (*) Included an RPM .spec file for building RPM (RedHat Package
+           Manager) packages on Linux or Unix systems.
+
+       (*) Fix Makefile dependency problems.
+
+       (*) Change src/rdsrc.pl to include sectioning information in info
+           output; required for install-info to work.
+
+       (*) Updated the RDOFF distribution to version 2 from Jules; minor
+           massaging to make it compile in my environment.
+
+       (*) Split doc files that can be built by anyone with a Perl
+           interpreter off into a separate archive.
+
+       (*) "Dress rehearsal" release!
+
+C.2.44 Version 0.98p7
+
+       (*) Fixed opcodes with a third byte-sized immediate argument to not
+           complain if given "byte" on the immediate.
+
+       (*) Allow `%undef' to remove single-line macros with arguments. This
+           matches the behaviour of #undef in the C preprocessor.
+
+       (*) Allow -d, -u, -i and -p to be specified as -D, -U, -I and -P for
+           compatibility with most C compilers and preprocessors. This
+           allows Makefile options to be shared between cc and nasm, for
+           example.
+
+       (*) Minor cleanups.
+
+       (*) Went through the list of Katmai instructions and hopefully fixed
+           the (rather few) mistakes in it.
+
+       (*) (Hopefully) fixed a number of disassembler bugs related to
+           ambiguous instructions (disambiguated by -p) and SSE
+           instructions with REP.
+
+       (*) Fix for bug reported by Mark Junger: "call dword 0x12345678"
+           should work and may add an OSP (affected CALL, JMP, Jcc).
+
+       (*) Fix for environments when "stderr" isn't a compile-time
+           constant.
+
+C.2.45 Version 0.98p6
+
+       (*) Took officially over coordination of the 0.98 release; so drop
+           the p3.x notation. Skipped p4 and p5 to avoid confusion with
+           John Fine's J4 and J5 releases.
+
+       (*) Update the documentation; however, it still doesn't include
+           documentation for the various new instructions. I somehow wonder
+           if it makes sense to have an instruction set reference in the
+           assembler manual when Intel et al have PDF versions of their
+           manuals online.
+
+       (*) Recognize "idt" or "centaur" for the -p option to ndisasm.
+
+       (*) Changed error messages back to stderr where they belong, but add
+           an -E option to redirect them elsewhere (the DOS shell cannot
+           redirect stderr.)
+
+       (*) -M option to generate Makefile dependencies (based on code from
+           Alex Verstak.)
+
+       (*) `%undef' preprocessor directive, and -u option, that undefines a
+           single-line macro.
+
+       (*) OS/2 Makefile (Mkfiles/Makefile.os2) for Borland under OS/2;
+           from Chuck Crayne.
+
+       (*) Various minor bugfixes (reported by): - Dangling `%s' in
+           preproc.c (Martin Junker)
+
+       (*) THERE ARE KNOWN BUGS IN SSE AND THE OTHER KATMAI INSTRUCTIONS. I
+           am on a trip and didn't bring the Katmai instruction reference,
+           so I can't work on them right now.
+
+       (*) Updated the License file per agreement with Simon and Jules to
+           include a GPL distribution clause.
+
+C.2.46 Version 0.98p3.7
+
+       (*) (Hopefully) fixed the canned Makefiles to include the outrdf2
+           and zoutieee modules.
+
+       (*) Renamed changes.asm to changed.asm.
+
+C.2.47 Version 0.98p3.6
+
+       (*) Fixed a bunch of instructions that were added in 0.98p3.5 which
+           had memory operands, and the address-size prefix was missing
+           from the instruction pattern.
+
+C.2.48 Version 0.98p3.5
+
+       (*) Merged in changes from John S. Fine's 0.98-J5 release. John's
+           based 0.98-J5 on my 0.98p3.3 release; this merges the changes.
+
+       (*) Expanded the instructions flag field to a long so we can fit
+           more flags; mark SSE (KNI) and AMD or Katmai-specific
+           instructions as such.
+
+       (*) Fix the "PRIV" flag on a bunch of instructions, and create new
+           "PROT" flag for protected-mode-only instructions (orthogonal to
+           if the instruction is privileged!) and new "SMM" flag for SMM-
+           only instructions.
+
+       (*) Added AMD-only SYSCALL and SYSRET instructions.
+
+       (*) Make SSE actually work, and add new Katmai MMX instructions.
+
+       (*) Added a -p (preferred vendor) option to ndisasm so that it can
+           distinguish e.g. Cyrix opcodes also used in SSE. For example:
+
+            ndisasm -p cyrix aliased.bin 
+            00000000  670F514310        paddsiw mm0,[ebx+0x10] 
+            00000005  670F514320        paddsiw mm0,[ebx+0x20] 
+            ndisasm -p intel aliased.bin 
+            00000000  670F514310        sqrtps xmm0,[ebx+0x10] 
+            00000005  670F514320        sqrtps xmm0,[ebx+0x20]
+
+       (*) Added a bunch of Cyrix-specific instructions.
+
+C.2.49 Version 0.98p3.4
+
+       (*) Made at least an attempt to modify all the additional Makefiles
+           (in the Mkfiles directory). I can't test it, but this was the
+           best I could do.
+
+       (*) DOS DJGPP+"Opus Make" Makefile from John S. Fine.
+
+       (*) changes.asm changes from John S. Fine.
+
+C.2.50 Version 0.98p3.3
+
+       (*) Patch from Conan Brink to allow nesting of `%rep' directives.
+
+       (*) If we're going to allow INT01 as an alias for INT1/ICEBP (one of
+           Jules 0.98p3 changes), then we should allow INT03 as an alias
+           for INT3 as well.
+
+       (*) Updated changes.asm to include the latest changes.
+
+       (*) Tried to clean up the <CR>s that had snuck in from a DOS/Windows
+           environment into my Unix environment, and try to make sure than
+           DOS/Windows users get them back.
+
+       (*) We would silently generate broken tools if insns.dat wasn't
+           sorted properly. Change insns.pl so that the order doesn't
+           matter.
+
+       (*) Fix bug in insns.pl (introduced by me) which would cause
+           conditional instructions to have an extra "cc" in disassembly,
+           e.g. "jnz" disassembled as "jccnz".
+
+C.2.51 Version 0.98p3.2
+
+       (*) Merged in John S. Fine's changes from his 0.98-J4 prerelease;
+           see http://www.csoft.net/cz/johnfine/
+
+       (*) Changed previous "spotless" Makefile target (appropriate for
+           distribution) to "distclean", and added "cleaner" target which
+           is same as "clean" except deletes files generated by Perl
+           scripts; "spotless" is union.
+
+       (*) Removed BASIC programs from distribution. Get a Perl interpreter
+           instead (see below.)
+
+       (*) Calling this "pre-release 3.2" rather than "p3-hpa2" because of
+           John's contributions.
+
+       (*) Actually link in the IEEE output format (zoutieee.c); fix a
+           bunch of compiler warnings in that file. Note I don't know what
+           IEEE output is supposed to look like, so these changes were made
+           "blind".
+
+C.2.52 Version 0.98p3-hpa
+
+       (*) Merged nasm098p3.zip with nasm-0.97.tar.gz to create a fully
+           buildable version for Unix systems (Makefile.in updates, etc.)
+
+       (*) Changed insns.pl to create the instruction tables in nasm.h and
+           names.c, so that a new instruction can be added by adding it
+           *only* to insns.dat.
+
+       (*) Added the following new instructions: SYSENTER, SYSEXIT, FXSAVE,
+           FXRSTOR, UD1, UD2 (the latter two are two opcodes that Intel
+           guarantee will never be used; one of them is documented as UD2
+           in Intel documentation, the other one just as "Undefined Opcode"
+           -- calling it UD1 seemed to make sense.)
+
+       (*) MAX_SYMBOL was defined to be 9, but LOADALL286 and LOADALL386
+           are 10 characters long. Now MAX_SYMBOL is derived from
+           insns.dat.
+
+       (*) A note on the BASIC programs included: forget them. insns.bas is
+           already out of date. Get yourself a Perl interpreter for your
+           platform of choice at http://www.cpan.org/ports/index.html.
+
+C.2.53 Version 0.98 pre-release 3
+
+       (*) added response file support, improved command line handling, new
+           layout help screen
+
+       (*) fixed limit checking bug, 'OUT byte nn, reg' bug, and a couple
+           of rdoff related bugs, updated Wishlist; 0.98 Prerelease 3.
+
+C.2.54 Version 0.98 pre-release 2
+
+       (*) fixed bug in outcoff.c to do with truncating section names
+           longer than 8 characters, referencing beyond end of string; 0.98
+           pre-release 2
+
+C.2.55 Version 0.98 pre-release 1
+
+       (*) Fixed a bug whereby STRUC didn't work at all in RDF.
+
+       (*) Fixed a problem with group specification in PUBDEFs in OBJ.
+
+       (*) Improved ease of adding new output formats. Contribution due to
+           Fox Cutter.
+
+       (*) Fixed a bug in relocations in the `bin' format: was showing up
+           when a relocatable reference crossed an 8192-byte boundary in
+           any output section.
+
+       (*) Fixed a bug in local labels: local-label lookups were
+           inconsistent between passes one and two if an EQU occurred
+           between the definition of a global label and the subsequent use
+           of a local label local to that global.
+
+       (*) Fixed a seg-fault in the preprocessor (again) which happened
+           when you use a blank line as the first line of a multi-line
+           macro definition and then defined a label on the same line as a
+           call to that macro.
+
+       (*) Fixed a stale-pointer bug in the handling of the NASM
+           environment variable. Thanks to Thomas McWilliams.
+
+       (*) ELF had a hard limit on the number of sections which caused
+           segfaults when transgressed. Fixed.
+
+       (*) Added ability for ndisasm to read from stdin by using `-' as the
+           filename.
+
+       (*) ndisasm wasn't outputting the TO keyword. Fixed.
+
+       (*) Fixed error cascade on bogus expression in `%if' - an error in
+           evaluation was causing the entire `%if' to be discarded, thus
+           creating trouble later when the `%else' or `%endif' was
+           encountered.
+
+       (*) Forward reference tracking was instruction-granular not operand-
+           granular, which was causing 286-specific code to be generated
+           needlessly on code of the form `shr word [forwardref],1'. Thanks
+           to Jim Hague for sending a patch.
+
+       (*) All messages now appear on stdout, as sending them to stderr
+           serves no useful purpose other than to make redirection
+           difficult.
+
+       (*) Fixed the problem with EQUs pointing to an external symbol -
+           this now generates an error message.
+
+       (*) Allowed multiple size prefixes to an operand, of which only the
+           first is taken into account.
+
+       (*) Incorporated John Fine's changes, including fixes of a large
+           number of preprocessor bugs, some small problems in OBJ, and a
+           reworking of label handling to define labels before their line
+           is assembled, rather than after.
+
+       (*) Reformatted a lot of the source code to be more readable.
+           Included 'coding.txt' as a guideline for how to format code for
+           contributors.
+
+       (*) Stopped nested `%reps' causing a panic - they now cause a
+           slightly more friendly error message instead.
+
+       (*) Fixed floating point constant problems (patch by Pedro Gimeno)
+
+       (*) Fixed the return value of insn_size() not being checked for -1,
+           indicating an error.
+
+       (*) Incorporated 3Dnow! instructions.
+
+       (*) Fixed the 'mov eax, eax + ebx' bug.
+
+       (*) Fixed the GLOBAL EQU bug in ELF. Released developers release 3.
+
+       (*) Incorporated John Fine's command line parsing changes
+
+       (*) Incorporated David Lindauer's OMF debug support
+
+       (*) Made changes for LCC 4.0 support (`__NASM_CDecl__', removed
+           register size specification warning when sizes agree).
+
+   C.3 NASM 0.9 Series
+
+       Revisions before 0.98.
+
+ C.3.1 Version 0.97 released December 1997
+
+       (*) This was entirely a bug-fix release to 0.96, which seems to have
+           got cursed. Silly me.
+
+       (*) Fixed stupid mistake in OBJ which caused `MOV EAX,<constant>' to
+           fail. Caused by an error in the `MOV EAX,<segment>' support.
+
+       (*) ndisasm hung at EOF when compiled with lcc on Linux because lcc
+           on Linux somehow breaks feof(). ndisasm now does not rely on
+           feof().
+
+       (*) A heading in the documentation was missing due to a markup error
+           in the indexing. Fixed.
+
+       (*) Fixed failure to update all pointers on realloc() within
+           extended- operand code in parser.c. Was causing wrong behaviour
+           and seg faults on lines such as `dd 0.0,0.0,0.0,0.0,...'
+
+       (*) Fixed a subtle preprocessor bug whereby invoking one multi-line
+           macro on the first line of the expansion of another, when the
+           second had been invoked with a label defined before it, didn't
+           expand the inner macro.
+
+       (*) Added internal.doc back in to the distribution archives - it was
+           missing in 0.96 *blush*
+
+       (*) Fixed bug causing 0.96 to be unable to assemble its own test
+           files, specifically objtest.asm. *blush again*
+
+       (*) Fixed seg-faults and bogus error messages caused by mismatching
+           `%rep' and `%endrep' within multi-line macro definitions.
+
+       (*) Fixed a problem with buffer overrun in OBJ, which was causing
+           corruption at ends of long PUBDEF records.
+
+       (*) Separated DOS archives into main-program and documentation to
+           reduce download size.
+
+ C.3.2 Version 0.96 released November 1997
+
+       (*) Fixed a bug whereby, if `nasm sourcefile' would cause a filename
+           collision warning and put output into `nasm.out', then `nasm
+           sourcefile -o outputfile' still gave the warning even though the
+           `-o' was honoured. Fixed name pollution under Digital UNIX: one
+           of its header files defined R_SP, which broke the enum in
+           nasm.h.
+
+       (*) Fixed minor instruction table problems: FUCOM and FUCOMP didn't
+           have two-operand forms; NDISASM didn't recognise the longer
+           register forms of PUSH and POP (eg FF F3 for PUSH BX); TEST
+           mem,imm32 was flagged as undocumented; the 32-bit forms of CMOV
+           had 16-bit operand size prefixes; `AAD imm' and `AAM imm' are no
+           longer flagged as undocumented because the Intel Architecture
+           reference documents them.
+
+       (*) Fixed a problem with the local-label mechanism, whereby strange
+           types of symbol (EQUs, auto-defined OBJ segment base symbols)
+           interfered with the `previous global label' value and screwed up
+           local labels.
+
+       (*) Fixed a bug whereby the stub preprocessor didn't communicate
+           with the listing file generator, so that the -a and -l options
+           in conjunction would produce a useless listing file.
+
+       (*) Merged `os2' object file format back into `obj', after
+           discovering that `obj' _also_ shouldn't have a link pass
+           separator in a module containing a non-trivial MODEND. Flat
+           segments are now declared using the FLAT attribute. `os2' is no
+           longer a valid object format name: use `obj'.
+
+       (*) Removed the fixed-size temporary storage in the evaluator. Very
+           very long expressions (like `mov ax,1+1+1+1+...' for two hundred
+           1s or so) should now no longer crash NASM.
+
+       (*) Fixed a bug involving segfaults on disassembly of MMX
+           instructions, by changing the meaning of one of the operand-type
+           flags in nasm.h. This may cause other apparently unrelated MMX
+           problems; it needs to be tested thoroughly.
+
+       (*) Fixed some buffer overrun problems with large OBJ output files.
+           Thanks to DJ Delorie for the bug report and fix.
+
+       (*) Made preprocess-only mode actually listen to the `%line' markers
+           as it prints them, so that it can report errors more sanely.
+
+       (*) Re-designed the evaluator to keep more sensible track of
+           expressions involving forward references: can now cope with
+           previously-nightmare situations such as:
+
+         mov ax,foo | bar 
+         foo equ 1 
+         bar equ 2
+
+       (*) Added the ALIGN and ALIGNB standard macros.
+
+       (*) Added PIC support in ELF: use of WRT to obtain the four extra
+           relocation types needed.
+
+       (*) Added the ability for output file formats to define their own
+           extensions to the GLOBAL, COMMON and EXTERN directives.
+
+       (*) Implemented common-variable alignment, and global-symbol type
+           and size declarations, in ELF.
+
+       (*) Implemented NEAR and FAR keywords for common variables, plus
+           far-common element size specification, in OBJ.
+
+       (*) Added a feature whereby EXTERNs and COMMONs in OBJ can be given
+           a default WRT specification (either a segment or a group).
+
+       (*) Transformed the Unix NASM archive into an auto-configuring
+           package.
+
+       (*) Added a sanity-check for people applying SEG to things which are
+           already segment bases: this previously went unnoticed by the SEG
+           processing and caused OBJ-driver panics later.
+
+       (*) Added the ability, in OBJ format, to deal with `MOV
+           EAX,<segment>' type references: OBJ doesn't directly support
+           dword-size segment base fixups, but as long as the low two bytes
+           of the constant term are zero, a word-size fixup can be
+           generated instead and it will work.
+
+       (*) Added the ability to specify sections' alignment requirements in
+           Win32 object files and pure binary files.
+
+       (*) Added preprocess-time expression evaluation: the `%assign' (and
+           `%iassign') directive and the bare `%if' (and `%elif')
+           conditional. Added relational operators to the evaluator, for
+           use only in `%if' constructs: the standard relationals = < > <=
+           >= <> (and C-like synonyms == and !=) plus low-precedence
+           logical operators &&, ^^ and ||.
+
+       (*) Added a preprocessor repeat construct: `%rep' / `%exitrep' /
+           `%endrep'.
+
+       (*) Added the __FILE__ and __LINE__ standard macros.
+
+       (*) Added a sanity check for number constants being greater than
+           0xFFFFFFFF. The warning can be disabled.
+
+       (*) Added the %0 token whereby a variadic multi-line macro can tell
+           how many parameters it's been given in a specific invocation.
+
+       (*) Added `%rotate', allowing multi-line macro parameters to be
+           cycled.
+
+       (*) Added the `*' option for the maximum parameter count on multi-
+           line macros, allowing them to take arbitrarily many parameters.
+
+       (*) Added the ability for the user-level forms of EXTERN, GLOBAL and
+           COMMON to take more than one argument.
+
+       (*) Added the IMPORT and EXPORT directives in OBJ format, to deal
+           with Windows DLLs.
+
+       (*) Added some more preprocessor `%if' constructs: `%ifidn' /
+           `%ifidni' (exact textual identity), and `%ifid' / `%ifnum' /
+           `%ifstr' (token type testing).
+
+       (*) Added the ability to distinguish SHL AX,1 (the 8086 version)
+           from SHL AX,BYTE 1 (the 286-and-upwards version whose constant
+           happens to be 1).
+
+       (*) Added NetBSD/FreeBSD/OpenBSD's variant of a.out format, complete
+           with PIC shared library features.
+
+       (*) Changed NASM's idiosyncratic handling of FCLEX, FDISI, FENI,
+           FINIT, FSAVE, FSTCW, FSTENV, and FSTSW to bring it into line
+           with the otherwise accepted standard. The previous behaviour,
+           though it was a deliberate feature, was a deliberate feature
+           based on a misunderstanding. Apologies for the inconvenience.
+
+       (*) Improved the flexibility of ABSOLUTE: you can now give it an
+           expression rather than being restricted to a constant, and it
+           can take relocatable arguments as well.
+
+       (*) Added the ability for a variable to be declared as EXTERN
+           multiple times, and the subsequent definitions are just ignored.
+
+       (*) We now allow instruction prefixes (CS, DS, LOCK, REPZ etc) to be
+           alone on a line (without a following instruction).
+
+       (*) Improved sanity checks on whether the arguments to EXTERN,
+           GLOBAL and COMMON are valid identifiers.
+
+       (*) Added misc/exebin.mac to allow direct generation of .EXE files
+           by hacking up an EXE header using DB and DW; also added
+           test/binexe.asm to demonstrate the use of this. Thanks to Yann
+           Guidon for contributing the EXE header code.
+
+       (*) ndisasm forgot to check whether the input file had been
+           successfully opened. Now it does. Doh!
+
+       (*) Added the Cyrix extensions to the MMX instruction set.
+
+       (*) Added a hinting mechanism to allow [EAX+EBX] and [EBX+EAX] to be
+           assembled differently. This is important since [ESI+EBP] and
+           [EBP+ESI] have different default base segment registers.
+
+       (*) Added support for the PharLap OMF extension for 4096-byte
+           segment alignment.
+
+ C.3.3 Version 0.95 released July 1997
+
+       (*) Fixed yet another ELF bug. This one manifested if the user
+           relied on the default segment, and attempted to define global
+           symbols without first explicitly declaring the target segment.
+
+       (*) Added makefiles (for NASM and the RDF tools) to build Win32
+           console apps under Symantec C++. Donated by Mark Junker.
+
+       (*) Added `macros.bas' and `insns.bas', QBasic versions of the Perl
+           scripts that convert `standard.mac' to `macros.c' and convert
+           `insns.dat' to `insnsa.c' and `insnsd.c'. Also thanks to Mark
+           Junker.
+
+       (*) Changed the diassembled forms of the conditional instructions so
+           that JB is now emitted as JC, and other similar changes.
+           Suggested list by Ulrich Doewich.
+
+       (*) Added `@' to the list of valid characters to begin an identifier
+           with.
+
+       (*) Documentary changes, notably the addition of the `Common
+           Problems' section in nasm.doc.
+
+       (*) Fixed a bug relating to 32-bit PC-relative fixups in OBJ.
+
+       (*) Fixed a bug in perm_copy() in labels.c which was causing
+           exceptions in cleanup_labels() on some systems.
+
+       (*) Positivity sanity check in TIMES argument changed from a warning
+           to an error following a further complaint.
+
+       (*) Changed the acceptable limits on byte and word operands to allow
+           things like `~10111001b' to work.
+
+       (*) Fixed a major problem in the preprocessor which caused seg-
+           faults if macro definitions contained blank lines or comment-
+           only lines.
+
+       (*) Fixed inadequate error checking on the commas separating the
+           arguments to `db', `dw' etc.
+
+       (*) Fixed a crippling bug in the handling of macros with operand
+           counts defined with a `+' modifier.
+
+       (*) Fixed a bug whereby object file formats which stored the input
+           file name in the output file (such as OBJ and COFF) weren't
+           doing so correctly when the output file name was specified on
+           the command line.
+
+       (*) Removed [INC] and [INCLUDE] support for good, since they were
+           obsolete anyway.
+
+       (*) Fixed a bug in OBJ which caused all fixups to be output in 16-
+           bit (old-format) FIXUPP records, rather than putting the 32-bit
+           ones in FIXUPP32 (new-format) records.
+
+       (*) Added, tentatively, OS/2 object file support (as a minor variant
+           on OBJ).
+
+       (*) Updates to Fox Cutter's Borland C makefile, Makefile.bc2.
+
+       (*) Removed a spurious second fclose() on the output file.
+
+       (*) Added the `-s' command line option to redirect all messages
+           which would go to stderr (errors, help text) to stdout instead.
+
+       (*) Added the `-w' command line option to selectively suppress some
+           classes of assembly warning messages.
+
+       (*) Added the `-p' pre-include and `-d' pre-define command-line
+           options.
+
+       (*) Added an include file search path: the `-i' command line option.
+
+       (*) Fixed a silly little preprocessor bug whereby starting a line
+           with a `%!' environment-variable reference caused an `unknown
+           directive' error.
+
+       (*) Added the long-awaited listing file support: the `-l' command
+           line option.
+
+       (*) Fixed a problem with OBJ format whereby, in the absence of any
+           explicit segment definition, non-global symbols declared in the
+           implicit default segment generated spurious EXTDEF records in
+           the output.
+
+       (*) Added the NASM environment variable.
+
+       (*) From this version forward, Win32 console-mode binaries will be
+           included in the DOS distribution in addition to the 16-bit
+           binaries. Added Makefile.vc for this purpose.
+
+       (*) Added `return 0;' to test/objlink.c to prevent compiler
+           warnings.
+
+       (*) Added the __NASM_MAJOR__ and __NASM_MINOR__ standard defines.
+
+       (*) Added an alternative memory-reference syntax in which prefixing
+           an operand with `&' is equivalent to enclosing it in square
+           brackets, at the request of Fox Cutter.
+
+       (*) Errors in pass two now cause the program to return a non-zero
+           error code, which they didn't before.
+
+       (*) Fixed the single-line macro cycle detection, which didn't work
+           at all on macros with no parameters (caused an infinite loop).
+           Also changed the behaviour of single-line macro cycle detection
+           to work like cpp, so that macros like `extrn' as given in the
+           documentation can be implemented.
+
+       (*) Fixed the implementation of WRT, which was too restrictive in
+           that you couldn't do `mov ax,[di+abc wrt dgroup]' because
+           (di+abc) wasn't a relocatable reference.
+
+ C.3.4 Version 0.94 released April 1997
+
+       (*) Major item: added the macro processor.
+
+       (*) Added undocumented instructions SMI, IBTS, XBTS and LOADALL286.
+           Also reorganised CMPXCHG instruction into early-486 and Pentium
+           forms. Thanks to Thobias Jones for the information.
+
+       (*) Fixed two more stupid bugs in ELF, which were causing `ld' to
+           continue to seg-fault in a lot of non-trivial cases.
+
+       (*) Fixed a seg-fault in the label manager.
+
+       (*) Stopped FBLD and FBSTP from _requiring_ the TWORD keyword, which
+           is the only option for BCD loads/stores in any case.
+
+       (*) Ensured FLDCW, FSTCW and FSTSW can cope with the WORD keyword,
+           if anyone bothers to provide it. Previously they complained
+           unless no keyword at all was present.
+
+       (*) Some forms of FDIV/FDIVR and FSUB/FSUBR were still inverted: a
+           vestige of a bug that I thought had been fixed in 0.92. This was
+           fixed, hopefully for good this time...
+
+       (*) Another minor phase error (insofar as a phase error can _ever_
+           be minor) fixed, this one occurring in code of the form
+
+         rol ax,forward_reference 
+         forward_reference equ 1
+
+       (*) The number supplied to TIMES is now sanity-checked for
+           positivity, and also may be greater than 64K (which previously
+           didn't work on 16-bit systems).
+
+       (*) Added Watcom C makefiles, and misc/pmw.bat, donated by Dominik
+           Behr.
+
+       (*) Added the INCBIN pseudo-opcode.
+
+       (*) Due to the advent of the preprocessor, the [INCLUDE] and [INC]
+           directives have become obsolete. They are still supported in
+           this version, with a warning, but won't be in the next.
+
+       (*) Fixed a bug in OBJ format, which caused incorrect object records
+           to be output when absolute labels were made global.
+
+       (*) Updates to RDOFF subdirectory, and changes to outrdf.c.
+
+ C.3.5 Version 0.93 released January 1997
+
+       This release went out in a great hurry after semi-crippling bugs
+       were found in 0.92.
+
+       (*) Really _did_ fix the stack overflows this time. *blush*
+
+       (*) Had problems with EA instruction sizes changing between passes,
+           when an offset contained a forward reference and so 4 bytes were
+           allocated for the offset in pass one; by pass two the symbol had
+           been defined and happened to be a small absolute value, so only
+           1 byte got allocated, causing instruction size mismatch between
+           passes and hence incorrect address calculations. Fixed.
+
+       (*) Stupid bug in the revised ELF section generation fixed
+           (associated string-table section for .symtab was hard-coded as
+           7, even when this didn't fit with the real section table). Was
+           causing `ld' to seg-fault under Linux.
+
+       (*) Included a new Borland C makefile, Makefile.bc2, donated by Fox
+           Cutter <lmb@comtch.iea.com>.
+
+ C.3.6 Version 0.92 released January 1997
+
+       (*) The FDIVP/FDIVRP and FSUBP/FSUBRP pairs had been inverted: this
+           was fixed. This also affected the LCC driver.
+
+       (*) Fixed a bug regarding 32-bit effective addresses of the form
+           `[other_register+ESP]'.
+
+       (*) Documentary changes, notably documentation of the fact that
+           Borland Win32 compilers use `obj' rather than `win32' object
+           format.
+
+       (*) Fixed the COMENT record in OBJ files, which was formatted
+           incorrectly.
+
+       (*) Fixed a bug causing segfaults in large RDF files.
+
+       (*) OBJ format now strips initial periods from segment and group
+           definitions, in order to avoid complications with the local
+           label syntax.
+
+       (*) Fixed a bug in disassembling far calls and jumps in NDISASM.
+
+       (*) Added support for user-defined sections in COFF and ELF files.
+
+       (*) Compiled the DOS binaries with a sensible amount of stack, to
+           prevent stack overflows on any arithmetic expression containing
+           parentheses.
+
+       (*) Fixed a bug in handling of files that do not terminate in a
+           newline.
+
+ C.3.7 Version 0.91 released November 1996
+
+       (*) Loads of bug fixes.
+
+       (*) Support for RDF added.
+
+       (*) Support for DBG debugging format added.
+
+       (*) Support for 32-bit extensions to Microsoft OBJ format added.
+
+       (*) Revised for Borland C: some variable names changed, makefile
+           added.
+
+       (*) LCC support revised to actually work.
+
+       (*) JMP/CALL NEAR/FAR notation added.
+
+       (*) `a16', `o16', `a32' and `o32' prefixes added.
+
+       (*) Range checking on short jumps implemented.
+
+       (*) MMX instruction support added.
+
+       (*) Negative floating point constant support added.
+
+       (*) Memory handling improved to bypass 64K barrier under DOS.
+
+       (*) `$' prefix to force treatment of reserved words as identifiers
+           added.
+
+       (*) Default-size mechanism for object formats added.
+
+       (*) Compile-time configurability added.
+
+       (*) `#', `@', `~' and c{?} are now valid characters in labels.
+
+       (*) `-e' and `-k' options in NDISASM added.
+
+ C.3.8 Version 0.90 released October 1996
+
+       First release version. First support for object file output. Other
+       changes from previous version (0.3x) too numerous to document.