diff options
author | Christophe Varoqui <cvaroqui@hera.kernel.org> | 2005-05-01 15:05:22 -0700 |
---|---|---|
committer | Christophe Varoqui <cvaroqui@hera.kernel.org> | 2005-05-01 15:05:22 -0700 |
commit | e1233f6e33c6034e8718b396c59cdf5040ac1122 (patch) | |
tree | 42bd8e6e6d5ed3d493d0bae936a1df02944c12de | |
download | multipath-tools-e1233f6e33c6034e8718b396c59cdf5040ac1122.tar.gz multipath-tools-e1233f6e33c6034e8718b396c59cdf5040ac1122.tar.bz2 multipath-tools-e1233f6e33c6034e8718b396c59cdf5040ac1122.zip |
Initial git import.
Release 0.4.5-pre2
124 files changed, 21214 insertions, 0 deletions
@@ -0,0 +1 @@ +Christophe Varoqui, <christophe.varoqui@free.fr> @@ -0,0 +1,483 @@ + + GNU LIBRARY GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1991 Free Software Foundation, Inc. + 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the library GPL. It is + numbered 2 because it goes with version 2 of the ordinary GPL.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Library General Public License, applies to some +specially designated Free Software Foundation software, and to any +other libraries whose authors decide to use it. You can use it for +your libraries, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if +you distribute copies of the library, or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link a program with the library, you must provide +complete object files to the recipients so that they can relink them +with the library, after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + Our method of protecting your rights has two steps: (1) copyright +the library, and (2) offer you this license which gives you legal +permission to copy, distribute and/or modify the library. + + Also, for each distributor's protection, we want to make certain +that everyone understands that there is no warranty for this free +library. If the library is modified by someone else and passed on, we +want its recipients to know that what they have is not the original +version, so that any problems introduced by others will not reflect on +the original authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that companies distributing free +software will individually obtain patent licenses, thus in effect +transforming the program into proprietary software. To prevent this, +we have made it clear that any patent must be licensed for everyone's +free use or not licensed at all. + + Most GNU software, including some libraries, is covered by the ordinary +GNU General Public License, which was designed for utility programs. This +license, the GNU Library General Public License, applies to certain +designated libraries. This license is quite different from the ordinary +one; be sure to read it in full, and don't assume that anything in it is +the same as in the ordinary license. + + The reason we have a separate public license for some libraries is that +they blur the distinction we usually make between modifying or adding to a +program and simply using it. Linking a program with a library, without +changing the library, is in some sense simply using the library, and is +analogous to running a utility program or application program. However, in +a textual and legal sense, the linked executable is a combined work, a +derivative of the original library, and the ordinary General Public License +treats it as such. + + Because of this blurred distinction, using the ordinary General +Public License for libraries did not effectively promote software +sharing, because most developers did not use the libraries. We +concluded that weaker conditions might promote sharing better. + + However, unrestricted linking of non-free programs would deprive the +users of those programs of all benefit from the free status of the +libraries themselves. This Library General Public License is intended to +permit developers of non-free programs to use free libraries, while +preserving your freedom as a user of such programs to change the free +libraries that are incorporated in them. (We have not seen how to achieve +this as regards changes in header files, but we have achieved it as regards +changes in the actual functions of the Library.) The hope is that this +will lead to faster development of free libraries. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, while the latter only +works together with the library. + + Note that it is possible for a library to be covered by the ordinary +General Public License rather than by this special one. + + GNU LIBRARY GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library which +contains a notice placed by the copyright holder or other authorized +party saying it may be distributed under the terms of this Library +General Public License (also called "this License"). Each licensee is +addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also compile or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + c) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + d) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the source code distributed need not include anything that is normally +distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Library General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + + Appendix: How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +"copyright" line and a pointer to where the full notice is found. + + <one line to give the library's name and a brief idea of what it does.> + Copyright (C) <year> <name of author> + + This library is free software; you can redistribute it and/or + modify it under the terms of the GNU Library General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later version. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Library General Public License for more details. + + You should have received a copy of the GNU Library General Public + License along with this library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, + MA 02111-1307, USA + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the library, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the + library `Frob' (a library for tweaking knobs) written by James Random Hacker. + + <signature of Ty Coon>, 1 April 1990 + Ty Coon, President of Vice + +That's all there is to it! diff --git a/ChangeLog b/ChangeLog new file mode 100644 index 0000000..2f19064 --- /dev/null +++ b/ChangeLog @@ -0,0 +1,1068 @@ +2005-05-23 multipath-tools-0.4.5 + + * [libmultipath] default_prio and prio_callout keyword can be + explicitly set to "none". Suggested by Kiyoshi Ueda, NEC + * [path_prio] don't exit pp_balance_units with error when + find_controler() is not successful. It just means no other + path is currently active on this controler. + * [path_prio] move balance_units in its own dir + * [multipathd] proactively fail_path upon checker up->down + transitions. Suggested by Edward Goggin, EMC + * [libmultipath] .priority is clearly an int, not an unsigned + int. /bin/false is now personna non grata as a prio callout. + Kiyoshi Ueda, NEC + * [libmultipath] callout.c argv parsing fix. Kiyoshi Ueda, + NEC + * [multipathd] check return codes in init_paths(), split out + init_event(). + * [libmultipath] add find_slot(vec, addr) to vector lib. + * [multipath] remove signal sending + * [multipathd] use uevent to do paths list housekeeping for + checkers. Remove signal handling. + * [libmultipath] add uevent.[ch] + +2005-04-23 multipath-tools-0.4.4 + + * [path_prio] clarify pp_alua licensing. Stefan Bader, IBM. + * [devmap_name] add a target_type filter (suggested by Hannes) + and set DM task device by major:minor rather than parsing + the full map list. + * [libmultipath] propagate an error on getprio callout + failures, so that multipath can mark the map as immutable. + Reported by Lars Marowsky-Brée, Suse. + * [libmultipath] move push_callout() from dict.c to config.c + Use it in store_hwe() to get in multipathd's ramfs the + callout defined in hwtable.c when no config file is used. + Reported by Lars Marowsky-Brée, Suse. + * [checkers] zero sense buffers before use in EMC checker. + Lars Marowsky-Brée, Suse. + * [all] pre-release bugfixing effort from Alasdair, Hannes, + Lars, Benjamin Marzinski + * [multipathd] set oom_adj to -17 when kernel permits. + Immune to OOM killer ? agk says : watch out for mem + leaks :/ + * [multipathd] safety nets from udevd : exit early if + not root, chdir to / to avoid pining a mount. + * [multipathd] multipathd could loose events coming from + sighup or DM waitev. Add a pending_event counter to + track that. + * [path_prio] add pp_emc from Lars M Bree, Suse. + * [path_prio] add pp_alua from Stefan Bader, IBM. + * [libmultipath] add config.c:load_config(), which sucks + a big chunk of code out of multipath/main.c. + * [libmultipath] don't allocate memory in : + * devmapper.c:dm_get_map() + * devmapper.c:dm_get_status() + * [libmultipath] devinfo() a la carte fetching + * [libmultipath] merge keepalived memory audit framework + (Thanks again, M. Cassen). Already found and fixed a + couple of leaks. + * [libmultipath] flatten/optimize dm_map_present() and + dm_mapname(). Inspired by Alasdair Kergon, RedHat. + * [kpartx] dm_map_name() free before use bugfix. Kiyoshi + Ueda, NEC + * [kpartx] add hotplug mode. To enable name the binary + "kpartx.dev". Kiyoshi Ueda, NEC + * [multipathd] don't loose events in event waiter thread. + Suggested and prototyped by Edward Goggin, EMC + * [libmultipath] add return values to vector manipulation + fonctions. Mem alloc return code audit. + * [libmultipath] Use "config->udev_dir + path->dev" as + a path devnode to open instead of mknod'ing for each + one. Fix some DoS issues regarding usage of /tmp in + libmultipath/discovery.c:opennode(). Kill unlinknode() + * [multipathd] merged the redhat init script and stop + installing one on "make install" + * [libmultipath] fold safe_sprintf.h into util.h + * [libmultipath] move blacklist to a real regex matcher + Example config files updated : check yours !! + * [multipath] fix path group compare logic to not stop + comparing at first path in each PG. + * [multipathd] check if pidfile is a dead remnent of a + crashed daemon. If so, overwrite it. Suggested by + Alasdair Kergon, RedHat. Code heavily based on work + by Andrew Tridgell, Samba Fame. + * [build] dropped libdevmapper/ and libsysfs/ from the + package. klibc build is now broken until distributors + provide klibc compiled static libraries in their + respective packages. + * [libmultipath] dm_task_no_open_count() before each DM + ioctl. Not doing that is reported to cause deadlocks + in kernel-space. Reported by Edward Goggin, EMC, fix + suggested by Alasdair Kergon, RedHat + Note minimal libdevmapper version bumped up to 1.01. + * [multipath] switched to condlog(). "make DEBUG=N" is + deprecated. Debug is spat with "-v3" or more. + * [multipathd] "multipathd -vN" cmdline arg to control + daemon verbosity. 0 < N < 4. "make LOGLEVEL=N" is + deprecated. + * [libmultipath] provide a common condlog() primitive to + switch lib messages to syslog or stdout depending on + who uses the lib (daemon or tool). + * [kpartx] give kpartx a private, slim copy of devmap.[ch] + * [multipath] allow wwid in blacklist{} config section. + Kiyoshi Ueda, NEC. + * [multipathd] set mode value before use (S_IRWXU). Fixes + RedHat Bugzilla Bug 150665. + * [all] add ->fd to "struct path *". remove fd from all + checker context declaration. remove lots of duplicate + logic. Now a fd is opened only once for path. It should + also bring a bit safety in contended memory scenarii + * [libcheckers] remove redundant sg_include.h + * [libmultipath] merge multipath:dict.[ch] with + multipathd:dict.[ch] into libmultipath/. move config.h + there too, add some helper functions to alloc/free + "struct config *" in a new config.c. Start using a + config in the daemon. + * [libmultipath] move dm_geteventnr(), dm_get_maps() and + dm_switchgroup() in devmapper.[ch] + * [libmultipath] move path discovery logic in + libmultipath. merge devinfo.[ch] and sysfs_devinfo.[ch] + into discovery.[ch] + * [libmultipath] move config.h in libmultipath. Move + find_[mp|hw]e in a new config.c. Move "struct hwtable" + declaration in config.h. Move propsel.[ch] in the + lib too. + * [multipathd] use libmultipath:dm_type() instead of + duplacate and bogus devmap discovery code. + * [multipathd] asynchronous & non-blocking logger + thread. Implementation split into a generic log + lib and a pthread wrapper for locking and events. + An ipc wrapper could be easily created by + interested parties. + * [multipath] add "multipath -l -v2 [devname|devt]" + support in complement to [mapname|wwid] + * [kpartx] suppress loop.h private copy. Should fix + the reported build problems + * [multipath] do sysfs_get_mnt_path() only one time + and store in a global var. + * [multipath] further path discovery optimzation + * [multipath] purge superfluous includes in main.c + * [libmultipath] introduce a cache file. expiry set + to 5 secondes to covert the hotplug event storms. + * [multipath] split get_pathvec_sysfs(). Introduce + get_refwwid() and filter_pathvec() + +2005-03-19 multipath-tools-0.4.3 + + * [libmultipath] rename find_[mp|hw] to find_[mp|hw]e and + introduce a real find_mp(). + * [priority] provison for recursive compilation of prio + subdirs, in preparation of merging more signicant + prioritizers. Stephan Bader, IBM + * [libmultipath] add a netapp controler to the hwtable + * [libmultipath] blacklist() not to discard sda[0-9]* + when sda is blacklisted + * [multipath] add a rr_min_io keyword in config file. + Suggested by Igor Feoktistov, NetApp + * [multipath] stop trying to avoid running in parallel + * [multipath] bump up params size to 1024 + * [multipathd] put prio callouts in to ramfs. Stephan + Bader, IBM + * [multipath] simplify multibus pgpolicy : no need to + copy mp->paths into mp->pg->paths then free source : + just copy the ptr and set source to NULL. + * [multipath] sort PG by descending prio value in + group_by_prio. Stephan Bader, IBM + * [multipath] fix a bug in group_by_prio that lead to + creation of multiple PG for a single prio value + * [multipath] don't store multipaths in a vector anymore : + free the "struct multipath" after usage. + * [multipath] multiple optimizations in the exec plan + * [multipath] allow "multipath -l -v2 [mapname|wwid]" + * [build] rip off klibc and move to klcc, at last. + Good job hpa. multipath-tools now depend on klibc + > 1.0 to build with BUILD=klibc flag. + * [multipath] never reload a map if no path is up in the + computed new map + * [multipath] don't flush maps with open count != 0 + * [libmultipath] add "int *dm_get_opencount(char *map)" + to devmapper.c + * [multipath] plug leaks and optimize devinfo.c. From + Edward Goggin, EMC + * [multipath] fix the multipath.dev hotplug script to not + do kpartx stuff in the same run as multipath stuff. + Igor Feoktistov, NetApp, noted the devmap symlink was + not yet present for kpartx to use. + * [devmap_name] accept major:minor synthax + * [libmultipath] add "char *dm_mapname(int maj, int min)", + needed to fail paths from checker thread + * [libmultipath] move dm_reinstate() in the lib, and add + dm_fail_path() + * [multipathd] mark failed path as failed upon DM + event. This should fix the design bug noticed by + Ramesh Caushik, Intel, where the daemon didn't run + multipath when a path went down and up in between 2 + checks. + * [libmultipath] allow NULL as a pathvec in disassemble_map + as is passed only for memory optimization + * [libmultipath] add structs.c to store alloc_*() and + free_*() + * [libmultipath] move dmparser.[ch] to the lib. + remove devinfo.[ch] dependency. + * [build] fix compilation glitch with BUILD=klibc, + flags to force use of local libs, remove the link + dependency in klibc, try to guess kernel sources + and build dirs. Stefan Bader, IBM + * [libmultipath] find_hw matching logic to take str + lengths in account. Noticed by Ramesh Caushik, Intel + * [multipath] select_action matching logic to take str + length in account. + * [multipath] lookup mp alias name earlier (in coalesce) + Edward Goggin, EMC, noticed we tried to use it before + it was set up. + +2005-01-23 multipath-tools-0.4.2 + + * [libmultipath] add symmetrix controler family to the + hwtable. Edward Goggin, EMC + * [libmultipath] factorize core structs (path, ...) + and defaults (pidfile, configfile, ...). Convert + callers. + * [multipath] fix dmparser to properly fetch non-default + hwhandler. Edward Goggin, EMC + * [multipath] fix devt2devname matching 8:1 with 8:10 + for example. Edward Goggin, EMC + * [multipath] switch_pg upon devmap create or reload + Noticed by Ake. + * [libmultipath] move find_hw() the library. Convert + users. Now multipathd understand '*' as a product + string + * [multipath] dissaemble_map() fix to avoid to + interpret 'D' as a disable PG (not 'F'). Edward + Goggin, EMC + * [multipath] find_path() fix to avoid matching 8:1 + with 8:10 for example. Edward Goggin, EMC + * [libmultipath] move some sysfs fetching routines + to library, under sysfs_devinfo.[ch]. Convert + callers. + * [multipath] fix -v0 : avoids the daemon waiting + for the initial multipath run to complete, which + will never happen because of a flooded pipe + * [multipathd] add scsi_id to default binvec + * [libmultipath] move hwtable related logic to the + library. Convert multipath and multipathd + * [multipath] move first blacklist call down after + setup_default_blist() + * [libmultipath] move basename() to the lib. Convert + multipath and multipathd. + * [libmultipath] move blacklist related logic to the + library. Convert multipath and multipathd + * [multipath] fix bug in the default hardware table + matching logic (Lars M-B, Suse) + * [multipath] allow "*" as scsi model string wildcard + (Lars M-B, Suse) + * [multipath] provide a macro to fill all hwe fields, + use it to declare Clariion models (Lars M-B, Suse) + * [multipath] use DEFAULT_GETUID instead of hardcoded + *and* incorrect "/bin/scsi_id -g -s" (Lars M-B, Suse) + * [multipath] kill superfluous suspend before table + reload. The code was unsafe, as spotted by Edward + Goggin (EMC) + * [multipath] exit early if device parameter is + blacklisted + * [multipath] don't check for prefix in initrd's + multipath.dev : this is the tool responsability to + exit early based on its blacklist. + * [multipath] don't signal the daemon in initrd + (Guido Guenther, Debian tracker) + * [multipath] better fail to run kpartx in initrd + than crashing the whole system. So don't sleep + waiting for udev to create the DM node. Maybe udev + has made progress I this regard ... (noticed by + Paul Wagland, Debian tracker) + * [multipath] don't reinstate when listing, ie list + implies dry_run + * [checkers] fix the emc checker (Hergen Lange) + * [multipath] node_name fetching shouldn't exit on + error. FC SAN are not the only multipathed context + (noticed by Ramesh Caushik) + +2004-12-20 multipath-tools-0.4.1 + + * [multipath] bump SERIAL_SIZE to 19 + * [multipath] add a new group_by_node_name pgpolicy + * [multipath] move getopt policy parser to + get_policy_id() + * [multipath] remove get_evpd_wwid() + * [checkers] fix the wwn_set test in emc checker + (Hergen Lange) + * [checkers] treat the emc checker in the name to + index translator function (Hergen Lange) + * [multipath] print to stderr DM unmet requirement + (Guido Guenther) + * [multipath] fix realloc return value store not + propagated to caller by merge_word() (Nicola Ranaldo) + +2004-12-10 multipath-tools-0.4.0 + + * [checkers] forgot to return back to caller the newly + allocated context. Lead to fd leak notably. + * [checkers] heavy check logic fix + * [checkers] really malloc the checker context size, + not the pointer size (stupidy may kill) + * [multipathd] check more sysfs calls return values + * [multipathd] search for sysfs mount point only once, + not on each updatepaths() iteration + * [multipathd] plug (char *) leak in the daemon + * [multipath] change pgcmp logic : we want to reload a + map only if a path is in computed map but not in cur + map (ie accept to load a map if it brings more paths) + * [multipath] undust coalesce_paths() + * [multipath] don't print unchanged multipath + * [multipath] store the action to take in the multipath + struct + * [multipath] print mp size with kB, MB, GB or TB units + * [multipath] compilation fix for merge_words() (Andy) + * [multipath] don't feed the kernel DM maps with paths of + different sizes : DM fails and we end up with an empty + map ... not fun + * [multipath] cover a new corner case : path listed in + status string but disappeared from sysfs + * [multipath] remove the "-D" command line flag : now + we can pass major:minor directly as last argument, like + device names or device map names. Update multipathd + accordingly. + * [multipath] try reinstate again paths after a switchpg + * [multipath] reinstate condition change : + +2004-12-05 multipath-tools-0.3.9 + + * [multipath] add a "-l" flag to list the current + multipath maps and their status info + * [priority] zalloc controler to avoid random path_count + at allocation time + * [multipath] add configlet pointers in struct multipath + to avoid searching for an entry over and over again + * [multipath] new reinstate policy : on multipath exec, + reinstate all failed paths the checkers report as ready + if they belong to enabled path groups (not disabled, not + active path group) + * [multipath] fork a print_mp() out of print_all_mp() + * [multipath] introduce PG priority, which is the sum of + its path priorities. Set first_pg in the map string to + the highest prio PG index. + * [multipath] assemble maps scaning PG top down now that + PG vector is unsorted + * [multipath] move select_*() to propsel.c + * [multipath] move devinfo() to devinfo.c + * [multipath] move h/b/t/l fetching to sysfs_devinfo() + * [multipath] move devt2devname() to devinfo.c so we can + use it from dmparser.c too + * [multipath] introduce select_alias() and clarify a bit + of code + * [multipath] don't sort PG anymore. We want the map as + static as possible. + * [multipath] fix a segfault in apply_format() triggered + when no config file found. + * [multipath] kill unused vars all over the place + * [multipath] add a struct pathgroup in struct multipath + Store the pathvec in it. We now have a place to store + PG status, etc ... + * [multipath] new dmparser.c, with disassemble_map(), + disassemble_status() + * [multipath] suppress *selector_args keywords. Merge + in the selector string. Update config file templates. + +2004-11-26 multipath-tools-0.3.8 + + * [priority] teach multipath to read callout keywords + formatted as /sbin/scsi_id -g -u -s /block/%n + Apply one substitutions out of : + * %n : blockdev basename (ie sdb) + * %d : blockdev major:minor string (ie 8:16) + update sample config files + * [priority] fix find_controler(). Now works, verified + on IBM T200 at OSDL (thanks again, Dave). Add to the + main build process + * [multipath] add a controler specific "prio_callout" + keyword. Noticed by Ake + * [multipath] normalize the debug ouput + * [multipath] add select_getuid(). De-spaghetti + devinfo() thanks to that helper. + * [libmultipath] add VECTOR_LAST_SLOT macro. + multipath/dict.h now use it heavily. + * [multipath] policies selectors speedup and cleanup + (pgpolicy, features, hwhandler, selector) + * [multipath] new "flush" command flag + * [libmultipath] add dm_type() and dm_flush_maps() + * [multipath] move dm_get_map() to libmultipath + * [multipath] rename iopolicy to pgpolicy everywhere. + Dual terminology was getting confusing. + * [multipath] assemble_map() to always set next_pg to 1 + for now. + * [multipath] update config file to show new keywords. + Add an IBM array tested at OSDL. + * [multipath] fork select_iopolicy() from setup_map() + * [multipath] introduce select_features() and + select_hwhandler(). Should merge select_* one day ... + * [multipath] add features and hardware_handler keywords + and use them in the map setup + * [build] make clean really clean. Noticed by Dave Olien, + OSDL + * [multipath] group_by_serial bugfix + * [multipath] dm_addmap() return value fix. Now multipath + really creates the maps + * [multipath] try dm_log_init_verbose() instead of dup() + + close() to silence libdevmapper (Ake at umu) + * [libcheckers] remove checkpath() wrapper, obsoleted by + the "fd in context" changes + * [multipathd] let pathcheckers allocate their context. + No more over or unneeded allocation. Suggested by Lars, + Suse + * [multipathd] store the pathcheckers fd in their context. + No more open / close on each check. Suggested by Lars, + Suse + +2004-11-05 multipath-tools-0.3.7 + + * [multipathd] fix off by one memory allocation (Hannes, + Suse) + * [multipathd] introduce a default callout handler that + just remembers to put the callout in ramfs, even if the + daemon has no direct use of them. multipath need some + that where forgotten, so parse them and use that default + handler. + * [libcheckers] emc_clariion checker update (Lars, Suse) + * [build] exit build process on failure (Lars, Suse) + * [kpartx] exit early if DM prereq not met + * [multipath] exit early if DM prereq not met + * [libmultipath] new dm_prereq() fn to check out if all DM + prerequisites are met + * [libmultipath] move callout.[ch] function in there. + multipath and multipathd impacted + * [libmultipath] move dm_* function in there. kpartx, + multipath are impacted + * [priority] pp_balance_lun should use DM_DEVICE_TABLE ioctl + instead of DM_DEVICE_STATUS to find out paths from the + primary path groups. + * [klibc] drop in "Stable" version 0.190 + * [build] add manpages for kpartx and multipathd (Patrick + Caulfield) + * [build] use system's sysfs for multipathd linking + * [build] make glibc the default build + * [build] "make BUILD=klibc" is enough, deprecate the + "make BUILD=klibc klibc" synthax + +2004-10-30 multipath-tools-0.3.6 + + * Patrick Caulfield took over debian packaging. Showing + evident expertise, his first wish is to see debian/ + disappear. :) So be it. + * [libmultipath] add a vector_foreach_slot macro. Still + needs an iterator but saves 1 line per loop occurence and + tame this UPPERCASE MACROS bad taste. + * [multipathd] don't load sg anymore on multipathd startup + * [multipathd] change killall for kill `cat $PIDFILE` in + init script (Jaime Peñalba & Cesar Solera) + * [multipathd] the fork fallback was borked (just exiting) + noticed by Jaime Peñalba & Cesar Solera + * [multipathd] try without the FLOATING_STACKS flag. Does + it matter anyway ? + * [multipathd] merge clone_platform.h from LTP and cover + the hppa special case. + * [multipath] since we will be able to create a devmap with + paths too small, don't rely anymore on the first path's + size blindly : verify the path is up, before assigning its + size to the multipath + * [priority] add a path priority fetcher to balance LU accross + controlers based on the controler serial detection. Untested + but provides a good example of what can be done with the + priority framework. + * [priority] create subdir and drop a test pp_random + * [multipath] add dev_t reporting to print_path() to ease + devmap decoding by humans + * [multipath] change default path priority to 1 + * [multipath] add wits to the sort_by_prio policy, so that + sort_pathvec_by_prio() is now useless. Remove it. + * [multipath] invert sort_pg_by_summed_prio sort order : + highest prio leftmost + * [libmultipath] add vector_del_slot + * revert multipath.rules change : devmap_name still takes + "major minor" and not "major:minor" as argument + * Makefile refinement : you can now enter any tool directory + and build from here, deps are solved + +2004-10-26 multipath-tools-0.3.5 + + * [multipathd] fix broken test for path going up or shaky + that kept executing multipath when it shouldn't + * change multipath.dev to exit early when udev' DEVNAME is + a devmap (/dev/dm-*). This avoids a recursion case when + the kernel devmapper keeps removing a map after multipath + configures it. + * change multipath.rules to follow the new -D synthax + * [multipath] "-D major minor" synthax changed to + "-D major:minor" to match the sysfs attribute value. + This change removes a few translations in multipath and + multipathd. + * [multipath] fix segfault in test if conf->dev is a devmap + (the one forwarded by MikeAnd) + * SG_IO ioctl seem to work in lk 2.6.10+, so remove all sg + device knowledge and advertise (here) the new dependency. + * [multipath] remove unused do_tur() + * [multipath] fix sort_pg_by_summed_prio(), and don't add up + failed path priority + +2004-10-26 multipath-tools-0.3.4 + + * [multipathd] exec multipath precisely : pass in the path + or the devmap to update. No more full reconfiguration, and + really use the reinstate feature of multipath. + * [multipathd] check all paths, not only failed ones. Path + checker now trigger on state change (formerly triggred on + state == UP condition) + * [multipathd] incremental updatepaths() instead of scrap / + refresh all logic. + * [multipathd] path checkers now take *msg and *context + params. consensus w/ lmb at suse. tur.c modified as example + * [multipath] assemble maps in PG vector descending order to + fit the layered policies design + * [multipath] stop playing with strings in pgpolicies, as it + uses more memory and looses info for no gain + * [multipath] remove lk2.4 scsi ioctl scsi_type remnant + * [multipath] layered pgpolicies : (see pgpolicies.c) + * group_by_status + * group_by_serial | multibus | failover | group_by_prio + * sort_pg_by_summed_prio + thus remove duplicated failedpath logic in pgpolicies + * [libmultipath] add vector_insert_slot + * [checkers] framework for arbitrate checkers return values + * [multipathd] scrap yet another reinvented wheel in the + name of the LOG macro : learn the existance of setloglevel + and LOG_UPTO macro + * glibc make with "make BUILD=glibc", asked by lmb at suse + +2004-10-20 multipath-tools-0.3.3 + + * [checkers] add the emc_clariion path checker (lmb at Suse) + * [multipath] introduce safe_snprintf macro to complement the + safe_sprintf. Needed to cover the sizeof(pointer) cases + pointed by Dave Olien at OSDL + * [multipath] move to the common libchecker framework and + activate the selector + * [multipath] fix an iopolicy selector bug (initialized lun + iopolicy overrode controler-wide iopolicy) + * [multipathd] cleanly separate out the checker selector, as + done with iopolicy selector + * [multipathd] move out the checkers into a common libcheckers + * [multipath] fix the anti-parallel-exec logic : use a write + lease for the task. From Dave Olien at osdl. + * [multipath] fix reinstate : pass a devt, not a devname + +2004-10-16 multipath-tools-0.3.2 + + * [multipath] add path reinstate logic : + * if a path is given as multipath arg + * if the map containing that path already exists + * if this map is the same as the that would be + created by this multipath run + * THEN reinstate the path + multipathd is is thus unchanged, while now supporting + reinstate + * audit and ensafe all sprintf usage + * [multipath] fix the annoying \n after each dev_t in + params string reporting + * [multipath] print out devmaps in "-v2 -d" mode + * [kpartx] bump up the params string size (lmb at suse) + * [kpartx] replace sprintf by snprintf (lmb at suse) + * [kpartx] initialize some more vars (lmb at suse) + * [multipath] mp->pg == NULL safety net before calling + assemble_map() (for Andy who happen to hit the bug) + * [multipath] last rampant bug in map CREATE or UPDATE switch + logic due to the device alias feature + * [kpartx] zeroe "struct slice all" (lmb at suse) + +2004-10-11 multipath-tools-0.3.1 + + * [kpartx] move back to getopt, originaly removed from the + original partx because of lack of implementation in klibc + * [kpartx] don't map extended partitions + * [kpartx] add a -p command flag to allow admin to force a + delimiting string between disk name and part number. When + specified always use it, when unspecified use 'p' as a delim + when last char of disk name is a digit, NUL otherwise. + * [kpartx] clean up + * bump klibc to 0.182 + * one step further : use klibc MCONFIG for all klibc specific + FLAGS definitions, ie massive Makefile.inc cleanup + * follow the klibc compilation rules by appending its OPTFLAGS + to multipath-tools' CFLAGS. This corrects the segfaults seen + on i386 where klibc is built with regparm=3 and tools are not + * [multipathd] fall back to fork when clone not available + like in Debian Woody + * [kpartx] move .start and .size from uint to ulong (Ake) + * briefly document system-disk-on-multipath in the FAQ file + +2004-10-06 multipath-tools-0.3.0 + + * first cut at making scripts to create multipath-aware initrds + those scripts are tested on Debian SID, and must be copied into + /etc/mkinitrd/scripts. it works here. + * [multipath] verify presence of the /sys/block/... node before + reading sysfs attributes. Avoids libsysfs and scsi_id stderr + garbage + * [multipath] move down the stderr close (Ake Sandgren at umu.se) + * [multipath] don't care about 0-sized mp (Ake Sandgren at umu.se) + * [multipath] bump mp size field to ulong (Ake Sandgren at umu.se) + * [multipath] replace quiet/verbose flags by a verbosity one. + introduce a new verbosity level : 1 == print only devmap names + thus we can feed kpartx with that output + * [multipath] update man page to reflect the hotplug.d -> dev.d + transition and replace the obsolete group_by_tur policy by the + forgotten group_by_prio + * [multipath] provide a /etc/udev/rules.d/multipath.rules for + multipath devices naming. Cleaner than the previously suggested + rule addition in the main udev.rules + * [multipath] move out of hotplug.d to dev.d : kill synchronisation + problems between device node creation and multipath execution. + Incidentally the unfriendly $DEVPATH param become a friendly + $DEVNAME (simply /dev/sdb) + * [multipath] rework the iopolicies name-to-id & id-to-name + translations. kills the last compilation warning here too + * [kpartx] kill last compilation warnings + * bump klibc to 0.181 + * add the debian/ packaging dir (make deb) + * prototype __clone & __clone2 + +2004-09-24 multipath-tools-0.2.9 + + * [multipathd] finally tame the clone compilation glitch on IA64 + move from sys_clone to __clone / __clone2 + * [kpartx] rework from Stephan Bader, IBM : + * handle s390x arch + * endianness fixes + * push the partname string size to handle wwwids + * quieten implicit cast warnings + * [multipath] add an 'alias' multipath keyword for friendlier device + names. This was "asked" by OSDL' CGL board of secret reviewers + * [multipath] last pass with JBOD and parallel SCSI support : + hard-code scsi_id as a fallback when disk strings doesn't match + any hwtable entry + * [multipath & multipathd] change the parser to not coalesce + consecutive spaces (Patrick Mansfield) + * [multipath] remove the [UN]: output prefix, so that stdout can be + easily fed to a tool like dmsetup + * [multipathd] DEBUG=3 logs more readable/usefull + * [multipathd] add a multipath_tool config keyword + * [multipathd] move to execute_program() like multipath already did + * [multipath] don't print the "no path" msg in quiet mode + * [multipathd] include linux/unistd.h for _syscall2 + definition on RedHat systems. Remove superfluous + asm/unistd.h include + * [libsysfs] forked : last version uses mntent, which + klibc doesn't provide. That, plus the fact we use + only 1/3 of the lib, pushed me to freeze the version + and strip all unused stuff. + * [multipathd] prepare_namespace() cleanup : no more "multipath" + special casing since we push it to binvec vector, like the other + callouts detected in the config file. + +2004-08-13 multipath-tools-0.2.8 + + * [multipathd] setsched prio to RT + * [multipath] massive include cleanup + * [multipath] add a "default_prio_callout" keyword + first user will be SPC-3 ALUA priority field fetcher + from IBM + * [multipath] reenable stdout on some error paths + * [build] spilt KERNEL_DIR into KERNEL_SOURCE & + KERNEL_BUILD as per 2.6 and SuSe convention + * [klibc] kill warnings due to awk parsing wrong locale in + arch/i386/MCONFIG + * [multipath] implement a generic group_by_prio pgpolicy + * [multipath] fix the broken failover pgpolicy + +2004-07-24 multipath-tools-0.2.7 + + * [multipath] args parser moved to getopt + <genanr@emsphone.com> + * [multipath] zero conf->hotplugdev at allocation + <genanr@emsphone.com> + * [multipath] clean up failed devmap creation attempt + * [libs] update to libdevmapper 1.00.19 + * [multipath] framework for claimed device skipping + still lacks a reliable way to know if the device is + claimed and by who (fs, swap, md, dm, ...). If you + think it is valid to let libdevmapper hit the wall, + please speak up and tell so. + * [multipath] shut down stderr when calling into libdm + * [multipath] reformat the verbose output + * [multipath] framework for path priority handling (ALUA) + * [multipath] kill all reference to group_by_tur + * [multipath] integrate path state logic into multibus & + failover pgpolicies. This obsoletes the group_by_tur one + which is now the same as multibus. + * [multipath] zalloc mp structs to avoid garbage in ->size + * bump version requisite for scsi_id to 0.6 to support the new + '-u' flag (s/ /_/ for proper JBOD device map naming) + * [multipath] correct the for(;;) limits to accept 1-slot + pathvecs + * [multipath] push WWID_SIZE to 64 char (scsi_id w/ JBODs) + * [multipath] add a exit_tool() wrapper fn for runfile unlink + * [multipath] add a "default_path_grouping_policy" keyword in the + "defaults" block. + * [multipath] add a "default_getuid_callout" keyword in the + "defaults" block. Now multipath is going to work with JBODs + * [multipath] fix segfault when pathvec is empty after + get_pathvec() + * move to template based specfile to avoid regular version skew + +2004-07-16 multipath-tools-0.2.6 + + * [multipathd] implement the system-disk-on-SAN safety net + * [multipathd] add exit_daemon() wrapper function + * [multipathd] mlockall() all daemon threads + * [multipath] fix a bug in the mp_iopolicy_handler that kept + the iopolicy per LUN override from working + * [multipath] display the tur bit value in print_path + as requested by SUN + * try to open /$udev/reverse/$major:$minor before falling back + to mknod + * add "udev_dir" to the defaults block in the config file + * merge "daemon" & "device_maps" config blocks into a new + "defaults" block + * [multipath] properly comment the config file + * [multipath] generalize the dbg() macro usage + Makefile now has a DEBUG flag + * [multipath] move to callout based WWID fetching. Default to + scsi_id callout. I merged execute_program from udev for that + (so credit goes to GregKH) + * [multipath] get rid of "devnodes in /dev" assumption + ie move to "maj:min" device mapper target synthax + +2004-07-10 multipath-tools-0.2.5 + + * [multipathd] fix misbehaviour noted by <genanr@emsphone.com> + improper tar directive in Makefile on some systems + * [multipathd] fix bug noted by <genanr@emsphone.com> + get_devmaps fills a private vector and forget to pass its + address to caller + * [multipath] extend EVPD 0x83 id fetching logic. + Code borrowed from scsi_id (thanks goes to Patrick + Mansfield @IBM) and merged by Hannes Reinecke @SUSE + * [multipathd] fix regression noted by <genanr@emsphone.com> + (segfault when no config file) + +2004-06-20 multipath-tools-0.2.4 + + * [multipathd] break free from system's libsysfs for now + as it is not that common these days + * [multipath] introduce per LUN policies in the config + file : path_grouping_policy, path_selector and + path_selector_args are supported. + See updated sample config file. + * [multipath] move ->iopolicy to multipath struct (from + path struct) + * [multipath] fill the voids left in the config file with + defaults + * [multipath] group config & flags in a global struct * + * [multipath] fix segfault when no config file (was a + regression since hwtable vectorisation in 0.2.2) + * [multipath] default path selector override in config file + * [multipath] don't play with strings in pgpolicies, leave + that to a new assemble_map fn. policies now use vector + * [multipathd] compilation fix for gentoo (Franck Denis) + * [multipath] strcmp fix (Franck Denis) + +2004-06-14 multipath-tools-0.2.3 + + * [multipath] group_by_serial try to be smart with LUN + balancing across controlers (for STK / LSI) : + 1st multipath : 1st pg made of path through 1st controler + 2nd multipath : 1st pg made of path through 2nd controler + 3rd multipath : 1st pg made of path through 1st controler + ... + * [multipath] drop .pindex[] in struct multipath in favor + of a *paths vector : much cleaner + * [multipath] fix group_by_serial pgpolicy broken by + vectorisation in 0.2.2 + * add a StorageTek array in the sample multipath.conf + * [multipathd] strcmp fix from Franck Denis + * [multipathd] convert to vector api + * [multipathd] add a configfile option for path checking + interval. See sample configfile for synthax. + +2004-06-07 multipath-tools-0.2.2 + + * [multipath] leave out 2.4 compat code. Is there + interest anyway ? + * [multipath] convert all_paths table to vector api. + Rename to pathvec. Get rid of max_devs + * [multipath] convert mp table to vector api + * convert blacklist to vector api + * 2.6.7-rc? adds _user annotations to scsi/sg.h, causing + compilation breakage. Add a "#define _user" in all + sg_include.h (and remove cruft) + * merge a real parser (from keepalived) courtesy of + Alexandre Cassen. Now multipath and multipathd share a + config file. This comes with a nice vector lib. + * devnode blacklist moved from hardcoded to config file + * Guy Coates noted -O2 CFLAGS lead to multipathd crashes + on IA64. Remove the needless optimisation for now. + +2004-06-05 multipath-tools-0.2.1 + + * [multipath] add a flag to inihibit the final SIGHUP to + multipathd. Needed to avoid recursion with the correction + below + * [multipathd] devmap event now triggers a multipath exec + in addition to the usual updatepaths() + * [multipathd] move checkers from sg_io on BLK onto CHR + readsector0 goes from read to sg_read + * [multipathd] rely on sysfs for failedpaths enum and no + longer on the device mapper + * [multipathd] convert get_lun_strings from ioctl to sysfs + so we can benefit from strings persistency for failed + paths + * [multipath] readconfig() to take only 8 char from vendor + string (ake) + * [multipath] remove unecessery and wrong getuid == NULL + check from devinfo() (ake) + * [multipathd] make readsector0 open path O_DIRECT + * [multipathd] sizeof(path) -> sizeof(struct path) (MikeC) + * [Makefile] don't try to install and uninstall libs + * [devmap_name] kill the wrong trailing '\n' + (Mike Christie) + * [kpartx] works with device nodes outside /dev + * [kpartx] correctly display the delimiter in partition + name outputs + +2004-05-17 multipath-tools-0.2.0 + + * change the default klibc by greg's : + corrects the segfaults reported by Ling Hwa Hing + +2004-05-16 multipath-tools-0.1.9 + + * break free from udev : package klibc and libsysfs + * add a spec file and a "make rpm" rule + * pensum on klibc changes needed : + * mmap.c & fork.c : invert includes + * make clean wipes .*.d + * auto create the linux symlink + * remove tools and specfiles (files and Makefile + targets) + +2004-05-15 multipath-tools-0.1.8 + + * Makefiles cleanup and factorisation + * Compilation fixes for non-ix86 archs, tested on x86_64 + * strip execs harder for a 10% size reduction + * blacklist /dev/fd* and /dev/loop* + * dmadm works with sysfs nodes with '!' (cciss for ex) + +2004-05-10 multipath-tools-0.1.7 + + * bugfixes from Andy <genanr@emsphone.com> : + * read the last line of the config file + * add an entry for the 3PARData storage ctlrs + * read the last char of vendor and model strings + +2004-04-25 multipath-tools-0.1.6 + + * add the dmadm WIP tool (read MD superblocks and create + corresponding devmaps when possible) + * plug fd leak in TUR path checker + +2004-03-25 multipath-tools-0.1.5 + + * kpartx to manage the nested bdevs as /dev/cciss/c0d0. + parts are named sysfs style : cciss!c0d0p* + * kpartx loop support + * kpartx do DM updates if part maps already present + * merge kpartx for partitioned multipath support + * add get_null_uid to getuid methods. assign it the "0" index + devices with this getuid are thus ignored by multipath. + warning : change /etc/multipath.conf (get_evpd_wwid == 1) + * mv all_scsi_ids out of the 2.6 code path, into the 2.4 one + * unlink runfile on malloc exit path + * update multipath manpage (MikeC) + +2004-03-17 multipath-tools-0.1.4 + + * multipath clean up + * split default hw table in hwtable.h + * split grouping policies in pgpolocies.c + * pass *mp to setup_map instead of mp[]+index + * ensure defhwtable is used if /etc/multipath.conf is buggy + * hwtable is not global anymore + * unlink the runfile in various error paths + +2004-03-13 multipath-tools-0.1.3 + + * multipath config tool now has a config file parser + (did I say I make the ugliest parsers ?) + * example multipath.conf to put in /etc (manualy) + +2004-03-12 multipath-tools-0.1.2 + + * detach the per devmap waiter threads + * set the thread stack size to lower limits + (VSZ down to 4MB from 85 MB here) + +2004-03-06 multipath-tools-0.1.1 + + * include dlist.h in multipath main.c (PM Hahn) + * typo in hotplug script (PM Hahn) + * pass -9 opt to gzip for manpages (PM Hahn) + +2004-03-05 multipath-tools-0.1.0 + + * add the group_by_tur policy + * add the multipathd daemon for pathchecking & DM hot-reconfig + * multipath doesn't run twice + * massive cleanups, and code restructuring + * Avoid Kernel Bug when passing too small a buffer in do_inq() + * Sync with 2.6.3-udm4 target synthax (no more PG prio) + +2004-02-21 multipath-018 + + * From the Debian SID inclusion review (Philipp Matthias Hahn) + * use DESTDIR install prefix in the Makefile + * add man pages for devmap_name & multipath + * correct libsysfs.h includes + * fork the hotplug script in its own shell + * Sync with the kernel device mapper code as of 2.6.3-udm3 + ie. Remove the test interval parameter and its uses + * Remove superfluous scsi parameter passed from hotplug + * Add the man pages to the [un]install targets + +2004-02-17 multipath-017 + + * remove the restrictive -f flag. + Introduce a more generic "-m iopolicy" one. + * remove useless "int with_sysfs" in env struct + +2004-02-04 multipath-016 + + * add a GROUP_BY_SERIAL flag. This should be useful for + controlers that activate they spare paths on simple IO + submition with a penalty. The StorageWorks HW defaults to + this mode, even if the MULTIBUS mode is OK. + * remove unused sg_err.c + * big restructuring : split devinfo.c from main.c. Export : + * void basename (char *, char *); + * int get_serial (int, char *); + * int get_lun_strings (char *, char *, char *, char *); + * int get_evpd_wwid(char *, char *); + * long get_disk_size (char *); + * stop passing struct env as param + * add devmap_name proggy for udev to name devmaps as per their + internal DM name and not only by their sysfs enum name (dm-*) + The corresponding udev.rules line is : + KERNEL="dm-[0-9]*", PROGRAM="/sbin/devmap_name %M %m", \ + NAME="%k", SYMLINK="%c" + * remove make_dm_node fn & call. Rely on udev for this. + * don't rely on the linux symlink in the udev/klibc dir since + udev build doesn't use it anymore. This corrects build breakage + +2004-01-19 multipath-013 + + * update the DM target synthax to the 2.6.0-udm5 style + +2003-12-29 multipath-012 + + * check hotplug event refers to a block device; if not exit early + * refresh doc + * add the uninstall target in Makefile + +2003-12-22 multipath-010 + + * tweak the install target in Makefile + * stop passing fds as argument : this change enable a strict + segregation of ugly 2.4 code + * sysfs version of get_lun_strings() + * be careful about the return of get_unique_id() since errors + formerly caught up by if(open()) in the caller fn are now returned + by get_unique_id() + * send get_serial() in unused.c + * introduce dm-simplecmd for RESUME & SUSPEND requests + * split add_map() in setup_map() & dm-addmap() + * setup_map() correctly submits "SUSPEND-RELOAD-RESUME or CREATE" + sequences instead of the bogus "RELOAD or CREATE" + * don't print .sg_dev if equal to .dev (2.6) in print_path() + * since the kernel code handles defective paths, remove all + code to cope with them : + * move do_tur() to unused.c + * remove .state from path struct + * remove .state settings & conditionals + * add a cmdline switch to force maps to failover mode, + ie 1 path per priority group + * add default policies to the whitelist array (spread io == + MULTIBUS / io forced to 1 path == FAILOVER) + * move get_disk_size() call out of add_map() to coalesce() + * comment tricky coalesce() fn + * bogus unsused.c file renamed to unused.c + +2003-12-20 multipath-010 + + * big ChangeLog update + * start to give a little control over target params : + introduce cmdline arg -i to control polling interval + * cope with hotplug-style calling convention : + ie "multipath scsi $DEVPATH" ... to avoid messing with + online maps not concerned by an event + * example hotplug agent to drop in /etc/hotplug.d/scsi + * revert the run & resched patch : unless someone proves me + wrong, this was overdesigned + * move commented out functions in unused.c + * update multipath target params to "udm[23] style" + * mp target now supports nr_path == 1, so do we + * add gratuitous free() + * push version forward + +2003-12-15 multipath-009 + + * Make the HW-specific get_unique_id switch pretty + * Prepare to field-test by whitelisting all known fibre array, + try to fetch WWID from the standard EVPD 0x83 off 8 for everyone + * configure the multipath target with round-robin path selector and + conservative default for a start (udm1 style) : + yes it makes this release the firstreally useful one. + * temporarily disable map creation for single path device + due to current restrictive defaults in the kernel target. + Sistina should work it out. + * correct the strncmp logic in blacklist function. + * update the Makefiles to autodetect libgcc.a & gcc includes + "ulibc-style". Factorisation of udevdirs & others niceties + * drop a hint about absent /dev/sd? on failed open() + * implement a reschedule flag in /var/run. + Last thing the prog do before exit is check if a call to multipath + was done (but canceled by /var/run/multipath.run check) during its + execution. If so restart themain loop. + * implement a blacklist of sysfs bdev to not bother with for now + (hd,md, dm, sr, scd, ram, raw). + This avoid sending SG_IO to unappropiate devices. + * Adds a /var/run/multipath.run handling to avoid simultaneous runs. + * Remove a commented-out "printf" + * drop a libdevmapper copy in extras/multipath; + maybe discussions w/Sistina folks will bring a better solution + in the future. + * drop a putchar usage in libdevmapper to compile cleanly with klibc + * drop another such usage of my own in main.c + * massage the Makefile to compile libdevmapper against klibc + * use "ld" to produce the binary rather than "gcc -static" + * stop being stupid w/ uneeded major, minor & dev in main.c:dm_mk_node() + * reverse to creating striped target for now because the multipath + target is more hairy than expected initialy + * push the version code to 009 to be in synch w/ udev + +2003-11-27 multipath-007 + + * removes sg_err.[ch] deps + * makes sure the core code play nice with klibc + * port the sysfs calls to dlist helpers + * links against udev's sysfs (need libsysfs.a & dlist.a) + * finally define DM_TARGET as "multipath" as Joe posted the code today + (not tested yet) + * push version forward (do you want it in sync with udev version?) + +2003-11-19 multipath-006 + + * merged in udev-006 tree + +2003-09-18 multipath-0.0.1 + + * multipath 0.0.1 released. + * Initial release. @@ -0,0 +1,63 @@ +1. How to set up System-on-multipath ? +====================================== + +prerequisite : udev and multipath-tools installed + +here are the steps on a Debian SID system : + +* add dm-mpath and dm-multipath to /etc/mkinitrd/modules +* copy $tools_dir/multipath/0[12]_* to /etc/mkinitrd/scripts +* define a friendly alias for your multipathed root disk + (in /etc/multipath.conf). Example : "system" +* enable busybox in /etc/mkinitrd/mkinitrd.conf and set ROOT + to any valid block-device (but not a /dev/dm-* one, due to + an mkintrd misbelief that all dm-* are managed by LVM2) +* run mkinitrd +* in /boot/grub/menu.lst, define the root= kernel parameter + Example : root=/dev/system1 +* modify /etc/fstab to reference /dev/system* partitions + +At reboot, you should see some info like : + +path /dev/sda : multipath system +... +gpt: 0 slices +dos: 5 slices +reduced size of partition #2 to 63 +Added system1 : 0 70685937 linear /dev/system 63 +Added system2 : 0 63 linear /dev/system 7068600 +Added system5 : 0 995967 linear /dev/system 70686063 +... + +2. How does it compare to commercial product XXX ? +================================================== + +Here are a few distinctive features : + +* you can mix HBA models, even different vendors, different speed ... +* you can mix storage controllers on your SAN, and access them all, applying + different path grouping policy +* completely event-driven model : no administration burden if you accept the + default behaviours +* extensible : you can plug your own policies if the available ones don't fill + your needs +* supports root FS on multipathed SAN +* free, open-source software + +3. LVM2 doesn't see my multipathed devices as PV, what's up ? +============================================================= + +By default, lvm2 does not consider device-mapper block devices (such as a +dm-crypt device) for use as physical volumes. + +In order to use a dm-crypt device as an lvm2 pv, add this line to the +devices block in /etc/lvm/lvm.conf: + +types = [ "device-mapper", 16 ] + +If /etc/lvm/lvm.conf does not exist, you can create one based on your +current/default config like so: + +lvm dumpconfig > /etc/lvm/lvm.conf + +(tip from Christophe Saout) diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..4919bfa --- /dev/null +++ b/Makefile @@ -0,0 +1,66 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> + +BUILD = glibc + +# +# Try to supply the linux kernel headers. +# +ifeq ($(KRNLSRC),) +KRNLLIB = /lib/modules/$(shell uname -r) +ifeq ($(shell test -r $(KRNLLIB)/source && echo 1),1) +KRNLSRC = $(KRNLLIB)/source +KRNLOBJ = $(KRNLLIB)/build +else +KRNLSRC = $(KRNLLIB)/build +KRNLOBJ = $(KRNLLIB)/build +endif +endif +export KRNLSRC +export KRNLOBJ + +BUILDDIRS = libmultipath libcheckers path_priority \ + devmap_name multipath multipathd kpartx +ALLDIRS = $(shell find . -type d -maxdepth 1 -mindepth 1) + +VERSION = $(shell basename ${PWD} | cut -d'-' -f3) +INSTALLDIRS = devmap_name multipath multipathd kpartx path_priority + +all: recurse + +recurse: + @for dir in $(BUILDDIRS); do \ + $(MAKE) -C $$dir BUILD=$(BUILD) VERSION=$(VERSION) \ + KRNLSRC=$(KRNLSRC) KRNLOBJ=$(KRNLOBJ) || exit $?; \ + done + +recurse_clean: + @for dir in $(ALLDIRS); do\ + $(MAKE) -C $$dir clean || exit $?; \ + done + +recurse_install: + @for dir in $(INSTALLDIRS); do\ + $(MAKE) -C $$dir install || exit $?; \ + done + +recurse_uninstall: + @for dir in $(INSTALLDIRS); do\ + $(MAKE) -C $$dir uninstall || exit $?; \ + done + +clean: recurse_clean + rm -f multipath-tools.spec + rm -rf rpms + +install: recurse_install + +uninstall: recurse_uninstall + +release: + sed -e "s/__VERSION__/${VERSION}/" \ + multipath-tools.spec.in > multipath-tools.spec + +rpm: release + rpmbuild -bb multipath-tools.spec diff --git a/Makefile.inc b/Makefile.inc new file mode 100644 index 0000000..37a0847 --- /dev/null +++ b/Makefile.inc @@ -0,0 +1,34 @@ +# Makefile.inc +# +# Copyright (C) 2004 Christophe Varoqui, <christophe.varoqui@free.fr> + +# +# Allow to force some libraries to be used statically. (Uncomment one of the +# following lines or define the values when calling make.) +# +# WITH_LOCAL_LIBDM = 1 +# WITH_LOCAL_LIBSYSFS = 1 + +ifeq ($(TOPDIR),) + TOPDIR = .. +endif + +ifeq ($(strip $(BUILD)),klibc) + CC = klcc + klibcdir = /usr/lib/klibc + libdm = $(klibcdir)/lib/libdevmapper.a + libsysfs = $(klibcdir)/lib/libsysfs.a +endif + +prefix = +exec_prefix = $(prefix) +bindir = $(exec_prefix)/sbin +checkersdir = $(TOPDIR)/libcheckers +multipathdir = $(TOPDIR)/libmultipath +mandir = /usr/share/man/man8 + +GZIP = /bin/gzip -9 -c +STRIP = strip --strip-all -R .comment -R .note + +CHECKERSLIB = $(checkersdir)/libcheckers +MULTIPATHLIB = $(multipathdir)/libmultipath @@ -0,0 +1,52 @@ +Dependancies : +============== + +These libs have been dropped in the multipath tree : + +o libdevmapper : comes with device-mapper-XXXX.tar.gz + See www.sistina.com +o libsysfs : comes with sysutils or udev + See ftp.kernel.org/pub/linux/utils/kernel/hotplug/ +o klibc + See ftp.kernel.org/pub/linux/libs/klibc/ + +External : + +o Linux kernel 2.6.10-rc with udm2 patchset (or greater) + ftp://sources.redhat.com/pub/dm/ + +How it works : +============== + +Get a path list in sysfs. + +For each path, a wwid is retrieved by a callout program. +Only White Listed HW can retrieve this info. + +Coalesce the paths according to pluggable policies and store + the result in mp array. + +Feed the maps to the kernel device mapper. + +The naming of the corresponding block device is handeld +by udev with the help of the devmap_name proggy. It is +called by the following rule in /etc/udev/udev.rules : + +KERNEL="dm-[0-9]*", PROGRAM="/sbin/devmap_name %M %m", \ +NAME="%k", SYMLINK="%c" + +Notes : +======= + +o 2.4.21 version of DM does not like even segment size. + if you enconter pbs with this, upgrade DM. + +Credits : +========= + +o Heavy cut'n paste from sg_utils. Thanks goes to D. + Gilbert. +o Light cut'n paste from dmsetup. Thanks Joe Thornber. +o Greg KH for the nice sysfs API. +o The klibc guys (Starving Linux Artists :), espacially + for their nice Makefiles and specfile @@ -0,0 +1,3 @@ +Things to do + +o activate group dm mesg fn diff --git a/devmap_name/Makefile b/devmap_name/Makefile new file mode 100644 index 0000000..32edfea --- /dev/null +++ b/devmap_name/Makefile @@ -0,0 +1,45 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> +BUILD = glibc + +include ../Makefile.inc + +OBJS = devmap_name.o +CFLAGS = -pipe -g -Wall -Wunused -Wstrict-prototypes + +ifeq ($(strip $(BUILD)),klibc) + OBJS += $(libdm) +else + LDFLAGS = -ldevmapper +endif + +EXEC = devmap_name + +all: $(BUILD) + +prepare: + rm -f core *.o *.gz + +glibc: prepare $(OBJS) + $(CC) $(OBJS) -o $(EXEC) $(LDFLAGS) + $(STRIP) $(EXEC) + $(GZIP) $(EXEC).8 > $(EXEC).8.gz + +klibc: prepare $(OBJS) + $(CC) -static -o $(EXEC) $(OBJS) + $(STRIP) $(EXEC) + $(GZIP) $(EXEC).8 > $(EXEC).8.gz + +install: + install -d $(DESTDIR)$(bindir) + install -m 755 $(EXEC) $(DESTDIR)$(bindir)/ + install -d $(DESTDIR)$(mandir) + install -m 644 $(EXEC).8.gz $(DESTDIR)$(mandir) + +uninstall: + rm $(DESTDIR)$(bindir)/$(EXEC) + rm $(DESTDIR)$(mandir)/$(EXEC).8.gz + +clean: + rm -f core *.o $(EXEC) *.gz diff --git a/devmap_name/devmap_name.8 b/devmap_name/devmap_name.8 new file mode 100644 index 0000000..f4f03c3 --- /dev/null +++ b/devmap_name/devmap_name.8 @@ -0,0 +1,30 @@ +.TH DEVMAP_NAME 8 "February 2004" "" "Linux Administrator's Manual" +.SH NAME +devmap_name \- Query device-mapper name +.SH SYNOPSIS +.BI devmap_name " major minor" +.SH DESCRIPTION +.B devmap_name +queries the device-mapper for the name for the device +specified by +.I major +and +.I minor +number. +.br +.B devmap_name +can be called from +.B udev +by the following rule in +.IR /etc/udev/udev.rules : +.sp +.nf +KERNEL="dm-[0-9]*", PROGRAM="/sbin/devmap_name %M %m", \\ + NAME="%k", SYMLINK="%c" +.fi +.SH "SEE ALSO" +.BR udev (8), +.BR dmsetup (8) +.SH AUTHORS +.B devmap_name +was developed by Christophe Varoqui, <christophe.varoqui@free.fr> and others. diff --git a/devmap_name/devmap_name.c b/devmap_name/devmap_name.c new file mode 100644 index 0000000..6a2124e --- /dev/null +++ b/devmap_name/devmap_name.c @@ -0,0 +1,85 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <ctype.h> +#include <unistd.h> +#include <linux/kdev_t.h> +#include <libdevmapper.h> + +static void usage(char * progname) { + fprintf(stderr, "usage : %s [-t target type] dev_t\n", progname); + fprintf(stderr, "where dev_t is either 'major minor' or 'major:minor'\n"); + exit(1); +} + +int dm_target_type(int major, int minor, char *type) +{ + struct dm_task *dmt; + void *next = NULL; + uint64_t start, length; + char *target_type = NULL; + char *params; + int r = 1; + + if (!(dmt = dm_task_create(DM_DEVICE_STATUS))) + return 1; + + if (!dm_task_set_major(dmt, major) || + !dm_task_set_minor(dmt, minor)) + goto bad; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto bad; + + if (!type) + goto good; + + do { + next = dm_get_next_target(dmt, next, &start, &length, + &target_type, ¶ms); + if (target_type && strcmp(target_type, type)) + goto bad; + } while (next); + +good: + printf("%s\n", dm_task_get_name(dmt)); + r = 0; +bad: + dm_task_destroy(dmt); + return r; +} + +int main(int argc, char **argv) +{ + int c; + int major, minor; + char *target_type = NULL; + + while ((c = getopt(argc, argv, "t:")) != -1) { + switch (c) { + case 't': + target_type = optarg; + break; + default: + usage(argv[0]); + return 1; + break; + } + } + + /* sanity check */ + if (optind == argc - 2) { + major = atoi(argv[argc - 2]); + minor = atoi(argv[argc - 1]); + } else if (optind != argc - 1 || + 2 != sscanf(argv[argc - 1], "%i:%i", &major, &minor)) + usage(argv[0]); + + if (dm_target_type(major, minor, target_type)) + return 1; + + return 0; +} + diff --git a/kpartx/ChangeLog b/kpartx/ChangeLog new file mode 100644 index 0000000..cd0c6c1 --- /dev/null +++ b/kpartx/ChangeLog @@ -0,0 +1,9 @@ +002: +* convert to kpartx name everywhere +* remove all HDGEO ioctl code +* now work with files by mapping loops on the fly +* merged and massage lopart.[ch] from lomount.[ch] + (due credit to original author here : hpa ?) +* added a fn find_loop_by_file in lopart.[ch] +001: +* Initial release diff --git a/kpartx/Makefile b/kpartx/Makefile new file mode 100644 index 0000000..44f327e --- /dev/null +++ b/kpartx/Makefile @@ -0,0 +1,52 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> +# +BUILD=glibc + +include ../Makefile.inc + +CFLAGS = -pipe -g -Wall -Wunused -Wstrict-prototypes -I. + +ifeq ($(strip $(BUILD)),klibc) + OBJS = bsd.o dos.o kpartx.o solaris.o unixware.o gpt.o crc32.o \ + lopart.o xstrncpy.o devmapper.o \ + $(MULTIPATHLIB)-$(BUILD).a $(libdm) +else + LDFLAGS = -ldevmapper + OBJS = bsd.o dos.o kpartx.o solaris.o unixware.o \ + gpt.o crc32.o lopart.o xstrncpy.o devmapper.o +endif + +EXEC = kpartx + +all: $(BUILD) + +prepare: + rm -f core *.o *.gz + +glibc: prepare $(OBJS) + $(CC) $(OBJS) -o $(EXEC) $(LDFLAGS) + $(STRIP) $(EXEC) + $(GZIP) $(EXEC).8 > $(EXEC).8.gz + +klibc: prepare $(OBJS) + $(CC) -static -o $(EXEC) $(CRT0) $(OBJS) $(KLIBC) $(LIBGCC) + $(STRIP) $(EXEC) + $(GZIP) $(EXEC).8 > $(EXEC).8.gz + +$(MULTIPATHLIB)-$(BUILD).a: + make -C $(multipathdir) BUILD=$(BUILD) + +install: + install -d $(DESTDIR)$(bindir) + install -m 755 $(EXEC) $(DESTDIR)$(bindir) + install -d $(DESTDIR)$(mandir) + install -m 644 $(EXEC).8.gz $(DESTDIR)$(mandir) + +uninstall: + rm -f $(DESTDIR)$(bindir)/$(EXEC) + rm -f $(DESTDIR)$(mandir)/$(EXEC).8.gz + +clean: + rm -f core *.o $(EXEC) *.gz diff --git a/kpartx/README b/kpartx/README new file mode 100644 index 0000000..e0680b1 --- /dev/null +++ b/kpartx/README @@ -0,0 +1,9 @@ +This version of partx is intented to be build +static against klibc. + +It creates partitions as device maps. + +With due respect to the original authors, + +have fun, +cvaroqui diff --git a/kpartx/bsd.c b/kpartx/bsd.c new file mode 100644 index 0000000..3ae2dc4 --- /dev/null +++ b/kpartx/bsd.c @@ -0,0 +1,83 @@ +#include "kpartx.h" +#include <stdio.h> + +#define BSD_DISKMAGIC (0x82564557UL) /* The disk magic number */ +#define XBSD_MAXPARTITIONS 16 +#define BSD_FS_UNUSED 0 + +struct bsd_disklabel { + unsigned int d_magic; /* the magic number */ + short int d_type; /* drive type */ + short int d_subtype; /* controller/d_type specific */ + char d_typename[16]; /* type name, e.g. "eagle" */ + char d_packname[16]; /* pack identifier */ + unsigned int d_secsize; /* # of bytes per sector */ + unsigned int d_nsectors; /* # of data sectors per track */ + unsigned int d_ntracks; /* # of tracks per cylinder */ + unsigned int d_ncylinders; /* # of data cylinders per unit */ + unsigned int d_secpercyl; /* # of data sectors per cylinder */ + unsigned int d_secperunit; /* # of data sectors per unit */ + unsigned short d_sparespertrack;/* # of spare sectors per track */ + unsigned short d_sparespercyl; /* # of spare sectors per cylinder */ + unsigned int d_acylinders; /* # of alt. cylinders per unit */ + unsigned short d_rpm; /* rotational speed */ + unsigned short d_interleave; /* hardware sector interleave */ + unsigned short d_trackskew; /* sector 0 skew, per track */ + unsigned short d_cylskew; /* sector 0 skew, per cylinder */ + unsigned int d_headswitch; /* head switch time, usec */ + unsigned int d_trkseek; /* track-to-track seek, usec */ + unsigned int d_flags; /* generic flags */ + unsigned int d_drivedata[5]; /* drive-type specific information */ + unsigned int d_spare[5]; /* reserved for future use */ + unsigned int d_magic2; /* the magic number (again) */ + unsigned short d_checksum; /* xor of data incl. partitions */ + + /* filesystem and partition information: */ + unsigned short d_npartitions; /* number of partitions in following */ + unsigned int d_bbsize; /* size of boot area at sn0, bytes */ + unsigned int d_sbsize; /* max size of fs superblock, bytes */ + struct bsd_partition { /* the partition table */ + unsigned int p_size; /* number of sectors in partition */ + unsigned int p_offset; /* starting sector */ + unsigned int p_fsize; /* filesystem basic fragment size */ + unsigned char p_fstype; /* filesystem type, see below */ + unsigned char p_frag; /* filesystem fragments per block */ + unsigned short p_cpg; /* filesystem cylinders per group */ + } d_partitions[XBSD_MAXPARTITIONS];/* actually may be more */ +}; + +int +read_bsd_pt(int fd, struct slice all, struct slice *sp, int ns) { + struct bsd_disklabel *l; + struct bsd_partition *p; + unsigned int offset = all.start; + int max_partitions; + char *bp; + int n = 0; + + bp = getblock(fd, offset+1); /* 1 sector suffices */ + if (bp == NULL) + return -1; + + l = (struct bsd_disklabel *) bp; + if (l->d_magic != BSD_DISKMAGIC) + return -1; + + max_partitions = 16; + if (l->d_npartitions < max_partitions) + max_partitions = l->d_npartitions; + for (p = l->d_partitions; p - l->d_partitions < max_partitions; p++) { + if (p->p_fstype == BSD_FS_UNUSED) + /* nothing */; + else if (n < ns) { + sp[n].start = p->p_offset; + sp[n].size = p->p_size; + n++; + } else { + fprintf(stderr, + "bsd_partition: too many slices\n"); + break; + } + } + return n; +} diff --git a/kpartx/byteorder.h b/kpartx/byteorder.h new file mode 100644 index 0000000..0f6ade1 --- /dev/null +++ b/kpartx/byteorder.h @@ -0,0 +1,15 @@ +#ifndef BYTEORDER_H_INCLUDED +#define BYTEORDER_H_INCLUDED + +#if defined (__s390__) || defined (__s390x__) +#define le32_to_cpu(x) ( \ + (*(((unsigned char *) &(x)))) + \ + (*(((unsigned char *) &(x))+1) << 8) + \ + (*(((unsigned char *) &(x))+2) << 16) + \ + (*(((unsigned char *) &(x))+3) << 24) \ + ) +#else +#define le32_to_cpu(x) (x) +#endif + +#endif /* BYTEORDER_H_INCLUDED */ diff --git a/kpartx/crc32.c b/kpartx/crc32.c new file mode 100644 index 0000000..42d803d --- /dev/null +++ b/kpartx/crc32.c @@ -0,0 +1,393 @@ +/* + * crc32.c + * This code is in the public domain; copyright abandoned. + * Liability for non-performance of this code is limited to the amount + * you paid for it. Since it is distributed for free, your refund will + * be very very small. If it breaks, you get to keep both pieces. + */ + +#include "crc32.h" + +#if __GNUC__ >= 3 /* 2.x has "attribute", but only 3.0 has "pure */ +#define attribute(x) __attribute__(x) +#else +#define attribute(x) +#endif + +/* + * There are multiple 16-bit CRC polynomials in common use, but this is + * *the* standard CRC-32 polynomial, first popularized by Ethernet. + * x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x^1+x^0 + */ +#define CRCPOLY_LE 0xedb88320 +#define CRCPOLY_BE 0x04c11db7 + +/* How many bits at a time to use. Requires a table of 4<<CRC_xx_BITS bytes. */ +/* For less performance-sensitive, use 4 */ +#define CRC_LE_BITS 8 +#define CRC_BE_BITS 8 + +/* + * Little-endian CRC computation. Used with serial bit streams sent + * lsbit-first. Be sure to use cpu_to_le32() to append the computed CRC. + */ +#if CRC_LE_BITS > 8 || CRC_LE_BITS < 1 || CRC_LE_BITS & CRC_LE_BITS-1 +# error CRC_LE_BITS must be a power of 2 between 1 and 8 +#endif + +#if CRC_LE_BITS == 1 +/* + * In fact, the table-based code will work in this case, but it can be + * simplified by inlining the table in ?: form. + */ +#define crc32init_le() +#define crc32cleanup_le() +/** + * crc32_le() - Calculate bitwise little-endian Ethernet AUTODIN II CRC32 + * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for + * other uses, or the previous crc32 value if computing incrementally. + * @p - pointer to buffer over which CRC is run + * @len - length of buffer @p + * + */ +uint32_t attribute((pure)) crc32_le(uint32_t crc, unsigned char const *p, size_t len) +{ + int i; + while (len--) { + crc ^= *p++; + for (i = 0; i < 8; i++) + crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0); + } + return crc; +} +#else /* Table-based approach */ + +static uint32_t *crc32table_le; +/** + * crc32init_le() - allocate and initialize LE table data + * + * crc is the crc of the byte i; other entries are filled in based on the + * fact that crctable[i^j] = crctable[i] ^ crctable[j]. + * + */ +static int +crc32init_le(void) +{ + unsigned i, j; + uint32_t crc = 1; + + crc32table_le = + malloc((1 << CRC_LE_BITS) * sizeof(uint32_t)); + if (!crc32table_le) + return 1; + crc32table_le[0] = 0; + + for (i = 1 << (CRC_LE_BITS - 1); i; i >>= 1) { + crc = (crc >> 1) ^ ((crc & 1) ? CRCPOLY_LE : 0); + for (j = 0; j < 1 << CRC_LE_BITS; j += 2 * i) + crc32table_le[i + j] = crc ^ crc32table_le[j]; + } + return 0; +} + +/** + * crc32cleanup_le(): free LE table data + */ +static void +crc32cleanup_le(void) +{ + if (crc32table_le) free(crc32table_le); + crc32table_le = NULL; +} + +/** + * crc32_le() - Calculate bitwise little-endian Ethernet AUTODIN II CRC32 + * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for + * other uses, or the previous crc32 value if computing incrementally. + * @p - pointer to buffer over which CRC is run + * @len - length of buffer @p + * + */ +uint32_t attribute((pure)) crc32_le(uint32_t crc, unsigned char const *p, size_t len) +{ + while (len--) { +# if CRC_LE_BITS == 8 + crc = (crc >> 8) ^ crc32table_le[(crc ^ *p++) & 255]; +# elif CRC_LE_BITS == 4 + crc ^= *p++; + crc = (crc >> 4) ^ crc32table_le[crc & 15]; + crc = (crc >> 4) ^ crc32table_le[crc & 15]; +# elif CRC_LE_BITS == 2 + crc ^= *p++; + crc = (crc >> 2) ^ crc32table_le[crc & 3]; + crc = (crc >> 2) ^ crc32table_le[crc & 3]; + crc = (crc >> 2) ^ crc32table_le[crc & 3]; + crc = (crc >> 2) ^ crc32table_le[crc & 3]; +# endif + } + return crc; +} +#endif + +/* + * Big-endian CRC computation. Used with serial bit streams sent + * msbit-first. Be sure to use cpu_to_be32() to append the computed CRC. + */ +#if CRC_BE_BITS > 8 || CRC_BE_BITS < 1 || CRC_BE_BITS & CRC_BE_BITS-1 +# error CRC_BE_BITS must be a power of 2 between 1 and 8 +#endif + +#if CRC_BE_BITS == 1 +/* + * In fact, the table-based code will work in this case, but it can be + * simplified by inlining the table in ?: form. + */ +#define crc32init_be() +#define crc32cleanup_be() + +/** + * crc32_be() - Calculate bitwise big-endian Ethernet AUTODIN II CRC32 + * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for + * other uses, or the previous crc32 value if computing incrementally. + * @p - pointer to buffer over which CRC is run + * @len - length of buffer @p + * + */ +uint32_t attribute((pure)) crc32_be(uint32_t crc, unsigned char const *p, size_t len) +{ + int i; + while (len--) { + crc ^= *p++ << 24; + for (i = 0; i < 8; i++) + crc = + (crc << 1) ^ ((crc & 0x80000000) ? CRCPOLY_BE : + 0); + } + return crc; +} + +#else /* Table-based approach */ +static uint32_t *crc32table_be; + +/** + * crc32init_be() - allocate and initialize BE table data + */ +static int +crc32init_be(void) +{ + unsigned i, j; + uint32_t crc = 0x80000000; + + crc32table_be = + malloc((1 << CRC_BE_BITS) * sizeof(uint32_t)); + if (!crc32table_be) + return 1; + crc32table_be[0] = 0; + + for (i = 1; i < 1 << CRC_BE_BITS; i <<= 1) { + crc = (crc << 1) ^ ((crc & 0x80000000) ? CRCPOLY_BE : 0); + for (j = 0; j < i; j++) + crc32table_be[i + j] = crc ^ crc32table_be[j]; + } + return 0; +} + +/** + * crc32cleanup_be(): free BE table data + */ +static void +crc32cleanup_be(void) +{ + if (crc32table_be) free(crc32table_be); + crc32table_be = NULL; +} + + +/** + * crc32_be() - Calculate bitwise big-endian Ethernet AUTODIN II CRC32 + * @crc - seed value for computation. ~0 for Ethernet, sometimes 0 for + * other uses, or the previous crc32 value if computing incrementally. + * @p - pointer to buffer over which CRC is run + * @len - length of buffer @p + * + */ +uint32_t attribute((pure)) crc32_be(uint32_t crc, unsigned char const *p, size_t len) +{ + while (len--) { +# if CRC_BE_BITS == 8 + crc = (crc << 8) ^ crc32table_be[(crc >> 24) ^ *p++]; +# elif CRC_BE_BITS == 4 + crc ^= *p++ << 24; + crc = (crc << 4) ^ crc32table_be[crc >> 28]; + crc = (crc << 4) ^ crc32table_be[crc >> 28]; +# elif CRC_BE_BITS == 2 + crc ^= *p++ << 24; + crc = (crc << 2) ^ crc32table_be[crc >> 30]; + crc = (crc << 2) ^ crc32table_be[crc >> 30]; + crc = (crc << 2) ^ crc32table_be[crc >> 30]; + crc = (crc << 2) ^ crc32table_be[crc >> 30]; +# endif + } + return crc; +} +#endif + +/* + * A brief CRC tutorial. + * + * A CRC is a long-division remainder. You add the CRC to the message, + * and the whole thing (message+CRC) is a multiple of the given + * CRC polynomial. To check the CRC, you can either check that the + * CRC matches the recomputed value, *or* you can check that the + * remainder computed on the message+CRC is 0. This latter approach + * is used by a lot of hardware implementations, and is why so many + * protocols put the end-of-frame flag after the CRC. + * + * It's actually the same long division you learned in school, except that + * - We're working in binary, so the digits are only 0 and 1, and + * - When dividing polynomials, there are no carries. Rather than add and + * subtract, we just xor. Thus, we tend to get a bit sloppy about + * the difference between adding and subtracting. + * + * A 32-bit CRC polynomial is actually 33 bits long. But since it's + * 33 bits long, bit 32 is always going to be set, so usually the CRC + * is written in hex with the most significant bit omitted. (If you're + * familiar with the IEEE 754 floating-point format, it's the same idea.) + * + * Note that a CRC is computed over a string of *bits*, so you have + * to decide on the endianness of the bits within each byte. To get + * the best error-detecting properties, this should correspond to the + * order they're actually sent. For example, standard RS-232 serial is + * little-endian; the most significant bit (sometimes used for parity) + * is sent last. And when appending a CRC word to a message, you should + * do it in the right order, matching the endianness. + * + * Just like with ordinary division, the remainder is always smaller than + * the divisor (the CRC polynomial) you're dividing by. Each step of the + * division, you take one more digit (bit) of the dividend and append it + * to the current remainder. Then you figure out the appropriate multiple + * of the divisor to subtract to being the remainder back into range. + * In binary, it's easy - it has to be either 0 or 1, and to make the + * XOR cancel, it's just a copy of bit 32 of the remainder. + * + * When computing a CRC, we don't care about the quotient, so we can + * throw the quotient bit away, but subtract the appropriate multiple of + * the polynomial from the remainder and we're back to where we started, + * ready to process the next bit. + * + * A big-endian CRC written this way would be coded like: + * for (i = 0; i < input_bits; i++) { + * multiple = remainder & 0x80000000 ? CRCPOLY : 0; + * remainder = (remainder << 1 | next_input_bit()) ^ multiple; + * } + * Notice how, to get at bit 32 of the shifted remainder, we look + * at bit 31 of the remainder *before* shifting it. + * + * But also notice how the next_input_bit() bits we're shifting into + * the remainder don't actually affect any decision-making until + * 32 bits later. Thus, the first 32 cycles of this are pretty boring. + * Also, to add the CRC to a message, we need a 32-bit-long hole for it at + * the end, so we have to add 32 extra cycles shifting in zeros at the + * end of every message, + * + * So the standard trick is to rearrage merging in the next_input_bit() + * until the moment it's needed. Then the first 32 cycles can be precomputed, + * and merging in the final 32 zero bits to make room for the CRC can be + * skipped entirely. + * This changes the code to: + * for (i = 0; i < input_bits; i++) { + * remainder ^= next_input_bit() << 31; + * multiple = (remainder & 0x80000000) ? CRCPOLY : 0; + * remainder = (remainder << 1) ^ multiple; + * } + * With this optimization, the little-endian code is simpler: + * for (i = 0; i < input_bits; i++) { + * remainder ^= next_input_bit(); + * multiple = (remainder & 1) ? CRCPOLY : 0; + * remainder = (remainder >> 1) ^ multiple; + * } + * + * Note that the other details of endianness have been hidden in CRCPOLY + * (which must be bit-reversed) and next_input_bit(). + * + * However, as long as next_input_bit is returning the bits in a sensible + * order, we can actually do the merging 8 or more bits at a time rather + * than one bit at a time: + * for (i = 0; i < input_bytes; i++) { + * remainder ^= next_input_byte() << 24; + * for (j = 0; j < 8; j++) { + * multiple = (remainder & 0x80000000) ? CRCPOLY : 0; + * remainder = (remainder << 1) ^ multiple; + * } + * } + * Or in little-endian: + * for (i = 0; i < input_bytes; i++) { + * remainder ^= next_input_byte(); + * for (j = 0; j < 8; j++) { + * multiple = (remainder & 1) ? CRCPOLY : 0; + * remainder = (remainder << 1) ^ multiple; + * } + * } + * If the input is a multiple of 32 bits, you can even XOR in a 32-bit + * word at a time and increase the inner loop count to 32. + * + * You can also mix and match the two loop styles, for example doing the + * bulk of a message byte-at-a-time and adding bit-at-a-time processing + * for any fractional bytes at the end. + * + * The only remaining optimization is to the byte-at-a-time table method. + * Here, rather than just shifting one bit of the remainder to decide + * in the correct multiple to subtract, we can shift a byte at a time. + * This produces a 40-bit (rather than a 33-bit) intermediate remainder, + * but again the multiple of the polynomial to subtract depends only on + * the high bits, the high 8 bits in this case. + * + * The multile we need in that case is the low 32 bits of a 40-bit + * value whose high 8 bits are given, and which is a multiple of the + * generator polynomial. This is simply the CRC-32 of the given + * one-byte message. + * + * Two more details: normally, appending zero bits to a message which + * is already a multiple of a polynomial produces a larger multiple of that + * polynomial. To enable a CRC to detect this condition, it's common to + * invert the CRC before appending it. This makes the remainder of the + * message+crc come out not as zero, but some fixed non-zero value. + * + * The same problem applies to zero bits prepended to the message, and + * a similar solution is used. Instead of starting with a remainder of + * 0, an initial remainder of all ones is used. As long as you start + * the same way on decoding, it doesn't make a difference. + */ + + +/** + * init_crc32(): generates CRC32 tables + * + * On successful initialization, use count is increased. + * This guarantees that the library functions will stay resident + * in memory, and prevents someone from 'rmmod crc32' while + * a driver that needs it is still loaded. + * This also greatly simplifies drivers, as there's no need + * to call an initialization/cleanup function from each driver. + * Since crc32.o is a library module, there's no requirement + * that the user can unload it. + */ +int +init_crc32(void) +{ + int rc1, rc2, rc; + rc1 = crc32init_le(); + rc2 = crc32init_be(); + rc = rc1 || rc2; + return rc; +} + +/** + * cleanup_crc32(): frees crc32 data when no longer needed + */ +void +cleanup_crc32(void) +{ + crc32cleanup_le(); + crc32cleanup_be(); +} diff --git a/kpartx/crc32.h b/kpartx/crc32.h new file mode 100644 index 0000000..a4505b8 --- /dev/null +++ b/kpartx/crc32.h @@ -0,0 +1,19 @@ +/* + * crc32.h + */ +#ifndef _CRC32_H +#define _CRC32_H + +#include <inttypes.h> +#include <stdlib.h> + +extern int init_crc32(void); +extern void cleanup_crc32(void); +extern uint32_t crc32_le(uint32_t crc, unsigned char const *p, size_t len); +extern uint32_t crc32_be(uint32_t crc, unsigned char const *p, size_t len); + +#define crc32(seed, data, length) crc32_le(seed, (unsigned char const *)data, length) +#define ether_crc_le(length, data) crc32_le(~0, data, length) +#define ether_crc(length, data) crc32_be(~0, data, length) + +#endif /* _CRC32_H */ diff --git a/kpartx/devmapper.c b/kpartx/devmapper.c new file mode 100644 index 0000000..9db8b93 --- /dev/null +++ b/kpartx/devmapper.c @@ -0,0 +1,140 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <libdevmapper.h> +#include <ctype.h> +#include <linux/kdev_t.h> + +extern int +dm_prereq (char * str, int x, int y, int z) +{ + int r = 1; + struct dm_task *dmt; + struct dm_versions *target; + struct dm_versions *last_target; + + if (!(dmt = dm_task_create(DM_DEVICE_LIST_VERSIONS))) + return 1; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + target = dm_task_get_versions(dmt); + + /* Fetch targets and print 'em */ + do { + last_target = target; + + if (!strncmp(str, target->name, strlen(str)) && + /* dummy prereq on multipath version */ + target->version[0] >= x && + target->version[1] >= y && + target->version[2] >= z + ) + r = 0; + + target = (void *) target + target->next; + } while (last_target != target); + + out: + dm_task_destroy(dmt); + return r; +} + +extern int +dm_simplecmd (int task, const char *name) { + int r = 0; + struct dm_task *dmt; + + if (!(dmt = dm_task_create(task))) + return 0; + + if (!dm_task_set_name(dmt, name)) + goto out; + + dm_task_no_open_count(dmt); + + r = dm_task_run(dmt); + + out: + dm_task_destroy(dmt); + return r; +} + +extern int +dm_addmap (int task, const char *name, const char *target, + const char *params, unsigned long size) { + int r = 0; + struct dm_task *dmt; + + if (!(dmt = dm_task_create (task))) + return 0; + + if (!dm_task_set_name (dmt, name)) + goto addout; + + if (!dm_task_add_target (dmt, 0, size, target, params)) + goto addout; + + dm_task_no_open_count(dmt); + + r = dm_task_run (dmt); + + addout: + dm_task_destroy (dmt); + return r; +} + +extern int +dm_map_present (char * str) +{ + int r = 0; + struct dm_task *dmt; + struct dm_info info; + + if (!(dmt = dm_task_create(DM_DEVICE_INFO))) + return 0; + + if (!dm_task_set_name(dmt, str)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + if (!dm_task_get_info(dmt, &info)) + goto out; + + if (info.exists) + r = 1; +out: + dm_task_destroy(dmt); + return r; +} + + +const char * +dm_mapname(int major, int minor) +{ + struct dm_task *dmt; + const char *mapname; + + if (!(dmt = dm_task_create(DM_DEVICE_INFO))) + return NULL; + + dm_task_no_open_count(dmt); + dm_task_set_major(dmt, major); + dm_task_set_minor(dmt, minor); + + if (!dm_task_run(dmt)) + goto out; + + mapname = strdup(dm_task_get_name(dmt)); +out: + dm_task_destroy(dmt); + return mapname; +} + diff --git a/kpartx/devmapper.h b/kpartx/devmapper.h new file mode 100644 index 0000000..ffd243f --- /dev/null +++ b/kpartx/devmapper.h @@ -0,0 +1,5 @@ +int dm_prereq (char *, int, int, int); +int dm_simplecmd (int, const char *); +int dm_addmap (int, const char *, const char *, const char *, unsigned long); +int dm_map_present (char *); +const char * dm_mapname(int major, int minor); diff --git a/kpartx/dos.c b/kpartx/dos.c new file mode 100644 index 0000000..2f4e8a9 --- /dev/null +++ b/kpartx/dos.c @@ -0,0 +1,112 @@ +#include "kpartx.h" +#include "byteorder.h" +#include <stdio.h> +#include "dos.h" + +static int +is_extended(int type) { + return (type == 5 || type == 0xf || type == 0x85); +} + +static int +read_extended_partition(int fd, struct partition *ep, + struct slice *sp, int ns) +{ + struct partition *p; + unsigned long start, here; + unsigned char *bp; + int loopct = 0; + int moretodo = 1; + int i, n=0; + + here = start = le32_to_cpu(ep->start_sect); + + while (moretodo) { + moretodo = 0; + if (++loopct > 100) + return n; + + bp = getblock(fd, here); + if (bp == NULL) + return n; + + if (bp[510] != 0x55 || bp[511] != 0xaa) + return n; + + p = (struct partition *) (bp + 0x1be); + + for (i=0; i<2; i++, p++) { + if (p->nr_sects == 0 || is_extended(p->sys_type)) + continue; + if (n < ns) { + sp[n].start = here + le32_to_cpu(p->start_sect); + sp[n].size = le32_to_cpu(p->nr_sects); + n++; + } else { + fprintf(stderr, + "dos_extd_partition: too many slices\n"); + return n; + } + loopct = 0; + } + + p -= 2; + for (i=0; i<2; i++, p++) { + if(p->nr_sects != 0 && is_extended(p->sys_type)) { + here = start + le32_to_cpu(p->start_sect); + moretodo = 1; + break; + } + } + } + return n; +} + +static int +is_gpt(int type) { + return (type == 0xEE); +} + +int +read_dos_pt(int fd, struct slice all, struct slice *sp, int ns) { + struct partition *p; + unsigned long offset = all.start; + int i, n=0; + unsigned char *bp; + + bp = getblock(fd, offset); + if (bp == NULL) + return -1; + + if (bp[510] != 0x55 || bp[511] != 0xaa) + return -1; + + p = (struct partition *) (bp + 0x1be); + for (i=0; i<4; i++) { + if (is_gpt(p->sys_type)) { + return 0; + } + p++; + } + p = (struct partition *) (bp + 0x1be); + for (i=0; i<4; i++) { + /* always add, even if zero length */ + if (n < ns) { + sp[n].start = le32_to_cpu(p->start_sect); + sp[n].size = le32_to_cpu(p->nr_sects); + n++; + } else { + fprintf(stderr, + "dos_partition: too many slices\n"); + break; + } + p++; + } + p = (struct partition *) (bp + 0x1be); + for (i=0; i<4; i++) { + if (is_extended(p->sys_type)) + n += read_extended_partition(fd, p, sp+n, ns-n); + p++; + } + return n; +} diff --git a/kpartx/dos.h b/kpartx/dos.h new file mode 100644 index 0000000..f45e7f6 --- /dev/null +++ b/kpartx/dos.h @@ -0,0 +1,13 @@ +#ifndef DOS_H_INCLUDED +#define DOS_H_INCLUDED + +struct partition { + unsigned char boot_ind; /* 0x80 - active */ + unsigned char bh, bs, bc; + unsigned char sys_type; + unsigned char eh, es, ec; + unsigned int start_sect; + unsigned int nr_sects; +} __attribute__((packed)); + +#endif /* DOS_H_INCLUDED */ diff --git a/kpartx/efi.h b/kpartx/efi.h new file mode 100644 index 0000000..1cbd961 --- /dev/null +++ b/kpartx/efi.h @@ -0,0 +1,58 @@ +/* + efi.[ch] - Manipulates EFI variables as exported in /proc/efi/vars + + Copyright (C) 2001 Dell Computer Corporation <Matt_Domsch@dell.com> + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#ifndef EFI_H +#define EFI_H + +/* + * Extensible Firmware Interface + * Based on 'Extensible Firmware Interface Specification' + * version 1.02, 12 December, 2000 + */ +#include <stdint.h> +#include <string.h> + +typedef struct { + uint8_t b[16]; +} efi_guid_t; + +#define EFI_GUID(a,b,c,d0,d1,d2,d3,d4,d5,d6,d7) \ +((efi_guid_t) \ +{{ (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \ + (b) & 0xff, ((b) >> 8) & 0xff, \ + (c) & 0xff, ((c) >> 8) & 0xff, \ + (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }}) + + +/****************************************************** + * GUIDs + ******************************************************/ +#define NULL_GUID \ +EFI_GUID( 0x00000000, 0x0000, 0x0000, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00) + +static inline int +efi_guidcmp(efi_guid_t left, efi_guid_t right) +{ + return memcmp(&left, &right, sizeof (efi_guid_t)); +} + +typedef uint16_t efi_char16_t; /* UNICODE character */ + +#endif /* EFI_H */ diff --git a/kpartx/gpt.c b/kpartx/gpt.c new file mode 100644 index 0000000..dc846ca --- /dev/null +++ b/kpartx/gpt.c @@ -0,0 +1,638 @@ +/* + gpt.[ch] + + Copyright (C) 2000-2001 Dell Computer Corporation <Matt_Domsch@dell.com> + + EFI GUID Partition Table handling + Per Intel EFI Specification v1.02 + http://developer.intel.com/technology/efi/efi.htm + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +#define _FILE_OFFSET_BITS 64 + +#include "gpt.h" +#include <stdio.h> +#include <string.h> +#include <stdlib.h> +#include <inttypes.h> +#include <sys/stat.h> +#include <sys/ioctl.h> +#include <fcntl.h> +#include <unistd.h> +#include <errno.h> +#include <endian.h> +#include <byteswap.h> +#include "crc32.h" + +#if BYTE_ORDER == LITTLE_ENDIAN +# define __le16_to_cpu(x) (x) +# define __le32_to_cpu(x) (x) +# define __le64_to_cpu(x) (x) +# define __cpu_to_le32(x) (x) +#elif BYTE_ORDER == BIG_ENDIAN +# define __le16_to_cpu(x) bswap_16(x) +# define __le32_to_cpu(x) bswap_32(x) +# define __le64_to_cpu(x) bswap_64(x) +# define __cpu_to_le32(x) bswap_32(x) +#endif + +#define BLKGETLASTSECT _IO(0x12,108) /* get last sector of block device */ +#define BLKGETSIZE _IO(0x12,96) /* return device size */ +#define BLKSSZGET _IO(0x12,104) /* get block device sector size */ +#define BLKGETSIZE64 _IOR(0x12,114,sizeof(uint64_t)) /* return device size in bytes (u64 *arg) */ + +struct blkdev_ioctl_param { + unsigned int block; + size_t content_length; + char * block_contents; +}; + +/** + * efi_crc32() - EFI version of crc32 function + * @buf: buffer to calculate crc32 of + * @len - length of buf + * + * Description: Returns EFI-style CRC32 value for @buf + * + * This function uses the little endian Ethernet polynomial + * but seeds the function with ~0, and xor's with ~0 at the end. + * Note, the EFI Specification, v1.02, has a reference to + * Dr. Dobbs Journal, May 1994 (actually it's in May 1992). + */ +static inline uint32_t +efi_crc32(const void *buf, unsigned long len) +{ + return (crc32(~0L, buf, len) ^ ~0L); +} + +/** + * is_pmbr_valid(): test Protective MBR for validity + * @mbr: pointer to a legacy mbr structure + * + * Description: Returns 1 if PMBR is valid, 0 otherwise. + * Validity depends on two things: + * 1) MSDOS signature is in the last two bytes of the MBR + * 2) One partition of type 0xEE is found + */ +static int +is_pmbr_valid(legacy_mbr *mbr) +{ + int i, found = 0, signature = 0; + if (!mbr) + return 0; + signature = (__le16_to_cpu(mbr->signature) == MSDOS_MBR_SIGNATURE); + for (i = 0; signature && i < 4; i++) { + if (mbr->partition[i].sys_type == + EFI_PMBR_OSTYPE_EFI_GPT) { + found = 1; + break; + } + } + return (signature && found); +} + + +/************************************************************ + * get_sector_size + * Requires: + * - filedes is an open file descriptor, suitable for reading + * Modifies: nothing + * Returns: + * sector size, or 512. + ************************************************************/ +static int +get_sector_size(int filedes) +{ + int rc, sector_size = 512; + + rc = ioctl(filedes, BLKSSZGET, §or_size); + if (rc) + sector_size = 512; + return sector_size; +} + +/************************************************************ + * _get_num_sectors + * Requires: + * - filedes is an open file descriptor, suitable for reading + * Modifies: nothing + * Returns: + * Last LBA value on success + * 0 on error + * + * Try getting BLKGETSIZE64 and BLKSSZGET first, + * then BLKGETSIZE if necessary. + * Kernels 2.4.15-2.4.18 and 2.5.0-2.5.3 have a broken BLKGETSIZE64 + * which returns the number of 512-byte sectors, not the size of + * the disk in bytes. Fixed in kernels 2.4.18-pre8 and 2.5.4-pre3. + ************************************************************/ +static uint64_t +_get_num_sectors(int filedes) +{ + unsigned long sectors=0; + int rc; +#if 0 + uint64_t bytes=0; + + rc = ioctl(filedes, BLKGETSIZE64, &bytes); + if (!rc) + return bytes / get_sector_size(filedes); +#endif + rc = ioctl(filedes, BLKGETSIZE, §ors); + if (rc) + return 0; + + return sectors; +} + +/************************************************************ + * last_lba(): return number of last logical block of device + * + * @fd + * + * Description: returns Last LBA value on success, 0 on error. + * Notes: The value st_blocks gives the size of the file + * in 512-byte blocks, which is OK if + * EFI_BLOCK_SIZE_SHIFT == 9. + ************************************************************/ + +static uint64_t +last_lba(int filedes) +{ + int rc; + uint64_t sectors = 0; + struct stat s; + memset(&s, 0, sizeof (s)); + rc = fstat(filedes, &s); + if (rc == -1) { + fprintf(stderr, "last_lba() could not stat: %s\n", + strerror(errno)); + return 0; + } + + if (S_ISBLK(s.st_mode)) { + sectors = _get_num_sectors(filedes); + } else { + fprintf(stderr, + "last_lba(): I don't know how to handle files with mode %x\n", + s.st_mode); + sectors = 1; + } + + return sectors - 1; +} + + +static ssize_t +read_lastoddsector(int fd, uint64_t lba, void *buffer, size_t count) +{ + int rc; + struct blkdev_ioctl_param ioctl_param; + + if (!buffer) return 0; + + ioctl_param.block = 0; /* read the last sector */ + ioctl_param.content_length = count; + ioctl_param.block_contents = buffer; + + rc = ioctl(fd, BLKGETLASTSECT, &ioctl_param); + if (rc == -1) perror("read failed"); + + return !rc; +} + +static ssize_t +read_lba(int fd, uint64_t lba, void *buffer, size_t bytes) +{ + int sector_size = get_sector_size(fd); + off_t offset = lba * sector_size; + ssize_t bytesread; + + lseek(fd, offset, SEEK_SET); + bytesread = read(fd, buffer, bytes); + + /* Kludge. This is necessary to read/write the last + block of an odd-sized disk, until Linux 2.5.x kernel fixes. + This is only used by gpt.c, and only to read + one sector, so we don't have to be fancy. + */ + if (!bytesread && !(last_lba(fd) & 1) && lba == last_lba(fd)) { + bytesread = read_lastoddsector(fd, lba, buffer, bytes); + } + return bytesread; +} + +/** + * alloc_read_gpt_entries(): reads partition entries from disk + * @fd is an open file descriptor to the whole disk + * @gpt is a buffer into which the GPT will be put + * Description: Returns ptes on success, NULL on error. + * Allocates space for PTEs based on information found in @gpt. + * Notes: remember to free pte when you're done! + */ +static gpt_entry * +alloc_read_gpt_entries(int fd, gpt_header * gpt) +{ + gpt_entry *pte; + size_t count = __le32_to_cpu(gpt->num_partition_entries) * + __le32_to_cpu(gpt->sizeof_partition_entry); + + if (!count) return NULL; + + pte = (gpt_entry *)malloc(count); + if (!pte) + return NULL; + memset(pte, 0, count); + + if (!read_lba(fd, __le64_to_cpu(gpt->partition_entry_lba), pte, + count)) { + free(pte); + return NULL; + } + return pte; +} + +/** + * alloc_read_gpt_header(): Allocates GPT header, reads into it from disk + * @fd is an open file descriptor to the whole disk + * @lba is the Logical Block Address of the partition table + * + * Description: returns GPT header on success, NULL on error. Allocates + * and fills a GPT header starting at @ from @bdev. + * Note: remember to free gpt when finished with it. + */ +static gpt_header * +alloc_read_gpt_header(int fd, uint64_t lba) +{ + gpt_header *gpt; + gpt = (gpt_header *) + malloc(sizeof (gpt_header)); + if (!gpt) + return NULL; + memset(gpt, 0, sizeof (*gpt)); + if (!read_lba(fd, lba, gpt, sizeof (gpt_header))) { + free(gpt); + return NULL; + } + + return gpt; +} + +/** + * is_gpt_valid() - tests one GPT header and PTEs for validity + * @fd is an open file descriptor to the whole disk + * @lba is the logical block address of the GPT header to test + * @gpt is a GPT header ptr, filled on return. + * @ptes is a PTEs ptr, filled on return. + * + * Description: returns 1 if valid, 0 on error. + * If valid, returns pointers to newly allocated GPT header and PTEs. + */ +static int +is_gpt_valid(int fd, uint64_t lba, + gpt_header ** gpt, gpt_entry ** ptes) +{ + int rc = 0; /* default to not valid */ + uint32_t crc, origcrc; + + if (!gpt || !ptes) + return 0; + if (!(*gpt = alloc_read_gpt_header(fd, lba))) + return 0; + + /* Check the GUID Partition Table signature */ + if (__le64_to_cpu((*gpt)->signature) != GPT_HEADER_SIGNATURE) { + /* + printf("GUID Partition Table Header signature is wrong: %" PRIx64" != %" PRIx64 "\n", + __le64_to_cpu((*gpt)->signature), GUID_PT_HEADER_SIGNATURE); + */ + free(*gpt); + *gpt = NULL; + return rc; + } + + /* Check the GUID Partition Table Header CRC */ + origcrc = __le32_to_cpu((*gpt)->header_crc32); + (*gpt)->header_crc32 = 0; + crc = efi_crc32(*gpt, __le32_to_cpu((*gpt)->header_size)); + if (crc != origcrc) { + // printf( "GPTH CRC check failed, %x != %x.\n", origcrc, crc); + (*gpt)->header_crc32 = __cpu_to_le32(origcrc); + free(*gpt); + *gpt = NULL; + return 0; + } + (*gpt)->header_crc32 = __cpu_to_le32(origcrc); + + /* Check that the my_lba entry points to the LBA + * that contains the GPT we read */ + if (__le64_to_cpu((*gpt)->my_lba) != lba) { + /* + printf( "my_lba % PRIx64 "x != lba %"PRIx64 "x.\n", + __le64_to_cpu((*gpt)->my_lba), lba); + */ + free(*gpt); + *gpt = NULL; + return 0; + } + + if (!(*ptes = alloc_read_gpt_entries(fd, *gpt))) { + free(*gpt); + *gpt = NULL; + return 0; + } + + /* Check the GUID Partition Entry Array CRC */ + crc = efi_crc32(*ptes, + __le32_to_cpu((*gpt)->num_partition_entries) * + __le32_to_cpu((*gpt)->sizeof_partition_entry)); + if (crc != __le32_to_cpu((*gpt)->partition_entry_array_crc32)) { + // printf("GUID Partitition Entry Array CRC check failed.\n"); + free(*gpt); + *gpt = NULL; + free(*ptes); + *ptes = NULL; + return 0; + } + + /* We're done, all's well */ + return 1; +} +/** + * compare_gpts() - Search disk for valid GPT headers and PTEs + * @pgpt is the primary GPT header + * @agpt is the alternate GPT header + * @lastlba is the last LBA number + * Description: Returns nothing. Sanity checks pgpt and agpt fields + * and prints warnings on discrepancies. + * + */ +static void +compare_gpts(gpt_header *pgpt, gpt_header *agpt, uint64_t lastlba) +{ + int error_found = 0; + if (!pgpt || !agpt) + return; + if (__le64_to_cpu(pgpt->my_lba) != __le64_to_cpu(agpt->alternate_lba)) { + error_found++; + fprintf(stderr, + "GPT:Primary header LBA != Alt. header alternate_lba\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->my_lba), + __le64_to_cpu(agpt->alternate_lba)); +#endif + } + if (__le64_to_cpu(pgpt->alternate_lba) != __le64_to_cpu(agpt->my_lba)) { + error_found++; + fprintf(stderr, + "GPT:Primary header alternate_lba != Alt. header my_lba\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->alternate_lba), + __le64_to_cpu(agpt->my_lba)); +#endif + } + if (__le64_to_cpu(pgpt->first_usable_lba) != + __le64_to_cpu(agpt->first_usable_lba)) { + error_found++; + fprintf(stderr, "GPT:first_usable_lbas don't match.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->first_usable_lba), + __le64_to_cpu(agpt->first_usable_lba)); +#endif + } + if (__le64_to_cpu(pgpt->last_usable_lba) != + __le64_to_cpu(agpt->last_usable_lba)) { + error_found++; + fprintf(stderr, "GPT:last_usable_lbas don't match.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->last_usable_lba), + __le64_to_cpu(agpt->last_usable_lba)); +#endif + } + if (efi_guidcmp(pgpt->disk_guid, agpt->disk_guid)) { + error_found++; + fprintf(stderr, "GPT:disk_guids don't match.\n"); + } + if (__le32_to_cpu(pgpt->num_partition_entries) != + __le32_to_cpu(agpt->num_partition_entries)) { + error_found++; + fprintf(stderr, "GPT:num_partition_entries don't match: " + "0x%x != 0x%x\n", + __le32_to_cpu(pgpt->num_partition_entries), + __le32_to_cpu(agpt->num_partition_entries)); + } + if (__le32_to_cpu(pgpt->sizeof_partition_entry) != + __le32_to_cpu(agpt->sizeof_partition_entry)) { + error_found++; + fprintf(stderr, + "GPT:sizeof_partition_entry values don't match: " + "0x%x != 0x%x\n", + __le32_to_cpu(pgpt->sizeof_partition_entry), + __le32_to_cpu(agpt->sizeof_partition_entry)); + } + if (__le32_to_cpu(pgpt->partition_entry_array_crc32) != + __le32_to_cpu(agpt->partition_entry_array_crc32)) { + error_found++; + fprintf(stderr, + "GPT:partition_entry_array_crc32 values don't match: " + "0x%x != 0x%x\n", + __le32_to_cpu(pgpt->partition_entry_array_crc32), + __le32_to_cpu(agpt->partition_entry_array_crc32)); + } + if (__le64_to_cpu(pgpt->alternate_lba) != lastlba) { + error_found++; + fprintf(stderr, + "GPT:Primary header thinks Alt. header is not at the end of the disk.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->alternate_lba), lastlba); +#endif + } + + if (__le64_to_cpu(agpt->my_lba) != lastlba) { + error_found++; + fprintf(stderr, + "GPT:Alternate GPT header not at the end of the disk.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(agpt->my_lba), lastlba); +#endif + } + + if (error_found) + fprintf(stderr, + "GPT: Use GNU Parted to correct GPT errors.\n"); + return; +} + +/** + * find_valid_gpt() - Search disk for valid GPT headers and PTEs + * @fd is an open file descriptor to the whole disk + * @gpt is a GPT header ptr, filled on return. + * @ptes is a PTEs ptr, filled on return. + * Description: Returns 1 if valid, 0 on error. + * If valid, returns pointers to newly allocated GPT header and PTEs. + * Validity depends on finding either the Primary GPT header and PTEs valid, + * or the Alternate GPT header and PTEs valid, and the PMBR valid. + */ +static int +find_valid_gpt(int fd, gpt_header ** gpt, gpt_entry ** ptes) +{ + extern int force_gpt; + int good_pgpt = 0, good_agpt = 0, good_pmbr = 0; + gpt_header *pgpt = NULL, *agpt = NULL; + gpt_entry *pptes = NULL, *aptes = NULL; + legacy_mbr *legacymbr = NULL; + uint64_t lastlba; + if (!gpt || !ptes) + return 0; + + lastlba = last_lba(fd); + good_pgpt = is_gpt_valid(fd, GPT_PRIMARY_PARTITION_TABLE_LBA, + &pgpt, &pptes); + if (good_pgpt) { + good_agpt = is_gpt_valid(fd, + __le64_to_cpu(pgpt->alternate_lba), + &agpt, &aptes); + if (!good_agpt) { + good_agpt = is_gpt_valid(fd, lastlba, + &agpt, &aptes); + } + } + else { + good_agpt = is_gpt_valid(fd, lastlba, + &agpt, &aptes); + } + + /* The obviously unsuccessful case */ + if (!good_pgpt && !good_agpt) { + goto fail; + } + + /* This will be added to the EFI Spec. per Intel after v1.02. */ + legacymbr = malloc(sizeof (*legacymbr)); + if (legacymbr) { + memset(legacymbr, 0, sizeof (*legacymbr)); + read_lba(fd, 0, (uint8_t *) legacymbr, + sizeof (*legacymbr)); + good_pmbr = is_pmbr_valid(legacymbr); + free(legacymbr); + legacymbr=NULL; + } + + /* Failure due to bad PMBR */ + if ((good_pgpt || good_agpt) && !good_pmbr && !force_gpt) { + fprintf(stderr, + " Warning: Disk has a valid GPT signature " + "but invalid PMBR.\n" + " Assuming this disk is *not* a GPT disk anymore.\n" + " Use gpt kernel option to override. " + "Use GNU Parted to correct disk.\n"); + goto fail; + } + + /* Would fail due to bad PMBR, but force GPT anyhow */ + if ((good_pgpt || good_agpt) && !good_pmbr && force_gpt) { + fprintf(stderr, + " Warning: Disk has a valid GPT signature but " + "invalid PMBR.\n" + " Use GNU Parted to correct disk.\n" + " gpt option taken, disk treated as GPT.\n"); + } + + compare_gpts(pgpt, agpt, lastlba); + + /* The good cases */ + if (good_pgpt && (good_pmbr || force_gpt)) { + *gpt = pgpt; + *ptes = pptes; + if (agpt) { free(agpt); agpt = NULL; } + if (aptes) { free(aptes); aptes = NULL; } + if (!good_agpt) { + fprintf(stderr, + "Alternate GPT is invalid, " + "using primary GPT.\n"); + } + return 1; + } + else if (good_agpt && (good_pmbr || force_gpt)) { + *gpt = agpt; + *ptes = aptes; + if (pgpt) { free(pgpt); pgpt = NULL; } + if (pptes) { free(pptes); pptes = NULL; } + fprintf(stderr, + "Primary GPT is invalid, using alternate GPT.\n"); + return 1; + } + + fail: + if (pgpt) { free(pgpt); pgpt=NULL; } + if (agpt) { free(agpt); agpt=NULL; } + if (pptes) { free(pptes); pptes=NULL; } + if (aptes) { free(aptes); aptes=NULL; } + *gpt = NULL; + *ptes = NULL; + return 0; +} + +/** + * read_gpt_pt() + * @fd + * @all - slice with start/size of whole disk + * + * 0 if this isn't our partition table + * number of partitions if successful + * + */ +int +read_gpt_pt (int fd, struct slice all, struct slice *sp, int ns) +{ + gpt_header *gpt = NULL; + gpt_entry *ptes = NULL; + uint32_t i; + int n = 0; + int last_used_index=-1; + + if (!find_valid_gpt (fd, &gpt, &ptes) || !gpt || !ptes) { + if (gpt) + free (gpt); + if (ptes) + free (ptes); + return 0; + } + + for (i = 0; i < __le32_to_cpu(gpt->num_partition_entries) && i < ns; i++) { + if (!efi_guidcmp (NULL_GUID, ptes[i].partition_type_guid)) { + sp[n].start = 0; + sp[n].size = 0; + n++; + } else { + sp[n].start = __le64_to_cpu(ptes[i].starting_lba); + sp[n].size = __le64_to_cpu(ptes[i].ending_lba) - + __le64_to_cpu(ptes[i].starting_lba) + 1; + last_used_index=n; + n++; + } + } + free (ptes); + free (gpt); + return last_used_index+1; +} diff --git a/kpartx/gpt.c.orig b/kpartx/gpt.c.orig new file mode 100644 index 0000000..d5e2dd5 --- /dev/null +++ b/kpartx/gpt.c.orig @@ -0,0 +1,625 @@ +/* + gpt.[ch] + + Copyright (C) 2000-2001 Dell Computer Corporation <Matt_Domsch@dell.com> + + EFI GUID Partition Table handling + Per Intel EFI Specification v1.02 + http://developer.intel.com/technology/efi/efi.htm + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +#define _FILE_OFFSET_BITS 64 + +#include "gpt.h" +#include <stdio.h> +#include <string.h> +#include <stdlib.h> +#include <inttypes.h> +#include <sys/stat.h> +#include <sys/ioctl.h> +#include <fcntl.h> +#include <unistd.h> +#include <errno.h> +#include <asm/byteorder.h> +#include "crc32.h" + +#define BLKGETLASTSECT _IO(0x12,108) /* get last sector of block device */ +#define BLKGETSIZE _IO(0x12,96) /* return device size */ +#define BLKSSZGET _IO(0x12,104) /* get block device sector size */ +#define BLKGETSIZE64 _IOR(0x12,114,sizeof(uint64_t)) /* return device size in bytes (u64 *arg) */ + +struct blkdev_ioctl_param { + unsigned int block; + size_t content_length; + char * block_contents; +}; + +/** + * efi_crc32() - EFI version of crc32 function + * @buf: buffer to calculate crc32 of + * @len - length of buf + * + * Description: Returns EFI-style CRC32 value for @buf + * + * This function uses the little endian Ethernet polynomial + * but seeds the function with ~0, and xor's with ~0 at the end. + * Note, the EFI Specification, v1.02, has a reference to + * Dr. Dobbs Journal, May 1994 (actually it's in May 1992). + */ +static inline uint32_t +efi_crc32(const void *buf, unsigned long len) +{ + return (crc32(~0L, buf, len) ^ ~0L); +} + +/** + * is_pmbr_valid(): test Protective MBR for validity + * @mbr: pointer to a legacy mbr structure + * + * Description: Returns 1 if PMBR is valid, 0 otherwise. + * Validity depends on two things: + * 1) MSDOS signature is in the last two bytes of the MBR + * 2) One partition of type 0xEE is found + */ +static int +is_pmbr_valid(legacy_mbr *mbr) +{ + int i, found = 0, signature = 0; + if (!mbr) + return 0; + signature = (__le16_to_cpu(mbr->signature) == MSDOS_MBR_SIGNATURE); + for (i = 0; signature && i < 4; i++) { + if (mbr->partition[i].sys_type == + EFI_PMBR_OSTYPE_EFI_GPT) { + found = 1; + break; + } + } + return (signature && found); +} + + +/************************************************************ + * get_sector_size + * Requires: + * - filedes is an open file descriptor, suitable for reading + * Modifies: nothing + * Returns: + * sector size, or 512. + ************************************************************/ +static int +get_sector_size(int filedes) +{ + int rc, sector_size = 512; + + rc = ioctl(filedes, BLKSSZGET, §or_size); + if (rc) + sector_size = 512; + return sector_size; +} + +/************************************************************ + * _get_num_sectors + * Requires: + * - filedes is an open file descriptor, suitable for reading + * Modifies: nothing + * Returns: + * Last LBA value on success + * 0 on error + * + * Try getting BLKGETSIZE64 and BLKSSZGET first, + * then BLKGETSIZE if necessary. + * Kernels 2.4.15-2.4.18 and 2.5.0-2.5.3 have a broken BLKGETSIZE64 + * which returns the number of 512-byte sectors, not the size of + * the disk in bytes. Fixed in kernels 2.4.18-pre8 and 2.5.4-pre3. + ************************************************************/ +static uint64_t +_get_num_sectors(int filedes) +{ + unsigned long sectors=0; + int rc; +#if 0 + uint64_t bytes=0; + + rc = ioctl(filedes, BLKGETSIZE64, &bytes); + if (!rc) + return bytes / get_sector_size(filedes); +#endif + rc = ioctl(filedes, BLKGETSIZE, §ors); + if (rc) + return 0; + + return sectors; +} + +/************************************************************ + * last_lba(): return number of last logical block of device + * + * @fd + * + * Description: returns Last LBA value on success, 0 on error. + * Notes: The value st_blocks gives the size of the file + * in 512-byte blocks, which is OK if + * EFI_BLOCK_SIZE_SHIFT == 9. + ************************************************************/ + +static uint64_t +last_lba(int filedes) +{ + int rc; + uint64_t sectors = 0; + struct stat s; + memset(&s, 0, sizeof (s)); + rc = fstat(filedes, &s); + if (rc == -1) { + fprintf(stderr, "last_lba() could not stat: %s\n", + strerror(errno)); + return 0; + } + + if (S_ISBLK(s.st_mode)) { + sectors = _get_num_sectors(filedes); + } else { + fprintf(stderr, + "last_lba(): I don't know how to handle files with mode %x\n", + s.st_mode); + sectors = 1; + } + + return sectors - 1; +} + + +static ssize_t +read_lastoddsector(int fd, uint64_t lba, void *buffer, size_t count) +{ + int rc; + struct blkdev_ioctl_param ioctl_param; + + if (!buffer) return 0; + + ioctl_param.block = 0; /* read the last sector */ + ioctl_param.content_length = count; + ioctl_param.block_contents = buffer; + + rc = ioctl(fd, BLKGETLASTSECT, &ioctl_param); + if (rc == -1) perror("read failed"); + + return !rc; +} + +static ssize_t +read_lba(int fd, uint64_t lba, void *buffer, size_t bytes) +{ + int sector_size = get_sector_size(fd); + off_t offset = lba * sector_size; + ssize_t bytesread; + + lseek(fd, offset, SEEK_SET); + bytesread = read(fd, buffer, bytes); + + /* Kludge. This is necessary to read/write the last + block of an odd-sized disk, until Linux 2.5.x kernel fixes. + This is only used by gpt.c, and only to read + one sector, so we don't have to be fancy. + */ + if (!bytesread && !(last_lba(fd) & 1) && lba == last_lba(fd)) { + bytesread = read_lastoddsector(fd, lba, buffer, bytes); + } + return bytesread; +} + +/** + * alloc_read_gpt_entries(): reads partition entries from disk + * @fd is an open file descriptor to the whole disk + * @gpt is a buffer into which the GPT will be put + * Description: Returns ptes on success, NULL on error. + * Allocates space for PTEs based on information found in @gpt. + * Notes: remember to free pte when you're done! + */ +static gpt_entry * +alloc_read_gpt_entries(int fd, gpt_header * gpt) +{ + gpt_entry *pte; + size_t count = __le32_to_cpu(gpt->num_partition_entries) * + __le32_to_cpu(gpt->sizeof_partition_entry); + + if (!count) return NULL; + + pte = (gpt_entry *)malloc(count); + if (!pte) + return NULL; + memset(pte, 0, count); + + if (!read_lba(fd, __le64_to_cpu(gpt->partition_entry_lba), pte, + count)) { + free(pte); + return NULL; + } + return pte; +} + +/** + * alloc_read_gpt_header(): Allocates GPT header, reads into it from disk + * @fd is an open file descriptor to the whole disk + * @lba is the Logical Block Address of the partition table + * + * Description: returns GPT header on success, NULL on error. Allocates + * and fills a GPT header starting at @ from @bdev. + * Note: remember to free gpt when finished with it. + */ +static gpt_header * +alloc_read_gpt_header(int fd, uint64_t lba) +{ + gpt_header *gpt; + gpt = (gpt_header *) + malloc(sizeof (gpt_header)); + if (!gpt) + return NULL; + memset(gpt, 0, sizeof (*gpt)); + if (!read_lba(fd, lba, gpt, sizeof (gpt_header))) { + free(gpt); + return NULL; + } + + return gpt; +} + +/** + * is_gpt_valid() - tests one GPT header and PTEs for validity + * @fd is an open file descriptor to the whole disk + * @lba is the logical block address of the GPT header to test + * @gpt is a GPT header ptr, filled on return. + * @ptes is a PTEs ptr, filled on return. + * + * Description: returns 1 if valid, 0 on error. + * If valid, returns pointers to newly allocated GPT header and PTEs. + */ +static int +is_gpt_valid(int fd, uint64_t lba, + gpt_header ** gpt, gpt_entry ** ptes) +{ + int rc = 0; /* default to not valid */ + uint32_t crc, origcrc; + + if (!gpt || !ptes) + return 0; + if (!(*gpt = alloc_read_gpt_header(fd, lba))) + return 0; + + /* Check the GUID Partition Table signature */ + if (__le64_to_cpu((*gpt)->signature) != GPT_HEADER_SIGNATURE) { + /* + printf("GUID Partition Table Header signature is wrong: %" PRIx64" != %" PRIx64 "\n", + __le64_to_cpu((*gpt)->signature), GUID_PT_HEADER_SIGNATURE); + */ + free(*gpt); + *gpt = NULL; + return rc; + } + + /* Check the GUID Partition Table Header CRC */ + origcrc = __le32_to_cpu((*gpt)->header_crc32); + (*gpt)->header_crc32 = 0; + crc = efi_crc32(*gpt, __le32_to_cpu((*gpt)->header_size)); + if (crc != origcrc) { + // printf( "GPTH CRC check failed, %x != %x.\n", origcrc, crc); + (*gpt)->header_crc32 = __cpu_to_le32(origcrc); + free(*gpt); + *gpt = NULL; + return 0; + } + (*gpt)->header_crc32 = __cpu_to_le32(origcrc); + + /* Check that the my_lba entry points to the LBA + * that contains the GPT we read */ + if (__le64_to_cpu((*gpt)->my_lba) != lba) { + /* + printf( "my_lba % PRIx64 "x != lba %"PRIx64 "x.\n", + __le64_to_cpu((*gpt)->my_lba), lba); + */ + free(*gpt); + *gpt = NULL; + return 0; + } + + if (!(*ptes = alloc_read_gpt_entries(fd, *gpt))) { + free(*gpt); + *gpt = NULL; + return 0; + } + + /* Check the GUID Partition Entry Array CRC */ + crc = efi_crc32(*ptes, + __le32_to_cpu((*gpt)->num_partition_entries) * + __le32_to_cpu((*gpt)->sizeof_partition_entry)); + if (crc != __le32_to_cpu((*gpt)->partition_entry_array_crc32)) { + // printf("GUID Partitition Entry Array CRC check failed.\n"); + free(*gpt); + *gpt = NULL; + free(*ptes); + *ptes = NULL; + return 0; + } + + /* We're done, all's well */ + return 1; +} +/** + * compare_gpts() - Search disk for valid GPT headers and PTEs + * @pgpt is the primary GPT header + * @agpt is the alternate GPT header + * @lastlba is the last LBA number + * Description: Returns nothing. Sanity checks pgpt and agpt fields + * and prints warnings on discrepancies. + * + */ +static void +compare_gpts(gpt_header *pgpt, gpt_header *agpt, uint64_t lastlba) +{ + int error_found = 0; + if (!pgpt || !agpt) + return; + if (__le64_to_cpu(pgpt->my_lba) != __le64_to_cpu(agpt->alternate_lba)) { + error_found++; + fprintf(stderr, + "GPT:Primary header LBA != Alt. header alternate_lba\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->my_lba), + __le64_to_cpu(agpt->alternate_lba)); +#endif + } + if (__le64_to_cpu(pgpt->alternate_lba) != __le64_to_cpu(agpt->my_lba)) { + error_found++; + fprintf(stderr, + "GPT:Primary header alternate_lba != Alt. header my_lba\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->alternate_lba), + __le64_to_cpu(agpt->my_lba)); +#endif + } + if (__le64_to_cpu(pgpt->first_usable_lba) != + __le64_to_cpu(agpt->first_usable_lba)) { + error_found++; + fprintf(stderr, "GPT:first_usable_lbas don't match.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->first_usable_lba), + __le64_to_cpu(agpt->first_usable_lba)); +#endif + } + if (__le64_to_cpu(pgpt->last_usable_lba) != + __le64_to_cpu(agpt->last_usable_lba)) { + error_found++; + fprintf(stderr, "GPT:last_usable_lbas don't match.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->last_usable_lba), + __le64_to_cpu(agpt->last_usable_lba)); +#endif + } + if (efi_guidcmp(pgpt->disk_guid, agpt->disk_guid)) { + error_found++; + fprintf(stderr, "GPT:disk_guids don't match.\n"); + } + if (__le32_to_cpu(pgpt->num_partition_entries) != + __le32_to_cpu(agpt->num_partition_entries)) { + error_found++; + fprintf(stderr, "GPT:num_partition_entries don't match: " + "0x%x != 0x%x\n", + __le32_to_cpu(pgpt->num_partition_entries), + __le32_to_cpu(agpt->num_partition_entries)); + } + if (__le32_to_cpu(pgpt->sizeof_partition_entry) != + __le32_to_cpu(agpt->sizeof_partition_entry)) { + error_found++; + fprintf(stderr, + "GPT:sizeof_partition_entry values don't match: " + "0x%x != 0x%x\n", + __le32_to_cpu(pgpt->sizeof_partition_entry), + __le32_to_cpu(agpt->sizeof_partition_entry)); + } + if (__le32_to_cpu(pgpt->partition_entry_array_crc32) != + __le32_to_cpu(agpt->partition_entry_array_crc32)) { + error_found++; + fprintf(stderr, + "GPT:partition_entry_array_crc32 values don't match: " + "0x%x != 0x%x\n", + __le32_to_cpu(pgpt->partition_entry_array_crc32), + __le32_to_cpu(agpt->partition_entry_array_crc32)); + } + if (__le64_to_cpu(pgpt->alternate_lba) != lastlba) { + error_found++; + fprintf(stderr, + "GPT:Primary header thinks Alt. header is not at the end of the disk.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(pgpt->alternate_lba), lastlba); +#endif + } + + if (__le64_to_cpu(agpt->my_lba) != lastlba) { + error_found++; + fprintf(stderr, + "GPT:Alternate GPT header not at the end of the disk.\n"); +#ifdef DEBUG + fprintf(stderr, "GPT:%" PRIx64 " != %" PRIx64 "\n", + __le64_to_cpu(agpt->my_lba), lastlba); +#endif + } + + if (error_found) + fprintf(stderr, + "GPT: Use GNU Parted to correct GPT errors.\n"); + return; +} + +/** + * find_valid_gpt() - Search disk for valid GPT headers and PTEs + * @fd is an open file descriptor to the whole disk + * @gpt is a GPT header ptr, filled on return. + * @ptes is a PTEs ptr, filled on return. + * Description: Returns 1 if valid, 0 on error. + * If valid, returns pointers to newly allocated GPT header and PTEs. + * Validity depends on finding either the Primary GPT header and PTEs valid, + * or the Alternate GPT header and PTEs valid, and the PMBR valid. + */ +static int +find_valid_gpt(int fd, gpt_header ** gpt, gpt_entry ** ptes) +{ + extern int force_gpt; + int good_pgpt = 0, good_agpt = 0, good_pmbr = 0; + gpt_header *pgpt = NULL, *agpt = NULL; + gpt_entry *pptes = NULL, *aptes = NULL; + legacy_mbr *legacymbr = NULL; + uint64_t lastlba; + if (!gpt || !ptes) + return 0; + + lastlba = last_lba(fd); + good_pgpt = is_gpt_valid(fd, GPT_PRIMARY_PARTITION_TABLE_LBA, + &pgpt, &pptes); + if (good_pgpt) { + good_agpt = is_gpt_valid(fd, + __le64_to_cpu(pgpt->alternate_lba), + &agpt, &aptes); + if (!good_agpt) { + good_agpt = is_gpt_valid(fd, lastlba, + &agpt, &aptes); + } + } + else { + good_agpt = is_gpt_valid(fd, lastlba, + &agpt, &aptes); + } + + /* The obviously unsuccessful case */ + if (!good_pgpt && !good_agpt) { + goto fail; + } + + /* This will be added to the EFI Spec. per Intel after v1.02. */ + legacymbr = malloc(sizeof (*legacymbr)); + if (legacymbr) { + memset(legacymbr, 0, sizeof (*legacymbr)); + read_lba(fd, 0, (uint8_t *) legacymbr, + sizeof (*legacymbr)); + good_pmbr = is_pmbr_valid(legacymbr); + free(legacymbr); + legacymbr=NULL; + } + + /* Failure due to bad PMBR */ + if ((good_pgpt || good_agpt) && !good_pmbr && !force_gpt) { + fprintf(stderr, + " Warning: Disk has a valid GPT signature " + "but invalid PMBR.\n" + " Assuming this disk is *not* a GPT disk anymore.\n" + " Use gpt kernel option to override. " + "Use GNU Parted to correct disk.\n"); + goto fail; + } + + /* Would fail due to bad PMBR, but force GPT anyhow */ + if ((good_pgpt || good_agpt) && !good_pmbr && force_gpt) { + fprintf(stderr, + " Warning: Disk has a valid GPT signature but " + "invalid PMBR.\n" + " Use GNU Parted to correct disk.\n" + " gpt option taken, disk treated as GPT.\n"); + } + + compare_gpts(pgpt, agpt, lastlba); + + /* The good cases */ + if (good_pgpt && (good_pmbr || force_gpt)) { + *gpt = pgpt; + *ptes = pptes; + if (agpt) { free(agpt); agpt = NULL; } + if (aptes) { free(aptes); aptes = NULL; } + if (!good_agpt) { + fprintf(stderr, + "Alternate GPT is invalid, " + "using primary GPT.\n"); + } + return 1; + } + else if (good_agpt && (good_pmbr || force_gpt)) { + *gpt = agpt; + *ptes = aptes; + if (pgpt) { free(pgpt); pgpt = NULL; } + if (pptes) { free(pptes); pptes = NULL; } + fprintf(stderr, + "Primary GPT is invalid, using alternate GPT.\n"); + return 1; + } + + fail: + if (pgpt) { free(pgpt); pgpt=NULL; } + if (agpt) { free(agpt); agpt=NULL; } + if (pptes) { free(pptes); pptes=NULL; } + if (aptes) { free(aptes); aptes=NULL; } + *gpt = NULL; + *ptes = NULL; + return 0; +} + +/** + * read_gpt_pt() + * @fd + * @all - slice with start/size of whole disk + * + * 0 if this isn't our partition table + * number of partitions if successful + * + */ +int +read_gpt_pt (int fd, struct slice all, struct slice *sp, int ns) +{ + gpt_header *gpt = NULL; + gpt_entry *ptes = NULL; + uint32_t i; + int n = 0; + int last_used_index=-1; + + if (!find_valid_gpt (fd, &gpt, &ptes) || !gpt || !ptes) { + if (gpt) + free (gpt); + if (ptes) + free (ptes); + return 0; + } + + for (i = 0; i < __le32_to_cpu(gpt->num_partition_entries) && i < ns; i++) { + if (!efi_guidcmp (NULL_GUID, ptes[i].partition_type_guid)) { + sp[n].start = 0; + sp[n].size = 0; + n++; + } else { + sp[n].start = __le64_to_cpu(ptes[i].starting_lba); + sp[n].size = __le64_to_cpu(ptes[i].ending_lba) - + __le64_to_cpu(ptes[i].starting_lba) + 1; + last_used_index=n; + n++; + } + } + free (ptes); + free (gpt); + return last_used_index+1; +} diff --git a/kpartx/gpt.h b/kpartx/gpt.h new file mode 100644 index 0000000..a073b42 --- /dev/null +++ b/kpartx/gpt.h @@ -0,0 +1,131 @@ +/* + gpt.[ch] + + Copyright (C) 2000-2001 Dell Computer Corporation <Matt_Domsch@dell.com> + + EFI GUID Partition Table handling + Per Intel EFI Specification v1.02 + http://developer.intel.com/technology/efi/efi.htm + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ + +#ifndef _GPT_H +#define _GPT_H + + +#include <inttypes.h> +#include "kpartx.h" +#include "dos.h" +#include "efi.h" + +#define EFI_PMBR_OSTYPE_EFI 0xEF +#define EFI_PMBR_OSTYPE_EFI_GPT 0xEE +#define MSDOS_MBR_SIGNATURE 0xaa55 +#define GPT_BLOCK_SIZE 512 + +#define GPT_HEADER_SIGNATURE 0x5452415020494645ULL +#define GPT_HEADER_REVISION_V1_02 0x00010200 +#define GPT_HEADER_REVISION_V1_00 0x00010000 +#define GPT_HEADER_REVISION_V0_99 0x00009900 +#define GPT_PRIMARY_PARTITION_TABLE_LBA 1 + +typedef struct _gpt_header { + uint64_t signature; + uint32_t revision; + uint32_t header_size; + uint32_t header_crc32; + uint32_t reserved1; + uint64_t my_lba; + uint64_t alternate_lba; + uint64_t first_usable_lba; + uint64_t last_usable_lba; + efi_guid_t disk_guid; + uint64_t partition_entry_lba; + uint32_t num_partition_entries; + uint32_t sizeof_partition_entry; + uint32_t partition_entry_array_crc32; + uint8_t reserved2[GPT_BLOCK_SIZE - 92]; +} __attribute__ ((packed)) gpt_header; + +typedef struct _gpt_entry_attributes { + uint64_t required_to_function:1; + uint64_t reserved:47; + uint64_t type_guid_specific:16; +} __attribute__ ((packed)) gpt_entry_attributes; + +typedef struct _gpt_entry { + efi_guid_t partition_type_guid; + efi_guid_t unique_partition_guid; + uint64_t starting_lba; + uint64_t ending_lba; + gpt_entry_attributes attributes; + efi_char16_t partition_name[72 / sizeof(efi_char16_t)]; +} __attribute__ ((packed)) gpt_entry; + + +/* + These values are only defaults. The actual on-disk structures + may define different sizes, so use those unless creating a new GPT disk! +*/ + +#define GPT_DEFAULT_RESERVED_PARTITION_ENTRY_ARRAY_SIZE 16384 +/* + Number of actual partition entries should be calculated + as: +*/ +#define GPT_DEFAULT_RESERVED_PARTITION_ENTRIES \ + (GPT_DEFAULT_RESERVED_PARTITION_ENTRY_ARRAY_SIZE / \ + sizeof(gpt_entry)) + + +/* Protected Master Boot Record & Legacy MBR share same structure */ +/* Needs to be packed because the u16s force misalignment. */ + +typedef struct _legacy_mbr { + uint8_t bootcode[440]; + uint32_t unique_mbr_signature; + uint16_t unknown; + struct partition partition[4]; + uint16_t signature; +} __attribute__ ((packed)) legacy_mbr; + + +#define EFI_GPT_PRIMARY_PARTITION_TABLE_LBA 1 + +/* Functions */ +int read_gpt_pt (int fd, struct slice all, struct slice *sp, int ns); + + +#endif + +/* + * Overrides for Emacs so that we follow Linus's tabbing style. + * Emacs will notice this stuff at the end of the file and automatically + * adjust the settings for this buffer only. This must remain at the end + * of the file. + * --------------------------------------------------------------------------- + * Local variables: + * c-indent-level: 4 + * c-brace-imaginary-offset: 0 + * c-brace-offset: -4 + * c-argdecl-indent: 4 + * c-label-offset: -4 + * c-continued-statement-offset: 4 + * c-continued-brace-offset: 0 + * indent-tabs-mode: nil + * tab-width: 8 + * End: + */ diff --git a/kpartx/kpartx.8 b/kpartx/kpartx.8 new file mode 100644 index 0000000..259ce3f --- /dev/null +++ b/kpartx/kpartx.8 @@ -0,0 +1,39 @@ +.TH KPARTX 8 "February 2004" "" "Linux Administrator's Manual" +.SH NAME +kpartx \- Create device maps from partition tables +.SH SYNOPSIS +.B kpartx +.RB [\| \-a\ \c +.BR |\ -d\ |\ -l \|] +.RB [\| \-v \|] +.RB wholedisk +.SH DESCRIPTION +This tool, derived from util-linux' partx, reads partition +tables on specified device and create device maps over partitions +segments detected. It is called from hotplug upon device maps +creation and deletion. +.SH OPTIONS +.TP +.B \-a +Add partition mappings +.TP +.B \-d +Delete partition mappings +.TP +.B \-l +List partition mappings that would be added -a +.TP +.B \-p +set device name-partition number delimiter +.TP +.B \-v +Operate verbosely +.SH "SEE ALSO" +.BR multipath (8) +.BR multipathd (8) +.BR hotplug (8) +.SH "AUTHORS" +This man page was assembled By Patrick Caulfield +for the Debian project. From documentation provided +by the multipath author Christophe Varoqui, <christophe.varoqui@free.fr> and others. + diff --git a/kpartx/kpartx.c b/kpartx/kpartx.c new file mode 100644 index 0000000..5d714bf --- /dev/null +++ b/kpartx/kpartx.c @@ -0,0 +1,514 @@ +/* + * Given a block device and a partition table type, + * try to parse the partition table, and list the + * contents. Optionally add or remove partitions. + * + * Read wholedisk and add all partitions: + * kpartx [-a|-d|-l] [-v] wholedisk + * + * aeb, 2000-03-21 + * cva, 2002-10-26 + */ + +#include <stdio.h> +#include <fcntl.h> +#include <errno.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <ctype.h> +#include <libdevmapper.h> +#include <linux/kdev_t.h> + +#include "devmapper.h" +#include "crc32.h" +#include "lopart.h" +#include "kpartx.h" + +#define SIZE(a) (sizeof(a)/sizeof((a)[0])) + +#define READ_SIZE 1024 +#define MAXTYPES 64 +#define MAXSLICES 256 +#define DM_TARGET "linear" +#define LO_NAME_SIZE 64 +#define PARTNAME_SIZE 128 +#define DELIM_SIZE 8 + +struct slice slices[MAXSLICES]; + +enum action { LIST, ADD, DELETE }; + +struct pt { + char *type; + ptreader *fn; +} pts[MAXTYPES]; + +int ptct = 0; + +static void +addpts(char *t, ptreader f) +{ + if (ptct >= MAXTYPES) { + fprintf(stderr, "addpts: too many types\n"); + exit(1); + } + pts[ptct].type = t; + pts[ptct].fn = f; + ptct++; +} + +static void +initpts(void) +{ + addpts("gpt", read_gpt_pt); + addpts("dos", read_dos_pt); + addpts("bsd", read_bsd_pt); + addpts("solaris", read_solaris_pt); + addpts("unixware", read_unixware_pt); +} + +static char short_opts[] = "ladgvnp:t:"; + +/* Used in gpt.c */ +int force_gpt=0; + +static int +usage(void) { + printf("usage : kpartx [-a|-d|-l] [-v] wholedisk\n"); + printf("\t-a add partition devmappings\n"); + printf("\t-d del partition devmappings\n"); + printf("\t-l list partitions devmappings that would be added by -a\n"); + printf("\t-p set device name-partition number delimiter\n"); + printf("\t-v verbose\n"); + return 1; +} + +static void +set_delimiter (char * device, char * delimiter) +{ + char * p = device; + + while (*(p++) != 0x0) + continue; + + if (isdigit(*(p - 2))) + *delimiter = 'p'; +} + +static void +strip_slash (char * device) +{ + char * p = device; + + while (*(p++) != 0x0) { + + if (*p == '/') + *p = '!'; + } +} + +static int +find_devname_offset (char * device) +{ + char *p, *q = NULL; + + p = device; + + while (*p++) + if (*p == '/') + q = p; + + return (int)(q - device) + 1; +} + +static char * +get_hotplug_device(void) +{ + unsigned int major, minor, off, len; + const char *mapname; + char *devname = NULL; + char *device = NULL; + char *var = NULL; + struct stat buf; + + var = getenv("ACTION"); + + if (!var || strcmp(var, "add")) + return NULL; + + /* Get dm mapname for hotpluged device. */ + if (!(devname = getenv("DEVNAME"))) + return NULL; + + if (stat(devname, &buf)) + return NULL; + + major = (unsigned int)MAJOR(buf.st_rdev); + minor = (unsigned int)MINOR(buf.st_rdev); + + if (!(mapname = dm_mapname(major, minor))) /* Not dm device. */ + return NULL; + + off = find_devname_offset(devname); + len = strlen(mapname); + + /* Dirname + mapname + \0 */ + if (!(device = (char *)malloc(sizeof(char) * (off + len + 1)))) + return NULL; + + /* Create new device name. */ + snprintf(device, off + 1, "%s", devname); + snprintf(device + off, len + 1, "%s", mapname); + + if (strlen(device) != (off + len)) + return NULL; + + return device; +} + +int +main(int argc, char **argv){ + int fd, i, j, k, n, op, off, arg; + struct slice all; + struct pt *ptp; + enum action what = LIST; + char *p, *type, *diskdevice, *device, *progname; + int lower, upper; + int verbose = 0; + char partname[PARTNAME_SIZE], params[PARTNAME_SIZE + 16]; + char * loopdev = NULL; + char * delim = NULL; + int loopro = 0; + int hotplug = 0; + struct stat buf; + + initpts(); + init_crc32(); + + lower = upper = 0; + type = device = diskdevice = NULL; + memset(&all, 0, sizeof(all)); + memset(&partname, 0, sizeof(partname)); + + /* Check whether hotplug mode. */ + progname = strrchr(argv[0], '/'); + + if (!progname) + progname = argv[0]; + else + progname++; + + if (!strcmp(progname, "kpartx.dev")) { /* Hotplug mode */ + hotplug = 1; + + /* Setup for original kpartx variables */ + if (!(device = get_hotplug_device())) + exit(1); + + diskdevice = device; + what = ADD; + } else if (argc < 2) { + usage(); + exit(1); + } + + while ((arg = getopt(argc, argv, short_opts)) != EOF) switch(arg) { + case 'g': + force_gpt=1; + break; + case 't': + type = optarg; + break; + case 'v': + verbose = 1; + break; + case 'n': + p = optarg; + lower = atoi(p); + if ((p[1] == '-') && p[2]) + upper = atoi(p+2); + else + upper = lower; + break; + case 'p': + delim = optarg; + break; + case 'l': + what = LIST; + break; + case 'a': + what = ADD; + break; + case 'd': + what = DELETE; + break; + default: + usage(); + exit(1); + } + + if (dm_prereq(DM_TARGET, 0, 0, 0) && (what == ADD || what == DELETE)) { + fprintf(stderr, "device mapper prerequisites not met\n"); + exit(1); + } + + if (hotplug) { + /* already got [disk]device */ + } else if (optind == argc-2) { + device = argv[optind]; + diskdevice = argv[optind+1]; + } else if (optind == argc-1) { + diskdevice = device = argv[optind]; + } else { + usage(); + exit(1); + } + + if (stat(device, &buf)) { + printf("failed to stat() %s\n", device); + exit (1); + } + + if (S_ISREG (buf.st_mode)) { + loopdev = malloc(LO_NAME_SIZE * sizeof(char)); + + if (!loopdev) + exit(1); + + /* already looped file ? */ + loopdev = find_loop_by_file(device); + + if (!loopdev && what == DELETE) + exit (0); + + if (!loopdev) { + loopdev = find_unused_loop_device(); + + if (set_loop(loopdev, device, 0, &loopro)) { + fprintf(stderr, "can't set up loop\n"); + exit (1); + } + } + device = loopdev; + } + + if (delim == NULL) { + delim = malloc(DELIM_SIZE); + memset(delim, 0, DELIM_SIZE); + set_delimiter(device, delim); + } + + off = find_devname_offset(device); + fd = open(device, O_RDONLY); + + if (fd == -1) { + perror(device); + exit(1); + } + if (!lower) + lower = 1; + + /* add/remove partitions to the kernel devmapper tables */ + for (i = 0; i < ptct; i++) { + ptp = &pts[i]; + + if (type && strcmp(type, ptp->type) > 0) + continue; + + /* here we get partitions */ + n = ptp->fn(fd, all, slices, SIZE(slices)); + +#ifdef DEBUG + if (n >= 0) + printf("%s: %d slices\n", ptp->type, n); +#endif + + if (n > 0) + close(fd); + else + continue; + + /* + * test for overlap, as in the case of an extended partition + * zero their size to avoid mapping + */ + for (j=0; j<n; j++) { + for (k=j+1; k<n; k++) { + if (slices[k].start > slices[j].start && + slices[k].start < slices[j].start + + slices[j].size) + slices[j].size = 0; + } + } + + switch(what) { + case LIST: + for (j = 0; j < n; j++) { + if (slices[j].size == 0) + continue; + + printf("%s%s%d : 0 %lu %s %lu\n", + device + off, delim, j+1, + (unsigned long) slices[j].size, device, + (unsigned long) slices[j].start); + } + break; + + case DELETE: + for (j = 0; j < n; j++) { + if (safe_sprintf(partname, "%s%s%d", + device + off , delim, j+1)) { + fprintf(stderr, "partname too small\n"); + exit(1); + } + strip_slash(partname); + + if (!slices[j].size || !dm_map_present(partname)) + continue; + + if (!dm_simplecmd(DM_DEVICE_REMOVE, partname)) + continue; + + if (verbose) + printf("del devmap : %s\n", partname); + } + + if (S_ISREG (buf.st_mode)) { + if (del_loop(device)) { + if (verbose) + printf("can't del loop : %s\n", + device); + exit(1); + } + printf("loop deleted : %s\n", device); + } + break; + + case ADD: + for (j=0; j<n; j++) { + if (slices[j].size == 0) + continue; + + if (safe_sprintf(partname, "%s%s%d", + device + off , delim, j+1)) { + fprintf(stderr, "partname too small\n"); + exit(1); + } + strip_slash(partname); + + if (safe_sprintf(params, "%s %lu", device, + (unsigned long)slices[j].start)) { + fprintf(stderr, "params too small\n"); + exit(1); + } + + op = (dm_map_present(partname) ? + DM_DEVICE_RELOAD : DM_DEVICE_CREATE); + + dm_addmap(op, partname, DM_TARGET, params, + slices[j].size); + + if (op == DM_DEVICE_RELOAD) + dm_simplecmd(DM_DEVICE_RESUME, + partname); + + if (verbose) + printf("add map %s : 0 %lu %s %s\n", + partname, slices[j].size, + DM_TARGET, params); + } + break; + + default: + break; + + } + if (n > 0) + break; + } + return 0; +} + +void * +xmalloc (size_t size) { + void *t; + + if (size == 0) + return NULL; + + t = malloc (size); + + if (t == NULL) { + fprintf(stderr, "Out of memory\n"); + exit(1); + } + + return t; +} + +/* + * sseek: seek to specified sector + */ +#if !defined (__alpha__) && !defined (__ia64__) && !defined (__x86_64__) \ + && !defined (__s390x__) +#include <linux/unistd.h> /* _syscall */ +static +_syscall5(int, _llseek, uint, fd, ulong, hi, ulong, lo, + long long *, res, uint, wh); +#endif + +static int +sseek(int fd, unsigned int secnr) { + long long in, out; + in = ((long long) secnr << 9); + out = 1; + +#if !defined (__alpha__) && !defined (__ia64__) && !defined (__x86_64__) \ + && !defined (__s390x__) + if (_llseek (fd, in>>32, in & 0xffffffff, &out, SEEK_SET) != 0 + || out != in) +#else + if ((out = lseek(fd, in, SEEK_SET)) != in) +#endif + { + fprintf(stderr, "llseek error\n"); + return -1; + } + return 0; +} + +static +struct block { + unsigned int secnr; + char *block; + struct block *next; +} *blockhead; + +char * +getblock (int fd, unsigned int secnr) { + struct block *bp; + + for (bp = blockhead; bp; bp = bp->next) + + if (bp->secnr == secnr) + return bp->block; + + if (sseek(fd, secnr)) + return NULL; + + bp = xmalloc(sizeof(struct block)); + bp->secnr = secnr; + bp->next = blockhead; + blockhead = bp; + bp->block = (char *) xmalloc(READ_SIZE); + + if (read(fd, bp->block, READ_SIZE) != READ_SIZE) { + fprintf(stderr, "read error, sector %d\n", secnr); + bp->block = NULL; + } + + return bp->block; +} diff --git a/kpartx/kpartx.h b/kpartx/kpartx.h new file mode 100644 index 0000000..8108021 --- /dev/null +++ b/kpartx/kpartx.h @@ -0,0 +1,42 @@ +#ifndef _KPARTX_H +#define _KPARTX_H + +/* + * For each partition type there is a routine that takes + * a block device and a range, and returns the list of + * slices found there in the supplied array SP that can + * hold NS entries. The return value is the number of + * entries stored, or -1 if the appropriate type is not + * present. + */ + +#define likely(x) __builtin_expect(!!(x), 1) +#define unlikely(x) __builtin_expect(!!(x), 0) + +#define safe_sprintf(var, format, args...) \ + snprintf(var, sizeof(var), format, ##args) >= sizeof(var) + +/* + * units: 512 byte sectors + */ +struct slice { + unsigned long start; + unsigned long size; +}; + +typedef int (ptreader)(int fd, struct slice all, struct slice *sp, int ns); + +extern ptreader read_dos_pt; +extern ptreader read_bsd_pt; +extern ptreader read_solaris_pt; +extern ptreader read_unixware_pt; +extern ptreader read_gpt_pt; + +char *getblock(int fd, unsigned int secnr); + +static inline int +four2int(unsigned char *p) { + return p[0] + (p[1]<<8) + (p[2]<<16) + (p[3]<<24); +} + +#endif /* _KPARTX_H */ diff --git a/kpartx/lopart.c b/kpartx/lopart.c new file mode 100644 index 0000000..26b0ec1 --- /dev/null +++ b/kpartx/lopart.c @@ -0,0 +1,294 @@ +/* Taken from Ted's losetup.c - Mitch <m.dsouza@mrc-apu.cam.ac.uk> */ +/* Added vfs mount options - aeb - 960223 */ +/* Removed lomount - aeb - 960224 */ + +/* 1999-02-22 Arkadiusz Mi¶kiewicz <misiek@pld.ORG.PL> + * - added Native Language Support + * Sun Mar 21 1999 - Arnaldo Carvalho de Melo <acme@conectiva.com.br> + * - fixed strerr(errno) in gettext calls + */ + +#define PROC_DEVICES "/proc/devices" + +/* + * losetup.c - setup and control loop devices + */ + +#include "kpartx.h" +#include <stdio.h> +#include <string.h> +#include <ctype.h> +#include <fcntl.h> +#include <errno.h> +#include <stdlib.h> +#include <unistd.h> +#include <sys/ioctl.h> +#include <sys/stat.h> +#include <sys/mman.h> +#include <sysmacros.h> +#include <linux/loop.h> + +#include "lopart.h" +#include "xstrncpy.h" + +#if !defined (__alpha__) && !defined (__ia64__) && !defined (__x86_64__) \ + && !defined (__s390x__) +#define int2ptr(x) ((void *) ((int) x)) +#else +#define int2ptr(x) ((void *) ((long) x)) +#endif + +static char * +xstrdup (const char *s) +{ + char *t; + + if (s == NULL) + return NULL; + + t = strdup (s); + + if (t == NULL) { + fprintf(stderr, "not enough memory"); + exit(1); + } + + return t; +} + +extern int +is_loop_device (const char *device) +{ + struct stat statbuf; + int loopmajor; +#if 1 + loopmajor = 7; +#else + FILE *procdev; + char line[100], *cp; + + loopmajor = 0; + + if ((procdev = fopen(PROC_DEVICES, "r")) != NULL) { + + while (fgets (line, sizeof(line), procdev)) { + + if ((cp = strstr (line, " loop\n")) != NULL) { + *cp='\0'; + loopmajor=atoi(line); + break; + } + } + + fclose(procdev); + } +#endif + return (loopmajor && stat(device, &statbuf) == 0 && + S_ISBLK(statbuf.st_mode) && + major(statbuf.st_rdev) == loopmajor); +} + +#define SIZE(a) (sizeof(a)/sizeof(a[0])) + +extern char * +find_loop_by_file (const char * filename) +{ + char dev[20]; + char *loop_formats[] = { "/dev/loop%d", "/dev/loop/%d" }; + int i, j, fd; + struct stat statbuf; + struct loop_info loopinfo; + + for (j = 0; j < SIZE(loop_formats); j++) { + + for (i = 0; i < 256; i++) { + sprintf (dev, loop_formats[j], i); + + if (stat (dev, &statbuf) != 0 || + !S_ISBLK(statbuf.st_mode)) + continue; + + fd = open (dev, O_RDONLY); + + if (fd < 0) + break; + + if (ioctl (fd, LOOP_GET_STATUS, &loopinfo) != 0) { + close (fd); + continue; + } + + if (0 == strcmp(filename, loopinfo.lo_name)) { + close (fd); + return xstrdup(dev); /*found */ + } + + close (fd); + continue; + } + } + return NULL; +} + +extern char * +find_unused_loop_device (void) +{ + /* Just creating a device, say in /tmp, is probably a bad idea - + people might have problems with backup or so. + So, we just try /dev/loop[0-7]. */ + + char dev[20]; + char *loop_formats[] = { "/dev/loop%d", "/dev/loop/%d" }; + int i, j, fd, somedev = 0, someloop = 0, loop_known = 0; + struct stat statbuf; + struct loop_info loopinfo; + FILE *procdev; + + for (j = 0; j < SIZE(loop_formats); j++) { + + for(i = 0; i < 256; i++) { + sprintf(dev, loop_formats[j], i); + + if (stat (dev, &statbuf) == 0 && S_ISBLK(statbuf.st_mode)) { + somedev++; + fd = open (dev, O_RDONLY); + + if (fd >= 0) { + + if(ioctl (fd, LOOP_GET_STATUS, &loopinfo) == 0) + someloop++; /* in use */ + + else if (errno == ENXIO) { + close (fd); + return xstrdup(dev);/* probably free */ + } + + close (fd); + } + + /* continue trying as long as devices exist */ + continue; + } + break; + } + } + + /* Nothing found. Why not? */ + if ((procdev = fopen(PROC_DEVICES, "r")) != NULL) { + char line[100]; + + while (fgets (line, sizeof(line), procdev)) + + if (strstr (line, " loop\n")) { + loop_known = 1; + break; + } + + fclose(procdev); + + if (!loop_known) + loop_known = -1; + } + + if (!somedev) + fprintf(stderr, "mount: could not find any device /dev/loop#"); + + else if (!someloop) { + + if (loop_known == 1) + fprintf(stderr, + "mount: Could not find any loop device.\n" + " Maybe /dev/loop# has a wrong major number?"); + + else if (loop_known == -1) + fprintf(stderr, + "mount: Could not find any loop device, and, according to %s,\n" + " this kernel does not know about the loop device.\n" + " (If so, then recompile or `insmod loop.o'.)", + PROC_DEVICES); + + else + fprintf(stderr, + "mount: Could not find any loop device. Maybe this kernel does not know\n" + " about the loop device (then recompile or `insmod loop.o'), or\n" + " maybe /dev/loop# has the wrong major number?"); + + } else + fprintf(stderr, "mount: could not find any free loop device"); + + return 0; +} + +extern int +set_loop (const char *device, const char *file, int offset, int *loopro) +{ + struct loop_info loopinfo; + int fd, ffd, mode; + + mode = (*loopro ? O_RDONLY : O_RDWR); + + if ((ffd = open (file, mode)) < 0) { + + if (!*loopro && errno == EROFS) + ffd = open (file, mode = O_RDONLY); + + if (ffd < 0) { + perror (file); + return 1; + } + } + + if ((fd = open (device, mode)) < 0) { + perror (device); + return 1; + } + + *loopro = (mode == O_RDONLY); + memset (&loopinfo, 0, sizeof (loopinfo)); + + xstrncpy (loopinfo.lo_name, file, LO_NAME_SIZE); + loopinfo.lo_offset = offset; + loopinfo.lo_encrypt_type = LO_CRYPT_NONE; + loopinfo.lo_encrypt_key_size = 0; + + if (ioctl (fd, LOOP_SET_FD, int2ptr(ffd)) < 0) { + perror ("ioctl: LOOP_SET_FD"); + close (fd); + close (ffd); + return 1; + } + + if (ioctl (fd, LOOP_SET_STATUS, &loopinfo) < 0) { + (void) ioctl (fd, LOOP_CLR_FD, 0); + perror ("ioctl: LOOP_SET_STATUS"); + close (fd); + close (ffd); + return 1; + } + + close (fd); + close (ffd); + return 0; +} + +extern int +del_loop (const char *device) +{ + int fd; + + if ((fd = open (device, O_RDONLY)) < 0) { + int errsv = errno; + fprintf(stderr, "loop: can't delete device %s: %s\n", + device, strerror (errsv)); + return 1; + } + + if (ioctl (fd, LOOP_CLR_FD, 0) < 0) { + perror ("ioctl: LOOP_CLR_FD"); + close (fd); + return 1; + } + + close (fd); + return 0; +} diff --git a/kpartx/lopart.h b/kpartx/lopart.h new file mode 100644 index 0000000..a512353 --- /dev/null +++ b/kpartx/lopart.h @@ -0,0 +1,6 @@ +extern int verbose; +extern int set_loop (const char *, const char *, int, int *); +extern int del_loop (const char *); +extern int is_loop_device (const char *); +extern char * find_unused_loop_device (void); +extern char * find_loop_by_file (const char *); diff --git a/kpartx/solaris.c b/kpartx/solaris.c new file mode 100644 index 0000000..e3000e9 --- /dev/null +++ b/kpartx/solaris.c @@ -0,0 +1,71 @@ +#include "kpartx.h" +#include <stdio.h> +#include <sys/types.h> +#include <time.h> /* time_t */ + +#define SOLARIS_X86_NUMSLICE 8 +#define SOLARIS_X86_VTOC_SANE (0x600DDEEEUL) + +//typedef int daddr_t; /* or long - check */ + +struct solaris_x86_slice { + unsigned short s_tag; /* ID tag of partition */ + unsigned short s_flag; /* permision flags */ + daddr_t s_start; /* start sector no of partition */ + long s_size; /* # of blocks in partition */ +}; + +struct solaris_x86_vtoc { + unsigned long v_bootinfo[3]; /* info for mboot */ + unsigned long v_sanity; /* to verify vtoc sanity */ + unsigned long v_version; /* layout version */ + char v_volume[8]; /* volume name */ + unsigned short v_sectorsz; /* sector size in bytes */ + unsigned short v_nparts; /* number of partitions */ + unsigned long v_reserved[10]; /* free space */ + struct solaris_x86_slice + v_slice[SOLARIS_X86_NUMSLICE]; /* slice headers */ + time_t timestamp[SOLARIS_X86_NUMSLICE]; /* timestamp */ + char v_asciilabel[128]; /* for compatibility */ +}; + +int +read_solaris_pt(int fd, struct slice all, struct slice *sp, int ns) { + struct solaris_x86_vtoc *v; + struct solaris_x86_slice *s; + unsigned int offset = all.start; + int i, n; + char *bp; + + bp = getblock(fd, offset+1); /* 1 sector suffices */ + if (bp == NULL) + return -1; + + v = (struct solaris_x86_vtoc *) bp; + if(v->v_sanity != SOLARIS_X86_VTOC_SANE) + return -1; + + if(v->v_version != 1) { + fprintf(stderr, "Cannot handle solaris version %ld vtoc\n", + v->v_version); + return 0; + } + + for(i=0, n=0; i<SOLARIS_X86_NUMSLICE; i++) { + s = &v->v_slice[i]; + + if (s->s_size == 0) + continue; + if (n < ns) { + sp[n].start = offset + s->s_start; + sp[n].size = s->s_size; + n++; + } else { + fprintf(stderr, + "solaris_x86_partition: too many slices\n"); + break; + } + } + return n; +} + diff --git a/kpartx/sysmacros.h b/kpartx/sysmacros.h new file mode 100644 index 0000000..171b33d --- /dev/null +++ b/kpartx/sysmacros.h @@ -0,0 +1,9 @@ +/* versions to be used with > 16-bit dev_t - leave unused for now */ + +#ifndef major +#define major(dev) ((dev) >> 8) +#endif + +#ifndef minor +#define minor(dev) ((dev) & 0xff) +#endif diff --git a/kpartx/unixware.c b/kpartx/unixware.c new file mode 100644 index 0000000..41cc957 --- /dev/null +++ b/kpartx/unixware.c @@ -0,0 +1,83 @@ +#include "kpartx.h" +#include <stdio.h> + +#define UNIXWARE_FS_UNUSED 0 +#define UNIXWARE_NUMSLICE 16 +#define UNIXWARE_DISKMAGIC (0xCA5E600D) +#define UNIXWARE_DISKMAGIC2 (0x600DDEEE) + +struct unixware_slice { + unsigned short s_label; /* label */ + unsigned short s_flags; /* permission flags */ + unsigned int start_sect; /* starting sector */ + unsigned int nr_sects; /* number of sectors in slice */ +}; + +struct unixware_disklabel { + unsigned int d_type; /* drive type */ + unsigned char d_magic[4]; /* the magic number */ + unsigned int d_version; /* version number */ + char d_serial[12]; /* serial number of the device */ + unsigned int d_ncylinders; /* # of data cylinders per device */ + unsigned int d_ntracks; /* # of tracks per cylinder */ + unsigned int d_nsectors; /* # of data sectors per track */ + unsigned int d_secsize; /* # of bytes per sector */ + unsigned int d_part_start; /* # of first sector of this partition */ + unsigned int d_unknown1[12]; /* ? */ + unsigned int d_alt_tbl; /* byte offset of alternate table */ + unsigned int d_alt_len; /* byte length of alternate table */ + unsigned int d_phys_cyl; /* # of physical cylinders per device */ + unsigned int d_phys_trk; /* # of physical tracks per cylinder */ + unsigned int d_phys_sec; /* # of physical sectors per track */ + unsigned int d_phys_bytes; /* # of physical bytes per sector */ + unsigned int d_unknown2; /* ? */ + unsigned int d_unknown3; /* ? */ + unsigned int d_pad[8]; /* pad */ + + struct unixware_vtoc { + unsigned char v_magic[4]; /* the magic number */ + unsigned int v_version; /* version number */ + char v_name[8]; /* volume name */ + unsigned short v_nslices; /* # of slices */ + unsigned short v_unknown1; /* ? */ + unsigned int v_reserved[10]; /* reserved */ + struct unixware_slice + v_slice[UNIXWARE_NUMSLICE]; /* slice headers */ + } vtoc; + +}; /* 408 */ + +int +read_unixware_pt(int fd, struct slice all, struct slice *sp, int ns) { + struct unixware_disklabel *l; + struct unixware_slice *p; + unsigned int offset = all.start; + char *bp; + int n = 0; + + bp = getblock(fd, offset+29); /* 1 sector suffices */ + if (bp == NULL) + return -1; + + l = (struct unixware_disklabel *) bp; + if (four2int(l->d_magic) != UNIXWARE_DISKMAGIC || + four2int(l->vtoc.v_magic) != UNIXWARE_DISKMAGIC2) + return -1; + + p = &l->vtoc.v_slice[1]; /* slice 0 is the whole disk. */ + while (p - &l->vtoc.v_slice[0] < UNIXWARE_NUMSLICE) { + if (p->s_label == UNIXWARE_FS_UNUSED) + /* nothing */; + else if (n < ns) { + sp[n].start = p->start_sect; + sp[n].size = p->nr_sects; + n++; + } else { + fprintf(stderr, + "unixware_partition: too many slices\n"); + break; + } + p++; + } + return n; +} diff --git a/kpartx/xstrncpy.c b/kpartx/xstrncpy.c new file mode 100644 index 0000000..7975426 --- /dev/null +++ b/kpartx/xstrncpy.c @@ -0,0 +1,10 @@ +/* NUL-terminated version of strncpy() */ +#include <string.h> +#include "xstrncpy.h" + +/* caller guarantees n > 0 */ +void +xstrncpy(char *dest, const char *src, size_t n) { + strncpy(dest, src, n-1); + dest[n-1] = 0; +} diff --git a/kpartx/xstrncpy.h b/kpartx/xstrncpy.h new file mode 100644 index 0000000..05c8fa2 --- /dev/null +++ b/kpartx/xstrncpy.h @@ -0,0 +1 @@ +extern void xstrncpy(char *dest, const char *src, size_t n); diff --git a/libcheckers/Makefile b/libcheckers/Makefile new file mode 100644 index 0000000..7539ce3 --- /dev/null +++ b/libcheckers/Makefile @@ -0,0 +1,27 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> +# +BUILD = glibc + +include ../Makefile.inc + +OBJS = readsector0.o tur.o selector.o emc_clariion.o + +all: $(BUILD) + +prepare: + rm -f core *.o *.gz + +klibc: prepare $(OBJS) + ar rs libcheckers-klibc.a *.o + +glibc: prepare $(OBJS) + ar rs libcheckers-glibc.a *.o + +install: + +uninstall: + +clean: + rm -f core *.a *.o *.gz diff --git a/libcheckers/checkers.h b/libcheckers/checkers.h new file mode 100644 index 0000000..a66d894 --- /dev/null +++ b/libcheckers/checkers.h @@ -0,0 +1,26 @@ +#ifndef _CHECKERS_H +#define _CHECKERS_H + +#define CHECKER_NAME_SIZE 16 +#define DEVNODE_SIZE 256 +#define MAX_CHECKER_MSG_SIZE 256 + +enum checkers { + CHECKER_RESERVED, + TUR, + READSECTOR0, + EMC_CLARIION +}; + +#define MSG(a) if (msg != NULL) \ + snprintf(msg, MAX_CHECKER_MSG_SIZE, "%s\n", a); + +int get_checker_id (char *); +void *get_checker_addr (int); +int get_checker_name (char *, int); + +int emc_clariion (int fd, char * msg, void ** ctxt); +int readsector0 (int fd, char * msg, void ** ctxt); +int tur (int fd, char * msg, void ** ctxt); + +#endif /* _CHECKERS_H */ diff --git a/libcheckers/emc_clariion.c b/libcheckers/emc_clariion.c new file mode 100644 index 0000000..790168b --- /dev/null +++ b/libcheckers/emc_clariion.c @@ -0,0 +1,153 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <errno.h> + +#include "path_state.h" +#include "checkers.h" + +#include "../libmultipath/sg_include.h" + +#define INQUIRY_CMD 0x12 +#define INQUIRY_CMDLEN 6 +#define HEAVY_CHECK_COUNT 10 + +struct emc_clariion_checker_context { + int run_count; + char wwn[16]; + unsigned wwn_set; +}; + +int emc_clariion(int fd, char *msg, void **context) +{ + unsigned char sense_buffer[256] = { 0, }; + unsigned char sb[128] = { 0, }; + unsigned char inqCmdBlk[INQUIRY_CMDLEN] = {INQUIRY_CMD, 1, 0xC0, 0, + sizeof(sb), 0}; + struct sg_io_hdr io_hdr; + struct emc_clariion_checker_context * ctxt = NULL; + int ret; + + /* + * caller passed in a context : use its address + */ + if (context) + ctxt = (struct emc_clariion_checker_context *) (*context); + + /* + * passed in context is uninitialized or volatile context : + * initialize it + */ + if (!ctxt) { + ctxt = malloc(sizeof(struct emc_clariion_checker_context)); + memset(ctxt, 0, sizeof(struct emc_clariion_checker_context)); + + if (!ctxt) { + MSG("cannot allocate context"); + return -1; + } + if (context) + *context = ctxt; + } + ctxt->run_count++; + + if ((ctxt->run_count % HEAVY_CHECK_COUNT) == 0) { + ctxt->run_count = 0; + /* do stuff */ + } + + if (fd <= 0) { + MSG("no usable fd"); + ret = -1; + goto out; + } + memset(&io_hdr, 0, sizeof (struct sg_io_hdr)); + io_hdr.interface_id = 'S'; + io_hdr.cmd_len = sizeof (inqCmdBlk); + io_hdr.mx_sb_len = sizeof (sb); + io_hdr.dxfer_direction = SG_DXFER_FROM_DEV; + io_hdr.dxfer_len = sizeof (sense_buffer); + io_hdr.dxferp = sense_buffer; + io_hdr.cmdp = inqCmdBlk; + io_hdr.sbp = sb; + io_hdr.timeout = 60000; + io_hdr.pack_id = 0; + if (ioctl(fd, SG_IO, &io_hdr) < 0) { + MSG("emc_clariion_checker: sending query command failed"); + ret = PATH_DOWN; + goto out; + } + if (io_hdr.info & SG_INFO_OK_MASK) { + MSG("emc_clariion_checker: query command indicates error"); + ret = PATH_DOWN; + goto out; + } + if (/* Verify the code page - right page & revision */ + sense_buffer[1] != 0xc0 || sense_buffer[9] != 0x00) { + MSG("emc_clariion_checker: Path unit report page in unknown format"); + ret = PATH_DOWN; + goto out; + } + + if ( /* Effective initiator type */ + sense_buffer[27] != 0x03 + /* Failover mode should be set to 1 */ + || (sense_buffer[28] & 0x07) != 0x04 + /* Arraycommpath should be set to 1 */ + || (sense_buffer[30] & 0x04) != 0x04) { + MSG("emc_clariion_checker: Path not correctly configured for failover"); + ret = PATH_DOWN; + goto out; + } + + if ( /* LUN operations should indicate normal operations */ + sense_buffer[48] != 0x00) { + MSG("emc_clariion_checker: Path not available for normal operations"); + ret = PATH_SHAKY; + goto out; + } + +#if 0 + /* This is not actually an error as the failover to this group + * _would_ bind the path */ + if ( /* LUN should at least be bound somewhere */ + sense_buffer[4] != 0x00) { + ret = PATH_UP; + goto out; + } +#endif + + /* + * store the LUN WWN there and compare that it indeed did not + * change in between, to protect against the path suddenly + * pointing somewhere else. + */ + if (context && ctxt->wwn_set) { + if (memcmp(ctxt->wwn, &sense_buffer[10], 16) != 0) { + MSG("emc_clariion_checker: Logical Unit WWN has changed!"); + ret = PATH_DOWN; + goto out; + } + } else { + memcpy(ctxt->wwn, &sense_buffer[10], 16); + ctxt->wwn_set = 1; + } + + + MSG("emc_clariion_checker: Path healthy"); + ret = PATH_UP; +out: + /* + * caller told us he doesn't want to keep the context : + * free it + */ + if (!context) + free(ctxt); + + return(ret); +} diff --git a/libcheckers/path_state.h b/libcheckers/path_state.h new file mode 100644 index 0000000..ffdc09c --- /dev/null +++ b/libcheckers/path_state.h @@ -0,0 +1,4 @@ +#define PATH_UNCHECKED 0 +#define PATH_DOWN 1 +#define PATH_UP 2 +#define PATH_SHAKY 3 diff --git a/libcheckers/readsector0.c b/libcheckers/readsector0.c new file mode 100644 index 0000000..50ab467 --- /dev/null +++ b/libcheckers/readsector0.c @@ -0,0 +1,140 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <errno.h> + +#include "path_state.h" +#include "checkers.h" + +#include "../libmultipath/sg_include.h" + +#define SENSE_BUFF_LEN 32 +#define DEF_TIMEOUT 60000 + +#define MSG_READSECTOR0_UP "readsector0 checker reports path is up" +#define MSG_READSECTOR0_DOWN "readsector0 checker reports path is down" + +struct readsector0_checker_context { + void * dummy; +}; + +static int +sg_read (int sg_fd, unsigned char * buff) +{ + /* defaults */ + int blocks = 1; + long long start_block = 0; + int bs = 512; + int cdbsz = 10; + int * diop = NULL; + + unsigned char rdCmd[cdbsz]; + unsigned char senseBuff[SENSE_BUFF_LEN]; + struct sg_io_hdr io_hdr; + int res; + int rd_opcode[] = {0x8, 0x28, 0xa8, 0x88}; + int sz_ind; + + memset(rdCmd, 0, cdbsz); + sz_ind = 1; + rdCmd[0] = rd_opcode[sz_ind]; + rdCmd[2] = (unsigned char)((start_block >> 24) & 0xff); + rdCmd[3] = (unsigned char)((start_block >> 16) & 0xff); + rdCmd[4] = (unsigned char)((start_block >> 8) & 0xff); + rdCmd[5] = (unsigned char)(start_block & 0xff); + rdCmd[7] = (unsigned char)((blocks >> 8) & 0xff); + rdCmd[8] = (unsigned char)(blocks & 0xff); + + memset(&io_hdr, 0, sizeof(struct sg_io_hdr)); + io_hdr.interface_id = 'S'; + io_hdr.cmd_len = cdbsz; + io_hdr.cmdp = rdCmd; + io_hdr.dxfer_direction = SG_DXFER_FROM_DEV; + io_hdr.dxfer_len = bs * blocks; + io_hdr.dxferp = buff; + io_hdr.mx_sb_len = SENSE_BUFF_LEN; + io_hdr.sbp = senseBuff; + io_hdr.timeout = DEF_TIMEOUT; + io_hdr.pack_id = (int)start_block; + if (diop && *diop) + io_hdr.flags |= SG_FLAG_DIRECT_IO; + + while (((res = ioctl(sg_fd, SG_IO, &io_hdr)) < 0) && (EINTR == errno)); + + if (res < 0) { + if (ENOMEM == errno) { + return PATH_UP; + } + return PATH_DOWN; + } + + if ((0 == io_hdr.status) && + (0 == io_hdr.host_status) && + (0 == io_hdr.driver_status)) { + return PATH_UP; + } else { + return PATH_DOWN; + } +} + +extern int +readsector0 (int fd, char *msg, void **context) +{ + char buf[512]; + struct readsector0_checker_context * ctxt = NULL; + int ret; + + /* + * caller passed in a context : use its address + */ + if (context) + ctxt = (struct readsector0_checker_context *) (*context); + + /* + * passed in context is uninitialized or volatile context : + * initialize it + */ + if (!ctxt) { + ctxt = malloc(sizeof(struct readsector0_checker_context)); + memset(ctxt, 0, sizeof(struct readsector0_checker_context)); + + if (!ctxt) { + MSG("cannot allocate context"); + return -1; + } + if (context) + *context = ctxt; + } + if (fd <= 0) { + MSG("no usable fd"); + ret = -1; + goto out; + } + ret = sg_read(fd, &buf[0]); + + switch (ret) + { + case PATH_DOWN: + MSG(MSG_READSECTOR0_DOWN); + break; + case PATH_UP: + MSG(MSG_READSECTOR0_UP); + break; + default: + break; + } +out: + /* + * caller told us he doesn't want to keep the context : + * free it + */ + if (!context) + free(ctxt); + + return ret; +} diff --git a/libcheckers/selector.c b/libcheckers/selector.c new file mode 100644 index 0000000..e759258 --- /dev/null +++ b/libcheckers/selector.c @@ -0,0 +1,64 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include "checkers.h" + +extern int +get_checker_id (char * str) +{ + if (0 == strncmp(str, "tur", 3)) + return TUR; + if (0 == strncmp(str, "readsector0", 11)) + return READSECTOR0; + if (0 == strncmp(str, "emc_clariion", 12)) + return EMC_CLARIION; + return -1; +} + +extern void * +get_checker_addr (int id) +{ + int (*checker) (int, char *, void **); + + switch (id) { + case TUR: + checker = &tur; + break; + case READSECTOR0: + checker = &readsector0; + break; + case EMC_CLARIION: + checker = &emc_clariion; + break; + default: + checker = NULL; + break; + } + return checker; +} + +extern int +get_checker_name (char * str, int id) +{ + char * s; + + switch (id) { + case TUR: + s = "tur"; + break; + case READSECTOR0: + s = "readsector0"; + break; + case EMC_CLARIION: + s = "emc_clariion"; + break; + default: + s = "undefined"; + break; + } + if (snprintf(str, CHECKER_NAME_SIZE, "%s", s) >= CHECKER_NAME_SIZE) { + fprintf(stderr, "checker_name too small\n"); + return 1; + } + return 0; +} diff --git a/libcheckers/tur.c b/libcheckers/tur.c new file mode 100644 index 0000000..3c76f41 --- /dev/null +++ b/libcheckers/tur.c @@ -0,0 +1,100 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <errno.h> + +#include "path_state.h" +#include "checkers.h" + +#include "../libmultipath/sg_include.h" + +#define TUR_CMD_LEN 6 +#define HEAVY_CHECK_COUNT 10 + +#define MSG_TUR_UP "tur checker reports path is up" +#define MSG_TUR_DOWN "tur checker reports path is down" + +struct tur_checker_context { + int run_count; +}; + + +extern int +tur (int fd, char *msg, void **context) +{ + unsigned char turCmdBlk[TUR_CMD_LEN] = { 0x00, 0, 0, 0, 0, 0 }; + struct sg_io_hdr io_hdr; + unsigned char sense_buffer[32]; + struct tur_checker_context * ctxt = NULL; + int ret; + + /* + * caller passed in a context : use its address + */ + if (context) + ctxt = (struct tur_checker_context *) (*context); + + /* + * passed in context is uninitialized or volatile context : + * initialize it + */ + if (!ctxt) { + ctxt = malloc(sizeof(struct tur_checker_context)); + memset(ctxt, 0, sizeof(struct tur_checker_context)); + + if (!ctxt) { + MSG("cannot allocate context"); + return -1; + } + if (context) + *context = ctxt; + } + ctxt->run_count++; + + if ((ctxt->run_count % HEAVY_CHECK_COUNT) == 0) { + ctxt->run_count = 0; + /* do stuff */ + } + if (fd <= 0) { + MSG("no usable fd"); + ret = -1; + goto out; + } + + memset(&io_hdr, 0, sizeof (struct sg_io_hdr)); + io_hdr.interface_id = 'S'; + io_hdr.cmd_len = sizeof (turCmdBlk); + io_hdr.mx_sb_len = sizeof (sense_buffer); + io_hdr.dxfer_direction = SG_DXFER_NONE; + io_hdr.cmdp = turCmdBlk; + io_hdr.sbp = sense_buffer; + io_hdr.timeout = 20000; + io_hdr.pack_id = 0; + if (ioctl(fd, SG_IO, &io_hdr) < 0) { + MSG(MSG_TUR_DOWN); + ret = PATH_DOWN; + goto out; + } + if (io_hdr.info & SG_INFO_OK_MASK) { + MSG(MSG_TUR_DOWN); + ret = PATH_DOWN; + goto out; + } + MSG(MSG_TUR_UP); + ret = PATH_UP; + +out: + /* + * caller told us he doesn't want to keep the context : + * free it + */ + if (!context) + free(ctxt); + + return(ret); +} diff --git a/libmultipath/Makefile b/libmultipath/Makefile new file mode 100644 index 0000000..778297d --- /dev/null +++ b/libmultipath/Makefile @@ -0,0 +1,36 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> +# +BUILD = glibc + +include ../Makefile.inc + +OBJS = memory.o parser.o vector.o devmapper.o callout.o \ + hwtable.o blacklist.o util.o dmparser.o config.o \ + structs.o cache.o discovery.o propsel.o dict.o \ + pgpolicies.o debug.o regex.o defaults.o uevent.o + +CFLAGS = -pipe -g -Wall -Wunused -Wstrict-prototypes + +ifeq ($(strip $(DAEMON)),1) + CFLAGS += -DDAEMON +endif + +all: $(BUILD) + +prepare: + rm -f core *.o *.gz + +klibc: prepare $(OBJS) + ar rs libmultipath-klibc.a *.o + +glibc: prepare $(OBJS) + ar rs libmultipath-glibc.a *.o + +install: + +uninstall: + +clean: + rm -f core *.a *.o *.gz diff --git a/libmultipath/blacklist.c b/libmultipath/blacklist.c new file mode 100644 index 0000000..4ba9ef4 --- /dev/null +++ b/libmultipath/blacklist.c @@ -0,0 +1,111 @@ +#include <stdio.h> + +#include "memory.h" +#include "vector.h" +#include "util.h" +#include "debug.h" +#include "regex.h" +#include "blacklist.h" + +static int +store_ble (vector blist, char * str) +{ + struct blentry * ble; + + if (!str) + return 0; + + ble = (struct blentry *)MALLOC(sizeof(struct blentry)); + + if (!ble) + goto out; + + ble->preg = MALLOC(sizeof(regex_t)); + + if (!ble->preg) + goto out1; + + ble->str = (char *)MALLOC(strlen(str) + 1); + + if (!ble->str) + goto out2; + + strcpy(ble->str, str); + + if (regcomp((regex_t *)ble->preg, ble->str, REG_EXTENDED|REG_NOSUB)) + goto out3; + + if (!vector_alloc_slot(blist)) + goto out3; + + vector_set_slot(blist, ble); + return 0; +out3: + FREE(ble->str); +out2: + FREE(ble->preg); +out1: + FREE(ble); +out: + return 1; +} + +int +setup_default_blist (vector blist) +{ + int r = 0; + + r += store_ble(blist, "(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"); + r += store_ble(blist, "hd[a-z]"); + r += store_ble(blist, "cciss!c[0-9]d[0-9]*"); + + return r; +} + +int +blacklist (vector blist, char * dev) +{ + int i; + struct blentry *ble; + + vector_foreach_slot (blist, ble, i) { + if (!regexec(ble->preg, dev, 0, NULL, 0)) { + condlog(3, "%s blacklisted", dev); + return 1; + } + } + return 0; +} + +int +store_regex (vector blist, char * regex) +{ + if (!blist) + return 1; + + if (!regex) + return 1; + + return store_ble(blist, regex); +} + +void +free_blacklist (vector blist) +{ + struct blentry * ble; + int i; + + if (!blist) + return; + + vector_foreach_slot (blist, ble, i) { + if (ble->str) + FREE(ble->str); + + if (ble->preg) + FREE(ble->preg); + + FREE(ble); + } + vector_free(blist); +} diff --git a/libmultipath/blacklist.h b/libmultipath/blacklist.h new file mode 100644 index 0000000..1b0fc05 --- /dev/null +++ b/libmultipath/blacklist.h @@ -0,0 +1,16 @@ +#ifndef _BLACKLIST_H +#define _BLACKLIST_H + +#define BLIST_ENTRY_SIZE 255 + +struct blentry { + char * str; + void * preg; +}; + +int setup_default_blist (vector blist); +int blacklist (vector blist, char * dev); +int store_regex (vector blist, char * regex); +void free_blacklist (vector blist); + +#endif /* _BLACKLIST_H */ diff --git a/libmultipath/cache.c b/libmultipath/cache.c new file mode 100644 index 0000000..de5bb17 --- /dev/null +++ b/libmultipath/cache.c @@ -0,0 +1,129 @@ +#include <stdio.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <string.h> +#include <time.h> + +#include "vector.h" +#include "structs.h" +#include "debug.h" +#include "cache.h" + +static void +revoke_cache_info(struct path * pp) +{ + pp->fd = 0; +} + +static int +lock_fd (int fd, int flag) +{ + struct flock fl; + + fl.l_type = flag; + fl.l_whence = 0; + fl.l_start = 0; + fl.l_len = 0; + + alarm(MAX_WAIT); + + if (fcntl(fd, F_SETLKW, &fl) == -1) { + condlog(0, "can't take a write lease on cache file\n"); + return 1; + } + alarm(0); + return 0; +} + +int +cache_load (vector pathvec) +{ + int fd; + int r = 1; + off_t record_len; + struct path record; + struct path * pp; + + fd = open(CACHE_FILE, O_RDONLY); + + if (fd < 0) + return 1; + + if (lock_fd(fd, F_RDLCK)) + goto out; + + record_len = sizeof(struct path); + + while (read(fd, &record, record_len)) { + pp = alloc_path(); + + if (!pp) + goto out; + + if (!vector_alloc_slot(pathvec)) { + free_path(pp); + goto out; + } + vector_set_slot(pathvec, pp); + memcpy(pp, &record, record_len); + revoke_cache_info(pp); + } + r = 0; + lock_fd(fd, F_UNLCK); +out: + close(fd); + return r; +} + +int +cache_dump (vector pathvec) +{ + int i; + int fd; + int r = 1; + off_t record_len; + struct path * pp; + + fd = open(CACHE_TMPFILE, O_RDWR|O_CREAT, 0600); + + if (fd < 0) + return 1; + + if (lock_fd(fd, F_WRLCK)) + goto out; + + ftruncate(fd, 0); + record_len = sizeof(struct path); + + vector_foreach_slot (pathvec, pp, i) { + if (write(fd, pp, record_len) < record_len) + goto out1; + } + rename(CACHE_TMPFILE, CACHE_FILE); + r = 0; +out1: + lock_fd(fd, F_UNLCK); +out: + close(fd); + return r; +} + +int +cache_cold (int expire) +{ + time_t t; + struct stat s; + + if (time(&t) < 0) + return 1; + + if(stat(CACHE_FILE, &s)) + return 1; + + if ((t - s.st_mtime) < expire) + return 0; + + return 1; +} diff --git a/libmultipath/cache.h b/libmultipath/cache.h new file mode 100644 index 0000000..aafdb4c --- /dev/null +++ b/libmultipath/cache.h @@ -0,0 +1,8 @@ +#define CACHE_FILE "/var/cache/multipath/.multipath.cache" +#define CACHE_TMPFILE "/var/cache/multipath/.multipath.cache.tmp" +#define CACHE_EXPIRE 5 +#define MAX_WAIT 5 + +int cache_load (vector pathvec); +int cache_dump (vector pathvec); +int cache_cold (int expire); diff --git a/libmultipath/callout.c b/libmultipath/callout.c new file mode 100644 index 0000000..f891a69 --- /dev/null +++ b/libmultipath/callout.c @@ -0,0 +1,105 @@ +#include <stdio.h> +#include <sys/stat.h> +#include <string.h> +#include <unistd.h> +#include <sys/types.h> +#include <stdlib.h> +#include <sys/wait.h> +#include <errno.h> + +#define PROGRAM_SIZE 100 +#define FIELD_PROGRAM + +#define strfieldcpy(to, from) \ +do { \ + to[sizeof(to)-1] = '\0'; \ + strncpy(to, from, sizeof(to)-1); \ +} while (0) + +int execute_program(char *path, char *value, int len) +{ + int retval; + int count; + int status; + int fds[2]; + pid_t pid; + char *pos; + char arg[PROGRAM_SIZE]; + char *argv[sizeof(arg) / 2]; + int i; + + i = 0; + + if (strchr(path, ' ')) { + strfieldcpy(arg, path); + pos = arg; + while (pos != NULL) { + if (pos[0] == '\'') { + /* don't separate if in apostrophes */ + pos++; + argv[i] = strsep(&pos, "\'"); + while (pos[0] == ' ') + pos++; + } else { + argv[i] = strsep(&pos, " "); + } + i++; + } + } else { + argv[i++] = path; + } + argv[i] = NULL; + + retval = pipe(fds); + + if (retval != 0) + return -1; + + + pid = fork(); + + switch(pid) { + case 0: + /* child */ + close(STDOUT_FILENO); + + /* dup write side of pipe to STDOUT */ + dup(fds[1]); + + retval = execv(argv[0], argv); + + exit(-1); + case -1: + return -1; + default: + /* parent reads from fds[0] */ + close(fds[1]); + retval = 0; + i = 0; + while (1) { + count = read(fds[0], value + i, len - i-1); + if (count <= 0) + break; + + i += count; + if (i >= len-1) { + retval = -1; + break; + } + } + + if (count < 0) + retval = -1; + + if (i > 0 && value[i-1] == '\n') + i--; + value[i] = '\0'; + + wait(&status); + close(fds[0]); + + if (!WIFEXITED(status) || (WEXITSTATUS(status) != 0)) + retval = -1; + } + return retval; +} diff --git a/libmultipath/callout.h b/libmultipath/callout.h new file mode 100644 index 0000000..a6a731d --- /dev/null +++ b/libmultipath/callout.h @@ -0,0 +1 @@ +int execute_program(char *, char *, int); diff --git a/libmultipath/config.c b/libmultipath/config.c new file mode 100644 index 0000000..1de363d --- /dev/null +++ b/libmultipath/config.c @@ -0,0 +1,464 @@ +#include <stdio.h> +#include <string.h> + +#include "memory.h" +#include "util.h" +#include "debug.h" +#include "parser.h" +#include "dict.h" +#include "hwtable.h" +#include "vector.h" +#include "blacklist.h" +#include "defaults.h" +#include "config.h" + +#include "../libcheckers/checkers.h" + +/* + * helper function to draw a list of callout binaries found in the config file + */ +extern int +push_callout(char * callout) +{ + int i; + char * bin; + char * p; + + /* + * purge command line arguments + */ + p = callout; + + while (*p != ' ' && *p != '\0') + p++; + + if (!conf->binvec) + conf->binvec = vector_alloc(); + + + if (!conf->binvec) + return 1; + + /* + * if this callout is already stored in binvec, don't store it twice + */ + vector_foreach_slot (conf->binvec, bin, i) + if (memcmp(bin, callout, p - callout) == 0) + return 0; + + /* + * else, store it + */ + bin = MALLOC((p - callout) + 1); + + if (!bin) + return 1; + + strncpy(bin, callout, p - callout); + + if (!vector_alloc_slot(conf->binvec)) + return 1; + + vector_set_slot(conf->binvec, bin); + + return 0; +} + +struct hwentry * +find_hwe (vector hwtable, char * vendor, char * product) +{ + int i; + struct hwentry * hwe; + + vector_foreach_slot (hwtable, hwe, i) { + if (strcmp_chomp(hwe->vendor, vendor) == 0 && + (hwe->product[0] == '*' || + strcmp_chomp(hwe->product, product) == 0)) + return hwe; + } + return NULL; +} + +extern struct mpentry * +find_mpe (char * wwid) +{ + int i; + struct mpentry * mpe; + + if (!wwid) + return NULL; + + vector_foreach_slot (conf->mptable, mpe, i) + if (mpe->wwid && strcmp(mpe->wwid, wwid) == 0) + return mpe; + + return NULL; +} + +extern char * +get_mpe_wwid (char * alias) +{ + int i; + struct mpentry * mpe; + + if (!alias) + return NULL; + + vector_foreach_slot (conf->mptable, mpe, i) + if (mpe->alias && strcmp(mpe->alias, alias) == 0) + return mpe->wwid; + + return NULL; +} + +void +free_hwe (struct hwentry * hwe) +{ + if (!hwe) + return; + + if (hwe->vendor) + FREE(hwe->vendor); + + if (hwe->product) + FREE(hwe->product); + + if (hwe->selector) + FREE(hwe->selector); + + if (hwe->getuid) + FREE(hwe->getuid); + + if (hwe->getprio) + FREE(hwe->getprio); + + if (hwe->features) + FREE(hwe->features); + + if (hwe->hwhandler) + FREE(hwe->hwhandler); + + FREE(hwe); +} + +void +free_hwtable (vector hwtable) +{ + int i; + struct hwentry * hwe; + + if (!hwtable) + return; + + vector_foreach_slot (hwtable, hwe, i) + free_hwe(hwe); + + vector_free(hwtable); +} + +void +free_mpe (struct mpentry * mpe) +{ + if (!mpe) + return; + + if (mpe->wwid) + FREE(mpe->wwid); + + if (mpe->selector) + FREE(mpe->selector); + + if (mpe->getuid) + FREE(mpe->getuid); + + if (mpe->alias) + FREE(mpe->alias); + + FREE(mpe); +} + +void +free_mptable (vector mptable) +{ + int i; + struct mpentry * mpe; + + if (!mptable) + return; + + vector_foreach_slot (mptable, mpe, i) + free_mpe(mpe); + + vector_free(mptable); +} + +static struct hwentry * +alloc_hwe (void) +{ + return (struct hwentry *)MALLOC(sizeof(struct hwentry)); +} + +static char * +set_param_str(char * str) +{ + char * dst; + int len; + + if (!str) + return NULL; + + len = strlen(str); + + if (!len) + return NULL; + + dst = (char *)MALLOC(len + 1); + + if (!dst) + return NULL; + + strcpy(dst, str); + return dst; +} + +int +store_hwe (vector hwtable, char * vendor, char * product, int pgp, + char * getuid) +{ + struct hwentry * hwe; + + hwe = alloc_hwe(); + + if (!hwe) + return 1; + + hwe->vendor = set_param_str(vendor); + + if (!hwe->vendor) + goto out; + + hwe->product = set_param_str(product); + + if (!hwe->product) + goto out; + + if (pgp) + hwe->pgpolicy = pgp; + + if (getuid) { + hwe->getuid = set_param_str(getuid); + push_callout(getuid); + } else { + hwe->getuid = set_default(DEFAULT_GETUID); + push_callout(DEFAULT_GETUID); + } + + if (!hwe->getuid) + goto out; + + if (!vector_alloc_slot(hwtable)) + goto out; + + vector_set_slot(hwtable, hwe); + return 0; +out: + free_hwe(hwe); + return 1; +} + +int +store_hwe_ext (vector hwtable, char * vendor, char * product, int pgp, + char * getuid, char * getprio, char * hwhandler, + char * features, char * checker) +{ + struct hwentry * hwe; + + hwe = alloc_hwe(); + + if (!hwe) + return 1; + + hwe->vendor = set_param_str(vendor); + + if (!hwe->vendor) + goto out; + + hwe->product = set_param_str(product); + + if (!hwe->product) + goto out; + + if (pgp) + hwe->pgpolicy = pgp; + + if (getuid) { + hwe->getuid = set_param_str(getuid); + push_callout(getuid); + } else { + hwe->getuid = set_default(DEFAULT_GETUID); + push_callout(DEFAULT_GETUID); + } + + if (!hwe->getuid) + goto out; + + if (getprio) { + hwe->getprio = set_param_str(getprio); + push_callout(getprio); + } else + hwe->getprio = NULL; + + if (hwhandler) + hwe->hwhandler = set_param_str(hwhandler); + else + hwe->hwhandler = set_default(DEFAULT_HWHANDLER); + + if (!hwe->hwhandler) + goto out; + + if (features) + hwe->features = set_param_str(features); + else + hwe->features = set_default(DEFAULT_FEATURES); + + if (!hwe->features) + goto out; + + if (checker) + hwe->checker_index = get_checker_id(checker); + else + hwe->checker_index = get_checker_id(DEFAULT_CHECKER); + + if (!vector_alloc_slot(hwtable)) + goto out; + + vector_set_slot(hwtable, hwe); + return 0; +out: + free_hwe(hwe); + return 1; +} + +struct config * +alloc_config (void) +{ + return (struct config *)MALLOC(sizeof(struct config)); +} + +void +free_config (struct config * conf) +{ + if (!conf) + return; + + if (conf->dev) + FREE(conf->dev); + + if (conf->multipath) + FREE(conf->multipath); + + if (conf->udev_dir) + FREE(conf->udev_dir); + + if (conf->default_selector) + FREE(conf->default_selector); + + if (conf->default_getuid) + FREE(conf->default_getuid); + + if (conf->default_getprio) + FREE(conf->default_getprio); + + if (conf->default_features) + FREE(conf->default_features); + + if (conf->default_hwhandler) + FREE(conf->default_hwhandler); + + free_blacklist(conf->blist); + free_mptable(conf->mptable); + free_hwtable(conf->hwtable); + free_strvec(conf->binvec); + + FREE(conf); +} + +int +load_config (char * file) +{ + conf = alloc_config(); + + if (!conf) + return 1; + + /* + * internal defaults + */ + conf->verbosity = 2; + conf->signal = 1; /* 1 == Send a signal to multipathd */ + conf->dev_type = DEV_NONE; + conf->minio = 1000; + + /* + * read the config file + */ + if (filepresent(file)) { + if (init_data(file, init_keywords)) { + condlog(0, "error parsing config file"); + goto out; + } + } + + /* + * fill the voids left in the config file + */ + if (conf->hwtable == NULL) { + conf->hwtable = vector_alloc(); + + if (!conf->hwtable) + goto out; + + if (setup_default_hwtable(conf->hwtable)) + goto out; + } + if (conf->blist == NULL) { + conf->blist = vector_alloc(); + + if (!conf->blist) + goto out; + + if (setup_default_blist(conf->blist)) + goto out; + } + if (conf->mptable == NULL) { + conf->mptable = vector_alloc(); + + if (!conf->mptable) + goto out; + } + if (conf->default_selector == NULL) + conf->default_selector = set_default(DEFAULT_SELECTOR); + + if (conf->udev_dir == NULL) + conf->udev_dir = set_default(DEFAULT_UDEVDIR); + + if (conf->default_getuid == NULL) + conf->default_getuid = set_default(DEFAULT_GETUID); + + if (conf->default_features == NULL) + conf->default_features = set_default(DEFAULT_FEATURES); + + if (conf->default_hwhandler == NULL) + conf->default_hwhandler = set_default(DEFAULT_HWHANDLER); + + if (!conf->default_selector || !conf->udev_dir || + !conf->default_getuid || !conf->default_features || + !conf->default_hwhandler) + goto out; + + return 0; +out: + free_config(conf); + return 1; +} + diff --git a/libmultipath/config.h b/libmultipath/config.h new file mode 100644 index 0000000..455ee25 --- /dev/null +++ b/libmultipath/config.h @@ -0,0 +1,90 @@ +#ifndef _CONFIG_H +#define _CONFIG_H + +#ifndef _VECTOR_H +#include "vector.h" +#endif + +enum devtypes { + DEV_NONE, + DEV_DEVT, + DEV_DEVNODE, + DEV_DEVMAP +}; + +struct hwentry { + int selector_args; + int pgpolicy; + int checker_index; + + char * vendor; + char * product; + char * selector; + char * getuid; + char * getprio; + char * features; + char * hwhandler; +}; + +struct mpentry { + int selector_args; + int pgpolicy; + + char * wwid; + char * selector; + char * getuid; + char * alias; +}; + +struct config { + int verbosity; + int dry_run; + int list; + int signal; + int pgpolicy_flag; + int with_sysfs; + int default_selector_args; + int default_pgpolicy; + int dev_type; + int minio; + int checkint; + + char * dev; + char * multipath; + char * udev_dir; + char * default_selector; + char * default_getuid; + char * default_getprio; + char * default_features; + char * default_hwhandler; + + vector mptable; + vector hwtable; + vector blist; + vector binvec; +}; + +struct config * conf; + +extern int push_callout(char * callout); + +struct hwentry * find_hwe (vector hwtable, char * vendor, char * product); +struct mpentry * find_mpe (char * wwid); +char * get_mpe_wwid (char * alias); + +void free_hwe (struct hwentry * hwe); +void free_hwtable (vector hwtable); +void free_mpe (struct mpentry * mpe); +void free_mptable (vector mptable); + +int store_hwe (vector hwtable, char * vendor, char * product, int pgp, + char * getuid); +int store_hwe_ext (vector hwtable, char * vendor, char * product, int pgp, + char * getuid, char * getprio, char * hwhandler, + char * features, char * checker); + +int load_config (char * file); +struct config * alloc_config (void); +void free_config (struct config * conf); + +#endif diff --git a/libmultipath/debug.c b/libmultipath/debug.c new file mode 100644 index 0000000..dc69d6f --- /dev/null +++ b/libmultipath/debug.c @@ -0,0 +1,24 @@ +#include <stdio.h> +#include <stdlib.h> +#include <stdarg.h> + +#include "config.h" + +void condlog (int prio, char * fmt, ...) +{ + va_list ap; + int thres; + + if (!conf) + thres = 0; + else + thres = conf->verbosity; + + va_start(ap, fmt); + + if (prio <= thres) { + vfprintf(stdout, fmt, ap); + fprintf(stdout, "\n"); + } + va_end(ap); +} diff --git a/libmultipath/debug.h b/libmultipath/debug.h new file mode 100644 index 0000000..727a9fd --- /dev/null +++ b/libmultipath/debug.h @@ -0,0 +1,8 @@ +void condlog (int prio, char * fmt, ...); + +#if DAEMON +#include <pthread.h> +#include "../multipathd/log_pthread.h" +#define condlog(prio, fmt, args...) \ + log_safe(prio + 3, fmt, ##args) +#endif diff --git a/libmultipath/defaults.c b/libmultipath/defaults.c new file mode 100644 index 0000000..3b8ecff --- /dev/null +++ b/libmultipath/defaults.c @@ -0,0 +1,20 @@ +#include <string.h> + +#include "memory.h" + +char * +set_default (char * str) +{ + int len; + char * p; + + len = strlen(str); + p = MALLOC(len + 1); + + if (!p) + return NULL; + + strncat(p, str, len); + + return p; +} diff --git a/libmultipath/defaults.h b/libmultipath/defaults.h new file mode 100644 index 0000000..9a8178d --- /dev/null +++ b/libmultipath/defaults.h @@ -0,0 +1,13 @@ +#define DEFAULT_GETUID "/sbin/scsi_id -g -u -s /block/%n" +#define DEFAULT_UDEVDIR "/dev" +#define DEFAULT_SELECTOR "round-robin 0" +#define DEFAULT_FEATURES "0" +#define DEFAULT_HWHANDLER "0" +#define DEFAULT_CHECKER "readsector0" + +#define DEFAULT_TARGET "multipath" +#define DEFAULT_PIDFILE "/var/run/multipathd.pid" +#define DEFAULT_RUNFILE "/var/run/multipath.run" +#define DEFAULT_CONFIGFILE "/etc/multipath.conf" + +char * set_default (char * str); diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c new file mode 100644 index 0000000..98739ac --- /dev/null +++ b/libmultipath/devmapper.c @@ -0,0 +1,485 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <libdevmapper.h> +#include <ctype.h> +#include <linux/kdev_t.h> + +#include "vector.h" +#include "structs.h" +#include "debug.h" +#include "memory.h" + +extern int +dm_prereq (char * str, int x, int y, int z) +{ + int r = 1; + struct dm_task *dmt; + struct dm_versions *target; + struct dm_versions *last_target; + + if (!(dmt = dm_task_create(DM_DEVICE_LIST_VERSIONS))) + return 1; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + target = dm_task_get_versions(dmt); + + /* Fetch targets and print 'em */ + do { + last_target = target; + + if (!strncmp(str, target->name, strlen(str)) && + /* dummy prereq on multipath version */ + target->version[0] >= x && + target->version[1] >= y && + target->version[2] >= z + ) + r = 0; + + target = (void *) target + target->next; + } while (last_target != target); + + out: + dm_task_destroy(dmt); + return r; +} + +extern int +dm_simplecmd (int task, const char *name) { + int r = 0; + struct dm_task *dmt; + + if (!(dmt = dm_task_create (task))) + return 0; + + if (!dm_task_set_name (dmt, name)) + goto out; + + dm_task_no_open_count(dmt); + + r = dm_task_run (dmt); + + out: + dm_task_destroy (dmt); + return r; +} + +extern int +dm_addmap (int task, const char *name, const char *target, + const char *params, unsigned long size) { + int r = 0; + struct dm_task *dmt; + + if (!(dmt = dm_task_create (task))) + return 0; + + if (!dm_task_set_name (dmt, name)) + goto addout; + + if (!dm_task_add_target (dmt, 0, size, target, params)) + goto addout; + + dm_task_no_open_count(dmt); + + r = dm_task_run (dmt); + + addout: + dm_task_destroy (dmt); + return r; +} + +extern int +dm_map_present (char * str) +{ + int r = 0; + struct dm_task *dmt; + struct dm_info info; + + if (!(dmt = dm_task_create(DM_DEVICE_INFO))) + return 0; + + if (!dm_task_set_name(dmt, str)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + if (!dm_task_get_info(dmt, &info)) + goto out; + + if (info.exists) + r = 1; +out: + dm_task_destroy(dmt); + return r; +} + +extern int +dm_get_map(char * name, unsigned long * size, char * outparams) +{ + int r = 1; + struct dm_task *dmt; + void *next = NULL; + uint64_t start, length; + char *target_type = NULL; + char *params = NULL; + + if (!(dmt = dm_task_create(DM_DEVICE_TABLE))) + return 1; + + if (!dm_task_set_name(dmt, name)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + /* Fetch 1st target */ + next = dm_get_next_target(dmt, next, &start, &length, + &target_type, ¶ms); + + if (size) + *size = length; + + if (snprintf(outparams, PARAMS_SIZE, "%s", params) <= PARAMS_SIZE) + r = 0; +out: + dm_task_destroy(dmt); + return r; +} + +extern int +dm_get_status(char * name, char * outstatus) +{ + int r = 1; + struct dm_task *dmt; + void *next = NULL; + uint64_t start, length; + char *target_type; + char *status; + + if (!(dmt = dm_task_create(DM_DEVICE_STATUS))) + return 1; + + if (!dm_task_set_name(dmt, name)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + /* Fetch 1st target */ + next = dm_get_next_target(dmt, next, &start, &length, + &target_type, &status); + + if (snprintf(outstatus, PARAMS_SIZE, "%s", status) <= PARAMS_SIZE) + r = 0; +out: + dm_task_destroy(dmt); + return r; +} + +extern int +dm_type(char * name, char * type) +{ + int r = 0; + struct dm_task *dmt; + void *next = NULL; + uint64_t start, length; + char *target_type = NULL; + char *params; + + if (!(dmt = dm_task_create(DM_DEVICE_TABLE))) + return 0; + + if (!dm_task_set_name(dmt, name)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + /* Fetch 1st target */ + next = dm_get_next_target(dmt, next, &start, &length, + &target_type, ¶ms); + + if (0 == strcmp(target_type, type)) + r = 1; + +out: + dm_task_destroy(dmt); + return r; +} + +int +dm_get_opencount (char * mapname) +{ + int r = -1; + struct dm_task *dmt; + struct dm_info info; + + if (!(dmt = dm_task_create(DM_DEVICE_INFO))) + return 0; + + if (!dm_task_set_name(dmt, mapname)) + goto out; + + if (!dm_task_run(dmt)) + goto out; + + if (!dm_task_get_info(dmt, &info)) + goto out; + + r = info.open_count; +out: + dm_task_destroy(dmt); + return r; +} + +extern int +dm_flush_maps (char * type) +{ + int r = 0; + struct dm_task *dmt; + struct dm_names *names; + unsigned next = 0; + + if (!(dmt = dm_task_create (DM_DEVICE_LIST))) + return 0; + + dm_task_no_open_count(dmt); + + if (!dm_task_run (dmt)) + goto out; + + if (!(names = dm_task_get_names (dmt))) + goto out; + + if (!names->dev) + goto out; + + do { + if (dm_type(names->name, type) && + dm_get_opencount(names->name) == 0 && + !dm_simplecmd(DM_DEVICE_REMOVE, names->name)) + r++; + + next = names->next; + names = (void *) names + next; + } while (next); + + out: + dm_task_destroy (dmt); + return r; +} + +int +dm_fail_path(char * mapname, char * path) +{ + int r = 1; + struct dm_task *dmt; + char str[32]; + + if (!(dmt = dm_task_create(DM_DEVICE_TARGET_MSG))) + return 1; + + if (!dm_task_set_name(dmt, mapname)) + goto out; + + if (!dm_task_set_sector(dmt, 0)) + goto out; + + if (snprintf(str, 32, "fail_path %s\n", path) > 32) + goto out; + + if (!dm_task_set_message(dmt, str)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + r = 0; +out: + dm_task_destroy(dmt); + return r; +} + +int +dm_reinstate(char * mapname, char * path) +{ + int r = 1; + struct dm_task *dmt; + char str[32]; + + if (!(dmt = dm_task_create(DM_DEVICE_TARGET_MSG))) + return 1; + + if (!dm_task_set_name(dmt, mapname)) + goto out; + + if (!dm_task_set_sector(dmt, 0)) + goto out; + + if (snprintf(str, 32, "reinstate_path %s\n", path) > 32) + goto out; + + if (!dm_task_set_message(dmt, str)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + r = 0; +out: + dm_task_destroy(dmt); + return r; +} + +int +dm_switchgroup(char * mapname, int index) +{ + int r = 0; + struct dm_task *dmt; + char str[24]; + + if (!(dmt = dm_task_create(DM_DEVICE_TARGET_MSG))) + return 0; + + if (!dm_task_set_name(dmt, mapname)) + goto out; + + if (!dm_task_set_sector(dmt, 0)) + goto out; + + snprintf(str, 24, "switch_group %i\n", index); + condlog(3, "message %s 0 %s", mapname, str); + + if (!dm_task_set_message(dmt, str)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + r = 1; + + out: + dm_task_destroy(dmt); + + return r; +} + +int +dm_get_maps (vector mp, char * type) +{ + struct multipath * mpp; + int r = 1; + struct dm_task *dmt; + struct dm_names *names; + unsigned next = 0; + + if (!type || !mp) + return 1; + + if (!(dmt = dm_task_create(DM_DEVICE_LIST))) + return 1; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + if (!(names = dm_task_get_names(dmt))) + goto out; + + if (!names->dev) { + r = 0; /* this is perfectly valid */ + goto out; + } + + do { + if (dm_type(names->name, type)) { + mpp = (struct multipath *) + MALLOC(sizeof(struct multipath)); + + if (!mpp) + goto out; + + if (dm_get_map(names->name, &mpp->size, mpp->params)) + goto out1; + + if (dm_get_status(names->name, mpp->status)) + goto out1; + + mpp->alias = MALLOC(strlen(names->name) + 1); + + if (!mpp->alias) + goto out1; + + strncat(mpp->alias, names->name, strlen(names->name)); + + if (!vector_alloc_slot(mp)) + goto out1; + + vector_set_slot(mp, mpp); + mpp = NULL; + } + next = names->next; + names = (void *) names + next; + } while (next); + + r = 0; + goto out; +out1: + free_multipath(mpp, KEEP_PATHS); +out: + dm_task_destroy (dmt); + return r; +} + +int +dm_geteventnr (char *name) +{ + struct dm_task *dmt; + struct dm_info info; + + if (!(dmt = dm_task_create(DM_DEVICE_INFO))) + return 0; + + if (!dm_task_set_name(dmt, name)) + goto out; + + dm_task_no_open_count(dmt); + + if (!dm_task_run(dmt)) + goto out; + + if (!dm_task_get_info(dmt, &info)) { + info.event_nr = 0; + goto out; + } + + if (!info.exists) { + info.event_nr = 0; + goto out; + } + +out: + dm_task_destroy(dmt); + + return info.event_nr; +} diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h new file mode 100644 index 0000000..207ad4f --- /dev/null +++ b/libmultipath/devmapper.h @@ -0,0 +1,13 @@ +int dm_prereq (char *, int, int, int); +int dm_simplecmd (int, const char *); +int dm_addmap (int, const char *, const char *, const char *, unsigned long); +int dm_map_present (char *); +int dm_get_map(char *, unsigned long *, char *); +int dm_get_status(char *, char *); +int dm_type(char *, char *); +int dm_flush_maps (char *); +int dm_fail_path(char * mapname, char * path); +int dm_reinstate(char * mapname, char * path); +int dm_switchgroup(char * mapname, int index); +int dm_get_maps (vector mp, char * type); +int dm_geteventnr (char *name); diff --git a/libmultipath/dict.c b/libmultipath/dict.c new file mode 100644 index 0000000..2c1e8ba --- /dev/null +++ b/libmultipath/dict.c @@ -0,0 +1,492 @@ +#include "vector.h" +#include "hwtable.h" +#include "structs.h" +#include "parser.h" +#include "config.h" +#include "debug.h" +#include "memory.h" +#include "pgpolicies.h" +#include "blacklist.h" + +#include "../libcheckers/checkers.h" + +/* + * default block handlers + */ +static int +multipath_tool_handler(vector strvec) +{ + conf->multipath = set_value(strvec); + + if (!conf->multipath) + return 1; + + return push_callout(conf->multipath); +} + +static int +polling_interval_handler(vector strvec) +{ + char * buff; + + buff = VECTOR_SLOT(strvec, 1); + conf->checkint = atoi(buff); + + return 0; +} + +static int +udev_dir_handler(vector strvec) +{ + conf->udev_dir = set_value(strvec); + + if (!conf->udev_dir) + return 1; + + return 0; +} + +static int +def_selector_handler(vector strvec) +{ + conf->default_selector = set_value(strvec); + + if (!conf->default_selector) + return 1; + + return 0; +} + +static int +def_pgpolicy_handler(vector strvec) +{ + char * buff; + + buff = set_value(strvec); + + if (!buff) + return 1; + + conf->default_pgpolicy = get_pgpolicy_id(buff); + FREE(buff); + + return 0; +} + +static int +def_getuid_callout_handler(vector strvec) +{ + conf->default_getuid = set_value(strvec); + + if (!conf->default_getuid) + return 1; + + return push_callout(conf->default_getuid); +} + +static int +def_prio_callout_handler(vector strvec) +{ + conf->default_getprio = set_value(strvec); + + if (!conf->default_getprio) + return 1; + + if (!strncmp(conf->default_getprio, "none", 4)) { + FREE(conf->default_getprio); + conf->default_getprio = NULL; + return 0; + } + + return push_callout(conf->default_getprio); +} + +static int +def_features_handler(vector strvec) +{ + conf->default_features = set_value(strvec); + + if (!conf->default_features) + return 1; + + return 0; +} + +static int +def_minio_handler(vector strvec) +{ + char * buff; + + buff = set_value(strvec); + + if (!buff) + return 1; + + conf->minio = atoi(buff); + FREE(buff); + + return 0; +} + +/* + * blacklist block handlers + */ +static int +blacklist_handler(vector strvec) +{ + conf->blist = vector_alloc(); + + if (!conf->blist) + return 1; + + return 0; +} + +static int +ble_handler(vector strvec) +{ + char * buff; + int ret; + + buff = set_value(strvec); + + if (!buff) + return 1; + + ret = store_regex(conf->blist, buff); + FREE(buff); + + return ret; +} + +/* + * devices block handlers + */ +static int +devices_handler(vector strvec) +{ + conf->hwtable = vector_alloc(); + + if (!conf->hwtable) + return 1; + + return 0; +} + +static int +device_handler(vector strvec) +{ + struct hwentry * hwe; + + hwe = (struct hwentry *)MALLOC(sizeof(struct hwentry)); + + if (!hwe) + return 1; + + if (!vector_alloc_slot(conf->hwtable)) { + FREE(hwe); + return 1; + } + vector_set_slot(conf->hwtable, hwe); + + return 0; +} + +static int +vendor_handler(vector strvec) +{ + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + if (!hwe) + return 1; + + hwe->vendor = set_value(strvec); + + if (!hwe->vendor) + return 1; + + return 0; +} + +static int +product_handler(vector strvec) +{ + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + if (!hwe) + return 1; + + hwe->product = set_value(strvec); + + if (!hwe->product) + return 1; + + return 0; +} + +static int +hw_pgpolicy_handler(vector strvec) +{ + char * buff; + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + buff = set_value(strvec); + + if (!buff) + return 1; + + hwe->pgpolicy = get_pgpolicy_id(buff); + FREE(buff); + + return 0; +} + +static int +hw_getuid_callout_handler(vector strvec) +{ + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + hwe->getuid = set_value(strvec); + + if (!hwe->getuid) + return 1; + + return push_callout(hwe->getuid); +} + +static int +hw_selector_handler(vector strvec) +{ + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + if (!hwe) + return 1; + + hwe->selector = set_value(strvec); + + if (!hwe->selector) + return 1; + + return 0; +} + +static int +hw_path_checker_handler(vector strvec) +{ + char * buff; + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + if (!hwe) + return 1; + + buff = set_value(strvec); + + if (!buff) + return 1; + + hwe->checker_index = get_checker_id(buff); + FREE(buff); + + return 0; +} + +static int +hw_features_handler(vector strvec) +{ + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + if (!hwe) + return 1; + + hwe->features = set_value(strvec); + + if (!hwe->features) + return 1; + + return 0; +} + +static int +hw_handler_handler(vector strvec) +{ + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + if (!hwe) + return 1; + + hwe->hwhandler = set_value(strvec); + + if (!hwe->hwhandler) + return 1; + + return 0; +} + +static int +prio_callout_handler(vector strvec) +{ + struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable); + + if (!hwe) + return 1; + + hwe->getprio = set_value(strvec); + + if (!hwe->getprio) + return 1; + + if (!strncmp(hwe->getprio, "none", 4)) { + FREE(hwe->getprio); + hwe->getprio = NULL; + return 0; + } + + return push_callout(hwe->getprio); +} + +/* + * multipaths block handlers + */ +static int +multipaths_handler(vector strvec) +{ + conf->mptable = vector_alloc(); + + if (!conf->mptable) + return 1; + + return 0; +} + +static int +multipath_handler(vector strvec) +{ + struct mpentry * mpe; + + mpe = (struct mpentry *)MALLOC(sizeof(struct mpentry)); + + if (!mpe) + return 1; + + if (!vector_alloc_slot(conf->mptable)) { + FREE(mpe); + return 1; + } + vector_set_slot(conf->mptable, mpe); + + return 0; +} + +static int +wwid_handler(vector strvec) +{ + struct mpentry * mpe = VECTOR_LAST_SLOT(conf->mptable); + + if (!mpe) + return 1; + + mpe->wwid = set_value(strvec); + + if (!mpe->wwid) + return 1; + + return 0; +} + +static int +alias_handler(vector strvec) +{ + struct mpentry * mpe = VECTOR_LAST_SLOT(conf->mptable); + + if (!mpe) + return 1; + + mpe->alias = set_value(strvec); + + if (!mpe->alias) + return 1; + + return 0; +} + +static int +mp_pgpolicy_handler(vector strvec) +{ + char * buff; + struct mpentry * mpe = VECTOR_LAST_SLOT(conf->mptable); + + if (!mpe) + return 1; + + buff = set_value(strvec); + + if (!buff) + return 1; + + mpe->pgpolicy = get_pgpolicy_id(buff); + FREE(buff); + + return 0; +} + +static int +mp_selector_handler(vector strvec) +{ + struct mpentry * mpe = VECTOR_LAST_SLOT(conf->mptable); + + if (!mpe) + return 1; + + mpe->selector = set_value(strvec); + + if (!mpe->selector) + return 1; + + return 0; +} + +vector +init_keywords(void) +{ + keywords = vector_alloc(); + + install_keyword_root("defaults", NULL); + install_keyword("polling_interval", &polling_interval_handler); + install_keyword("multipath_tool", &multipath_tool_handler); + install_keyword("udev_dir", &udev_dir_handler); + install_keyword("default_selector", &def_selector_handler); + install_keyword("default_path_grouping_policy", &def_pgpolicy_handler); + install_keyword("default_getuid_callout", &def_getuid_callout_handler); + install_keyword("default_prio_callout", &def_prio_callout_handler); + install_keyword("default_features", &def_features_handler); + install_keyword("rr_min_io", &def_minio_handler); + + install_keyword_root("devnode_blacklist", &blacklist_handler); + install_keyword("devnode", &ble_handler); + install_keyword("wwid", &ble_handler); + + install_keyword_root("devices", &devices_handler); + install_keyword("device", &device_handler); + install_sublevel(); + install_keyword("vendor", &vendor_handler); + install_keyword("product", &product_handler); + install_keyword("path_grouping_policy", &hw_pgpolicy_handler); + install_keyword("getuid_callout", &hw_getuid_callout_handler); + install_keyword("path_selector", &hw_selector_handler); + install_keyword("path_checker", &hw_path_checker_handler); + install_keyword("features", &hw_features_handler); + install_keyword("hardware_handler", &hw_handler_handler); + install_keyword("prio_callout", &prio_callout_handler); + install_sublevel_end(); + + install_keyword_root("multipaths", &multipaths_handler); + install_keyword("multipath", &multipath_handler); + install_sublevel(); + install_keyword("wwid", &wwid_handler); + install_keyword("alias", &alias_handler); + install_keyword("path_grouping_policy", &mp_pgpolicy_handler); + install_keyword("path_selector", &mp_selector_handler); + install_sublevel_end(); + + return keywords; +} diff --git a/libmultipath/dict.h b/libmultipath/dict.h new file mode 100644 index 0000000..ac35edc --- /dev/null +++ b/libmultipath/dict.h @@ -0,0 +1,10 @@ +#ifndef _DICT_H +#define _DICT_H + +#ifndef _VECTOR_H +#include "vector.h" +#endif + +vector init_keywords(void); + +#endif /* _DICT_H */ diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c new file mode 100644 index 0000000..87860eb --- /dev/null +++ b/libmultipath/discovery.c @@ -0,0 +1,531 @@ +#include <stdio.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/ioctl.h> + +#include <sysfs/dlist.h> +#include <sysfs/libsysfs.h> + +#include "vector.h" +#include "memory.h" +#include "blacklist.h" +#include "util.h" +#include "structs.h" +#include "callout.h" +#include "config.h" +#include "debug.h" +#include "propsel.h" +#include "sg_include.h" +#include "discovery.h" + +#define readattr(a,b) \ + sysfs_read_attribute_value(a, b, sizeof(b)) + +int +store_pathinfo (vector pathvec, vector hwtable, char * devname) +{ + struct path * pp; + + pp = alloc_path(); + + if (!pp) + return 1; + + if(safe_sprintf(pp->dev, "%s", devname)) { + fprintf(stderr, "pp->dev too small\n"); + goto out; + } + if (store_path(pathvec, pp)) + goto out; + + pathinfo(pp, hwtable, DI_ALL); + + return 0; +out: + free_path(pp); + return 1; +} +int +path_discovery (vector pathvec, struct config * conf, int flag) +{ + struct sysfs_directory * sdir; + struct sysfs_directory * devp; + char path[FILE_NAME_SIZE]; + struct path * pp; + int r = 1; + + if(safe_sprintf(path, "%s/block", sysfs_path)) { + fprintf(stderr, "path too small\n"); + exit(1); + } + sdir = sysfs_open_directory(path); + sysfs_read_directory(sdir); + + dlist_for_each_data(sdir->subdirs, devp, struct sysfs_directory) { + if (blacklist(conf->blist, devp->name)) + continue; + + if(safe_sprintf(path, "%s/block/%s/device", sysfs_path, + devp->name)) { + fprintf(stderr, "path too small\n"); + exit(1); + } + + if (!filepresent(path)) + continue; + + pp = find_path_by_dev(pathvec, devp->name); + + if (!pp) { + /* + * new path : alloc, store and fetch info + */ + if (store_pathinfo(pathvec, conf->hwtable, devp->name)) + goto out; + } else { + /* + * path already known : + * refresh only what the caller wants + */ + pathinfo(pp, conf->hwtable, flag); + } + } + r = 0; +out: + sysfs_close_directory(sdir); + return r; +} + +#define declare_sysfs_get_str(fname, fmt) \ +extern int \ +sysfs_get_##fname (char * sysfs_path, char * dev, char * buff, int len) \ +{ \ + char attr_path[SYSFS_PATH_SIZE]; \ + char attr_buff[SYSFS_PATH_SIZE]; \ + int attr_len; \ +\ + if(safe_sprintf(attr_path, fmt, sysfs_path, dev)) \ + return 1; \ + if (0 > sysfs_read_attribute_value(attr_path, attr_buff, sizeof(attr_buff))) \ + return 1; \ +\ + attr_len = strlen(attr_buff); \ + if (attr_len < 2 || attr_len - 1 > len) \ + return 1; \ +\ + strncpy(buff, attr_buff, attr_len - 1); \ + buff[attr_len - 1] = '\0'; \ + return 0; \ +} + +declare_sysfs_get_str(vendor, "%s/block/%s/device/vendor"); +declare_sysfs_get_str(model, "%s/block/%s/device/model"); +declare_sysfs_get_str(rev, "%s/block/%s/device/rev"); +declare_sysfs_get_str(dev, "%s/block/%s/dev"); + +#define declare_sysfs_get_val(fname, fmt) \ +extern unsigned long \ +sysfs_get_##fname (char * sysfs_path, char * dev) \ +{ \ + char attr_path[SYSFS_PATH_SIZE]; \ + char attr_buff[SYSFS_PATH_SIZE]; \ +\ + if(safe_sprintf(attr_path, fmt, sysfs_path, dev)) \ + return 0; \ + if (0 > sysfs_read_attribute_value(attr_path, attr_buff, sizeof(attr_buff))) \ + return 0; \ +\ + return strtoul(attr_buff, NULL, 0); \ +} + +declare_sysfs_get_val(size, "%s/block/%s/size"); + +static int +opennode (char * dev, int mode) +{ + char devpath[FILE_NAME_SIZE]; + int fd; + + if (safe_sprintf(devpath, "%s/%s", conf->udev_dir, dev)) { + fprintf(stderr, "devpath too small\n"); + return -1; + } + fd = open(devpath, mode); + + if (fd <= 0) + condlog(0, "open(%s) failed", devpath); + + return fd; +} + +#if 0 +int +get_claimed(int fd) +{ + /* + * FIXME : O_EXCL always fails ? + */ + return 0; +} +#endif + +extern int +devt2devname (char *devname, char *devt) +{ + struct sysfs_directory * sdir; + struct sysfs_directory * devp; + char block_path[FILE_NAME_SIZE]; + char attr_path[FILE_NAME_SIZE]; + char attr_value[16]; + int len; + + if(safe_sprintf(block_path, "%s/block", sysfs_path)) { + fprintf(stderr, "block_path too small\n"); + exit(1); + } + sdir = sysfs_open_directory(block_path); + sysfs_read_directory(sdir); + + dlist_for_each_data (sdir->subdirs, devp, struct sysfs_directory) { + if(safe_sprintf(attr_path, "%s/%s/dev", + block_path, devp->name)) { + fprintf(stderr, "attr_path too small\n"); + exit(1); + } + sysfs_read_attribute_value(attr_path, attr_value, + sizeof(attr_value)); + + len = strlen(attr_value); + + /* discard newline */ + if (len > 1) len--; + + if (strlen(devt) == len && + strncmp(attr_value, devt, len) == 0) { + if(safe_sprintf(attr_path, "%s/%s", + block_path, devp->name)) { + fprintf(stderr, "attr_path too small\n"); + exit(1); + } + sysfs_get_name_from_path(attr_path, devname, + FILE_NAME_SIZE); + sysfs_close_directory(sdir); + return 0; + } + } + sysfs_close_directory(sdir); + return 1; +} + +static int +do_inq(int sg_fd, int cmddt, int evpd, unsigned int pg_op, + void *resp, int mx_resp_len, int noisy) +{ + unsigned char inqCmdBlk[INQUIRY_CMDLEN] = + { INQUIRY_CMD, 0, 0, 0, 0, 0 }; + unsigned char sense_b[SENSE_BUFF_LEN]; + struct sg_io_hdr io_hdr; + + if (cmddt) + inqCmdBlk[1] |= 2; + if (evpd) + inqCmdBlk[1] |= 1; + inqCmdBlk[2] = (unsigned char) pg_op; + inqCmdBlk[3] = (unsigned char)((mx_resp_len >> 8) & 0xff); + inqCmdBlk[4] = (unsigned char) (mx_resp_len & 0xff); + memset(&io_hdr, 0, sizeof (struct sg_io_hdr)); + io_hdr.interface_id = 'S'; + io_hdr.cmd_len = sizeof (inqCmdBlk); + io_hdr.mx_sb_len = sizeof (sense_b); + io_hdr.dxfer_direction = SG_DXFER_FROM_DEV; + io_hdr.dxfer_len = mx_resp_len; + io_hdr.dxferp = resp; + io_hdr.cmdp = inqCmdBlk; + io_hdr.sbp = sense_b; + io_hdr.timeout = DEF_TIMEOUT; + + if (ioctl(sg_fd, SG_IO, &io_hdr) < 0) + return -1; + + /* treat SG_ERR here to get rid of sg_err.[ch] */ + io_hdr.status &= 0x7e; + if ((0 == io_hdr.status) && (0 == io_hdr.host_status) && + (0 == io_hdr.driver_status)) + return 0; + if ((SCSI_CHECK_CONDITION == io_hdr.status) || + (SCSI_COMMAND_TERMINATED == io_hdr.status) || + (SG_ERR_DRIVER_SENSE == (0xf & io_hdr.driver_status))) { + if (io_hdr.sbp && (io_hdr.sb_len_wr > 2)) { + int sense_key; + unsigned char * sense_buffer = io_hdr.sbp; + if (sense_buffer[0] & 0x2) + sense_key = sense_buffer[1] & 0xf; + else + sense_key = sense_buffer[2] & 0xf; + if(RECOVERED_ERROR == sense_key) + return 0; + } + } + return -1; +} + +int +get_serial (char * str, int fd) +{ + int len; + char buff[MX_ALLOC_LEN + 1]; + + if (fd < 0) + return 0; + + if (0 == do_inq(fd, 0, 1, 0x80, buff, MX_ALLOC_LEN, 0)) { + len = buff[3]; + if (len > 0) { + memcpy(str, buff + 4, len); + buff[len] = '\0'; + } + return 1; + } + return 0; +} + +extern int +sysfs_pathinfo(struct path * curpath) +{ + char attr_path[FILE_NAME_SIZE]; + char attr_buff[FILE_NAME_SIZE]; + + if (sysfs_get_vendor(sysfs_path, curpath->dev, + curpath->vendor_id, SCSI_VENDOR_SIZE)) + return 1; + condlog(3, "vendor = %s", curpath->vendor_id); + + if (sysfs_get_model(sysfs_path, curpath->dev, + curpath->product_id, SCSI_PRODUCT_SIZE)) + return 1; + condlog(3, "product = %s", curpath->product_id); + + if (sysfs_get_rev(sysfs_path, curpath->dev, + curpath->rev, SCSI_REV_SIZE)) + return 1; + condlog(3, "rev = %s", curpath->rev); + + if (sysfs_get_dev(sysfs_path, curpath->dev, + curpath->dev_t, BLK_DEV_SIZE)) + return 1; + condlog(3, "dev_t = %s", curpath->dev_t); + + curpath->size = sysfs_get_size(sysfs_path, curpath->dev); + + if (curpath->size == 0) + return 1; + condlog(3, "size = %lu", curpath->size); + + /* + * host / bus / target / lun + */ + if(safe_sprintf(attr_path, "%s/block/%s/device", + sysfs_path, curpath->dev)) { + fprintf(stderr, "attr_path too small\n"); + return 1; + } + if (0 > sysfs_get_link(attr_path, attr_buff, sizeof(attr_buff))) + return 1; + basename(attr_buff, attr_path); + sscanf(attr_path, "%i:%i:%i:%i", + &curpath->sg_id.host_no, + &curpath->sg_id.channel, + &curpath->sg_id.scsi_id, + &curpath->sg_id.lun); + condlog(3, "h:b:t:l = %i:%i:%i:%i", + curpath->sg_id.host_no, + curpath->sg_id.channel, + curpath->sg_id.scsi_id, + curpath->sg_id.lun); + + /* + * target node name + */ + if(safe_sprintf(attr_path, + "%s/class/fc_transport/target%i:%i:%i/node_name", + sysfs_path, + curpath->sg_id.host_no, + curpath->sg_id.channel, + curpath->sg_id.scsi_id)) { + fprintf(stderr, "attr_path too small\n"); + return 1; + } + if (0 <= readattr(attr_path, attr_buff) && strlen(attr_buff) > 0) + strncpy(curpath->tgt_node_name, attr_buff, + strlen(attr_buff) - 1); + condlog(3, "tgt_node_name = %s", curpath->tgt_node_name); + + return 0; +} + +static int +apply_format (char * string, char * cmd, struct path * pp) +{ + char * pos; + char * dst; + char * p; + int len; + int myfree; + + if (!string) + return 1; + + if (!cmd) + return 1; + + dst = cmd; + + if (!dst) + return 1; + + p = dst; + pos = strchr(string, '%'); + myfree = CALLOUT_MAX_SIZE; + + if (!pos) { + strcpy(dst, string); + return 0; + } + + len = (int) (pos - string) + 1; + myfree -= len; + + if (myfree < 2) + return 1; + + snprintf(p, len, "%s", string); + p += len - 1; + pos++; + + switch (*pos) { + case 'n': + len = strlen(pp->dev) + 1; + myfree -= len; + + if (myfree < 2) + return 1; + + snprintf(p, len, "%s", pp->dev); + p += len - 1; + break; + case 'd': + len = strlen(pp->dev_t) + 1; + myfree -= len; + + if (myfree < 2) + return 1; + + snprintf(p, len, "%s", pp->dev_t); + p += len - 1; + break; + default: + break; + } + pos++; + + if (!*pos) + return 0; + + len = strlen(pos) + 1; + myfree -= len; + + if (myfree < 2) + return 1; + + snprintf(p, len, "%s", pos); + condlog(3, "reformated callout = %s", dst); + return 0; +} + +extern int +pathinfo (struct path *pp, vector hwtable, int mask) +{ + char buff[CALLOUT_MAX_SIZE]; + char prio[16]; + + condlog(3, "===== path %s =====", pp->dev); + + /* + * fetch info available in sysfs + */ + if (mask & DI_SYSFS && sysfs_pathinfo(pp)) + return 1; + + /* + * then those not available through sysfs + */ + if (pp->fd <= 0) + pp->fd = opennode(pp->dev, O_RDONLY); + + if (pp->fd <= 0) + return 1; + + if (mask & DI_SERIAL) { + get_serial(pp->serial, pp->fd); + condlog(3, "serial = %s", pp->serial); + } +#if 0 + if (mask & DI_CLAIMED) { + pp->claimed = get_claimed(pp->fd); + condlog(3, "claimed = %i", pp->claimed); + } +#endif + + /* get and store hwe pointer */ + pp->hwe = find_hwe(hwtable, pp->vendor_id, pp->product_id); + + /* + * get path state, no message collection, no context + */ + select_checkfn(pp); + + if (mask & DI_CHECKER) { + pp->state = pp->checkfn(pp->fd, NULL, NULL); + condlog(3, "state = %i", pp->state); + } + + /* + * get path prio + */ + if (mask & DI_PRIO) { + select_getprio(pp); + + if (!pp->getprio) { + pp->priority = 1; + } else if (apply_format(pp->getprio, &buff[0], pp)) { + condlog(0, "error formatting prio callout command"); + pp->priority = -1; + } else if (execute_program(buff, prio, 16)) { + condlog(0, "error calling out %s", buff); + pp->priority = -1; + } else + pp->priority = atoi(prio); + + condlog(3, "prio = %u", pp->priority); + } + + /* + * get path uid + */ + if (mask & DI_WWID && !strlen(pp->wwid)) { + select_getuid(pp); + + if (apply_format(pp->getuid, &buff[0], pp)) { + condlog(0, "error formatting uid callout command"); + memset(pp->wwid, 0, WWID_SIZE); + } else if (execute_program(buff, pp->wwid, WWID_SIZE)) { + condlog(0, "error calling out %s", buff); + memset(pp->wwid, 0, WWID_SIZE); + } + condlog(3, "uid = %s (callout)", pp->wwid); + } + else if (strlen(pp->wwid)) + condlog(3, "uid = %s (cache)", pp->wwid); + + return 0; +} diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h new file mode 100644 index 0000000..86e23bc --- /dev/null +++ b/libmultipath/discovery.h @@ -0,0 +1,57 @@ +#ifndef DISCOVERY_H +#define DISCOVERY_H + +#define SYSFS_PATH_SIZE 255 +#define INQUIRY_CMDLEN 6 +#define INQUIRY_CMD 0x12 +#define SENSE_BUFF_LEN 32 +#define DEF_TIMEOUT 60000 +#define RECOVERED_ERROR 0x01 +#define MX_ALLOC_LEN 255 +#define TUR_CMD_LEN 6 + +#ifndef BLKGETSIZE +#define BLKGETSIZE _IO(0x12,96) +#endif + +/* + * exerpt from sg_err.h + */ +#define SCSI_CHECK_CONDITION 0x2 +#define SCSI_COMMAND_TERMINATED 0x22 +#define SG_ERR_DRIVER_SENSE 0x08 + +int sysfs_get_vendor (char * sysfs_path, char * dev, char * buff, int len); +int sysfs_get_model (char * sysfs_path, char * dev, char * buff, int len); +int sysfs_get_rev (char * sysfs_path, char * dev, char * buff, int len); +int sysfs_get_dev (char * sysfs_path, char * dev, char * buff, int len); + +unsigned long sysfs_get_size (char * sysfs_path, char * dev); +int path_discovery (vector pathvec, struct config * conf, int flag); + +void basename (char *, char *); +int get_serial (char * buff, int fd); +int do_tur (char *); +int devt2devname (char *, char *); +int pathinfo (struct path *, vector hwtable, int mask); +int store_pathinfo (vector pathvec, vector hwtable, char * devname); + + +#if 0 +int get_claimed(int fd); +#endif + +/* + * discovery bitmask + */ +#define DI_SYSFS 1 +#define DI_SERIAL 2 +#define DI_CLAIMED 4 +#define DI_CHECKER 8 +#define DI_PRIO 16 +#define DI_WWID 32 + +#define DI_ALL (DI_SYSFS | DI_SERIAL | DI_CLAIMED | DI_CHECKER | \ + DI_PRIO | DI_WWID) + +#endif /* DISCOVERY_H */ diff --git a/libmultipath/dmparser.c b/libmultipath/dmparser.c new file mode 100644 index 0000000..de37790 --- /dev/null +++ b/libmultipath/dmparser.c @@ -0,0 +1,443 @@ +/* + * Christophe Varoqui (2004) + * This code is GPLv2, see license file + * + */ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "vector.h" +#include "memory.h" +#include "structs.h" +#include "util.h" +#include "debug.h" + +#define WORD_SIZE 64 + +static int +get_word (char * sentence, char ** word) +{ + char * p; + int len; + int skip = 0; + + while (*sentence == ' ') { + sentence++; + skip++; + } + if (*sentence == '\0') + return 0; + + p = sentence; + + while (*p != ' ' && *p != '\0') + p++; + + len = (int) (p - sentence); + + if (!word) + return skip + len; + + *word = MALLOC(len + 1); + + if (!*word) { + condlog(0, "get_word : oom\n"); + return 0; + } + strncpy(*word, sentence, len); + condlog(4, "*word = %s, len = %i", *word, len); + + if (*p == '\0') + return 0; + + return skip + len; +} + +static int +merge_words (char ** dst, char * word, int space) +{ + char * p; + int len; + + len = strlen(*dst) + strlen(word) + space; + *dst = REALLOC(*dst, len + 1); + + if (!*dst) + return 1; + + p = *dst; + + while (*p != '\0') + p++; + + while (space) { + *p = ' '; + p++; + space--; + } + strncpy(p, word, strlen(word) + 1); + + return 0; +} + +extern int +disassemble_map (vector pathvec, char * params, struct multipath * mpp) +{ + char * word; + char * p; + int i, j, k; + int num_features = 0; + int num_hwhandler = 0; + int num_pg = 0; + int num_pg_args = 0; + int num_paths = 0; + int num_paths_args = 0; + struct path * pp; + struct pathgroup * pgp; + + p = params; + + /* + * features + */ + p += get_word(p, &mpp->features); + + if (!mpp->features) + return 1; + + num_features = atoi(mpp->features); + + for (i = 0; i < num_features; i++) { + p += get_word(p, &word); + + if (!word) + return 1; + + if (merge_words(&mpp->features, word, 1)) { + FREE(word); + return 1; + } + FREE(word); + } + + /* + * hwhandler + */ + p += get_word(p, &mpp->hwhandler); + + if (!mpp->hwhandler) + return 1; + + num_hwhandler = atoi(mpp->hwhandler); + + for (i = 0; i < num_hwhandler; i++) { + p += get_word(p, &word); + + if (!word) + return 1; + + if (merge_words(&mpp->hwhandler, word, 1)) { + FREE(word); + return 1; + } + FREE(word); + } + + /* + * nb of path groups + */ + p += get_word(p, &word); + + if (!word) + return 1; + + num_pg = atoi(word); + FREE(word); + + if (num_pg > 0 && !mpp->pg) + mpp->pg = vector_alloc(); + + if (!mpp->pg) + return 1; + /* + * first pg to try + */ + p += get_word(p, &word); + + if (!word) + goto out; + + mpp->nextpg = atoi(word); + FREE(word); + + for (i = 0; i < num_pg; i++) { + /* + * selector + */ + + if (!mpp->selector) { + p += get_word(p, &mpp->selector); + + if (!mpp->selector) + goto out; + + /* + * selector args + */ + p += get_word(p, &word); + + if (!word) + goto out; + + num_pg_args = atoi(word); + + if (merge_words(&mpp->selector, word, 1)) { + FREE(word); + goto out1; + } + FREE(word); + } else { + p += get_word(p, NULL); + p += get_word(p, NULL); + } + + for (j = 0; j < num_pg_args; j++) + p += get_word(p, NULL); + + /* + * paths + */ + pgp = alloc_pathgroup(); + + if (!pgp) + goto out; + + if (store_pathgroup(mpp->pg, pgp)) + goto out; + + p += get_word(p, &word); + + if (!word) + goto out; + + num_paths = atoi(word); + FREE(word); + + p += get_word(p, &word); + + if (!word) + goto out; + + num_paths_args = atoi(word); + FREE(word); + + for (j = 0; j < num_paths; j++) { + pp = NULL; + p += get_word(p, &word); + + if (!word) + goto out; + + if (pathvec) + pp = find_path_by_devt(pathvec, word); + + if (!pp) { + pp = alloc_path(); + + if (!pp) + goto out1; + + strncpy(pp->dev_t, word, BLK_DEV_SIZE); + } + FREE(word); + + if (store_path(pgp->paths, pp)) + goto out; + + pgp->id ^= (long)pp; + + if (!strlen(mpp->wwid)) + strncpy(mpp->wwid, pp->wwid, WWID_SIZE); + + for (k = 0; k < num_paths_args; k++) + p += get_word(p, NULL); + } + } + return 0; +out1: + FREE(word); +out: + free_pgvec(mpp->pg, KEEP_PATHS); + return 1; +} + +extern int +disassemble_status (char * params, struct multipath * mpp) +{ + char * word; + char * p; + int i, j; + int num_feature_args; + int num_hwhandler_args; + int num_pg; + int num_pg_args; + int num_paths; + struct path * pp; + struct pathgroup * pgp; + + p = params; + + /* + * features + */ + p += get_word(p, &word); + + if (!word) + return 1; + + num_feature_args = atoi(word); + FREE(word); + + for (i = 0; i < num_feature_args; i++) { + if (i == 1) { + p += get_word(p, &word); + + if (!word) + return 1; + + mpp->queuedio = atoi(word); + FREE(word); + continue; + } + /* unknown */ + p += get_word(p, NULL); + } + /* + * hwhandler + */ + p += get_word(p, &word); + + if (!word) + return 1; + + num_hwhandler_args = atoi(word); + FREE(word); + + for (i = 0; i < num_hwhandler_args; i++) + p += get_word(p, NULL); + + /* + * nb of path groups + */ + p += get_word(p, &word); + + if (!word) + return 1; + + num_pg = atoi(word); + FREE(word); + + /* + * next pg to try + */ + p += get_word(p, NULL); + + if (VECTOR_SIZE(mpp->pg) < num_pg) + return 1; + + for (i = 0; i < num_pg; i++) { + pgp = VECTOR_SLOT(mpp->pg, i); + /* + * PG status + */ + p += get_word(p, &word); + + if (!word) + return 1; + + switch (*word) { + case 'D': + pgp->status = PGSTATE_DISABLED; + break; + case 'A': + pgp->status = PGSTATE_ACTIVE; + break; + case 'E': + pgp->status = PGSTATE_ENABLED; + break; + default: + pgp->status = PGSTATE_RESERVED; + break; + } + FREE(word); + + /* + * undef ? + */ + p += get_word(p, NULL); + + p += get_word(p, &word); + + if (!word) + return 1; + + num_paths = atoi(word); + FREE(word); + + p += get_word(p, &word); + + if (!word) + return 1; + + num_pg_args = atoi(word); + FREE(word); + + if (VECTOR_SIZE(pgp->paths) < num_paths) + return 1; + + for (j = 0; j < num_paths; j++) { + pp = VECTOR_SLOT(pgp->paths, j); + /* + * path + */ + p += get_word(p, NULL); + + /* + * path status + */ + p += get_word(p, &word); + + if (!word) + return 1; + + switch (*word) { + case 'F': + pp->dmstate = PSTATE_FAILED; + break; + case 'A': + pp->dmstate = PSTATE_ACTIVE; + break; + default: + break; + } + FREE(word); + /* + * fail count + */ + p += get_word(p, &word); + + if (!word) + return 1; + + pp->failcount = atoi(word); + FREE(word); + } + /* + * selector args + */ + for (j = 0; j < num_pg_args; j++) + p += get_word(p, NULL); + } + return 0; +} diff --git a/libmultipath/dmparser.h b/libmultipath/dmparser.h new file mode 100644 index 0000000..48ee25a --- /dev/null +++ b/libmultipath/dmparser.h @@ -0,0 +1,2 @@ +int disassemble_map (vector, char *, struct multipath *); +int disassemble_status (char *, struct multipath *); diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c new file mode 100644 index 0000000..0382c6f --- /dev/null +++ b/libmultipath/hwtable.c @@ -0,0 +1,47 @@ +#include <stdio.h> + +#include "vector.h" +#include "defaults.h" +#include "structs.h" +#include "config.h" +#include "pgpolicies.h" + +extern int +setup_default_hwtable (vector hw) +{ + int r = 0; + + r += store_hwe(hw, "3PARdata", "VV", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "COMPAQ", "HSV110 (C)COMPAQ", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "COMPAQ", "MSA1000", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "COMPAQ", "MSA1000 VOLUME", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "DDN", "SAN DataDirector", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "DEC", "HSG80", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "EMC", "SYMMETRIX", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "FSC", "CentricStor", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "HITACHI", "DF400", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "HITACHI", "DF500", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "HITACHI", "DF600", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "HP", "HSV110", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "HP", "A6189A", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "HP", "OPEN-", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "IBM", "ProFibre 4000R", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "NETAPP", "LUN", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "SGI", "TP9100", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "SGI", "TP9300", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "STK", "OPENstorage D280", GROUP_BY_SERIAL, DEFAULT_GETUID); + r += store_hwe(hw, "SUN", "StorEdge 3510", MULTIBUS, DEFAULT_GETUID); + r += store_hwe(hw, "SUN", "T4", MULTIBUS, DEFAULT_GETUID); + + r += store_hwe_ext(hw, "DGC", "*", GROUP_BY_PRIO, DEFAULT_GETUID, + "/sbin/pp_emc /dev/%n", "1 emc", "0", "emc_clariion"); + r += store_hwe_ext(hw, "IBM", "3542", GROUP_BY_SERIAL, DEFAULT_GETUID, + NULL, "0", "0", "tur"); + r += store_hwe_ext(hw, "SGI", "TP9400", MULTIBUS, DEFAULT_GETUID, + NULL, "0", "0", "tur"); + r += store_hwe_ext(hw, "SGI", "TP9500", FAILOVER, DEFAULT_GETUID, + NULL, "0", "0", "tur"); + + return r; +} + diff --git a/libmultipath/hwtable.h b/libmultipath/hwtable.h new file mode 100644 index 0000000..13c5701 --- /dev/null +++ b/libmultipath/hwtable.h @@ -0,0 +1,6 @@ +#ifndef _HWTABLE_H +#define _HWTABLE_H + +int setup_default_hwtable (vector hw); + +#endif /* _HWTABLE_H */ diff --git a/libmultipath/memory.c b/libmultipath/memory.c new file mode 100644 index 0000000..e846b87 --- /dev/null +++ b/libmultipath/memory.c @@ -0,0 +1,432 @@ +/* + * Part: Memory management framework. This framework is used to + * find any memory leak. + * + * Version: $Id: memory.c,v 1.1.11 2005/03/01 01:22:13 acassen Exp $ + * + * Authors: Alexandre Cassen, <acassen@linux-vs.org> + * Jan Holmberg, <jan@artech.net> + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Copyright (C) 2001-2005 Alexandre Cassen, <acassen@linux-vs.org> + */ + +#include "memory.h" + +/* Global var */ +unsigned long mem_allocated; /* Total memory used in Bytes */ + +void * +xalloc(unsigned long size) +{ + void *mem; + if ((mem = malloc(size))) + mem_allocated += size; + return mem; +} + +void * +zalloc(unsigned long size) +{ + void *mem; + if ((mem = malloc(size))) { + memset(mem, 0, size); + mem_allocated += size; + } + return mem; +} + +void +xfree(void *p) +{ + mem_allocated -= sizeof (p); + free(p); + p = NULL; +} + +/* + * Memory management. in debug mode, + * help finding eventual memory leak. + * Allocation memory types manipulated are : + * + * +type+--------meaning--------+ + * ! 0 ! Free slot ! + * ! 1 ! Overrun ! + * ! 2 ! free null ! + * ! 3 ! realloc null ! + * ! 4 ! Not previus allocated ! + * ! 8 ! Last free list ! + * ! 9 ! Allocated ! + * +----+-----------------------+ + * + * global variabel debug bit 9 ( 512 ) used to + * flag some memory error. + * + */ + +#ifdef _DEBUG_ + +typedef struct { + int type; + int line; + char *func; + char *file; + void *ptr; + unsigned long size; + long csum; +} MEMCHECK; + +/* Last free pointers */ +static MEMCHECK free_list[256]; + +static MEMCHECK alloc_list[MAX_ALLOC_LIST]; +static int number_alloc_list = 0; +static int n = 0; /* Alloc list pointer */ +static int f = 0; /* Free list pointer */ + +char * +dbg_malloc(unsigned long size, char *file, char *function, int line) +{ + void *buf; + int i = 0; + long check; + + buf = zalloc(size + sizeof (long)); + + check = 0xa5a5 + size; + *(long *) ((char *) buf + size) = check; + + while (i < number_alloc_list) { + if (alloc_list[i].type == 0) + break; + i++; + } + + if (i == number_alloc_list) + number_alloc_list++; + + assert(number_alloc_list < MAX_ALLOC_LIST); + + alloc_list[i].ptr = buf; + alloc_list[i].size = size; + alloc_list[i].file = file; + alloc_list[i].func = function; + alloc_list[i].line = line; + alloc_list[i].csum = check; + alloc_list[i].type = 9; + + if (debug & 1) + printf("zalloc[%3d:%3d], %p, %4ld at %s, %3d, %s\n", + i, number_alloc_list, buf, size, file, line, + function); + + n++; + return buf; +} + + + +/* Display a buffer into a HEXA formated output */ +static void +dump_buffer(char *buff, int count) +{ + int i, j, c; + int printnext = 1; + + if (count % 16) + c = count + (16 - count % 16); + else + c = count; + + for (i = 0; i < c; i++) { + if (printnext) { + printnext--; + printf("%.4x ", i & 0xffff); + } + if (i < count) + printf("%3.2x", buff[i] & 0xff); + else + printf(" "); + if (!((i + 1) % 8)) { + if ((i + 1) % 16) + printf(" -"); + else { + printf(" "); + for (j = i - 15; j <= i; j++) + if (j < count) { + if ((buff[j] & 0xff) >= 0x20 + && (buff[j] & 0xff) <= 0x7e) + printf("%c", + buff[j] & 0xff); + else + printf("."); + } else + printf(" "); + printf("\n"); + printnext = 1; + } + } + } +} + +int +dbg_free(void *buffer, char *file, char *function, int line) +{ + int i = 0; + void *buf; + + /* If nullpointer remember */ + if (buffer == NULL) { + i = number_alloc_list++; + + assert(number_alloc_list < MAX_ALLOC_LIST); + + alloc_list[i].ptr = buffer; + alloc_list[i].size = 0; + alloc_list[i].file = file; + alloc_list[i].func = function; + alloc_list[i].line = line; + alloc_list[i].type = 2; + if (debug & 1) + printf("free NULL in %s, %3d, %s\n", file, + line, function); + + debug |= 512; /* Memory Error detect */ + + return n; + } else + buf = buffer; + + while (i < number_alloc_list) { + if (alloc_list[i].type == 9 && alloc_list[i].ptr == buf) { + if (* + ((long *) ((char *) alloc_list[i].ptr + + alloc_list[i].size)) == + alloc_list[i].csum) + alloc_list[i].type = 0; /* Release */ + else { + alloc_list[i].type = 1; /* Overrun */ + if (debug & 1) { + printf("free corrupt, buffer overrun [%3d:%3d], %p, %4ld at %s, %3d, %s\n", + i, number_alloc_list, + buf, alloc_list[i].size, file, + line, function); + dump_buffer(alloc_list[i].ptr, + alloc_list[i].size + sizeof (long)); + printf("Check_sum\n"); + dump_buffer((char *) &alloc_list[i].csum, + sizeof(long)); + + debug |= 512; /* Memory Error detect */ + } + } + break; + } + i++; + } + + /* Not found */ + if (i == number_alloc_list) { + printf("Free ERROR %p\n", buffer); + number_alloc_list++; + + assert(number_alloc_list < MAX_ALLOC_LIST); + + alloc_list[i].ptr = buf; + alloc_list[i].size = 0; + alloc_list[i].file = file; + alloc_list[i].func = function; + alloc_list[i].line = line; + alloc_list[i].type = 4; + debug |= 512; + + return n; + } + + if (buffer != NULL) + xfree(buffer); + + if (debug & 1) + printf("free [%3d:%3d], %p, %4ld at %s, %3d, %s\n", + i, number_alloc_list, buf, + alloc_list[i].size, file, line, function); + + free_list[f].file = file; + free_list[f].line = line; + free_list[f].func = function; + free_list[f].ptr = buffer; + free_list[f].type = 8; + free_list[f].csum = i; /* Using this field for row id */ + + f++; + f &= 255; + n--; + + return n; +} + +void +dbg_free_final(char *banner) +{ + unsigned int sum = 0, overrun = 0, badptr = 0; + int i, j; + i = 0; + + printf("\n---[ Memory dump for (%s)]---\n\n", banner); + + while (i < number_alloc_list) { + switch (alloc_list[i].type) { + case 3: + badptr++; + printf + ("null pointer to realloc(nil,%ld)! at %s, %3d, %s\n", + alloc_list[i].size, alloc_list[i].file, + alloc_list[i].line, alloc_list[i].func); + break; + case 4: + badptr++; + printf + ("pointer not found in table to free(%p) [%3d:%3d], at %s, %3d, %s\n", + alloc_list[i].ptr, i, number_alloc_list, + alloc_list[i].file, alloc_list[i].line, + alloc_list[i].func); + for (j = 0; j < 256; j++) + if (free_list[j].ptr == alloc_list[i].ptr) + if (free_list[j].type == 8) + printf + (" -> pointer already released at [%3d:%3d], at %s, %3d, %s\n", + (int) free_list[j].csum, + number_alloc_list, + free_list[j].file, + free_list[j].line, + free_list[j].func); + break; + case 2: + badptr++; + printf("null pointer to free(nil)! at %s, %3d, %s\n", + alloc_list[i].file, alloc_list[i].line, + alloc_list[i].func); + break; + case 1: + overrun++; + printf("%p [%3d:%3d], %4ld buffer overrun!:\n", + alloc_list[i].ptr, i, number_alloc_list, + alloc_list[i].size); + printf(" --> source of malloc: %s, %3d, %s\n", + alloc_list[i].file, alloc_list[i].line, + alloc_list[i].func); + break; + case 9: + sum += alloc_list[i].size; + printf("%p [%3d:%3d], %4ld not released!:\n", + alloc_list[i].ptr, i, number_alloc_list, + alloc_list[i].size); + printf(" --> source of malloc: %s, %3d, %s\n", + alloc_list[i].file, alloc_list[i].line, + alloc_list[i].func); + break; + } + i++; + } + + printf("\n\n---[ Memory dump summary for (%s) ]---\n", banner); + printf("Total number of bytes not freed...: %d\n", sum); + printf("Number of entries not freed.......: %d\n", n); + printf("Maximum allocated entries.........: %d\n", number_alloc_list); + printf("Number of bad entries.............: %d\n", badptr); + printf("Number of buffer overrun..........: %d\n\n", overrun); + + if (sum || n || badptr || overrun) + printf("=> Program seems to have some memory problem !!!\n\n"); + else + printf("=> Program seems to be memory allocation safe...\n\n"); +} + +void * +dbg_realloc(void *buffer, unsigned long size, char *file, char *function, + int line) +{ + int i = 0; + void *buf, *buf2; + long check; + + if (buffer == NULL) { + printf("realloc %p %s, %3d %s\n", buffer, file, line, function); + i = number_alloc_list++; + + assert(number_alloc_list < MAX_ALLOC_LIST); + + alloc_list[i].ptr = NULL; + alloc_list[i].size = 0; + alloc_list[i].file = file; + alloc_list[i].func = function; + alloc_list[i].line = line; + alloc_list[i].type = 3; + return dbg_malloc(size, file, function, line); + } + + buf = buffer; + + while (i < number_alloc_list) { + if (alloc_list[i].ptr == buf) { + buf = alloc_list[i].ptr; + break; + } + i++; + } + + /* not found */ + if (i == number_alloc_list) { + printf("realloc ERROR no matching zalloc %p \n", buffer); + number_alloc_list++; + + assert(number_alloc_list < MAX_ALLOC_LIST); + + alloc_list[i].ptr = buf; + alloc_list[i].size = 0; + alloc_list[i].file = file; + alloc_list[i].func = function; + alloc_list[i].line = line; + alloc_list[i].type = 9; + debug |= 512; /* Memory Error detect */ + return NULL; + } + + buf2 = ((char *) buf) + alloc_list[i].size; + + if (*(long *) (buf2) != alloc_list[i].csum) { + alloc_list[i].type = 1; + debug |= 512; /* Memory Error detect */ + } + buf = realloc(buffer, size + sizeof (long)); + + check = 0xa5a5 + size; + *(long *) ((char *) buf + size) = check; + alloc_list[i].csum = check; + + if (debug & 1) + printf("realloc [%3d:%3d] %p, %4ld %s %d %s -> %p %4ld %s %d %s\n", + i, number_alloc_list, alloc_list[i].ptr, + alloc_list[i].size, file, line, function, buf, size, + alloc_list[i].file, alloc_list[i].line, + alloc_list[i].func); + + alloc_list[i].ptr = buf; + alloc_list[i].size = size; + alloc_list[i].file = file; + alloc_list[i].line = line; + alloc_list[i].func = function; + + return buf; +} + +#endif diff --git a/libmultipath/memory.h b/libmultipath/memory.h new file mode 100644 index 0000000..55b58bb --- /dev/null +++ b/libmultipath/memory.h @@ -0,0 +1,72 @@ +/* + * Part: memory.c include file. + * + * Version: $Id: memory.h,v 1.1.11 2005/03/01 01:22:13 acassen Exp $ + * + * Authors: Alexandre Cassen, <acassen@linux-vs.org> + * Jan Holmberg, <jan@artech.net> + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Copyright (C) 2001-2005 Alexandre Cassen, <acassen@linux-vs.org> + */ + +#ifndef _MEMORY_H +#define _MEMORY_H + +/* system includes */ +#include <stdio.h> +#include <stdint.h> +#include <stdlib.h> +#include <string.h> +#include <assert.h> + +/* extern types */ +extern unsigned long mem_allocated; +extern void *xalloc(unsigned long size); +extern void *zalloc(unsigned long size); +extern void xfree(void *p); + +/* Global alloc macro */ +#define ALLOC(n) (xalloc(n)) + +/* Local defines */ +#ifdef _DEBUG_ + +int debug; + +#define MAX_ALLOC_LIST 2048 + +#define MALLOC(n) ( dbg_malloc((n), \ + (__FILE__), (char *)(__FUNCTION__), (__LINE__)) ) +#define FREE(b) ( dbg_free((b), \ + (__FILE__), (char *)(__FUNCTION__), (__LINE__)) ) +#define REALLOC(b,n) ( dbg_realloc((b), (n), \ + (__FILE__), (char *)(__FUNCTION__), (__LINE__)) ) + +/* Memory debug prototypes defs */ +extern char *dbg_malloc(unsigned long, char *, char *, int); +extern int dbg_free(void *, char *, char *, int); +extern void *dbg_realloc(void *, unsigned long, char *, char *, int); +extern void dbg_free_final(char *); + +#else + +#define MALLOC(n) (zalloc(n)) +#define FREE(p) (xfree(p)) +#define REALLOC(p,n) (realloc((p),(n))) + +#endif + +/* Common defines */ +#define FREE_PTR(P) if((P)) FREE((P)); + +#endif diff --git a/libmultipath/parser.c b/libmultipath/parser.c new file mode 100644 index 0000000..ebb14cd --- /dev/null +++ b/libmultipath/parser.c @@ -0,0 +1,400 @@ +/* + * Part: Configuration file parser/reader. Place into the dynamic + * data structure representation the conf file + * + * Version: $Id: parser.c,v 1.0.3 2003/05/11 02:28:03 acassen Exp $ + * + * Author: Alexandre Cassen, <acassen@linux-vs.org> + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <syslog.h> + +#include "parser.h" +#include "memory.h" + +/* local vars */ +static int sublevel = 0; + +int +keyword_alloc(vector keywords, char *string, int (*handler) (vector)) +{ + struct keyword *keyword; + + keyword = (struct keyword *) MALLOC(sizeof (struct keyword)); + + if (!keyword) + return 1; + + if (!vector_alloc_slot(keywords)) { + FREE(keyword); + return 1; + } + keyword->string = string; + keyword->handler = handler; + + vector_set_slot(keywords, keyword); + + return 0; +} + +int +install_keyword_root(char *string, int (*handler) (vector)) +{ + return keyword_alloc(keywords, string, handler); +} + +void +install_sublevel(void) +{ + sublevel++; +} + +void +install_sublevel_end(void) +{ + sublevel--; +} + +int +install_keyword(char *string, int (*handler) (vector)) +{ + int i = 0; + struct keyword *keyword; + + /* fetch last keyword */ + keyword = VECTOR_SLOT(keywords, VECTOR_SIZE(keywords) - 1); + + /* position to last sub level */ + for (i = 0; i < sublevel; i++) + keyword = + VECTOR_SLOT(keyword->sub, VECTOR_SIZE(keyword->sub) - 1); + + /* First sub level allocation */ + if (!keyword->sub) + keyword->sub = vector_alloc(); + + if (!keyword->sub) + return 1; + + /* add new sub keyword */ + return keyword_alloc(keyword->sub, string, handler); +} + +void +free_keywords(vector keywords) +{ + struct keyword *keyword; + int i; + + for (i = 0; i < VECTOR_SIZE(keywords); i++) { + keyword = VECTOR_SLOT(keywords, i); + if (keyword->sub) + free_keywords(keyword->sub); + FREE(keyword); + } + vector_free(keywords); +} + +vector +alloc_strvec(char *string) +{ + char *cp, *start, *token; + int strlen; + int in_string; + vector strvec; + + if (!string) + return NULL; + + cp = string; + + /* Skip white spaces */ + while (isspace((int) *cp) && *cp != '\0') + cp++; + + /* Return if there is only white spaces */ + if (*cp == '\0') + return NULL; + + /* Return if string begin with a comment */ + if (*cp == '!' || *cp == '#') + return NULL; + + /* Create a vector and alloc each command piece */ + strvec = vector_alloc(); + + if (!strvec) + return NULL; + + in_string = 0; + while (1) { + if (!vector_alloc_slot(strvec)) + goto out; + + start = cp; + if (*cp == '"') { + cp++; + token = MALLOC(2); + + if (!token) + goto out; + + *(token) = '"'; + *(token + 1) = '\0'; + if (in_string) + in_string = 0; + else + in_string = 1; + + } else { + while ((in_string || !isspace((int) *cp)) && *cp + != '\0' && *cp != '"') + cp++; + strlen = cp - start; + token = MALLOC(strlen + 1); + + if (!token) + goto out; + + memcpy(token, start, strlen); + *(token + strlen) = '\0'; + } + vector_set_slot(strvec, token); + + while (isspace((int) *cp) && *cp != '\0') + cp++; + if (*cp == '\0' || *cp == '!' || *cp == '#') + return strvec; + } +out: + vector_free(strvec); + return NULL; +} + +int +read_line(char *buf, int size) +{ + int ch; + int count = 0; + + while ((ch = fgetc(stream)) != EOF && (int) ch != '\n' + && (int) ch != '\r') { + if (count < size) + buf[count] = (int) ch; + else + break; + count++; + } + return (ch == EOF) ? 0 : 1; +} + +vector +read_value_block(void) +{ + char *buf; + int i; + char *str = NULL; + char *dup; + vector vec = NULL; + vector elements = vector_alloc(); + + buf = (char *) MALLOC(MAXBUF); + + if (!buf) + return NULL; + + if (!elements) + goto out; + + while (read_line(buf, MAXBUF)) { + vec = alloc_strvec(buf); + if (vec) { + str = VECTOR_SLOT(vec, 0); + if (!strcmp(str, EOB)) { + free_strvec(vec); + break; + } + + if (VECTOR_SIZE(vec)) + for (i = 0; i < VECTOR_SIZE(vec); i++) { + str = VECTOR_SLOT(vec, i); + dup = (char *) MALLOC(strlen(str) + 1); + memcpy(dup, str, strlen(str)); + + if (!vector_alloc_slot(elements)) + goto out1; + + vector_set_slot(elements, dup); + } + free_strvec(vec); + } + memset(buf, 0, MAXBUF); + } + FREE(buf); + return elements; +out1: + FREE(dup); +out: + FREE(buf); + return NULL; +} + +int +alloc_value_block(vector strvec, void (*alloc_func) (vector)) +{ + char *buf; + char *str = NULL; + vector vec = NULL; + + buf = (char *) MALLOC(MAXBUF); + + if (!buf) + return 1; + + while (read_line(buf, MAXBUF)) { + vec = alloc_strvec(buf); + if (vec) { + str = VECTOR_SLOT(vec, 0); + if (!strcmp(str, EOB)) { + free_strvec(vec); + break; + } + + if (VECTOR_SIZE(vec)) + (*alloc_func) (vec); + + free_strvec(vec); + } + memset(buf, 0, MAXBUF); + } + FREE(buf); + return 0; +} + +void * +set_value(vector strvec) +{ + char *str = VECTOR_SLOT(strvec, 1); + int size = strlen(str); + int i = 0; + int len = 0; + char *alloc = NULL; + char *tmp; + + if (*str == '"') { + for (i = 2; i < VECTOR_SIZE(strvec); i++) { + str = VECTOR_SLOT(strvec, i); + len += strlen(str); + if (!alloc) + alloc = + (char *) MALLOC(sizeof (char *) * + (len + 1)); + else { + alloc = + REALLOC(alloc, sizeof (char *) * (len + 1)); + tmp = VECTOR_SLOT(strvec, i-1); + if (*str != '"' && *tmp != '"') + strncat(alloc, " ", 1); + } + + if (i != VECTOR_SIZE(strvec)-1) + strncat(alloc, str, strlen(str)); + } + } else { + alloc = MALLOC(sizeof (char *) * (size + 1)); + memcpy(alloc, str, size); + } + return alloc; +} + +/* non-recursive configuration stream handler */ +static int kw_level = 0; +int +process_stream(vector keywords) +{ + int i; + int r = 0; + struct keyword *keyword; + char *str; + char *buf; + vector strvec; + + buf = MALLOC(MAXBUF); + + if (!buf) + return 1; + + while (read_line(buf, MAXBUF)) { + strvec = alloc_strvec(buf); + memset(buf,0, MAXBUF); + + if (!strvec) + continue; + + str = VECTOR_SLOT(strvec, 0); + + if (!strcmp(str, EOB) && kw_level > 0) { + free_strvec(strvec); + break; + } + + for (i = 0; i < VECTOR_SIZE(keywords); i++) { + keyword = VECTOR_SLOT(keywords, i); + + if (!strcmp(keyword->string, str)) { + if (keyword->handler) + r += (*keyword->handler) (strvec); + + if (keyword->sub) { + kw_level++; + r += process_stream(keyword->sub); + kw_level--; + } + break; + } + } + + free_strvec(strvec); + } + + FREE(buf); + return r; +} + +/* Data initialization */ +int +init_data(char *conf_file, vector (*init_keywords) (void)) +{ + int r; + + stream = fopen(conf_file, "r"); + if (!stream) { + syslog(LOG_WARNING, "Configuration file open problem"); + return 1; + } + + /* Init Keywords structure */ + (*init_keywords) (); + +/* Dump configuration * + vector_dump(keywords); + dump_keywords(keywords, 0); +*/ + + /* Stream handling */ + r = process_stream(keywords); + fclose(stream); + free_keywords(keywords); + + return r; +} diff --git a/libmultipath/parser.h b/libmultipath/parser.h new file mode 100644 index 0000000..7a6dd18 --- /dev/null +++ b/libmultipath/parser.h @@ -0,0 +1,73 @@ +/* + * Soft: Keepalived is a failover program for the LVS project + * <www.linuxvirtualserver.org>. It monitor & manipulate + * a loadbalanced server pool using multi-layer checks. + * + * Part: cfreader.c include file. + * + * Version: $Id: parser.h,v 1.0.3 2003/05/11 02:28:03 acassen Exp $ + * + * Author: Alexandre Cassen, <acassen@linux-vs.org> + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _PARSER_H +#define _PARSER_H + +/* system includes */ +#include <stdlib.h> +#include <stdio.h> +#include <string.h> +#include <stdint.h> +#include <syslog.h> +#include <ctype.h> + +/* local includes */ +#include "vector.h" + +/* Global definitions */ +#define EOB "}" +#define MAXBUF 1024 + +/* ketword definition */ +struct keyword { + char *string; + int (*handler) (vector); + vector sub; +}; + +/* Reloading helpers */ +#define SET_RELOAD (reload = 1) +#define UNSET_RELOAD (reload = 0) +#define RELOAD_DELAY 5 + +/* global var exported */ +vector keywords; +FILE *stream; + +/* Prototypes */ +extern int keyword_alloc(vector keywords, char *string, int (*handler) (vector)); +extern int install_keyword_root(char *string, int (*handler) (vector)); +extern void install_sublevel(void); +extern void install_sublevel_end(void); +extern int install_keyword(char *string, int (*handler) (vector)); +extern void dump_keywords(vector keydump, int level); +extern void free_keywords(vector keywords); +extern vector alloc_strvec(char *string); +extern int read_line(char *buf, int size); +extern vector read_value_block(void); +extern int alloc_value_block(vector strvec, void (*alloc_func) (vector)); +extern void *set_value(vector strvec); +extern int process_stream(vector keywords); +extern int init_data(char *conf_file, vector (*init_keywords) (void)); + +#endif diff --git a/libmultipath/pgpolicies.c b/libmultipath/pgpolicies.c new file mode 100644 index 0000000..10e4515 --- /dev/null +++ b/libmultipath/pgpolicies.c @@ -0,0 +1,346 @@ +/* + * Here we define the path grouping policies + */ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +#include "util.h" +#include "memory.h" +#include "vector.h" +#include "structs.h" +#include "pgpolicies.h" + +#include "../libcheckers/path_state.h" + +extern int +get_pgpolicy_id (char * str) +{ + if (0 == strncmp(str, "failover", 8)) + return FAILOVER; + if (0 == strncmp(str, "multibus", 8)) + return MULTIBUS; + if (0 == strncmp(str, "group_by_serial", 15)) + return GROUP_BY_SERIAL; + if (0 == strncmp(str, "group_by_prio", 13)) + return GROUP_BY_PRIO; + if (0 == strncmp(str, "group_by_node_name", 18)) + return GROUP_BY_NODE_NAME; + + return -1; +} + +extern void +get_pgpolicy_name (char * buff, int id) +{ + char * s; + + switch (id) { + case FAILOVER: + s = "failover"; + break; + case MULTIBUS: + s = "multibus"; + break; + case GROUP_BY_SERIAL: + s = "group_by_serial"; + break; + case GROUP_BY_PRIO: + s = "group_by_prio"; + break; + case GROUP_BY_NODE_NAME: + s = "group_by_node_name"; + break; + default: + s = "undefined"; + break; + } + if(safe_snprintf(buff, POLICY_NAME_SIZE, "%s", s)) { + fprintf(stderr, "get_pgpolicy_name: buff too small\n"); + exit(1); + } +} + +/* + * One path group per unique tgt_node_name present in the path vector + */ +extern int +group_by_node_name (struct multipath * mp) { + int i, j; + int * bitmap; + struct path * pp; + struct pathgroup * pgp; + struct path * pp2; + + if (!mp->pg) + mp->pg = vector_alloc(); + + if (!mp->pg) + return 1; + + /* init the bitmap */ + bitmap = (int *)MALLOC(VECTOR_SIZE(mp->paths) * sizeof (int)); + + if (!bitmap) + goto out; + + for (i = 0; i < VECTOR_SIZE(mp->paths); i++) { + + if (bitmap[i]) + continue; + + pp = VECTOR_SLOT(mp->paths, i); + + /* here, we really got a new pg */ + pgp = alloc_pathgroup(); + + if (!pgp) + goto out1; + + if (store_pathgroup(mp->pg, pgp)) + goto out1; + + /* feed the first path */ + if (store_path(pgp->paths, pp)) + goto out1; + + bitmap[i] = 1; + + for (j = i + 1; j < VECTOR_SIZE(mp->paths); j++) { + + if (bitmap[j]) + continue; + + pp2 = VECTOR_SLOT(mp->paths, j); + + if (!strncmp(pp->tgt_node_name, pp2->tgt_node_name, + NODE_NAME_SIZE)) { + if (store_path(pgp->paths, pp2)) + goto out1; + + bitmap[j] = 1; + } + } + } + FREE(bitmap); + free_pathvec(mp->paths, KEEP_PATHS); + mp->paths = NULL; + return 0; +out1: + FREE(bitmap); +out: + free_pgvec(mp->pg, KEEP_PATHS); + return 1; +} + +/* + * One path group per unique serial number present in the path vector + */ +extern int +group_by_serial (struct multipath * mp) { + int i, j; + int * bitmap; + struct path * pp; + struct pathgroup * pgp; + struct path * pp2; + + if (!mp->pg) + mp->pg = vector_alloc(); + + if (!mp->pg) + return 1; + + /* init the bitmap */ + bitmap = (int *)MALLOC(VECTOR_SIZE(mp->paths) * sizeof (int)); + + if (!bitmap) + goto out; + + for (i = 0; i < VECTOR_SIZE(mp->paths); i++) { + + if (bitmap[i]) + continue; + + pp = VECTOR_SLOT(mp->paths, i); + + /* here, we really got a new pg */ + pgp = alloc_pathgroup(); + + if (!pgp) + goto out1; + + if (store_pathgroup(mp->pg, pgp)) + goto out1; + + /* feed the first path */ + if (store_path(pgp->paths, pp)) + goto out1; + + bitmap[i] = 1; + + for (j = i + 1; j < VECTOR_SIZE(mp->paths); j++) { + + if (bitmap[j]) + continue; + + pp2 = VECTOR_SLOT(mp->paths, j); + + if (0 == strcmp(pp->serial, pp2->serial)) { + if (store_path(pgp->paths, pp2)) + goto out1; + + bitmap[j] = 1; + } + } + } + FREE(bitmap); + free_pathvec(mp->paths, KEEP_PATHS); + mp->paths = NULL; + return 0; +out1: + FREE(bitmap); +out: + free_pgvec(mp->pg, KEEP_PATHS); + return 1; +} + +extern int +one_path_per_group (struct multipath * mp) +{ + int i; + struct path * pp; + struct pathgroup * pgp; + + if (!mp->pg) + mp->pg = vector_alloc(); + + if (!mp->pg) + return 1; + + for (i = 0; i < VECTOR_SIZE(mp->paths); i++) { + pp = VECTOR_SLOT(mp->paths, i); + pgp = alloc_pathgroup(); + + if (!pgp) + goto out; + + if (store_pathgroup(mp->pg, pgp)) + goto out; + + if (store_path(pgp->paths, pp)) + goto out; + } + free_pathvec(mp->paths, KEEP_PATHS); + mp->paths = NULL; + return 0; +out: + free_pgvec(mp->pg, KEEP_PATHS); + return 1; +} + +extern int +one_group (struct multipath * mp) /* aka multibus */ +{ + struct pathgroup * pgp; + + if (VECTOR_SIZE(pgp->paths) < 0) + return 0; + + if (!mp->pg) + mp->pg = vector_alloc(); + + if (!mp->pg) + return 1; + + pgp = alloc_pathgroup(); + + if (!pgp) + goto out; + + vector_free(pgp->paths); + pgp->paths = mp->paths; + mp->paths = NULL; + + if (store_pathgroup(mp->pg, pgp)) + goto out; + + return 0; +out: + free_pgvec(mp->pg, KEEP_PATHS); + return 1; +} + +extern int +group_by_prio (struct multipath * mp) +{ + int i; + unsigned int prio; + struct path * pp; + struct pathgroup * pgp; + + if (!mp->pg) + mp->pg = vector_alloc(); + + if (!mp->pg) + return 1; + + while (VECTOR_SIZE(mp->paths) > 0) { + pp = VECTOR_SLOT(mp->paths, 0); + prio = pp->priority; + + /* + * Find the position to insert the new path group. All groups + * are ordered by the priority value (higher value first). + */ + vector_foreach_slot(mp->pg, pgp, i) { + pp = VECTOR_SLOT(pgp->paths, 0); + + if (prio > pp->priority) + break; + } + + /* + * Initialize the new path group. + */ + pgp = alloc_pathgroup(); + + if (!pgp) + goto out; + + if (store_path(pgp->paths, VECTOR_SLOT(mp->paths, 0))) + goto out; + + vector_del_slot(mp->paths, 0); + + /* + * Store the new path group into the vector. + */ + if (i < VECTOR_SIZE(mp->pg)) { + if (!vector_insert_slot(mp->pg, i, pgp)) + goto out; + } else { + if (store_pathgroup(mp->pg, pgp)) + goto out; + } + + /* + * add the other paths with the same prio + */ + vector_foreach_slot(mp->paths, pp, i) { + if (pp->priority == prio) { + if (store_path(pgp->paths, pp)) + goto out; + + vector_del_slot(mp->paths, i); + i--; + } + } + } + free_pathvec(mp->paths, KEEP_PATHS); + mp->paths = NULL; + return 0; +out: + free_pgvec(mp->pg, KEEP_PATHS); + return 1; + +} diff --git a/libmultipath/pgpolicies.h b/libmultipath/pgpolicies.h new file mode 100644 index 0000000..e0a1c65 --- /dev/null +++ b/libmultipath/pgpolicies.h @@ -0,0 +1,34 @@ +#ifndef _PGPOLICIES_H +#define _PGPOLICIES_H + +#if 0 +#ifndef _MAIN_H +#include "main.h" +#endif +#endif + +#define POLICY_NAME_SIZE 32 + +/* Storage controlers capabilities */ +enum iopolicies { + IOPOLICY_RESERVED, + FAILOVER, + MULTIBUS, + GROUP_BY_SERIAL, + GROUP_BY_PRIO, + GROUP_BY_NODE_NAME +}; + +int get_pgpolicy_id(char *); +void get_pgpolicy_name (char *, int); + +/* + * policies + */ +int one_path_per_group(struct multipath *); +int one_group(struct multipath *); +int group_by_serial(struct multipath *); +int group_by_prio(struct multipath *); +int group_by_node_name(struct multipath *); + +#endif diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c new file mode 100644 index 0000000..ce2409b --- /dev/null +++ b/libmultipath/propsel.c @@ -0,0 +1,151 @@ +#include <stdio.h> + +#include "vector.h" +#include "structs.h" +#include "config.h" +#include "debug.h" +#include "pgpolicies.h" + +#include "../libcheckers/checkers.h" + +/* + * selectors : + * traverse the configuration layers from most specific to most generic + * stop at first explicit setting found + */ +extern int +select_pgpolicy (struct multipath * mp) +{ + struct path * pp; + char pgpolicy_name[POLICY_NAME_SIZE]; + + pp = VECTOR_SLOT(mp->paths, 0); + + if (conf->pgpolicy_flag > 0) { + mp->pgpolicy = conf->pgpolicy_flag; + get_pgpolicy_name(pgpolicy_name, mp->pgpolicy); + condlog(3, "pgpolicy = %s (cmd line flag)", pgpolicy_name); + return 0; + } + if (mp->mpe && mp->mpe->pgpolicy > 0) { + mp->pgpolicy = mp->mpe->pgpolicy; + get_pgpolicy_name(pgpolicy_name, mp->pgpolicy); + condlog(3, "pgpolicy = %s (LUN setting)", pgpolicy_name); + return 0; + } + if (mp->hwe && mp->hwe->pgpolicy > 0) { + mp->pgpolicy = mp->hwe->pgpolicy; + get_pgpolicy_name(pgpolicy_name, mp->pgpolicy); + condlog(3, "pgpolicy = %s (controler setting)", pgpolicy_name); + return 0; + } + if (conf->default_pgpolicy > 0) { + mp->pgpolicy = conf->default_pgpolicy; + get_pgpolicy_name(pgpolicy_name, mp->pgpolicy); + condlog(3, "pgpolicy = %s (config file default)", pgpolicy_name); + return 0; + } + mp->pgpolicy = FAILOVER; + get_pgpolicy_name(pgpolicy_name, FAILOVER); + condlog(3, "pgpolicy = %s (internal default)", pgpolicy_name); + return 0; +} + +extern int +select_selector (struct multipath * mp) +{ + if (mp->mpe && mp->mpe->selector) { + mp->selector = mp->mpe->selector; + condlog(3, "selector = %s (LUN setting)", mp->selector); + return 0; + } + if (mp->hwe && mp->hwe->selector) { + mp->selector = mp->hwe->selector; + condlog(3, "selector = %s (controler setting)", mp->selector); + return 0; + } + mp->selector = conf->default_selector; + condlog(3, "selector = %s (internal default)", mp->selector); + return 0; +} + +extern int +select_alias (struct multipath * mp) +{ + if (mp->mpe && mp->mpe->alias) + mp->alias = mp->mpe->alias; + else + mp->alias = mp->wwid; + + return 0; +} + +extern int +select_features (struct multipath * mp) +{ + if (mp->hwe && mp->hwe->features) { + mp->features = mp->hwe->features; + condlog(3, "features = %s (controler setting)", mp->features); + return 0; + } + mp->features = conf->default_features; + condlog(3, "features = %s (internal default)", mp->features); + return 0; +} + +extern int +select_hwhandler (struct multipath * mp) +{ + if (mp->hwe && mp->hwe->hwhandler) { + mp->hwhandler = mp->hwe->hwhandler; + condlog(3, "hwhandler = %s (controler setting)", mp->hwhandler); + return 0; + } + mp->hwhandler = conf->default_hwhandler; + condlog(3, "hwhandler = %s (internal default)", mp->hwhandler); + return 0; +} + +extern int +select_checkfn(struct path *pp) +{ + char checker_name[CHECKER_NAME_SIZE]; + + if (pp->hwe && pp->hwe->checker_index > 0) { + get_checker_name(checker_name, pp->hwe->checker_index); + condlog(3, "path checker = %s (controler setting)", checker_name); + pp->checkfn = get_checker_addr(pp->hwe->checker_index); + return 0; + } + pp->checkfn = &readsector0; + get_checker_name(checker_name, READSECTOR0); + condlog(3, "path checker = %s (internal default)", checker_name); + return 0; +} + +extern int +select_getuid (struct path * pp) +{ + if (pp->hwe && pp->hwe->getuid) { + pp->getuid = pp->hwe->getuid; + condlog(3, "getuid = %s (controler setting)", pp->getuid); + return 0; + } + pp->getuid = conf->default_getuid; + condlog(3, "getuid = %s (internal default)", pp->getuid); + return 0; +} + +extern int +select_getprio (struct path * pp) +{ + if (pp->hwe && pp->hwe->getprio) { + pp->getprio = pp->hwe->getprio; + condlog(3, "getprio = %s (controler setting)", pp->getprio); + return 0; + } + pp->getprio = conf->default_getprio; + condlog(3, "getprio = %s (internal default)", pp->getprio); + return 0; +} + diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h new file mode 100644 index 0000000..9d5366e --- /dev/null +++ b/libmultipath/propsel.h @@ -0,0 +1,9 @@ +int select_pgpolicy (struct multipath * mp); +int select_selector (struct multipath * mp); +int select_alias (struct multipath * mp); +int select_features (struct multipath * mp); +int select_hwhandler (struct multipath * mp); +int select_checkfn(struct path *pp); +int select_getuid (struct path * pp); +int select_getprio (struct path * pp); + diff --git a/libmultipath/regex.c b/libmultipath/regex.c new file mode 100644 index 0000000..3311b50 --- /dev/null +++ b/libmultipath/regex.c @@ -0,0 +1,4030 @@ +/* Extended regular expression matching and search library, + version 0.12. + (Implements POSIX draft P10003.2/D11.2, except for + internationalization features.) + + Copyright (C) 1993 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ + +#define _GNU_SOURCE + +#include <sys/types.h> +#include <stdlib.h> +#include <string.h> + +#ifndef bcmp +#define bcmp(s1, s2, n) memcmp ((s1), (s2), (n)) +#endif +#ifndef bcopy +#define bcopy(s, d, n) memcpy ((d), (s), (n)) +#endif +#ifndef bzero +#define bzero(s, n) memset ((s), 0, (n)) +#endif + +/* Define the syntax stuff for \<, \>, etc. */ + +#ifndef Sword +#define Sword 1 +#endif + +#define CHAR_SET_SIZE 256 + +static char re_syntax_table[CHAR_SET_SIZE]; + +static void init_syntax_once(void) +{ + register int c; + static int done = 0; + + if (done) + return; + + bzero(re_syntax_table, sizeof re_syntax_table); + + for (c = 'a'; c <= 'z'; c++) + re_syntax_table[c] = Sword; + + for (c = 'A'; c <= 'Z'; c++) + re_syntax_table[c] = Sword; + + for (c = '0'; c <= '9'; c++) + re_syntax_table[c] = Sword; + + re_syntax_table['_'] = Sword; + + done = 1; +} + +#define SYNTAX(c) re_syntax_table[c] + +#include "regex.h" +#include <ctype.h> + +#ifdef isblank +#define ISBLANK(c) (isascii (c) && isblank (c)) +#else +#define ISBLANK(c) ((c) == ' ' || (c) == '\t') +#endif +#ifdef isgraph +#define ISGRAPH(c) (isascii (c) && isgraph (c)) +#else +#define ISGRAPH(c) (isascii (c) && isprint (c) && !isspace (c)) +#endif + +#define ISPRINT(c) (isascii (c) && isprint (c)) +#define ISDIGIT(c) (isascii (c) && isdigit (c)) +#define ISALNUM(c) (isascii (c) && isalnum (c)) +#define ISALPHA(c) (isascii (c) && isalpha (c)) +#define ISCNTRL(c) (isascii (c) && iscntrl (c)) +#define ISLOWER(c) (isascii (c) && islower (c)) +#define ISPUNCT(c) (isascii (c) && ispunct (c)) +#define ISSPACE(c) (isascii (c) && isspace (c)) +#define ISUPPER(c) (isascii (c) && isupper (c)) +#define ISXDIGIT(c) (isascii (c) && isxdigit (c)) + +#undef SIGN_EXTEND_CHAR +#define SIGN_EXTEND_CHAR(c) ((signed char) (c)) + +#ifndef alloca +#ifdef __GNUC__ +#define alloca __builtin_alloca +#endif /* not __GNUC__ */ +#endif /* not alloca */ + +#define REGEX_ALLOCATE alloca + +/* Assumes a `char *destination' variable. */ +#define REGEX_REALLOCATE(source, osize, nsize) \ + (destination = (char *) alloca (nsize), \ + bcopy (source, destination, osize), \ + destination) + +/* True if `size1' is non-NULL and PTR is pointing anywhere inside + `string1' or just past its end. This works if PTR is NULL, which is + a good thing. */ +#define FIRST_STRING_P(ptr) \ + (size1 && string1 <= (ptr) && (ptr) <= string1 + size1) + +/* (Re)Allocate N items of type T using malloc, or fail. */ +#define TALLOC(n, t) ((t *) malloc ((n) * sizeof (t))) +#define RETALLOC(addr, n, t) ((addr) = (t *) realloc (addr, (n) * sizeof (t))) +#define REGEX_TALLOC(n, t) ((t *) REGEX_ALLOCATE ((n) * sizeof (t))) + +#define BYTEWIDTH 8 /* In bits. */ + +#define STREQ(s1, s2) ((strcmp (s1, s2) == 0)) + +#define MAX(a, b) ((a) > (b) ? (a) : (b)) +#define MIN(a, b) ((a) < (b) ? (a) : (b)) + +typedef char boolean; +#define false 0 +#define true 1 + +typedef enum { + no_op = 0, + exactn = 1, + anychar, + charset, + charset_not, + start_memory, + stop_memory, + duplicate, + begline, + endline, + begbuf, + endbuf, + jump, + jump_past_alt, + on_failure_jump, + on_failure_keep_string_jump, + pop_failure_jump, + maybe_pop_jump, + dummy_failure_jump, + push_dummy_failure, + succeed_n, + jump_n, + set_number_at, + wordchar, + notwordchar, + wordbeg, + wordend, + wordbound, + notwordbound +} re_opcode_t; + +#define STORE_NUMBER(destination, number) \ + do { \ + (destination)[0] = (number) & 0377; \ + (destination)[1] = (number) >> 8; \ + } while (0) + +#define STORE_NUMBER_AND_INCR(destination, number) \ + do { \ + STORE_NUMBER (destination, number); \ + (destination) += 2; \ + } while (0) + +#define EXTRACT_NUMBER(destination, source) \ + do { \ + (destination) = *(source) & 0377; \ + (destination) += SIGN_EXTEND_CHAR (*((source) + 1)) << 8; \ + } while (0) + +#define EXTRACT_NUMBER_AND_INCR(destination, source) \ + do { \ + EXTRACT_NUMBER (destination, source); \ + (source) += 2; \ + } while (0) + +#undef assert +#define assert(e) + +#define DEBUG_STATEMENT(e) +#define DEBUG_PRINT1(x) +#define DEBUG_PRINT2(x1, x2) +#define DEBUG_PRINT3(x1, x2, x3) +#define DEBUG_PRINT4(x1, x2, x3, x4) +#define DEBUG_PRINT_COMPILED_PATTERN(p, s, e) +#define DEBUG_PRINT_DOUBLE_STRING(w, s1, sz1, s2, sz2) + +reg_syntax_t re_syntax_options = RE_SYNTAX_EMACS; +reg_syntax_t re_set_syntax(syntax) +reg_syntax_t syntax; +{ + reg_syntax_t ret = re_syntax_options; + + re_syntax_options = syntax; + return ret; +} + +/* This table gives an error message for each of the error codes listed + in regex.h. Obviously the order here has to be same as there. */ + +static const char *re_error_msg[] = { NULL, /* REG_NOERROR */ + "No match", /* REG_NOMATCH */ + "Invalid regular expression", /* REG_BADPAT */ + "Invalid collation character", /* REG_ECOLLATE */ + "Invalid character class name", /* REG_ECTYPE */ + "Trailing backslash", /* REG_EESCAPE */ + "Invalid back reference", /* REG_ESUBREG */ + "Unmatched [ or [^", /* REG_EBRACK */ + "Unmatched ( or \\(", /* REG_EPAREN */ + "Unmatched \\{", /* REG_EBRACE */ + "Invalid content of \\{\\}", /* REG_BADBR */ + "Invalid range end", /* REG_ERANGE */ + "Memory exhausted", /* REG_ESPACE */ + "Invalid preceding regular expression", /* REG_BADRPT */ + "Premature end of regular expression", /* REG_EEND */ + "Regular expression too big", /* REG_ESIZE */ + "Unmatched ) or \\)", /* REG_ERPAREN */ +}; + +/* Subroutine declarations and macros for regex_compile. */ + +static reg_errcode_t regex_compile (const char *pattern, size_t size, + reg_syntax_t syntax, + struct re_pattern_buffer * bufp); + +static void store_op1 (re_opcode_t op, unsigned char *loc, int arg); + +static void store_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2); + +static void insert_op1 (re_opcode_t op, unsigned char *loc, int arg, + unsigned char *end); + +static void insert_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2, + unsigned char *end); + +static boolean at_begline_loc_p (const char *pattern, const char *p, + reg_syntax_t syntax); + +static boolean at_endline_loc_p (const char *p, const char *pend, + reg_syntax_t syntax); + +static reg_errcode_t compile_range (const char **p_ptr, const char *pend, + char *translate, reg_syntax_t syntax, + unsigned char *b); + +/* Fetch the next character in the uncompiled pattern---translating it + if necessary. Also cast from a signed character in the constant + string passed to us by the user to an unsigned char that we can use + as an array index (in, e.g., `translate'). */ +#define PATFETCH(c) \ + do {if (p == pend) return REG_EEND; \ + c = (unsigned char) *p++; \ + if (translate) c = translate[c]; \ + } while (0) + +/* Fetch the next character in the uncompiled pattern, with no + translation. */ +#define PATFETCH_RAW(c) \ + do {if (p == pend) return REG_EEND; \ + c = (unsigned char) *p++; \ + } while (0) + +/* Go backwards one character in the pattern. */ +#define PATUNFETCH p-- + + +/* If `translate' is non-null, return translate[D], else just D. We + cast the subscript to translate because some data is declared as + `char *', to avoid warnings when a string constant is passed. But + when we use a character as a subscript we must make it unsigned. */ +#define TRANSLATE(d) (translate ? translate[(unsigned char) (d)] : (d)) + + +/* Macros for outputting the compiled pattern into `buffer'. */ + +/* If the buffer isn't allocated when it comes in, use this. */ +#define INIT_BUF_SIZE 32 + +/* Make sure we have at least N more bytes of space in buffer. */ +#define GET_BUFFER_SPACE(n) \ + while (b - bufp->buffer + (n) > bufp->allocated) \ + EXTEND_BUFFER () + +/* Make sure we have one more byte of buffer space and then add C to it. */ +#define BUF_PUSH(c) \ + do { \ + GET_BUFFER_SPACE (1); \ + *b++ = (unsigned char) (c); \ + } while (0) + + +/* Ensure we have two more bytes of buffer space and then append C1 and C2. */ +#define BUF_PUSH_2(c1, c2) \ + do { \ + GET_BUFFER_SPACE (2); \ + *b++ = (unsigned char) (c1); \ + *b++ = (unsigned char) (c2); \ + } while (0) + + +/* As with BUF_PUSH_2, except for three bytes. */ +#define BUF_PUSH_3(c1, c2, c3) \ + do { \ + GET_BUFFER_SPACE (3); \ + *b++ = (unsigned char) (c1); \ + *b++ = (unsigned char) (c2); \ + *b++ = (unsigned char) (c3); \ + } while (0) + + +/* Store a jump with opcode OP at LOC to location TO. We store a + relative address offset by the three bytes the jump itself occupies. */ +#define STORE_JUMP(op, loc, to) \ + store_op1 (op, loc, (int)((to) - (loc) - 3)) + +/* Likewise, for a two-argument jump. */ +#define STORE_JUMP2(op, loc, to, arg) \ + store_op2 (op, loc, (int)((to) - (loc) - 3), arg) + +/* Like `STORE_JUMP', but for inserting. Assume `b' is the buffer end. */ +#define INSERT_JUMP(op, loc, to) \ + insert_op1 (op, loc, (int)((to) - (loc) - 3), b) + +/* Like `STORE_JUMP2', but for inserting. Assume `b' is the buffer end. */ +#define INSERT_JUMP2(op, loc, to, arg) \ + insert_op2 (op, loc, (int)((to) - (loc) - 3), arg, b) + + +/* This is not an arbitrary limit: the arguments which represent offsets + into the pattern are two bytes long. So if 2^16 bytes turns out to + be too small, many things would have to change. */ +#define MAX_BUF_SIZE (1L << 16) +#define REALLOC realloc + +/* Extend the buffer by twice its current size via realloc and + reset the pointers that pointed into the old block to point to the + correct places in the new one. If extending the buffer results in it + being larger than MAX_BUF_SIZE, then flag memory exhausted. */ +#define EXTEND_BUFFER() \ + do { \ + unsigned char *old_buffer = bufp->buffer; \ + if (bufp->allocated == MAX_BUF_SIZE) \ + return REG_ESIZE; \ + bufp->allocated <<= 1; \ + if (bufp->allocated > MAX_BUF_SIZE) \ + bufp->allocated = MAX_BUF_SIZE; \ + bufp->buffer = (unsigned char *) REALLOC(bufp->buffer, bufp->allocated);\ + if (bufp->buffer == NULL) \ + return REG_ESPACE; \ + /* If the buffer moved, move all the pointers into it. */ \ + if (old_buffer != bufp->buffer) \ + { \ + b = (b - old_buffer) + bufp->buffer; \ + begalt = (begalt - old_buffer) + bufp->buffer; \ + if (fixup_alt_jump) \ + fixup_alt_jump = (fixup_alt_jump - old_buffer) + bufp->buffer;\ + if (laststart) \ + laststart = (laststart - old_buffer) + bufp->buffer; \ + if (pending_exact) \ + pending_exact = (pending_exact - old_buffer) + bufp->buffer; \ + } \ + } while (0) + + +/* Since we have one byte reserved for the register number argument to + {start,stop}_memory, the maximum number of groups we can report + things about is what fits in that byte. */ +#define MAX_REGNUM 255 + +/* But patterns can have more than `MAX_REGNUM' registers. We just + ignore the excess. */ +typedef unsigned regnum_t; + + +/* Macros for the compile stack. */ + +/* Since offsets can go either forwards or backwards, this type needs to + be able to hold values from -(MAX_BUF_SIZE - 1) to MAX_BUF_SIZE - 1. */ +/* int may be not enough when sizeof(int) == 2 */ +typedef long pattern_offset_t; + +typedef struct { + pattern_offset_t begalt_offset; + pattern_offset_t fixup_alt_jump; + pattern_offset_t inner_group_offset; + pattern_offset_t laststart_offset; + regnum_t regnum; +} compile_stack_elt_t; + + +typedef struct { + compile_stack_elt_t *stack; + unsigned size; + unsigned avail; /* Offset of next open position. */ +} compile_stack_type; + + +#define INIT_COMPILE_STACK_SIZE 32 + +#define COMPILE_STACK_EMPTY (compile_stack.avail == 0) +#define COMPILE_STACK_FULL (compile_stack.avail == compile_stack.size) + +/* The next available element. */ +#define COMPILE_STACK_TOP (compile_stack.stack[compile_stack.avail]) + + +/* Set the bit for character C in a list. */ +#define SET_LIST_BIT(c) \ + (b[((unsigned char) (c)) / BYTEWIDTH] \ + |= 1 << (((unsigned char) c) % BYTEWIDTH)) + + +/* Get the next unsigned number in the uncompiled pattern. */ +#define GET_UNSIGNED_NUMBER(num) \ + { if (p != pend) \ + { \ + PATFETCH (c); \ + while (ISDIGIT (c)) \ + { \ + if (num < 0) \ + num = 0; \ + num = num * 10 + c - '0'; \ + if (p == pend) \ + break; \ + PATFETCH (c); \ + } \ + } \ + } + +#define CHAR_CLASS_MAX_LENGTH 6 /* Namely, `xdigit'. */ + +#define IS_CHAR_CLASS(string) \ + (STREQ (string, "alpha") || STREQ (string, "upper") \ + || STREQ (string, "lower") || STREQ (string, "digit") \ + || STREQ (string, "alnum") || STREQ (string, "xdigit") \ + || STREQ (string, "space") || STREQ (string, "print") \ + || STREQ (string, "punct") || STREQ (string, "graph") \ + || STREQ (string, "cntrl") || STREQ (string, "blank")) + +static boolean group_in_compile_stack (compile_stack_type + compile_stack, regnum_t regnum); + +/* `regex_compile' compiles PATTERN (of length SIZE) according to SYNTAX. + Returns one of error codes defined in `regex.h', or zero for success */ + +static reg_errcode_t regex_compile(pattern, size, syntax, bufp) +const char *pattern; +size_t size; +reg_syntax_t syntax; +struct re_pattern_buffer *bufp; +{ + /* We fetch characters from PATTERN here. Even though PATTERN is + `char *' (i.e., signed), we declare these variables as unsigned, so + they can be reliably used as array indices. */ + register unsigned char c, c1; + + /* A random tempory spot in PATTERN. */ + const char *p1; + + /* Points to the end of the buffer, where we should append. */ + register unsigned char *b; + + /* Keeps track of unclosed groups. */ + compile_stack_type compile_stack; + + /* Points to the current (ending) position in the pattern. */ + const char *p = pattern; + const char *pend = pattern + size; + + /* How to translate the characters in the pattern. */ + char *translate = bufp->translate; + + /* Address of the count-byte of the most recently inserted `exactn' + command. This makes it possible to tell if a new exact-match + character can be added to that command or if the character requires + a new `exactn' command. */ + unsigned char *pending_exact = 0; + + /* Address of start of the most recently finished expression. + This tells, e.g., postfix * where to find the start of its + operand. Reset at the beginning of groups and alternatives. */ + unsigned char *laststart = 0; + + /* Address of beginning of regexp, or inside of last group. */ + unsigned char *begalt; + + /* Place in the uncompiled pattern (i.e., the {) to + which to go back if the interval is invalid. */ + const char *beg_interval; + + /* Address of the place where a forward jump should go to the end of + the containing expression. Each alternative of an `or' -- except the + last -- ends with a forward jump of this sort. */ + unsigned char *fixup_alt_jump = 0; + + /* Counts open-groups as they are encountered. Remembered for the + matching close-group on the compile stack, so the same register + number is put in the stop_memory as the start_memory. */ + regnum_t regnum = 0; + + /* Initialize the compile stack. */ + compile_stack.stack = + TALLOC(INIT_COMPILE_STACK_SIZE, compile_stack_elt_t); + if (compile_stack.stack == NULL) + return REG_ESPACE; + + compile_stack.size = INIT_COMPILE_STACK_SIZE; + compile_stack.avail = 0; + + /* Initialize the pattern buffer. */ + bufp->syntax = syntax; + bufp->fastmap_accurate = 0; + bufp->not_bol = bufp->not_eol = 0; + + /* Set `used' to zero, so that if we return an error, the pattern + printer (for debugging) will think there's no pattern. We reset it + at the end. */ + bufp->used = 0; + + /* Always count groups, whether or not bufp->no_sub is set. */ + bufp->re_nsub = 0; + + /* Initialize the syntax table. */ + init_syntax_once(); + + if (bufp->allocated == 0) { + if (bufp->buffer) { + RETALLOC(bufp->buffer, INIT_BUF_SIZE, + unsigned char); + } else { /* Caller did not allocate a buffer. Do it for them. */ + bufp->buffer = + TALLOC(INIT_BUF_SIZE, unsigned char); + } + if (!bufp->buffer) + return REG_ESPACE; + + bufp->allocated = INIT_BUF_SIZE; + } + + begalt = b = bufp->buffer; + + /* Loop through the uncompiled pattern until we're at the end. */ + while (p != pend) { + PATFETCH(c); + + switch (c) { + case '^': + { + if (p == pattern + 1 || + syntax & RE_CONTEXT_INDEP_ANCHORS || + at_begline_loc_p(pattern, p, syntax)) + BUF_PUSH(begline); + else + goto normal_char; + } + break; + + case '$': + { + if (p == pend || + syntax & RE_CONTEXT_INDEP_ANCHORS || + at_endline_loc_p(p, pend, syntax)) + BUF_PUSH(endline); + else + goto normal_char; + } + break; + + case '+': + + case '?': + if ((syntax & RE_BK_PLUS_QM) || + (syntax & RE_LIMITED_OPS)) + goto normal_char; + handle_plus: + + case '*': + /* If there is no previous pattern... */ + if (!laststart) { + if (syntax & RE_CONTEXT_INVALID_OPS) + return REG_BADRPT; + else if (!(syntax & RE_CONTEXT_INDEP_OPS)) + goto normal_char; + } + + { + /* Are we optimizing this jump? */ + boolean keep_string_p = false; + + /* 1 means zero (many) matches is allowed. */ + char zero_times_ok = 0, many_times_ok = 0; + + for (;;) { + zero_times_ok |= c != '+'; + many_times_ok |= c != '?'; + + if (p == pend) + break; + + PATFETCH(c); + + if (c == '*' || (!(syntax & RE_BK_PLUS_QM) && + (c == '+' || c == '?'))); + + else if (syntax & RE_BK_PLUS_QM && c == '\\') { + if (p == pend) + return REG_EESCAPE; + + PATFETCH(c1); + if (!(c1 == '+' || c1 == '?')) { + PATUNFETCH; + PATUNFETCH; + break; + } + + c = c1; + } else { + PATUNFETCH; + break; + } + } + + if (!laststart) + break; + + if (many_times_ok) { + assert(p - 1 > pattern); + + /* Allocate the space for the jump. */ + GET_BUFFER_SPACE(3); + + if (TRANSLATE(*(p - 2)) == TRANSLATE('.') && + zero_times_ok && p < pend && + TRANSLATE(*p) == TRANSLATE('\n') && + !(syntax & RE_DOT_NEWLINE)) { + /* We have .*\n. */ + STORE_JUMP(jump, b, laststart); + keep_string_p = true; + } else + STORE_JUMP(maybe_pop_jump, b, + laststart - 3); + + b += 3; + } + + GET_BUFFER_SPACE(3); + INSERT_JUMP(keep_string_p ? + on_failure_keep_string_jump : + on_failure_jump, laststart, + b + 3); + pending_exact = 0; + b += 3; + + if (!zero_times_ok) { + GET_BUFFER_SPACE(3); + INSERT_JUMP(dummy_failure_jump, + laststart, + laststart + 6); + b += 3; + } + } + break; + + + case '.': + laststart = b; + BUF_PUSH(anychar); + break; + + case '[': + { + boolean had_char_class = false; + + if (p == pend) + return REG_EBRACK; + + GET_BUFFER_SPACE(34); + + laststart = b; + + /* We test `*p == '^' twice, instead of using an if + statement, so we only need one BUF_PUSH. */ + BUF_PUSH(*p == '^' ? charset_not : charset); + if (*p == '^') + p++; + + p1 = p; + + /* Push the number of bytes in the bitmap. */ + BUF_PUSH((1 << BYTEWIDTH) / BYTEWIDTH); + + /* Clear the whole map. */ + bzero(b, (1 << BYTEWIDTH) / BYTEWIDTH); + + if ((re_opcode_t) b[-2] == charset_not + && (syntax & RE_HAT_LISTS_NOT_NEWLINE)) + SET_LIST_BIT('\n'); + + /* Read in characters and ranges, setting map bits. */ + for (;;) { + if (p == pend) + return REG_EBRACK; + + PATFETCH(c); + + if ((syntax & RE_BACKSLASH_ESCAPE_IN_LISTS) && + c == '\\') { + if (p == pend) + return REG_EESCAPE; + + PATFETCH(c1); + SET_LIST_BIT(c1); + continue; + } + + if (c == ']' && p != p1 + 1) + break; + + if (had_char_class && c == '-' && *p != ']') + return REG_ERANGE; + + if (c == '-' && !(p - 2 >= pattern && + p[-2] == '[') && !(p - 3 >= pattern && + p[-3] == '[' && p[-2] == '^') && + *p != ']') { + reg_errcode_t ret = + compile_range(&p, pend, translate, + syntax, b); + if (ret != REG_NOERROR) + return ret; + } + + else if (p[0] == '-' && p[1] != ']') { + reg_errcode_t ret; + + /* Move past the `-'. */ + PATFETCH(c1); + + ret = compile_range(&p, pend, translate, + syntax, b); + if (ret != REG_NOERROR) + return ret; + } + + else if (syntax & RE_CHAR_CLASSES && + c == '[' && *p == ':') { + char str[CHAR_CLASS_MAX_LENGTH + 1]; + + PATFETCH(c); + c1 = 0; + + /* If pattern is `[[:'. */ + if (p == pend) + return REG_EBRACK; + + for (;;) { + PATFETCH(c); + if (c == ':' || c == ']' || + p == pend || c1 == + CHAR_CLASS_MAX_LENGTH) + break; + str[c1++] = c; + } + str[c1] = '\0'; + + if (c == ':' && *p == ']') { + int ch; + boolean is_alnum = + STREQ(str, "alnum"); + boolean is_alpha = + STREQ(str, "alpha"); + boolean is_blank = + STREQ(str, "blank"); + boolean is_cntrl = + STREQ(str, "cntrl"); + boolean is_digit = + STREQ(str, "digit"); + boolean is_graph = + STREQ(str, "graph"); + boolean is_lower = + STREQ(str, "lower"); + boolean is_print = + STREQ(str, "print"); + boolean is_punct = + STREQ(str, "punct"); + boolean is_space = + STREQ(str, "space"); + boolean is_upper = + STREQ(str, "upper"); + boolean is_xdigit = + STREQ(str, "xdigit"); + + if (!IS_CHAR_CLASS(str)) + return REG_ECTYPE; + + PATFETCH(c); + + if (p == pend) + return REG_EBRACK; + + for (ch = 0; ch < 1 << + BYTEWIDTH; ch++) { + if ((is_alnum && + ISALNUM(ch)) || + (is_alpha && + ISALPHA(ch)) || + (is_blank && + ISBLANK(ch)) || + (is_cntrl && + ISCNTRL(ch)) || + (is_digit && + ISDIGIT(ch)) || + (is_graph && + ISGRAPH(ch)) || + (is_lower && + ISLOWER(ch)) || + (is_print && + ISPRINT(ch)) || + (is_punct && + ISPUNCT(ch)) || + (is_space && + ISSPACE(ch)) || + (is_upper && + ISUPPER(ch)) || + (is_xdigit && + ISXDIGIT(ch))) + SET_LIST_BIT(ch); + } + had_char_class = + true; + } else { + c1++; + while (c1--) + PATUNFETCH; + SET_LIST_BIT('['); + SET_LIST_BIT(':'); + had_char_class = false; + } + } else { + had_char_class = false; + SET_LIST_BIT(c); + } + } + + while ((int) b[-1] > 0 + && b[b[-1] - 1] == 0) + b[-1]--; + b += b[-1]; + } + break; + + case '(': + if (syntax & RE_NO_BK_PARENS) + goto handle_open; + else + goto normal_char; + + + case ')': + if (syntax & RE_NO_BK_PARENS) + goto handle_close; + else + goto normal_char; + + + case '\n': + if (syntax & RE_NEWLINE_ALT) + goto handle_alt; + else + goto normal_char; + + + case '|': + if (syntax & RE_NO_BK_VBAR) + goto handle_alt; + else + goto normal_char; + + + case '{': + if (syntax & RE_INTERVALS + && syntax & RE_NO_BK_BRACES) + goto handle_interval; + else + goto normal_char; + + + case '\\': + if (p == pend) + return REG_EESCAPE; + + PATFETCH_RAW(c); + + switch (c) { + case '(': + if (syntax & RE_NO_BK_PARENS) + goto normal_backslash; + + handle_open: + bufp->re_nsub++; + regnum++; + + if (COMPILE_STACK_FULL) { + RETALLOC(compile_stack.stack, + compile_stack.size << 1, + compile_stack_elt_t); + if (compile_stack.stack == NULL) + return REG_ESPACE; + + compile_stack.size <<= 1; + } + + COMPILE_STACK_TOP.begalt_offset = + begalt - bufp->buffer; + COMPILE_STACK_TOP.fixup_alt_jump = + fixup_alt_jump ? fixup_alt_jump - + bufp->buffer + 1 : 0; + COMPILE_STACK_TOP.laststart_offset = + b - bufp->buffer; + COMPILE_STACK_TOP.regnum = regnum; + + if (regnum <= MAX_REGNUM) { + COMPILE_STACK_TOP.inner_group_offset = + b - bufp->buffer + 2; + BUF_PUSH_3(start_memory, regnum, 0); + } + + compile_stack.avail++; + + fixup_alt_jump = 0; + laststart = 0; + begalt = b; + pending_exact = 0; + break; + + case ')': + if (syntax & RE_NO_BK_PARENS) + goto normal_backslash; + + if (COMPILE_STACK_EMPTY) { + if (syntax & RE_UNMATCHED_RIGHT_PAREN_ORD) + goto normal_backslash; + else + return REG_ERPAREN; + } + + handle_close: + if (fixup_alt_jump) { + BUF_PUSH(push_dummy_failure); + STORE_JUMP(jump_past_alt, + fixup_alt_jump, b - 1); + } + + if (COMPILE_STACK_EMPTY) { + if (syntax & RE_UNMATCHED_RIGHT_PAREN_ORD) + goto normal_char; + else + return REG_ERPAREN; + } + + assert(compile_stack.avail != 0); + { + regnum_t this_group_regnum; + + compile_stack.avail--; + begalt = bufp->buffer + + COMPILE_STACK_TOP.begalt_offset; + fixup_alt_jump = + COMPILE_STACK_TOP.fixup_alt_jump ? + bufp->buffer + COMPILE_STACK_TOP. + fixup_alt_jump - 1 : 0; + laststart = bufp->buffer + + COMPILE_STACK_TOP.laststart_offset; + this_group_regnum = COMPILE_STACK_TOP.regnum; + pending_exact = 0; + + if (this_group_regnum <= MAX_REGNUM) { + unsigned char + *inner_group_loc = bufp->buffer + + COMPILE_STACK_TOP. + inner_group_offset; + + *inner_group_loc = regnum - + this_group_regnum; + BUF_PUSH_3(stop_memory, + this_group_regnum, + regnum - this_group_regnum); + } + } + break; + + + case '|': /* `\|'. */ + if (syntax & RE_LIMITED_OPS || syntax & RE_NO_BK_VBAR) + goto normal_backslash; + handle_alt: + if (syntax & RE_LIMITED_OPS) + goto normal_char; + + GET_BUFFER_SPACE(3); + INSERT_JUMP(on_failure_jump, begalt, b + 6); + pending_exact = 0; + b += 3; + + if (fixup_alt_jump) + STORE_JUMP(jump_past_alt, fixup_alt_jump, b); + + fixup_alt_jump = b; + GET_BUFFER_SPACE(3); + b += 3; + + laststart = 0; + begalt = b; + break; + + + case '{': + /* If \{ is a literal. */ + if (!(syntax & RE_INTERVALS) || ((syntax & RE_INTERVALS) + && (syntax & RE_NO_BK_BRACES)) + || (p - 2 == pattern && p == pend)) + goto normal_backslash; + + handle_interval: + { + int lower_bound = -1, upper_bound = -1; + beg_interval = p - 1; + + if (p == pend) { + if (syntax & RE_NO_BK_BRACES) + goto unfetch_interval; + else + return REG_EBRACE; + } + + GET_UNSIGNED_NUMBER(lower_bound); + + if (c == ',') { + GET_UNSIGNED_NUMBER(upper_bound); + if (upper_bound < 0) + upper_bound = RE_DUP_MAX; + } else + upper_bound = lower_bound; + + if (lower_bound < 0 || upper_bound > RE_DUP_MAX + || lower_bound > upper_bound) { + if (syntax & RE_NO_BK_BRACES) + goto unfetch_interval; + else + return REG_BADBR; + } + + if (!(syntax & RE_NO_BK_BRACES)) { + if (c != '\\') + return REG_EBRACE; + + PATFETCH(c); + } + + if (c != '}') { + if (syntax & RE_NO_BK_BRACES) + goto unfetch_interval; + else + return REG_BADBR; + } + + if (!laststart) { + if (syntax & RE_CONTEXT_INVALID_OPS) + return REG_BADRPT; + else if (syntax & RE_CONTEXT_INDEP_OPS) + laststart = b; + else + goto unfetch_interval; + } + + if (upper_bound == 0) { + GET_BUFFER_SPACE(3); + INSERT_JUMP(jump, laststart, b + 3); + b += 3; + } + + else { + unsigned nbytes = + 10 + (upper_bound > 1) * 10; + + GET_BUFFER_SPACE(nbytes); + + INSERT_JUMP2(succeed_n, laststart, + b + 5 + (upper_bound > + 1) * 5, lower_bound); + b += 5; + + insert_op2(set_number_at, laststart, 5, + lower_bound, b); + b += 5; + + if (upper_bound > 1) { + STORE_JUMP2(jump_n, b, + laststart + 5, + upper_bound - 1); + b += 5; + + insert_op2(set_number_at, + laststart, + b - laststart, + upper_bound - 1, b); + b += 5; + } + } + pending_exact = 0; + beg_interval = NULL; + } + break; + + unfetch_interval: + assert(beg_interval); + p = beg_interval; + beg_interval = NULL; + + /* normal_char and normal_backslash need `c'. */ + PATFETCH(c); + + if (!(syntax & RE_NO_BK_BRACES)) { + if (p > pattern && p[-1] == '\\') + goto normal_backslash; + } + goto normal_char; + + case 'w': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + laststart = b; + BUF_PUSH(wordchar); + break; + + + case 'W': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + laststart = b; + BUF_PUSH(notwordchar); + break; + + + case '<': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + BUF_PUSH(wordbeg); + break; + + case '>': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + BUF_PUSH(wordend); + break; + + case 'b': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + BUF_PUSH(wordbound); + break; + + case 'B': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + BUF_PUSH(notwordbound); + break; + + case '`': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + BUF_PUSH(begbuf); + break; + + case '\'': + if (re_syntax_options & RE_NO_GNU_OPS) + goto normal_char; + BUF_PUSH(endbuf); + break; + + case '1': + case '2': + case '3': + case '4': + case '5': + case '6': + case '7': + case '8': + case '9': + if (syntax & RE_NO_BK_REFS) + goto normal_char; + + c1 = c - '0'; + + if (c1 > regnum) + return REG_ESUBREG; + + /* Can't back reference to a subexpression if inside of it. */ + if (group_in_compile_stack + (compile_stack, (regnum_t) c1)) + goto normal_char; + + laststart = b; + BUF_PUSH_2(duplicate, c1); + break; + + + case '+': + case '?': + if (syntax & RE_BK_PLUS_QM) + goto handle_plus; + else + goto normal_backslash; + + default: + normal_backslash: + /* You might think it would be useful for \ to mean + not to translate; but if we don't translate it + it will never match anything. */ + c = TRANSLATE(c); + goto normal_char; + } + break; + + + default: + /* Expects the character in `c'. */ + normal_char: + /* If no exactn currently being built. */ + if (!pending_exact + /* If last exactn not at current position. */ + || pending_exact + *pending_exact + 1 != b + /* We have only one byte following the exactn for the count. */ + || *pending_exact == (1 << BYTEWIDTH) - 1 + /* If followed by a repetition operator. */ + || *p == '*' || *p == '^' + || ((syntax & RE_BK_PLUS_QM) + ? *p == '\\' && (p[1] == '+' + || p[1] == '?') + : (*p == '+' || *p == '?')) + || ((syntax & RE_INTERVALS) + && ((syntax & RE_NO_BK_BRACES) + ? *p == '{' + : (p[0] == '\\' && p[1] == '{')))) { + /* Start building a new exactn. */ + + laststart = b; + + BUF_PUSH_2(exactn, 0); + pending_exact = b - 1; + } + + BUF_PUSH(c); + (*pending_exact)++; + break; + } /* switch (c) */ + } /* while p != pend */ + + + /* Through the pattern now. */ + + if (fixup_alt_jump) + STORE_JUMP(jump_past_alt, fixup_alt_jump, b); + + if (!COMPILE_STACK_EMPTY) + return REG_EPAREN; + + free(compile_stack.stack); + + /* We have succeeded; set the length of the buffer. */ + bufp->used = b - bufp->buffer; + + return REG_NOERROR; +} /* regex_compile */ + +/* Subroutines for `regex_compile'. */ + +/* Store OP at LOC followed by two-byte integer parameter ARG. */ + +static void store_op1(op, loc, arg) +re_opcode_t op; +unsigned char *loc; +int arg; +{ + *loc = (unsigned char) op; + STORE_NUMBER(loc + 1, arg); +} + + +/* Like `store_op1', but for two two-byte parameters ARG1 and ARG2. */ + +static void store_op2(op, loc, arg1, arg2) +re_opcode_t op; +unsigned char *loc; +int arg1, arg2; +{ + *loc = (unsigned char) op; + STORE_NUMBER(loc + 1, arg1); + STORE_NUMBER(loc + 3, arg2); +} + + +/* Copy the bytes from LOC to END to open up three bytes of space at LOC + for OP followed by two-byte integer parameter ARG. */ + +static void insert_op1(op, loc, arg, end) +re_opcode_t op; +unsigned char *loc; +int arg; +unsigned char *end; +{ + register unsigned char *pfrom = end; + register unsigned char *pto = end + 3; + + while (pfrom != loc) + *--pto = *--pfrom; + + store_op1(op, loc, arg); +} + + +/* Like `insert_op1', but for two two-byte parameters ARG1 and ARG2. */ + +static void insert_op2(op, loc, arg1, arg2, end) +re_opcode_t op; +unsigned char *loc; +int arg1, arg2; +unsigned char *end; +{ + register unsigned char *pfrom = end; + register unsigned char *pto = end + 5; + + while (pfrom != loc) + *--pto = *--pfrom; + + store_op2(op, loc, arg1, arg2); +} + + +/* P points to just after a ^ in PATTERN. Return true if that ^ comes + after an alternative or a begin-subexpression. We assume there is at + least one character before the ^. */ + +static boolean at_begline_loc_p(pattern, p, syntax) +const char *pattern, *p; +reg_syntax_t syntax; +{ + const char *prev = p - 2; + boolean prev_prev_backslash = prev > pattern && prev[-1] == '\\'; + + return + /* After a subexpression? */ + (*prev == '(' + && (syntax & RE_NO_BK_PARENS || prev_prev_backslash)) + /* After an alternative? */ + || (*prev == '|' + && (syntax & RE_NO_BK_VBAR || prev_prev_backslash)); +} + + +/* The dual of at_begline_loc_p. This one is for $. We assume there is + at least one character after the $, i.e., `P < PEND'. */ + +static boolean at_endline_loc_p(p, pend, syntax) +const char *p, *pend; +reg_syntax_t syntax; +{ + const char *next = p; + boolean next_backslash = *next == '\\'; + const char *next_next = p + 1 < pend ? p + 1 : NULL; + + return + /* Before a subexpression? */ + (syntax & RE_NO_BK_PARENS ? *next == ')' + : next_backslash && next_next && *next_next == ')') + /* Before an alternative? */ + || (syntax & RE_NO_BK_VBAR ? *next == '|' + : next_backslash && next_next && *next_next == '|'); +} + + +/* Returns true if REGNUM is in one of COMPILE_STACK's elements and + false if it's not. */ + +static boolean group_in_compile_stack(compile_stack, regnum) +compile_stack_type compile_stack; +regnum_t regnum; +{ + int this_element; + + for (this_element = compile_stack.avail - 1; + this_element >= 0; this_element--) + if (compile_stack.stack[this_element].regnum == regnum) + return true; + + return false; +} + + +/* Read the ending character of a range (in a bracket expression) from the + uncompiled pattern *P_PTR (which ends at PEND). We assume the + starting character is in `P[-2]'. (`P[-1]' is the character `-'.) + Then we set the translation of all bits between the starting and + ending characters (inclusive) in the compiled pattern B. + + Return an error code. + + We use these short variable names so we can use the same macros as + `regex_compile' itself. */ + +static reg_errcode_t compile_range(p_ptr, pend, translate, syntax, b) +const char **p_ptr, *pend; +char *translate; +reg_syntax_t syntax; +unsigned char *b; +{ + unsigned this_char; + + const char *p = *p_ptr; + int range_start, range_end; + + if (p == pend) + return REG_ERANGE; + + /* Even though the pattern is a signed `char *', we need to fetch + with unsigned char *'s; if the high bit of the pattern character + is set, the range endpoints will be negative if we fetch using a + signed char *. + + We also want to fetch the endpoints without translating them; the + appropriate translation is done in the bit-setting loop below. */ + range_start = ((unsigned char *) p)[-2]; + range_end = ((unsigned char *) p)[0]; + + /* Have to increment the pointer into the pattern string, so the + caller isn't still at the ending character. */ + (*p_ptr)++; + + /* If the start is after the end, the range is empty. */ + if (range_start > range_end) + return syntax & RE_NO_EMPTY_RANGES ? REG_ERANGE : + REG_NOERROR; + + /* Here we see why `this_char' has to be larger than an `unsigned + char' -- the range is inclusive, so if `range_end' == 0xff + (assuming 8-bit characters), we would otherwise go into an infinite + loop, since all characters <= 0xff. */ + for (this_char = range_start; this_char <= range_end; this_char++) { + SET_LIST_BIT(TRANSLATE(this_char)); + } + return REG_NOERROR; +} + +/* Failure stack declarations and macros; both re_compile_fastmap and + re_match_2 use a failure stack. These have to be macros because of + REGEX_ALLOCATE. */ + + +/* Number of failure points for which to initially allocate space + when matching. If this number is exceeded, we allocate more + space, so it is not a hard limit. */ +#define INIT_FAILURE_ALLOC 5 + +/* Roughly the maximum number of failure points on the stack. Would be + exactly that if always used MAX_FAILURE_SPACE each time we failed. + This is a variable only so users of regex can assign to it; we never + change it ourselves. */ +int re_max_failures = 2000; + +typedef const unsigned char *fail_stack_elt_t; + +typedef struct { + fail_stack_elt_t *stack; + unsigned size; + unsigned avail; /* Offset of next open position. */ +} fail_stack_type; + +#define FAIL_STACK_EMPTY() (fail_stack.avail == 0) +#define FAIL_STACK_PTR_EMPTY() (fail_stack_ptr->avail == 0) +#define FAIL_STACK_FULL() (fail_stack.avail == fail_stack.size) +#define FAIL_STACK_TOP() (fail_stack.stack[fail_stack.avail]) + + +/* Initialize `fail_stack'. Do `return -2' if the alloc fails. */ + +#define INIT_FAIL_STACK() \ + do { \ + fail_stack.stack = (fail_stack_elt_t *) \ + REGEX_ALLOCATE (INIT_FAILURE_ALLOC * sizeof (fail_stack_elt_t)); \ + \ + if (fail_stack.stack == NULL) \ + return -2; \ + \ + fail_stack.size = INIT_FAILURE_ALLOC; \ + fail_stack.avail = 0; \ + } while (0) + + +/* Double the size of FAIL_STACK, up to approximately `re_max_failures' items. + + Return 1 if succeeds, and 0 if either ran out of memory + allocating space for it or it was already too large. + + REGEX_REALLOCATE requires `destination' be declared. */ + +#define DOUBLE_FAIL_STACK(fail_stack) \ + ((fail_stack).size > re_max_failures * MAX_FAILURE_ITEMS \ + ? 0 \ + : ((fail_stack).stack = (fail_stack_elt_t *) \ + REGEX_REALLOCATE ((fail_stack).stack, \ + (fail_stack).size * sizeof (fail_stack_elt_t), \ + ((fail_stack).size << 1) * sizeof (fail_stack_elt_t)), \ + \ + (fail_stack).stack == NULL \ + ? 0 \ + : ((fail_stack).size <<= 1, \ + 1))) + + +/* Push PATTERN_OP on FAIL_STACK. + + Return 1 if was able to do so and 0 if ran out of memory allocating + space to do so. */ +#define PUSH_PATTERN_OP(pattern_op, fail_stack) \ + ((FAIL_STACK_FULL () \ + && !DOUBLE_FAIL_STACK (fail_stack)) \ + ? 0 \ + : ((fail_stack).stack[(fail_stack).avail++] = pattern_op, \ + 1)) + +/* This pushes an item onto the failure stack. Must be a four-byte + value. Assumes the variable `fail_stack'. Probably should only + be called from within `PUSH_FAILURE_POINT'. */ +#define PUSH_FAILURE_ITEM(item) \ + fail_stack.stack[fail_stack.avail++] = (fail_stack_elt_t) item + +/* The complement operation. Assumes `fail_stack' is nonempty. */ +#define POP_FAILURE_ITEM() fail_stack.stack[--fail_stack.avail] + +/* Used to omit pushing failure point id's when we're not debugging. */ +#define DEBUG_PUSH(item) +#define DEBUG_POP(item_addr) + + +/* Push the information about the state we will need + if we ever fail back to it. + + Requires variables fail_stack, regstart, regend, reg_info, and + num_regs be declared. DOUBLE_FAIL_STACK requires `destination' be + declared. + + Does `return FAILURE_CODE' if runs out of memory. */ + +#define PUSH_FAILURE_POINT(pattern_place, string_place, failure_code) \ + do { \ + char *destination; \ + /* Must be int, so when we don't save any registers, the arithmetic \ + of 0 + -1 isn't done as unsigned. */ \ + /* Can't be int, since there is not a shred of a guarantee that int \ + is wide enough to hold a value of something to which pointer can \ + be assigned */ \ + s_reg_t this_reg; \ + \ + DEBUG_STATEMENT (failure_id++); \ + DEBUG_STATEMENT (nfailure_points_pushed++); \ + DEBUG_PRINT2 ("\nPUSH_FAILURE_POINT #%u:\n", failure_id); \ + DEBUG_PRINT2 (" Before push, next avail: %d\n", (fail_stack).avail);\ + DEBUG_PRINT2 (" size: %d\n", (fail_stack).size);\ + \ + DEBUG_PRINT2 (" slots needed: %d\n", NUM_FAILURE_ITEMS); \ + DEBUG_PRINT2 (" available: %d\n", REMAINING_AVAIL_SLOTS); \ + \ + /* Ensure we have enough space allocated for what we will push. */ \ + while (REMAINING_AVAIL_SLOTS < NUM_FAILURE_ITEMS) \ + { \ + if (!DOUBLE_FAIL_STACK (fail_stack)) \ + return failure_code; \ + \ + DEBUG_PRINT2 ("\n Doubled stack; size now: %d\n", \ + (fail_stack).size); \ + DEBUG_PRINT2 (" slots available: %d\n", REMAINING_AVAIL_SLOTS);\ + } + +#define PUSH_FAILURE_POINT2(pattern_place, string_place, failure_code) \ + /* Push the info, starting with the registers. */ \ + DEBUG_PRINT1 ("\n"); \ + \ + PUSH_FAILURE_POINT_LOOP (); \ + \ + DEBUG_PRINT2 (" Pushing low active reg: %d\n", lowest_active_reg);\ + PUSH_FAILURE_ITEM (lowest_active_reg); \ + \ + DEBUG_PRINT2 (" Pushing high active reg: %d\n", highest_active_reg);\ + PUSH_FAILURE_ITEM (highest_active_reg); \ + \ + DEBUG_PRINT2 (" Pushing pattern 0x%x: ", pattern_place); \ + DEBUG_PRINT_COMPILED_PATTERN (bufp, pattern_place, pend); \ + PUSH_FAILURE_ITEM (pattern_place); \ + \ + DEBUG_PRINT2 (" Pushing string 0x%x: `", string_place); \ + DEBUG_PRINT_DOUBLE_STRING (string_place, string1, size1, string2, \ + size2); \ + DEBUG_PRINT1 ("'\n"); \ + PUSH_FAILURE_ITEM (string_place); \ + \ + DEBUG_PRINT2 (" Pushing failure id: %u\n", failure_id); \ + DEBUG_PUSH (failure_id); \ + } while (0) + +/* Pulled out of PUSH_FAILURE_POINT() to shorten the definition + of that macro. (for VAX C) */ +#define PUSH_FAILURE_POINT_LOOP() \ + for (this_reg = lowest_active_reg; this_reg <= highest_active_reg; \ + this_reg++) \ + { \ + DEBUG_PRINT2 (" Pushing reg: %d\n", this_reg); \ + DEBUG_STATEMENT (num_regs_pushed++); \ + \ + DEBUG_PRINT2 (" start: 0x%x\n", regstart[this_reg]); \ + PUSH_FAILURE_ITEM (regstart[this_reg]); \ + \ + DEBUG_PRINT2 (" end: 0x%x\n", regend[this_reg]); \ + PUSH_FAILURE_ITEM (regend[this_reg]); \ + \ + DEBUG_PRINT2 (" info: 0x%x\n ", reg_info[this_reg]); \ + DEBUG_PRINT2 (" match_null=%d", \ + REG_MATCH_NULL_STRING_P (reg_info[this_reg])); \ + DEBUG_PRINT2 (" active=%d", IS_ACTIVE (reg_info[this_reg])); \ + DEBUG_PRINT2 (" matched_something=%d", \ + MATCHED_SOMETHING (reg_info[this_reg])); \ + DEBUG_PRINT2 (" ever_matched=%d", \ + EVER_MATCHED_SOMETHING (reg_info[this_reg])); \ + DEBUG_PRINT1 ("\n"); \ + PUSH_FAILURE_ITEM (reg_info[this_reg].word); \ + } + +/* This is the number of items that are pushed and popped on the stack + for each register. */ +#define NUM_REG_ITEMS 3 + +/* Individual items aside from the registers. */ +#define NUM_NONREG_ITEMS 4 + +/* We push at most this many items on the stack. */ +#define MAX_FAILURE_ITEMS ((num_regs - 1) * NUM_REG_ITEMS + NUM_NONREG_ITEMS) + +/* We actually push this many items. */ +#define NUM_FAILURE_ITEMS \ + ((highest_active_reg - lowest_active_reg + 1) * NUM_REG_ITEMS \ + + NUM_NONREG_ITEMS) + +/* How many items can still be added to the stack without overflowing it. */ +#define REMAINING_AVAIL_SLOTS ((fail_stack).size - (fail_stack).avail) + + +/* Pops what PUSH_FAIL_STACK pushes. + + We restore into the parameters, all of which should be lvalues: + STR -- the saved data position. + PAT -- the saved pattern position. + LOW_REG, HIGH_REG -- the highest and lowest active registers. + REGSTART, REGEND -- arrays of string positions. + REG_INFO -- array of information about each subexpression. + + Also assumes the variables `fail_stack' and (if debugging), `bufp', + `pend', `string1', `size1', `string2', and `size2'. */ + +#define POP_FAILURE_POINT(str, pat, low_reg, high_reg, regstart, regend, reg_info)\ +{ \ + DEBUG_STATEMENT (fail_stack_elt_t failure_id;) \ + s_reg_t this_reg; \ + const unsigned char *string_temp; \ + \ + assert (!FAIL_STACK_EMPTY ()); \ + \ + /* Remove failure points and point to how many regs pushed. */ \ + DEBUG_PRINT1 ("POP_FAILURE_POINT:\n"); \ + DEBUG_PRINT2 (" Before pop, next avail: %d\n", fail_stack.avail); \ + DEBUG_PRINT2 (" size: %d\n", fail_stack.size); \ + \ + assert (fail_stack.avail >= NUM_NONREG_ITEMS); \ + \ + DEBUG_POP (&failure_id); \ + DEBUG_PRINT2 (" Popping failure id: %u\n", failure_id); \ + \ + /* If the saved string location is NULL, it came from an \ + on_failure_keep_string_jump opcode, and we want to throw away the \ + saved NULL, thus retaining our current position in the string. */ \ + string_temp = POP_FAILURE_ITEM (); \ + if (string_temp != NULL) \ + str = (const char *) string_temp; \ + \ + DEBUG_PRINT2 (" Popping string 0x%x: `", str); \ + DEBUG_PRINT_DOUBLE_STRING (str, string1, size1, string2, size2); \ + DEBUG_PRINT1 ("'\n"); \ + \ + pat = (unsigned char *) POP_FAILURE_ITEM (); \ + DEBUG_PRINT2 (" Popping pattern 0x%x: ", pat); \ + DEBUG_PRINT_COMPILED_PATTERN (bufp, pat, pend); \ + \ + POP_FAILURE_POINT2 (low_reg, high_reg, regstart, regend, reg_info); + +/* Pulled out of POP_FAILURE_POINT() to shorten the definition + of that macro. (for MSC 5.1) */ +#define POP_FAILURE_POINT2(low_reg, high_reg, regstart, regend, reg_info) \ + \ + /* Restore register info. */ \ + high_reg = (active_reg_t) POP_FAILURE_ITEM (); \ + DEBUG_PRINT2 (" Popping high active reg: %d\n", high_reg); \ + \ + low_reg = (active_reg_t) POP_FAILURE_ITEM (); \ + DEBUG_PRINT2 (" Popping low active reg: %d\n", low_reg); \ + \ + for (this_reg = high_reg; this_reg >= low_reg; this_reg--) \ + { \ + DEBUG_PRINT2 (" Popping reg: %d\n", this_reg); \ + \ + reg_info[this_reg].word = POP_FAILURE_ITEM (); \ + DEBUG_PRINT2 (" info: 0x%x\n", reg_info[this_reg]); \ + \ + regend[this_reg] = (const char *) POP_FAILURE_ITEM (); \ + DEBUG_PRINT2 (" end: 0x%x\n", regend[this_reg]); \ + \ + regstart[this_reg] = (const char *) POP_FAILURE_ITEM (); \ + DEBUG_PRINT2 (" start: 0x%x\n", regstart[this_reg]); \ + } \ + \ + DEBUG_STATEMENT (nfailure_points_popped++); \ +} /* POP_FAILURE_POINT */ + +/* re_compile_fastmap computes a ``fastmap'' for the compiled pattern in + BUFP. A fastmap records which of the (1 << BYTEWIDTH) possible + characters can start a string that matches the pattern. This fastmap + is used by re_search to skip quickly over impossible starting points. + + The caller must supply the address of a (1 << BYTEWIDTH)-byte data + area as BUFP->fastmap. + + We set the `fastmap', `fastmap_accurate', and `can_be_null' fields in + the pattern buffer. + + Returns 0 if we succeed, -2 if an internal error. */ + +int re_compile_fastmap(bufp) +struct re_pattern_buffer *bufp; +{ + int j, k; + fail_stack_type fail_stack; + char *destination; + /* We don't push any register information onto the failure stack. */ + unsigned num_regs = 0; + + register char *fastmap = bufp->fastmap; + unsigned char *pattern = bufp->buffer; + const unsigned char *p = pattern; + register unsigned char *pend = pattern + bufp->used; + + /* Assume that each path through the pattern can be null until + proven otherwise. We set this false at the bottom of switch + statement, to which we get only if a particular path doesn't + match the empty string. */ + boolean path_can_be_null = true; + + /* We aren't doing a `succeed_n' to begin with. */ + boolean succeed_n_p = false; + + assert(fastmap != NULL && p != NULL); + + INIT_FAIL_STACK(); + bzero(fastmap, 1 << BYTEWIDTH); /* Assume nothing's valid. */ + bufp->fastmap_accurate = 1; /* It will be when we're done. */ + bufp->can_be_null = 0; + + while (p != pend || !FAIL_STACK_EMPTY()) { + if (p == pend) { + bufp->can_be_null |= path_can_be_null; + + /* Reset for next path. */ + path_can_be_null = true; + + p = fail_stack.stack[--fail_stack.avail]; + } + + /* We should never be about to go beyond the end of the pattern. */ + assert(p < pend); + + switch ((re_opcode_t) * p++) { + + /* I guess the idea here is to simply not bother with a fastmap + if a backreference is used, since it's too hard to figure out + the fastmap for the corresponding group. Setting + `can_be_null' stops `re_search_2' from using the fastmap, so + that is all we do. */ + case duplicate: + bufp->can_be_null = 1; + return 0; + + + /* Following are the cases which match a character. These end + with `break'. */ + + case exactn: + fastmap[p[1]] = 1; + break; + + + case charset: + for (j = *p++ * BYTEWIDTH - 1; j >= 0; j--) + if (p[j / BYTEWIDTH] & + (1 << (j % BYTEWIDTH))) + fastmap[j] = 1; + break; + + + case charset_not: + /* Chars beyond end of map must be allowed. */ + for (j = *p * BYTEWIDTH; j < (1 << BYTEWIDTH); j++) + fastmap[j] = 1; + + for (j = *p++ * BYTEWIDTH - 1; j >= 0; j--) + if (! + (p[j / BYTEWIDTH] & + (1 << (j % BYTEWIDTH)))) + fastmap[j] = 1; + break; + + + case wordchar: + for (j = 0; j < (1 << BYTEWIDTH); j++) + if (SYNTAX(j) == Sword) + fastmap[j] = 1; + break; + + + case notwordchar: + for (j = 0; j < (1 << BYTEWIDTH); j++) + if (SYNTAX(j) != Sword) + fastmap[j] = 1; + break; + + + case anychar: + /* `.' matches anything ... */ + for (j = 0; j < (1 << BYTEWIDTH); j++) + fastmap[j] = 1; + + /* ... except perhaps newline. */ + if (!(bufp->syntax & RE_DOT_NEWLINE)) + fastmap['\n'] = 0; + + /* Return if we have already set `can_be_null'; if we have, + then the fastmap is irrelevant. Something's wrong here. */ + else if (bufp->can_be_null) + return 0; + + /* Otherwise, have to check alternative paths. */ + break; + + case no_op: + case begline: + case endline: + case begbuf: + case endbuf: + case wordbound: + case notwordbound: + case wordbeg: + case wordend: + case push_dummy_failure: + continue; + + + case jump_n: + case pop_failure_jump: + case maybe_pop_jump: + case jump: + case jump_past_alt: + case dummy_failure_jump: + EXTRACT_NUMBER_AND_INCR(j, p); + p += j; + if (j > 0) + continue; + + /* Jump backward implies we just went through the body of a + loop and matched nothing. Opcode jumped to should be + `on_failure_jump' or `succeed_n'. Just treat it like an + ordinary jump. For a * loop, it has pushed its failure + point already; if so, discard that as redundant. */ + if ((re_opcode_t) * p != on_failure_jump + && (re_opcode_t) * p != succeed_n) + continue; + + p++; + EXTRACT_NUMBER_AND_INCR(j, p); + p += j; + + /* If what's on the stack is where we are now, pop it. */ + if (!FAIL_STACK_EMPTY() + && fail_stack.stack[fail_stack.avail - 1] == p) + fail_stack.avail--; + + continue; + + + case on_failure_jump: + case on_failure_keep_string_jump: + handle_on_failure_jump: + EXTRACT_NUMBER_AND_INCR(j, p); + + /* For some patterns, e.g., `(a?)?', `p+j' here points to the + end of the pattern. We don't want to push such a point, + since when we restore it above, entering the switch will + increment `p' past the end of the pattern. We don't need + to push such a point since we obviously won't find any more + fastmap entries beyond `pend'. Such a pattern can match + the null string, though. */ + if (p + j < pend) { + if (!PUSH_PATTERN_OP(p + j, fail_stack)) + return -2; + } else + bufp->can_be_null = 1; + + if (succeed_n_p) { + EXTRACT_NUMBER_AND_INCR(k, p); /* Skip the n. */ + succeed_n_p = false; + } + + continue; + + + case succeed_n: + /* Get to the number of times to succeed. */ + p += 2; + + /* Increment p past the n for when k != 0. */ + EXTRACT_NUMBER_AND_INCR(k, p); + if (k == 0) { + p -= 4; + succeed_n_p = true; /* Spaghetti code alert. */ + goto handle_on_failure_jump; + } + continue; + + + case set_number_at: + p += 4; + continue; + + + case start_memory: + case stop_memory: + p += 2; + continue; + + + default: + abort(); /* We have listed all the cases. */ + } /* switch *p++ */ + + /* Getting here means we have found the possible starting + characters for one path of the pattern -- and that the empty + string does not match. We need not follow this path further. + Instead, look at the next alternative (remembered on the + stack), or quit if no more. The test at the top of the loop + does these things. */ + path_can_be_null = false; + p = pend; + } /* while p */ + + /* Set `can_be_null' for the last path (also the first path, if the + pattern is empty). */ + bufp->can_be_null |= path_can_be_null; + return 0; +} /* re_compile_fastmap */ + +/* Set REGS to hold NUM_REGS registers, storing them in STARTS and + ENDS. Subsequent matches using PATTERN_BUFFER and REGS will use + this memory for recording register information. STARTS and ENDS + must be allocated using the malloc library routine, and must each + be at least NUM_REGS * sizeof (regoff_t) bytes long. + + If NUM_REGS == 0, then subsequent matches should allocate their own + register data. + + Unless this function is called, the first search or match using + PATTERN_BUFFER will allocate its own register data, without + freeing the old data. */ + +void re_set_registers(bufp, regs, num_regs, starts, ends) +struct re_pattern_buffer *bufp; +struct re_registers *regs; +unsigned num_regs; +regoff_t *starts, *ends; +{ + if (num_regs) { + bufp->regs_allocated = REGS_REALLOCATE; + regs->num_regs = num_regs; + regs->start = starts; + regs->end = ends; + } else { + bufp->regs_allocated = REGS_UNALLOCATED; + regs->num_regs = 0; + regs->start = regs->end = 0; + } +} + +/* Searching routines. */ + +/* Like re_search_2, below, but only one string is specified, and + doesn't let you say where to stop matching. */ + +int re_search(bufp, string, size, startpos, range, regs) +struct re_pattern_buffer *bufp; +const char *string; +int size, startpos, range; +struct re_registers *regs; +{ + return re_search_2(bufp, NULL, 0, string, size, startpos, range, + regs, size); +} + + +/* Using the compiled pattern in BUFP->buffer, first tries to match the + virtual concatenation of STRING1 and STRING2, starting first at index + STARTPOS, then at STARTPOS + 1, and so on. + + STRING1 and STRING2 have length SIZE1 and SIZE2, respectively. + + RANGE is how far to scan while trying to match. RANGE = 0 means try + only at STARTPOS; in general, the last start tried is STARTPOS + + RANGE. + + In REGS, return the indices of the virtual concatenation of STRING1 + and STRING2 that matched the entire BUFP->buffer and its contained + subexpressions. + + Do not consider matching one past the index STOP in the virtual + concatenation of STRING1 and STRING2. + + We return either the position in the strings at which the match was + found, -1 if no match, or -2 if error (such as failure + stack overflow). */ + +int +re_search_2(bufp, string1, size1, string2, size2, startpos, range, regs, + stop) +struct re_pattern_buffer *bufp; +const char *string1, *string2; +int size1, size2; +int startpos; +int range; +struct re_registers *regs; +int stop; +{ + int val; + register char *fastmap = bufp->fastmap; + register char *translate = bufp->translate; + int total_size = size1 + size2; + int endpos = startpos + range; + + /* Check for out-of-range STARTPOS. */ + if (startpos < 0 || startpos > total_size) + return -1; + + /* Fix up RANGE if it might eventually take us outside + the virtual concatenation of STRING1 and STRING2. */ + if (endpos < -1) + range = -1 - startpos; + else if (endpos > total_size) + range = total_size - startpos; + + /* If the search isn't to be a backwards one, don't waste time in a + search for a pattern that must be anchored. */ + if (bufp->used > 0 && (re_opcode_t) bufp->buffer[0] == begbuf + && range > 0) { + if (startpos > 0) + return -1; + else + range = 1; + } + + /* Update the fastmap now if not correct already. */ + if (fastmap && !bufp->fastmap_accurate) + if (re_compile_fastmap(bufp) == -2) + return -2; + + /* Loop through the string, looking for a place to start matching. */ + for (;;) { + /* If a fastmap is supplied, skip quickly over characters that + cannot be the start of a match. If the pattern can match the + null string, however, we don't need to skip characters; we want + the first null string. */ + if (fastmap && startpos < total_size && !bufp->can_be_null) { + if (range > 0) { /* Searching forwards. */ + register const char *d; + register int lim = 0; + int irange = range; + + if (startpos < size1 + && startpos + range >= size1) + lim = range - (size1 - startpos); + + d = (startpos >= + size1 ? string2 - size1 : string1) + + startpos; + + /* Written out as an if-else to avoid testing `translate' + inside the loop. */ + if (translate) + while (range > lim + && !fastmap[(unsigned char) + translate[(unsigned char) *d++]]) + range--; + else + while (range > lim + && !fastmap[(unsigned char) + *d++]) + range--; + + startpos += irange - range; + } else { /* Searching backwards. */ + + register char c = (size1 == 0 + || startpos >= + size1 ? string2[startpos + - size1] + : string1[startpos]); + + if (!fastmap[(unsigned char) TRANSLATE(c)]) + goto advance; + } + } + + /* If can't match the null string, and that's all we have left, fail. */ + if (range >= 0 && startpos == total_size && fastmap + && !bufp->can_be_null) + return -1; + + val = re_match_2(bufp, string1, size1, string2, size2, + startpos, regs, stop); + if (val >= 0) + return startpos; + + if (val == -2) + return -2; + + advance: + if (!range) + break; + else if (range > 0) { + range--; + startpos++; + } else { + range++; + startpos--; + } + } + return -1; +} /* re_search_2 */ + +/* Structure for per-register (a.k.a. per-group) information. + This must not be longer than one word, because we push this value + onto the failure stack. Other register information, such as the + starting and ending positions (which are addresses), and the list of + inner groups (which is a bits list) are maintained in separate + variables. + + We are making a (strictly speaking) nonportable assumption here: that + the compiler will pack our bit fields into something that fits into + the type of `word', i.e., is something that fits into one item on the + failure stack. */ + +/* Declarations and macros for re_match_2. */ + +typedef union { + fail_stack_elt_t word; + struct { + /* This field is one if this group can match the empty string, + zero if not. If not yet determined, `MATCH_NULL_UNSET_VALUE'. */ +#define MATCH_NULL_UNSET_VALUE 3 + unsigned match_null_string_p:2; + unsigned is_active:1; + unsigned matched_something:1; + unsigned ever_matched_something:1; + } bits; +} register_info_type; + +#define REG_MATCH_NULL_STRING_P(R) ((R).bits.match_null_string_p) +#define IS_ACTIVE(R) ((R).bits.is_active) +#define MATCHED_SOMETHING(R) ((R).bits.matched_something) +#define EVER_MATCHED_SOMETHING(R) ((R).bits.ever_matched_something) + +static boolean group_match_null_string_p (unsigned char **p, + unsigned char *end, + register_info_type * + reg_info); + +static boolean alt_match_null_string_p (unsigned char *p, unsigned char *end, + register_info_type * reg_info); + +static boolean common_op_match_null_string_p (unsigned char **p, + unsigned char *end, + register_info_type * reg_info); + +static int bcmp_translate (const char *s1, const char *s2, + int len, char *translate); + +/* Call this when have matched a real character; it sets `matched' flags + for the subexpressions which we are currently inside. Also records + that those subexprs have matched. */ +#define SET_REGS_MATCHED() \ + do \ + { \ + active_reg_t r; \ + for (r = lowest_active_reg; r <= highest_active_reg; r++) \ + { \ + MATCHED_SOMETHING (reg_info[r]) \ + = EVER_MATCHED_SOMETHING (reg_info[r]) \ + = 1; \ + } \ + } \ + while (0) + + +/* This converts PTR, a pointer into one of the search strings `string1' + and `string2' into an offset from the beginning of that string. */ +#define POINTER_TO_OFFSET(ptr) \ + (FIRST_STRING_P (ptr) ? (ptr) - string1 : (ptr) - string2 + size1) + +/* Registers are set to a sentinel when they haven't yet matched. */ +#define REG_UNSET_VALUE ((char *) -1) +#define REG_UNSET(e) ((e) == REG_UNSET_VALUE) + + +/* Macros for dealing with the split strings in re_match_2. */ + +#define MATCHING_IN_FIRST_STRING (dend == end_match_1) + +/* Call before fetching a character with *d. This switches over to + string2 if necessary. */ +#define PREFETCH() \ + while (d == dend) \ + { \ + /* End of string2 => fail. */ \ + if (dend == end_match_2) \ + goto fail; \ + /* End of string1 => advance to string2. */ \ + d = string2; \ + dend = end_match_2; \ + } + + +/* Test if at very beginning or at very end of the virtual concatenation + of `string1' and `string2'. If only one string, it's `string2'. */ +#define AT_STRINGS_BEG(d) ((d) == (size1 ? string1 : string2) || !size2) +#define AT_STRINGS_END(d) ((d) == end2) + + +/* Test if D points to a character which is word-constituent. We have + two special cases to check for: if past the end of string1, look at + the first character in string2; and if before the beginning of + string2, look at the last character in string1. */ +#define WORDCHAR_P(d) \ + (SYNTAX ((d) == end1 ? *string2 \ + : (d) == string2 - 1 ? *(end1 - 1) : *(d)) \ + == Sword) + +/* Test if the character before D and the one at D differ with respect + to being word-constituent. */ +#define AT_WORD_BOUNDARY(d) \ + (AT_STRINGS_BEG (d) || AT_STRINGS_END (d) \ + || WORDCHAR_P (d - 1) != WORDCHAR_P (d)) + + +/* Free everything we malloc. */ +#define FREE_VARIABLES() alloca (0) + +/* These values must meet several constraints. They must not be valid + register values; since we have a limit of 255 registers (because + we use only one byte in the pattern for the register number), we can + use numbers larger than 255. They must differ by 1, because of + NUM_FAILURE_ITEMS above. And the value for the lowest register must + be larger than the value for the highest register, so we do not try + to actually save any registers when none are active. */ +#define NO_HIGHEST_ACTIVE_REG (1 << BYTEWIDTH) +#define NO_LOWEST_ACTIVE_REG (NO_HIGHEST_ACTIVE_REG + 1) + +/* Matching routines. */ + +/* re_match is like re_match_2 except it takes only a single string. */ + +int re_match(bufp, string, size, pos, regs) +struct re_pattern_buffer *bufp; +const char *string; +int size, pos; +struct re_registers *regs; +{ + return re_match_2(bufp, NULL, 0, string, size, pos, regs, size); +} + +/* re_match_2 matches the compiled pattern in BUFP against the + the (virtual) concatenation of STRING1 and STRING2 (of length SIZE1 + and SIZE2, respectively). We start matching at POS, and stop + matching at STOP. + + If REGS is non-null and the `no_sub' field of BUFP is nonzero, we + store offsets for the substring each group matched in REGS. See the + documentation for exactly how many groups we fill. + + We return -1 if no match, -2 if an internal error (such as the + failure stack overflowing). Otherwise, we return the length of the + matched substring. */ + +int re_match_2(bufp, string1, size1, string2, size2, pos, regs, stop) +struct re_pattern_buffer *bufp; +const char *string1, *string2; +int size1, size2; +int pos; +struct re_registers *regs; +int stop; +{ + /* General temporaries. */ + int mcnt; + unsigned char *p1; + + /* Just past the end of the corresponding string. */ + const char *end1, *end2; + + /* Pointers into string1 and string2, just past the last characters in + each to consider matching. */ + const char *end_match_1, *end_match_2; + + /* Where we are in the data, and the end of the current string. */ + const char *d, *dend; + + /* Where we are in the pattern, and the end of the pattern. */ + unsigned char *p = bufp->buffer; + register unsigned char *pend = p + bufp->used; + + /* We use this to map every character in the string. */ + char *translate = bufp->translate; + + /* Failure point stack. Each place that can handle a failure further + down the line pushes a failure point on this stack. It consists of + restart, regend, and reg_info for all registers corresponding to + the subexpressions we're currently inside, plus the number of such + registers, and, finally, two char *'s. The first char * is where + to resume scanning the pattern; the second one is where to resume + scanning the strings. If the latter is zero, the failure point is + a ``dummy''; if a failure happens and the failure point is a dummy, + it gets discarded and the next next one is tried. */ + fail_stack_type fail_stack; + + /* We fill all the registers internally, independent of what we + return, for use in backreferences. The number here includes + an element for register zero. */ + size_t num_regs = bufp->re_nsub + 1; + + /* The currently active registers. */ + active_reg_t lowest_active_reg = NO_LOWEST_ACTIVE_REG; + active_reg_t highest_active_reg = NO_HIGHEST_ACTIVE_REG; + + /* Information on the contents of registers. These are pointers into + the input strings; they record just what was matched (on this + attempt) by a subexpression part of the pattern, that is, the + regnum-th regstart pointer points to where in the pattern we began + matching and the regnum-th regend points to right after where we + stopped matching the regnum-th subexpression. (The zeroth register + keeps track of what the whole pattern matches.) */ + const char **regstart = 0, **regend = 0; + + /* If a group that's operated upon by a repetition operator fails to + match anything, then the register for its start will need to be + restored because it will have been set to wherever in the string we + are when we last see its open-group operator. Similarly for a + register's end. */ + const char **old_regstart = 0, **old_regend = 0; + + /* The is_active field of reg_info helps us keep track of which (possibly + nested) subexpressions we are currently in. The matched_something + field of reg_info[reg_num] helps us tell whether or not we have + matched any of the pattern so far this time through the reg_num-th + subexpression. These two fields get reset each time through any + loop their register is in. */ + register_info_type *reg_info = 0; + + /* The following record the register info as found in the above + variables when we find a match better than any we've seen before. + This happens as we backtrack through the failure points, which in + turn happens only if we have not yet matched the entire string. */ + unsigned best_regs_set = false; + const char **best_regstart = 0, **best_regend = 0; + + /* Logically, this is `best_regend[0]'. But we don't want to have to + allocate space for that if we're not allocating space for anything + else (see below). Also, we never need info about register 0 for + any of the other register vectors, and it seems rather a kludge to + treat `best_regend' differently than the rest. So we keep track of + the end of the best match so far in a separate variable. We + initialize this to NULL so that when we backtrack the first time + and need to test it, it's not garbage. */ + const char *match_end = NULL; + + /* Used when we pop values we don't care about. */ + const char **reg_dummy = 0; + register_info_type *reg_info_dummy = 0; + + DEBUG_PRINT1("\n\nEntering re_match_2.\n"); + + INIT_FAIL_STACK(); + + /* Do not bother to initialize all the register variables if there are + no groups in the pattern, as it takes a fair amount of time. If + there are groups, we include space for register 0 (the whole + pattern), even though we never use it, since it simplifies the + array indexing. We should fix this. */ + if (bufp->re_nsub) { + regstart = REGEX_TALLOC(num_regs, const char *); + regend = REGEX_TALLOC(num_regs, const char *); + old_regstart = REGEX_TALLOC(num_regs, const char *); + old_regend = REGEX_TALLOC(num_regs, const char *); + best_regstart = REGEX_TALLOC(num_regs, const char *); + best_regend = REGEX_TALLOC(num_regs, const char *); + reg_info = REGEX_TALLOC(num_regs, register_info_type); + reg_dummy = REGEX_TALLOC(num_regs, const char *); + reg_info_dummy = + REGEX_TALLOC(num_regs, register_info_type); + + if (! + (regstart && regend && old_regstart && old_regend + && reg_info && best_regstart && best_regend + && reg_dummy && reg_info_dummy)) { + FREE_VARIABLES(); + return -2; + } + } + + /* The starting position is bogus. */ + if (pos < 0 || pos > size1 + size2) { + FREE_VARIABLES(); + return -1; + } + + /* Initialize subexpression text positions to -1 to mark ones that no + start_memory/stop_memory has been seen for. Also initialize the + register information struct. */ + for (mcnt = 1; mcnt < num_regs; mcnt++) { + regstart[mcnt] = regend[mcnt] + = old_regstart[mcnt] = old_regend[mcnt] = + REG_UNSET_VALUE; + + REG_MATCH_NULL_STRING_P(reg_info[mcnt]) = + MATCH_NULL_UNSET_VALUE; + IS_ACTIVE(reg_info[mcnt]) = 0; + MATCHED_SOMETHING(reg_info[mcnt]) = 0; + EVER_MATCHED_SOMETHING(reg_info[mcnt]) = 0; + } + + /* We move `string1' into `string2' if the latter's empty -- but not if + `string1' is null. */ + if (size2 == 0 && string1 != NULL) { + string2 = string1; + size2 = size1; + string1 = 0; + size1 = 0; + } + end1 = string1 + size1; + end2 = string2 + size2; + + /* Compute where to stop matching, within the two strings. */ + if (stop <= size1) { + end_match_1 = string1 + stop; + end_match_2 = string2; + } else { + end_match_1 = end1; + end_match_2 = string2 + stop - size1; + } + + /* `p' scans through the pattern as `d' scans through the data. + `dend' is the end of the input string that `d' points within. `d' + is advanced into the following input string whenever necessary, but + this happens before fetching; therefore, at the beginning of the + loop, `d' can be pointing at the end of a string, but it cannot + equal `string2'. */ + if (size1 > 0 && pos <= size1) { + d = string1 + pos; + dend = end_match_1; + } else { + d = string2 + pos - size1; + dend = end_match_2; + } + + DEBUG_PRINT1("The compiled pattern is: "); + DEBUG_PRINT_COMPILED_PATTERN(bufp, p, pend); + DEBUG_PRINT1("The string to match is: `"); + DEBUG_PRINT_DOUBLE_STRING(d, string1, size1, string2, size2); + DEBUG_PRINT1("'\n"); + + /* This loops over pattern commands. It exits by returning from the + function if the match is complete, or it drops through if the match + fails at this starting point in the input data. */ + for (;;) { + DEBUG_PRINT2("\n0x%x: ", p); + + if (p == pend) { /* End of pattern means we might have succeeded. */ + DEBUG_PRINT1("end of pattern ... "); + + /* If we haven't matched the entire string, and we want the + longest match, try backtracking. */ + if (d != end_match_2) { + DEBUG_PRINT1("backtracking.\n"); + + if (!FAIL_STACK_EMPTY()) { /* More failure points to try. */ + boolean same_str_p = + (FIRST_STRING_P(match_end) + == MATCHING_IN_FIRST_STRING); + + /* If exceeds best match so far, save it. */ + if (!best_regs_set + || (same_str_p + && d > match_end) + || (!same_str_p + && + !MATCHING_IN_FIRST_STRING)) + { + best_regs_set = true; + match_end = d; + + DEBUG_PRINT1 + ("\nSAVING match as best so far.\n"); + + for (mcnt = 1; + mcnt < num_regs; + mcnt++) { + best_regstart[mcnt] + = + regstart[mcnt]; + best_regend[mcnt] = + regend[mcnt]; + } + } + goto fail; + } + + /* If no failure points, don't restore garbage. */ + else if (best_regs_set) { + restore_best_regs: + /* Restore best match. It may happen that `dend == + end_match_1' while the restored d is in string2. + For example, the pattern `x.*y.*z' against the + strings `x-' and `y-z-', if the two strings are + not consecutive in memory. */ + DEBUG_PRINT1 + ("Restoring best registers.\n"); + + d = match_end; + dend = ((d >= string1 && d <= end1) + ? end_match_1 : + end_match_2); + + for (mcnt = 1; mcnt < num_regs; + mcnt++) { + regstart[mcnt] = + best_regstart[mcnt]; + regend[mcnt] = + best_regend[mcnt]; + } + } + } + /* d != end_match_2 */ + DEBUG_PRINT1("Accepting match.\n"); + + /* If caller wants register contents data back, do it. */ + if (regs && !bufp->no_sub) { + /* Have the register data arrays been allocated? */ + if (bufp->regs_allocated == REGS_UNALLOCATED) { /* No. So allocate them with malloc. We need one + extra element beyond `num_regs' for the `-1' marker + GNU code uses. */ + regs->num_regs = + MAX(RE_NREGS, num_regs + 1); + regs->start = + TALLOC(regs->num_regs, + regoff_t); + regs->end = + TALLOC(regs->num_regs, + regoff_t); + if (regs->start == NULL + || regs->end == NULL) + return -2; + bufp->regs_allocated = + REGS_REALLOCATE; + } else if (bufp->regs_allocated == REGS_REALLOCATE) { /* Yes. If we need more elements than were already + allocated, reallocate them. If we need fewer, just + leave it alone. */ + if (regs->num_regs < num_regs + 1) { + regs->num_regs = + num_regs + 1; + RETALLOC(regs->start, + regs->num_regs, + regoff_t); + RETALLOC(regs->end, + regs->num_regs, + regoff_t); + if (regs->start == NULL + || regs->end == NULL) + return -2; + } + } else { + /* These braces fend off a "empty body in an else-statement" + warning under GCC when assert expands to nothing. */ + assert(bufp->regs_allocated == + REGS_FIXED); + } + + /* Convert the pointer data in `regstart' and `regend' to + indices. Register zero has to be set differently, + since we haven't kept track of any info for it. */ + if (regs->num_regs > 0) { + regs->start[0] = pos; + regs->end[0] = + (MATCHING_IN_FIRST_STRING ? d - + string1 : d - string2 + + size1); + } + + /* Go through the first `min (num_regs, regs->num_regs)' + registers, since that is all we initialized. */ + for (mcnt = 1; + mcnt < MIN(num_regs, regs->num_regs); + mcnt++) { + if (REG_UNSET(regstart[mcnt]) + || REG_UNSET(regend[mcnt])) + regs->start[mcnt] = + regs->end[mcnt] = -1; + else { + regs->start[mcnt] = + POINTER_TO_OFFSET + (regstart[mcnt]); + regs->end[mcnt] = + POINTER_TO_OFFSET + (regend[mcnt]); + } + } + + /* If the regs structure we return has more elements than + were in the pattern, set the extra elements to -1. If + we (re)allocated the registers, this is the case, + because we always allocate enough to have at least one + -1 at the end. */ + for (mcnt = num_regs; + mcnt < regs->num_regs; mcnt++) + regs->start[mcnt] = + regs->end[mcnt] = -1; + } + /* regs && !bufp->no_sub */ + FREE_VARIABLES(); + DEBUG_PRINT4 + ("%u failure points pushed, %u popped (%u remain).\n", + nfailure_points_pushed, + nfailure_points_popped, + nfailure_points_pushed - + nfailure_points_popped); + DEBUG_PRINT2("%u registers pushed.\n", + num_regs_pushed); + + mcnt = d - pos - (MATCHING_IN_FIRST_STRING + ? string1 : string2 - size1); + + DEBUG_PRINT2("Returning %d from re_match_2.\n", + mcnt); + + return mcnt; + } + + /* Otherwise match next pattern command. */ + switch ((re_opcode_t) * p++) { + /* Ignore these. Used to ignore the n of succeed_n's which + currently have n == 0. */ + case no_op: + DEBUG_PRINT1("EXECUTING no_op.\n"); + break; + + + /* Match the next n pattern characters exactly. The following + byte in the pattern defines n, and the n bytes after that + are the characters to match. */ + case exactn: + mcnt = *p++; + DEBUG_PRINT2("EXECUTING exactn %d.\n", mcnt); + + /* This is written out as an if-else so we don't waste time + testing `translate' inside the loop. */ + if (translate) { + do { + PREFETCH(); + if (translate[(unsigned char) *d++] + != (char) *p++) + goto fail; + } + while (--mcnt); + } else { + do { + PREFETCH(); + if (*d++ != (char) *p++) + goto fail; + } + while (--mcnt); + } + SET_REGS_MATCHED(); + break; + + + /* Match any character except possibly a newline or a null. */ + case anychar: + DEBUG_PRINT1("EXECUTING anychar.\n"); + + PREFETCH(); + + if ((!(bufp->syntax & RE_DOT_NEWLINE) + && TRANSLATE(*d) == '\n') + || (bufp->syntax & RE_DOT_NOT_NULL + && TRANSLATE(*d) == '\000')) + goto fail; + + SET_REGS_MATCHED(); + DEBUG_PRINT2(" Matched `%d'.\n", *d); + d++; + break; + + + case charset: + case charset_not: + { + register unsigned char c; + boolean not = + (re_opcode_t) * (p - 1) == charset_not; + + DEBUG_PRINT2("EXECUTING charset%s.\n", + not ? "_not" : ""); + + PREFETCH(); + c = TRANSLATE(*d); /* The character to match. */ + + /* Cast to `unsigned' instead of `unsigned char' in case the + bit list is a full 32 bytes long. */ + if (c < (unsigned) (*p * BYTEWIDTH) + && p[1 + + c / BYTEWIDTH] & (1 << (c % + BYTEWIDTH))) + not = !not; + + p += 1 + *p; + + if (!not) + goto fail; + + SET_REGS_MATCHED(); + d++; + break; + } + + + /* The beginning of a group is represented by start_memory. + The arguments are the register number in the next byte, and the + number of groups inner to this one in the next. The text + matched within the group is recorded (in the internal + registers data structure) under the register number. */ + case start_memory: + DEBUG_PRINT3("EXECUTING start_memory %d (%d):\n", + *p, p[1]); + + /* Find out if this group can match the empty string. */ + p1 = p; /* To send to group_match_null_string_p. */ + + if (REG_MATCH_NULL_STRING_P(reg_info[*p]) == + MATCH_NULL_UNSET_VALUE) + REG_MATCH_NULL_STRING_P(reg_info[*p]) + = group_match_null_string_p(&p1, pend, + reg_info); + + /* Save the position in the string where we were the last time + we were at this open-group operator in case the group is + operated upon by a repetition operator, e.g., with `(a*)*b' + against `ab'; then we want to ignore where we are now in + the string in case this attempt to match fails. */ + old_regstart[*p] = + REG_MATCH_NULL_STRING_P(reg_info[*p]) + ? REG_UNSET(regstart[*p]) ? d : regstart[*p] + : regstart[*p]; + DEBUG_PRINT2(" old_regstart: %d\n", + POINTER_TO_OFFSET(old_regstart[*p])); + + regstart[*p] = d; + DEBUG_PRINT2(" regstart: %d\n", + POINTER_TO_OFFSET(regstart[*p])); + + IS_ACTIVE(reg_info[*p]) = 1; + MATCHED_SOMETHING(reg_info[*p]) = 0; + + /* This is the new highest active register. */ + highest_active_reg = *p; + + /* If nothing was active before, this is the new lowest active + register. */ + if (lowest_active_reg == NO_LOWEST_ACTIVE_REG) + lowest_active_reg = *p; + + /* Move past the register number and inner group count. */ + p += 2; + break; + + + /* The stop_memory opcode represents the end of a group. Its + arguments are the same as start_memory's: the register + number, and the number of inner groups. */ + case stop_memory: + DEBUG_PRINT3("EXECUTING stop_memory %d (%d):\n", + *p, p[1]); + + /* We need to save the string position the last time we were at + this close-group operator in case the group is operated + upon by a repetition operator, e.g., with `((a*)*(b*)*)*' + against `aba'; then we want to ignore where we are now in + the string in case this attempt to match fails. */ + old_regend[*p] = + REG_MATCH_NULL_STRING_P(reg_info[*p]) + ? REG_UNSET(regend[*p]) ? d : regend[*p] + : regend[*p]; + DEBUG_PRINT2(" old_regend: %d\n", + POINTER_TO_OFFSET(old_regend[*p])); + + regend[*p] = d; + DEBUG_PRINT2(" regend: %d\n", + POINTER_TO_OFFSET(regend[*p])); + + /* This register isn't active anymore. */ + IS_ACTIVE(reg_info[*p]) = 0; + + /* If this was the only register active, nothing is active + anymore. */ + if (lowest_active_reg == highest_active_reg) { + lowest_active_reg = NO_LOWEST_ACTIVE_REG; + highest_active_reg = NO_HIGHEST_ACTIVE_REG; + } else { /* We must scan for the new highest active register, since + it isn't necessarily one less than now: consider + (a(b)c(d(e)f)g). When group 3 ends, after the f), the + new highest active register is 1. */ + unsigned char r = *p - 1; + while (r > 0 && !IS_ACTIVE(reg_info[r])) + r--; + + /* If we end up at register zero, that means that we saved + the registers as the result of an `on_failure_jump', not + a `start_memory', and we jumped to past the innermost + `stop_memory'. For example, in ((.)*) we save + registers 1 and 2 as a result of the *, but when we pop + back to the second ), we are at the stop_memory 1. + Thus, nothing is active. */ + if (r == 0) { + lowest_active_reg = + NO_LOWEST_ACTIVE_REG; + highest_active_reg = + NO_HIGHEST_ACTIVE_REG; + } else + highest_active_reg = r; + } + + /* If just failed to match something this time around with a + group that's operated on by a repetition operator, try to + force exit from the ``loop'', and restore the register + information for this group that we had before trying this + last match. */ + if ((!MATCHED_SOMETHING(reg_info[*p]) + || (re_opcode_t) p[-3] == start_memory) + && (p + 2) < pend) { + boolean is_a_jump_n = false; + + p1 = p + 2; + mcnt = 0; + switch ((re_opcode_t) * p1++) { + case jump_n: + is_a_jump_n = true; + case pop_failure_jump: + case maybe_pop_jump: + case jump: + case dummy_failure_jump: + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + if (is_a_jump_n) + p1 += 2; + break; + + default: + /* do nothing */ ; + } + p1 += mcnt; + + /* If the next operation is a jump backwards in the pattern + to an on_failure_jump right before the start_memory + corresponding to this stop_memory, exit from the loop + by forcing a failure after pushing on the stack the + on_failure_jump's jump in the pattern, and d. */ + if (mcnt < 0 + && (re_opcode_t) * p1 == + on_failure_jump + && (re_opcode_t) p1[3] == start_memory + && p1[4] == *p) { + /* If this group ever matched anything, then restore + what its registers were before trying this last + failed match, e.g., with `(a*)*b' against `ab' for + regstart[1], and, e.g., with `((a*)*(b*)*)*' + against `aba' for regend[3]. + + Also restore the registers for inner groups for, + e.g., `((a*)(b*))*' against `aba' (register 3 would + otherwise get trashed). */ + + if (EVER_MATCHED_SOMETHING + (reg_info[*p])) { + unsigned r; + + EVER_MATCHED_SOMETHING + (reg_info[*p]) = 0; + + /* Restore this and inner groups' (if any) registers. */ + for (r = *p; + r < *p + *(p + 1); + r++) { + regstart[r] = + old_regstart + [r]; + + /* xx why this test? */ + if ((s_reg_t) + old_regend[r] + >= + (s_reg_t) + regstart[r]) + regend[r] = + old_regend + [r]; + } + } + p1++; + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + PUSH_FAILURE_POINT(p1 + mcnt, d, + -2); + PUSH_FAILURE_POINT2(p1 + mcnt, d, + -2); + + goto fail; + } + } + + /* Move past the register number and the inner group count. */ + p += 2; + break; + + + /* \<digit> has been turned into a `duplicate' command which is + followed by the numeric value of <digit> as the register number. */ + case duplicate: + { + register const char *d2, *dend2; + int regno = *p++; /* Get which register to match against. */ + DEBUG_PRINT2("EXECUTING duplicate %d.\n", + regno); + + /* Can't back reference a group which we've never matched. */ + if (REG_UNSET(regstart[regno]) + || REG_UNSET(regend[regno])) + goto fail; + + /* Where in input to try to start matching. */ + d2 = regstart[regno]; + + /* Where to stop matching; if both the place to start and + the place to stop matching are in the same string, then + set to the place to stop, otherwise, for now have to use + the end of the first string. */ + + dend2 = ((FIRST_STRING_P(regstart[regno]) + == FIRST_STRING_P(regend[regno])) + ? regend[regno] : end_match_1); + for (;;) { + /* If necessary, advance to next segment in register + contents. */ + while (d2 == dend2) { + if (dend2 == end_match_2) + break; + if (dend2 == regend[regno]) + break; + + /* End of string1 => advance to string2. */ + d2 = string2; + dend2 = regend[regno]; + } + /* At end of register contents => success */ + if (d2 == dend2) + break; + + /* If necessary, advance to next segment in data. */ + PREFETCH(); + + /* How many characters left in this segment to match. */ + mcnt = dend - d; + + /* Want how many consecutive characters we can match in + one shot, so, if necessary, adjust the count. */ + if (mcnt > dend2 - d2) + mcnt = dend2 - d2; + + /* Compare that many; failure if mismatch, else move + past them. */ + if (translate + ? bcmp_translate(d, d2, mcnt, + translate) + : bcmp(d, d2, mcnt)) + goto fail; + d += mcnt, d2 += mcnt; + } + } + break; + + + /* begline matches the empty string at the beginning of the string + (unless `not_bol' is set in `bufp'), and, if + `newline_anchor' is set, after newlines. */ + case begline: + DEBUG_PRINT1("EXECUTING begline.\n"); + + if (AT_STRINGS_BEG(d)) { + if (!bufp->not_bol) + break; + } else if (d[-1] == '\n' && bufp->newline_anchor) { + break; + } + /* In all other cases, we fail. */ + goto fail; + + + /* endline is the dual of begline. */ + case endline: + DEBUG_PRINT1("EXECUTING endline.\n"); + + if (AT_STRINGS_END(d)) { + if (!bufp->not_eol) + break; + } + + /* We have to ``prefetch'' the next character. */ + else if ((d == end1 ? *string2 : *d) == '\n' + && bufp->newline_anchor) { + break; + } + goto fail; + + + /* Match at the very beginning of the data. */ + case begbuf: + DEBUG_PRINT1("EXECUTING begbuf.\n"); + if (AT_STRINGS_BEG(d)) + break; + goto fail; + + + /* Match at the very end of the data. */ + case endbuf: + DEBUG_PRINT1("EXECUTING endbuf.\n"); + if (AT_STRINGS_END(d)) + break; + goto fail; + + + /* on_failure_keep_string_jump is used to optimize `.*\n'. It + pushes NULL as the value for the string on the stack. Then + `pop_failure_point' will keep the current value for the + string, instead of restoring it. To see why, consider + matching `foo\nbar' against `.*\n'. The .* matches the foo; + then the . fails against the \n. But the next thing we want + to do is match the \n against the \n; if we restored the + string value, we would be back at the foo. + + Because this is used only in specific cases, we don't need to + check all the things that `on_failure_jump' does, to make + sure the right things get saved on the stack. Hence we don't + share its code. The only reason to push anything on the + stack at all is that otherwise we would have to change + `anychar's code to do something besides goto fail in this + case; that seems worse than this. */ + case on_failure_keep_string_jump: + DEBUG_PRINT1 + ("EXECUTING on_failure_keep_string_jump"); + + EXTRACT_NUMBER_AND_INCR(mcnt, p); + DEBUG_PRINT3(" %d (to 0x%x):\n", mcnt, p + mcnt); + + PUSH_FAILURE_POINT(p + mcnt, NULL, -2); + PUSH_FAILURE_POINT2(p + mcnt, NULL, -2); + break; + + + /* Uses of on_failure_jump: + + Each alternative starts with an on_failure_jump that points + to the beginning of the next alternative. Each alternative + except the last ends with a jump that in effect jumps past + the rest of the alternatives. (They really jump to the + ending jump of the following alternative, because tensioning + these jumps is a hassle.) + + Repeats start with an on_failure_jump that points past both + the repetition text and either the following jump or + pop_failure_jump back to this on_failure_jump. */ + case on_failure_jump: + on_failure: + DEBUG_PRINT1("EXECUTING on_failure_jump"); + + EXTRACT_NUMBER_AND_INCR(mcnt, p); + DEBUG_PRINT3(" %d (to 0x%x)", mcnt, p + mcnt); + + /* If this on_failure_jump comes right before a group (i.e., + the original * applied to a group), save the information + for that group and all inner ones, so that if we fail back + to this point, the group's information will be correct. + For example, in \(a*\)*\1, we need the preceding group, + and in \(\(a*\)b*\)\2, we need the inner group. */ + + /* We can't use `p' to check ahead because we push + a failure point to `p + mcnt' after we do this. */ + p1 = p; + + /* We need to skip no_op's before we look for the + start_memory in case this on_failure_jump is happening as + the result of a completed succeed_n, as in \(a\)\{1,3\}b\1 + against aba. */ + while (p1 < pend && (re_opcode_t) * p1 == no_op) + p1++; + + if (p1 < pend + && (re_opcode_t) * p1 == start_memory) { + /* We have a new highest active register now. This will + get reset at the start_memory we are about to get to, + but we will have saved all the registers relevant to + this repetition op, as described above. */ + highest_active_reg = *(p1 + 1) + *(p1 + 2); + if (lowest_active_reg == + NO_LOWEST_ACTIVE_REG) + lowest_active_reg = *(p1 + 1); + } + + DEBUG_PRINT1(":\n"); + PUSH_FAILURE_POINT(p + mcnt, d, -2); + PUSH_FAILURE_POINT2(p + mcnt, d, -2); + break; + + + /* A smart repeat ends with `maybe_pop_jump'. + We change it to either `pop_failure_jump' or `jump'. */ + case maybe_pop_jump: + EXTRACT_NUMBER_AND_INCR(mcnt, p); + DEBUG_PRINT2("EXECUTING maybe_pop_jump %d.\n", + mcnt); + { + register unsigned char *p2 = p; + + /* Compare the beginning of the repeat with what in the + pattern follows its end. If we can establish that there + is nothing that they would both match, i.e., that we + would have to backtrack because of (as in, e.g., `a*a') + then we can change to pop_failure_jump, because we'll + never have to backtrack. + + This is not true in the case of alternatives: in + `(a|ab)*' we do need to backtrack to the `ab' alternative + (e.g., if the string was `ab'). But instead of trying to + detect that here, the alternative has put on a dummy + failure point which is what we will end up popping. */ + + /* Skip over open/close-group commands. */ + while (p2 + 2 < pend + && ((re_opcode_t) * p2 == + stop_memory + || (re_opcode_t) * p2 == + start_memory)) + p2 += 3; /* Skip over args, too. */ + + /* If we're at the end of the pattern, we can change. */ + if (p2 == pend) { + /* Consider what happens when matching ":\(.*\)" + against ":/". I don't really understand this code + yet. */ + p[-3] = + (unsigned char) + pop_failure_jump; + DEBUG_PRINT1 + (" End of pattern: change to `pop_failure_jump'.\n"); + } + + else if ((re_opcode_t) * p2 == exactn + || (bufp->newline_anchor + && (re_opcode_t) * p2 == + endline)) { + register unsigned char c = + *p2 == + (unsigned char) endline ? '\n' + : p2[2]; + p1 = p + mcnt; + + /* p1[0] ... p1[2] are the `on_failure_jump' corresponding + to the `maybe_finalize_jump' of this case. Examine what + follows. */ + if ((re_opcode_t) p1[3] == exactn + && p1[5] != c) { + p[-3] = + (unsigned char) + pop_failure_jump; + DEBUG_PRINT3 + (" %c != %c => pop_failure_jump.\n", + c, p1[5]); + } + + else if ((re_opcode_t) p1[3] == + charset + || (re_opcode_t) p1[3] == + charset_not) { + int not = + (re_opcode_t) p1[3] == + charset_not; + + if (c < + (unsigned char) (p1[4] + * + BYTEWIDTH) + && p1[5 + + c / + BYTEWIDTH] & (1 + << + (c + % + BYTEWIDTH))) + not = !not; + + /* `not' is equal to 1 if c would match, which means + that we can't change to pop_failure_jump. */ + if (!not) { + p[-3] = + (unsigned char) + pop_failure_jump; + DEBUG_PRINT1 + (" No match => pop_failure_jump.\n"); + } + } + } + } + p -= 2; /* Point at relative address again. */ + if ((re_opcode_t) p[-1] != pop_failure_jump) { + p[-1] = (unsigned char) jump; + DEBUG_PRINT1(" Match => jump.\n"); + goto unconditional_jump; + } + /* Note fall through. */ + + + /* The end of a simple repeat has a pop_failure_jump back to + its matching on_failure_jump, where the latter will push a + failure point. The pop_failure_jump takes off failure + points put on by this pop_failure_jump's matching + on_failure_jump; we got through the pattern to here from the + matching on_failure_jump, so didn't fail. */ + case pop_failure_jump: + { + /* We need to pass separate storage for the lowest and + highest registers, even though we don't care about the + actual values. Otherwise, we will restore only one + register from the stack, since lowest will == highest in + `pop_failure_point'. */ + active_reg_t dummy_low_reg, dummy_high_reg; + unsigned char *pdummy; + const char *sdummy; + + DEBUG_PRINT1 + ("EXECUTING pop_failure_jump.\n"); + POP_FAILURE_POINT(sdummy, pdummy, + dummy_low_reg, + dummy_high_reg, + reg_dummy, reg_dummy, + reg_info_dummy); + } + /* Note fall through. */ + + + /* Unconditionally jump (without popping any failure points). */ + case jump: + unconditional_jump: + EXTRACT_NUMBER_AND_INCR(mcnt, p); /* Get the amount to jump. */ + DEBUG_PRINT2("EXECUTING jump %d ", mcnt); + p += mcnt; /* Do the jump. */ + DEBUG_PRINT2("(to 0x%x).\n", p); + break; + + + /* We need this opcode so we can detect where alternatives end + in `group_match_null_string_p' et al. */ + case jump_past_alt: + DEBUG_PRINT1("EXECUTING jump_past_alt.\n"); + goto unconditional_jump; + + + /* Normally, the on_failure_jump pushes a failure point, which + then gets popped at pop_failure_jump. We will end up at + pop_failure_jump, also, and with a pattern of, say, `a+', we + are skipping over the on_failure_jump, so we have to push + something meaningless for pop_failure_jump to pop. */ + case dummy_failure_jump: + DEBUG_PRINT1("EXECUTING dummy_failure_jump.\n"); + /* It doesn't matter what we push for the string here. What + the code at `fail' tests is the value for the pattern. */ + PUSH_FAILURE_POINT(0, 0, -2); + PUSH_FAILURE_POINT2(0, 0, -2); + goto unconditional_jump; + + + /* At the end of an alternative, we need to push a dummy failure + point in case we are followed by a `pop_failure_jump', because + we don't want the failure point for the alternative to be + popped. For example, matching `(a|ab)*' against `aab' + requires that we match the `ab' alternative. */ + case push_dummy_failure: + DEBUG_PRINT1("EXECUTING push_dummy_failure.\n"); + /* See comments just above at `dummy_failure_jump' about the + two zeroes. */ + PUSH_FAILURE_POINT(0, 0, -2); + PUSH_FAILURE_POINT2(0, 0, -2); + break; + + /* Have to succeed matching what follows at least n times. + After that, handle like `on_failure_jump'. */ + case succeed_n: + EXTRACT_NUMBER(mcnt, p + 2); + DEBUG_PRINT2("EXECUTING succeed_n %d.\n", mcnt); + + assert(mcnt >= 0); + /* Originally, this is how many times we HAVE to succeed. */ + if (mcnt > 0) { + mcnt--; + p += 2; + STORE_NUMBER_AND_INCR(p, mcnt); + DEBUG_PRINT3(" Setting 0x%x to %d.\n", p, + mcnt); + } else if (mcnt == 0) { + DEBUG_PRINT2 + (" Setting two bytes from 0x%x to no_op.\n", + p + 2); + p[2] = (unsigned char) no_op; + p[3] = (unsigned char) no_op; + goto on_failure; + } + break; + + case jump_n: + EXTRACT_NUMBER(mcnt, p + 2); + DEBUG_PRINT2("EXECUTING jump_n %d.\n", mcnt); + + /* Originally, this is how many times we CAN jump. */ + if (mcnt) { + mcnt--; + STORE_NUMBER(p + 2, mcnt); + goto unconditional_jump; + } + /* If don't have to jump any more, skip over the rest of command. */ + else + p += 4; + break; + + case set_number_at: + { + DEBUG_PRINT1("EXECUTING set_number_at.\n"); + + EXTRACT_NUMBER_AND_INCR(mcnt, p); + p1 = p + mcnt; + EXTRACT_NUMBER_AND_INCR(mcnt, p); + DEBUG_PRINT3(" Setting 0x%x to %d.\n", p1, + mcnt); + STORE_NUMBER(p1, mcnt); + break; + } + + case wordbound: + DEBUG_PRINT1("EXECUTING wordbound.\n"); + if (AT_WORD_BOUNDARY(d)) + break; + goto fail; + + case notwordbound: + DEBUG_PRINT1("EXECUTING notwordbound.\n"); + if (AT_WORD_BOUNDARY(d)) + goto fail; + break; + + case wordbeg: + DEBUG_PRINT1("EXECUTING wordbeg.\n"); + if (WORDCHAR_P(d) + && (AT_STRINGS_BEG(d) || !WORDCHAR_P(d - 1))) + break; + goto fail; + + case wordend: + DEBUG_PRINT1("EXECUTING wordend.\n"); + if (!AT_STRINGS_BEG(d) && WORDCHAR_P(d - 1) + && (!WORDCHAR_P(d) || AT_STRINGS_END(d))) + break; + goto fail; + + case wordchar: + DEBUG_PRINT1("EXECUTING non-Emacs wordchar.\n"); + PREFETCH(); + if (!WORDCHAR_P(d)) + goto fail; + SET_REGS_MATCHED(); + d++; + break; + + case notwordchar: + DEBUG_PRINT1("EXECUTING non-Emacs notwordchar.\n"); + PREFETCH(); + if (WORDCHAR_P(d)) + goto fail; + SET_REGS_MATCHED(); + d++; + break; + + default: + abort(); + } + continue; /* Successfully executed one pattern command; keep going. */ + + + /* We goto here if a matching operation fails. */ + fail: + if (!FAIL_STACK_EMPTY()) { /* A restart point is known. Restore to that state. */ + DEBUG_PRINT1("\nFAIL:\n"); + POP_FAILURE_POINT(d, p, + lowest_active_reg, + highest_active_reg, regstart, + regend, reg_info); + + /* If this failure point is a dummy, try the next one. */ + if (!p) + goto fail; + + /* If we failed to the end of the pattern, don't examine *p. */ + assert(p <= pend); + if (p < pend) { + boolean is_a_jump_n = false; + + /* If failed to a backwards jump that's part of a repetition + loop, need to pop this failure point and use the next one. */ + switch ((re_opcode_t) * p) { + case jump_n: + is_a_jump_n = true; + case maybe_pop_jump: + case pop_failure_jump: + case jump: + p1 = p + 1; + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + p1 += mcnt; + + if ((is_a_jump_n + && (re_opcode_t) * p1 == + succeed_n) + || (!is_a_jump_n + && (re_opcode_t) * p1 == + on_failure_jump)) + goto fail; + break; + default: + /* do nothing */ ; + } + } + + if (d >= string1 && d <= end1) + dend = end_match_1; + } else + break; /* Matching at this starting point really fails. */ + } /* for (;;) */ + + if (best_regs_set) + goto restore_best_regs; + + FREE_VARIABLES(); + + return -1; /* Failure to match. */ +} /* re_match_2 */ + +/* Subroutine definitions for re_match_2. */ + + +/* We are passed P pointing to a register number after a start_memory. + + Return true if the pattern up to the corresponding stop_memory can + match the empty string, and false otherwise. + + If we find the matching stop_memory, sets P to point to one past its number. + Otherwise, sets P to an undefined byte less than or equal to END. + + We don't handle duplicates properly (yet). */ + +static boolean group_match_null_string_p(p, end, reg_info) +unsigned char **p, *end; +register_info_type *reg_info; +{ + int mcnt; + /* Point to after the args to the start_memory. */ + unsigned char *p1 = *p + 2; + + while (p1 < end) { + /* Skip over opcodes that can match nothing, and return true or + false, as appropriate, when we get to one that can't, or to the + matching stop_memory. */ + + switch ((re_opcode_t) * p1) { + /* Could be either a loop or a series of alternatives. */ + case on_failure_jump: + p1++; + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + + /* If the next operation is not a jump backwards in the + pattern. */ + + if (mcnt >= 0) { + /* Go through the on_failure_jumps of the alternatives, + seeing if any of the alternatives cannot match nothing. + The last alternative starts with only a jump, + whereas the rest start with on_failure_jump and end + with a jump, e.g., here is the pattern for `a|b|c': + + /on_failure_jump/0/6/exactn/1/a/jump_past_alt/0/6 + /on_failure_jump/0/6/exactn/1/b/jump_past_alt/0/3 + /exactn/1/c + + So, we have to first go through the first (n-1) + alternatives and then deal with the last one separately. */ + + + /* Deal with the first (n-1) alternatives, which start + with an on_failure_jump (see above) that jumps to right + past a jump_past_alt. */ + + while ((re_opcode_t) p1[mcnt - 3] == + jump_past_alt) { + /* `mcnt' holds how many bytes long the alternative + is, including the ending `jump_past_alt' and + its number. */ + + if (!alt_match_null_string_p + (p1, p1 + mcnt - 3, reg_info)) + return false; + + /* Move to right after this alternative, including the + jump_past_alt. */ + p1 += mcnt; + + /* Break if it's the beginning of an n-th alternative + that doesn't begin with an on_failure_jump. */ + if ((re_opcode_t) * p1 != + on_failure_jump) + break; + + /* Still have to check that it's not an n-th + alternative that starts with an on_failure_jump. */ + p1++; + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + if ((re_opcode_t) p1[mcnt - 3] != + jump_past_alt) { + /* Get to the beginning of the n-th alternative. */ + p1 -= 3; + break; + } + } + + /* Deal with the last alternative: go back and get number + of the `jump_past_alt' just before it. `mcnt' contains + the length of the alternative. */ + EXTRACT_NUMBER(mcnt, p1 - 2); + + if (!alt_match_null_string_p + (p1, p1 + mcnt, reg_info)) + return false; + + p1 += mcnt; /* Get past the n-th alternative. */ + } /* if mcnt > 0 */ + break; + + + case stop_memory: + assert(p1[1] == **p); + *p = p1 + 2; + return true; + + + default: + if (!common_op_match_null_string_p + (&p1, end, reg_info)) + return false; + } + } /* while p1 < end */ + + return false; +} /* group_match_null_string_p */ + + +/* Similar to group_match_null_string_p, but doesn't deal with alternatives: + It expects P to be the first byte of a single alternative and END one + byte past the last. The alternative can contain groups. */ + +static boolean alt_match_null_string_p(p, end, reg_info) +unsigned char *p, *end; +register_info_type *reg_info; +{ + int mcnt; + unsigned char *p1 = p; + + while (p1 < end) { + /* Skip over opcodes that can match nothing, and break when we get + to one that can't. */ + + switch ((re_opcode_t) * p1) { + /* It's a loop. */ + case on_failure_jump: + p1++; + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + p1 += mcnt; + break; + + default: + if (!common_op_match_null_string_p + (&p1, end, reg_info)) + return false; + } + } /* while p1 < end */ + + return true; +} /* alt_match_null_string_p */ + + +/* Deals with the ops common to group_match_null_string_p and + alt_match_null_string_p. + + Sets P to one after the op and its arguments, if any. */ + +static boolean common_op_match_null_string_p(p, end, reg_info) +unsigned char **p, *end; +register_info_type *reg_info; +{ + int mcnt; + boolean ret; + int reg_no; + unsigned char *p1 = *p; + + switch ((re_opcode_t) * p1++) { + case no_op: + case begline: + case endline: + case begbuf: + case endbuf: + case wordbeg: + case wordend: + case wordbound: + case notwordbound: + break; + + case start_memory: + reg_no = *p1; + assert(reg_no > 0 && reg_no <= MAX_REGNUM); + ret = group_match_null_string_p(&p1, end, reg_info); + + /* Have to set this here in case we're checking a group which + contains a group and a back reference to it. */ + + if (REG_MATCH_NULL_STRING_P(reg_info[reg_no]) == + MATCH_NULL_UNSET_VALUE) + REG_MATCH_NULL_STRING_P(reg_info[reg_no]) = ret; + + if (!ret) + return false; + break; + + /* If this is an optimized succeed_n for zero times, make the jump. */ + case jump: + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + if (mcnt >= 0) + p1 += mcnt; + else + return false; + break; + + case succeed_n: + /* Get to the number of times to succeed. */ + p1 += 2; + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + + if (mcnt == 0) { + p1 -= 4; + EXTRACT_NUMBER_AND_INCR(mcnt, p1); + p1 += mcnt; + } else + return false; + break; + + case duplicate: + if (!REG_MATCH_NULL_STRING_P(reg_info[*p1])) + return false; + break; + + case set_number_at: + p1 += 4; + + default: + /* All other opcodes mean we cannot match the empty string. */ + return false; + } + + *p = p1; + return true; +} /* common_op_match_null_string_p */ + + +/* Return zero if TRANSLATE[S1] and TRANSLATE[S2] are identical for LEN + bytes; nonzero otherwise. */ + +static int bcmp_translate(s1, s2, len, translate) +const char *s1, *s2; +register int len; +char *translate; +{ + register const unsigned char *p1 = (const unsigned char *) s1, + *p2 = (const unsigned char *) s2; + while (len) { + if (translate[*p1++] != translate[*p2++]) + return 1; + len--; + } + return 0; +} + +/* Entry points for GNU code. */ + +/* re_compile_pattern is the GNU regular expression compiler: it + compiles PATTERN (of length SIZE) and puts the result in BUFP. + Returns 0 if the pattern was valid, otherwise an error string. + + Assumes the `allocated' (and perhaps `buffer') and `translate' fields + are set in BUFP on entry. + + We call regex_compile to do the actual compilation. */ + +const char *re_compile_pattern(pattern, length, bufp) +const char *pattern; +size_t length; +struct re_pattern_buffer *bufp; +{ + reg_errcode_t ret; + + /* GNU code is written to assume at least RE_NREGS registers will be set + (and at least one extra will be -1). */ + bufp->regs_allocated = REGS_UNALLOCATED; + + /* And GNU code determines whether or not to get register information + by passing null for the REGS argument to re_match, etc., not by + setting no_sub. */ + bufp->no_sub = 0; + + /* Match anchors at newline. */ + bufp->newline_anchor = 1; + + ret = regex_compile(pattern, length, re_syntax_options, bufp); + + return re_error_msg[(int) ret]; +} + +/* Entry points compatible with 4.2 BSD regex library. We don't define + them if this is an Emacs or POSIX compilation. */ + +/* POSIX.2 functions. Don't define these for Emacs. */ + +#if !NO_POSIX_COMPAT + +/* regcomp takes a regular expression as a string and compiles it. + + PREG is a regex_t *. We do not expect any fields to be initialized, + since POSIX says we shouldn't. Thus, we set + + `buffer' to the compiled pattern; + `used' to the length of the compiled pattern; + `syntax' to RE_SYNTAX_POSIX_EXTENDED if the + REG_EXTENDED bit in CFLAGS is set; otherwise, to + RE_SYNTAX_POSIX_BASIC; + `newline_anchor' to REG_NEWLINE being set in CFLAGS; + `fastmap' and `fastmap_accurate' to zero; + `re_nsub' to the number of subexpressions in PATTERN. + + PATTERN is the address of the pattern string. + + CFLAGS is a series of bits which affect compilation. + + If REG_EXTENDED is set, we use POSIX extended syntax; otherwise, we + use POSIX basic syntax. + + If REG_NEWLINE is set, then . and [^...] don't match newline. + Also, regexec will try a match beginning after every newline. + + If REG_ICASE is set, then we considers upper- and lowercase + versions of letters to be equivalent when matching. + + If REG_NOSUB is set, then when PREG is passed to regexec, that + routine will report only success or failure, and nothing about the + registers. + + It returns 0 if it succeeds, nonzero if it doesn't. (See regex.h for + the return codes and their meanings.) */ + +int regcomp(preg, pattern, cflags) +regex_t *preg; +const char *pattern; +int cflags; +{ + reg_errcode_t ret; + reg_syntax_t syntax + = (cflags & REG_EXTENDED) ? + RE_SYNTAX_POSIX_EXTENDED : RE_SYNTAX_POSIX_BASIC; + + /* regex_compile will allocate the space for the compiled pattern. */ + preg->buffer = 0; + preg->allocated = 0; + preg->used = 0; + + /* Don't bother to use a fastmap when searching. This simplifies the + REG_NEWLINE case: if we used a fastmap, we'd have to put all the + characters after newlines into the fastmap. This way, we just try + every character. */ + preg->fastmap = 0; + + if (cflags & REG_ICASE) { + unsigned i; + + preg->translate = (char *) malloc(CHAR_SET_SIZE); + if (preg->translate == NULL) + return (int) REG_ESPACE; + + /* Map uppercase characters to corresponding lowercase ones. */ + for (i = 0; i < CHAR_SET_SIZE; i++) + preg->translate[i] = ISUPPER(i) ? tolower(i) : i; + } else + preg->translate = NULL; + + /* If REG_NEWLINE is set, newlines are treated differently. */ + if (cflags & REG_NEWLINE) { /* REG_NEWLINE implies neither . nor [^...] match newline. */ + syntax &= ~RE_DOT_NEWLINE; + syntax |= RE_HAT_LISTS_NOT_NEWLINE; + /* It also changes the matching behavior. */ + preg->newline_anchor = 1; + } else + preg->newline_anchor = 0; + + preg->no_sub = !!(cflags & REG_NOSUB); + + /* POSIX says a null character in the pattern terminates it, so we + can use strlen here in compiling the pattern. */ + ret = regex_compile(pattern, strlen(pattern), syntax, preg); + + /* POSIX doesn't distinguish between an unmatched open-group and an + unmatched close-group: both are REG_EPAREN. */ + if (ret == REG_ERPAREN) + ret = REG_EPAREN; + + return (int) ret; +} + + +/* regexec searches for a given pattern, specified by PREG, in the + string STRING. + + If NMATCH is zero or REG_NOSUB was set in the cflags argument to + `regcomp', we ignore PMATCH. Otherwise, we assume PMATCH has at + least NMATCH elements, and we set them to the offsets of the + corresponding matched substrings. + + EFLAGS specifies `execution flags' which affect matching: if + REG_NOTBOL is set, then ^ does not match at the beginning of the + string; if REG_NOTEOL is set, then $ does not match at the end. + + We return 0 if we find a match and REG_NOMATCH if not. */ + +int regexec(preg, string, nmatch, pmatch, eflags) +const regex_t *preg; +const char *string; +size_t nmatch; +regmatch_t pmatch[]; +int eflags; +{ + int ret; + struct re_registers regs; + regex_t private_preg; + int len = strlen(string); + boolean want_reg_info = !preg->no_sub && nmatch > 0; + + private_preg = *preg; + + private_preg.not_bol = !!(eflags & REG_NOTBOL); + private_preg.not_eol = !!(eflags & REG_NOTEOL); + + /* The user has told us exactly how many registers to return + information about, via `nmatch'. We have to pass that on to the + matching routines. */ + private_preg.regs_allocated = REGS_FIXED; + + if (want_reg_info) { + regs.num_regs = nmatch; + regs.start = TALLOC(nmatch, regoff_t); + regs.end = TALLOC(nmatch, regoff_t); + if (regs.start == NULL || regs.end == NULL) + return (int) REG_NOMATCH; + } + + /* Perform the searching operation. */ + ret = re_search(&private_preg, string, len, + /* start: */ 0, /* range: */ len, + want_reg_info ? ®s : (struct re_registers *) 0); + + /* Copy the register information to the POSIX structure. */ + if (want_reg_info) { + if (ret >= 0) { + unsigned r; + + for (r = 0; r < nmatch; r++) { + pmatch[r].rm_so = regs.start[r]; + pmatch[r].rm_eo = regs.end[r]; + } + } + + /* If we needed the temporary register info, free the space now. */ + free(regs.start); + free(regs.end); + } + + /* We want zero return to mean success, unlike `re_search'. */ + return ret >= 0 ? (int) REG_NOERROR : (int) REG_NOMATCH; +} + + +/* Returns a message corresponding to an error code, ERRCODE, returned + from either regcomp or regexec. We don't use PREG here. */ + +size_t regerror(errcode, preg, errbuf, errbuf_size) +int errcode; +const regex_t *preg; +char *errbuf; +size_t errbuf_size; +{ + const char *msg; + size_t msg_size; + + if (errcode < 0 + || errcode >= (sizeof(re_error_msg) / sizeof(re_error_msg[0]))) + /* Only error codes returned by the rest of the code should be passed + to this routine. If we are given anything else, or if other regex + code generates an invalid error code, then the program has a bug. + Dump core so we can fix it. */ + abort(); + + msg = re_error_msg[errcode]; + + /* POSIX doesn't require that we do anything in this case, but why + not be nice. */ + if (!msg) + msg = "Success"; + + msg_size = strlen(msg) + 1; /* Includes the null. */ + + if (errbuf_size != 0) { + if (msg_size > errbuf_size) { + strncpy(errbuf, msg, errbuf_size - 1); + errbuf[errbuf_size - 1] = 0; + } else + strcpy(errbuf, msg); + } + + return msg_size; +} + + +/* Free dynamically allocated space used by PREG. */ + +void regfree(preg) +regex_t *preg; +{ + if (preg->buffer != NULL) + free(preg->buffer); + preg->buffer = NULL; + + preg->allocated = 0; + preg->used = 0; + + if (preg->fastmap != NULL) + free(preg->fastmap); + preg->fastmap = NULL; + preg->fastmap_accurate = 0; + + if (preg->translate != NULL) + free(preg->translate); + preg->translate = NULL; +} + +#endif /* !NO_POSIX_COMPAT */ diff --git a/libmultipath/regex.h b/libmultipath/regex.h new file mode 100644 index 0000000..4715250 --- /dev/null +++ b/libmultipath/regex.h @@ -0,0 +1,252 @@ +/* Definitions for data structures and routines for the regular + expression library, version 0.12. + + Copyright (C) 1985, 1989, 1990, 1991, 1992, 1993 + Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ + +#ifndef __REGEXP_LIBRARY_H__ +#define __REGEXP_LIBRARY_H__ + +typedef long s_reg_t; +typedef unsigned long active_reg_t; + +typedef unsigned long reg_syntax_t; + +#define RE_BACKSLASH_ESCAPE_IN_LISTS (1L) +#define RE_BK_PLUS_QM (RE_BACKSLASH_ESCAPE_IN_LISTS << 1) +#define RE_CHAR_CLASSES (RE_BK_PLUS_QM << 1) +#define RE_CONTEXT_INDEP_ANCHORS (RE_CHAR_CLASSES << 1) +#define RE_CONTEXT_INDEP_OPS (RE_CONTEXT_INDEP_ANCHORS << 1) +#define RE_CONTEXT_INVALID_OPS (RE_CONTEXT_INDEP_OPS << 1) +#define RE_DOT_NEWLINE (RE_CONTEXT_INVALID_OPS << 1) +#define RE_DOT_NOT_NULL (RE_DOT_NEWLINE << 1) +#define RE_HAT_LISTS_NOT_NEWLINE (RE_DOT_NOT_NULL << 1) +#define RE_INTERVALS (RE_HAT_LISTS_NOT_NEWLINE << 1) +#define RE_LIMITED_OPS (RE_INTERVALS << 1) +#define RE_NEWLINE_ALT (RE_LIMITED_OPS << 1) +#define RE_NO_BK_BRACES (RE_NEWLINE_ALT << 1) +#define RE_NO_BK_PARENS (RE_NO_BK_BRACES << 1) +#define RE_NO_BK_REFS (RE_NO_BK_PARENS << 1) +#define RE_NO_BK_VBAR (RE_NO_BK_REFS << 1) +#define RE_NO_EMPTY_RANGES (RE_NO_BK_VBAR << 1) +#define RE_UNMATCHED_RIGHT_PAREN_ORD (RE_NO_EMPTY_RANGES << 1) +#define RE_NO_GNU_OPS (RE_UNMATCHED_RIGHT_PAREN_ORD << 1) + +extern reg_syntax_t re_syntax_options; + +#define RE_SYNTAX_EMACS 0 + +#define RE_SYNTAX_AWK \ + (RE_BACKSLASH_ESCAPE_IN_LISTS | RE_DOT_NOT_NULL | \ + RE_NO_BK_PARENS | RE_NO_BK_REFS | \ + RE_NO_BK_VBAR | RE_NO_EMPTY_RANGES | \ + RE_DOT_NEWLINE | RE_CONTEXT_INDEP_ANCHORS | \ + RE_UNMATCHED_RIGHT_PAREN_ORD | RE_NO_GNU_OPS) + +#define RE_SYNTAX_GNU_AWK \ + ((RE_SYNTAX_POSIX_EXTENDED | RE_BACKSLASH_ESCAPE_IN_LISTS) | \ + & ~(RE_DOT_NOT_NULL | RE_INTERVALS | RE_CONTEXT_INDEP_OPS)) + +#define RE_SYNTAX_POSIX_AWK \ + (RE_SYNTAX_POSIX_EXTENDED | RE_BACKSLASH_ESCAPE_IN_LISTS | \ + RE_INTERVALS | RE_NO_GNU_OPS) + +#define RE_SYNTAX_GREP \ + (RE_BK_PLUS_QM | RE_CHAR_CLASSES | \ + RE_HAT_LISTS_NOT_NEWLINE | RE_INTERVALS | \ + RE_NEWLINE_ALT) + +#define RE_SYNTAX_EGREP \ + (RE_CHAR_CLASSES | RE_CONTEXT_INDEP_ANCHORS | \ + RE_CONTEXT_INDEP_OPS | RE_HAT_LISTS_NOT_NEWLINE | \ + RE_NEWLINE_ALT | RE_NO_BK_PARENS | \ + RE_NO_BK_VBAR) + +#define RE_SYNTAX_POSIX_EGREP \ + (RE_SYNTAX_EGREP | RE_INTERVALS | \ + RE_NO_BK_BRACES) + +#define RE_SYNTAX_ED RE_SYNTAX_POSIX_BASIC + +#define RE_SYNTAX_SED RE_SYNTAX_POSIX_BASIC + +#define _RE_SYNTAX_POSIX_COMMON \ + (RE_CHAR_CLASSES | RE_DOT_NEWLINE | \ + RE_DOT_NOT_NULL | RE_INTERVALS | \ + RE_NO_EMPTY_RANGES) + +#define RE_SYNTAX_POSIX_BASIC \ + (_RE_SYNTAX_POSIX_COMMON | RE_BK_PLUS_QM) + +#define RE_SYNTAX_POSIX_MINIMAL_BASIC \ + (_RE_SYNTAX_POSIX_COMMON | RE_LIMITED_OPS) + +#define RE_SYNTAX_POSIX_EXTENDED \ + (_RE_SYNTAX_POSIX_COMMON | RE_CONTEXT_INDEP_ANCHORS | \ + RE_CONTEXT_INDEP_OPS | RE_NO_BK_BRACES | \ + RE_NO_BK_PARENS | RE_NO_BK_VBAR | \ + RE_UNMATCHED_RIGHT_PAREN_ORD) + +#define RE_SYNTAX_POSIX_MINIMAL_EXTENDED \ + (_RE_SYNTAX_POSIX_COMMON | RE_CONTEXT_INDEP_ANCHORS | \ + RE_CONTEXT_INVALID_OPS | RE_NO_BK_BRACES | \ + RE_NO_BK_PARENS | RE_NO_BK_REFS | \ + RE_NO_BK_VBAR | RE_UNMATCHED_RIGHT_PAREN_ORD) + +/* Maximum number of duplicates an interval can allow */ +#define RE_DUP_MAX (0x7fff) + +/* POSIX 'cflags' bits */ +#define REG_EXTENDED 1 +#define REG_ICASE (REG_EXTENDED << 1) +#define REG_NEWLINE (REG_ICASE << 1) +#define REG_NOSUB (REG_NEWLINE << 1) + + +/* POSIX `eflags' bits */ +#define REG_NOTBOL 1 +#define REG_NOTEOL (1 << 1) + +/* If any error codes are removed, changed, or added, update the + `re_error_msg' table in regex.c. */ +typedef enum +{ + REG_NOERROR = 0, /* Success. */ + REG_NOMATCH, /* Didn't find a match (for regexec). */ + + /* POSIX regcomp return error codes */ + REG_BADPAT, /* Invalid pattern. */ + REG_ECOLLATE, /* Not implemented. */ + REG_ECTYPE, /* Invalid character class name. */ + REG_EESCAPE, /* Trailing backslash. */ + REG_ESUBREG, /* Invalid back reference. */ + REG_EBRACK, /* Unmatched left bracket. */ + REG_EPAREN, /* Parenthesis imbalance. */ + REG_EBRACE, /* Unmatched \{. */ + REG_BADBR, /* Invalid contents of \{\}. */ + REG_ERANGE, /* Invalid range end. */ + REG_ESPACE, /* Ran out of memory. */ + REG_BADRPT, /* No preceding re for repetition op. */ + + /* Error codes we've added */ + REG_EEND, /* Premature end. */ + REG_ESIZE, /* Compiled pattern bigger than 2^16 bytes. */ + REG_ERPAREN /* Unmatched ) or \); not returned from regcomp. */ +} reg_errcode_t; + +#define REGS_UNALLOCATED 0 +#define REGS_REALLOCATE 1 +#define REGS_FIXED 2 + +/* This data structure represents a compiled pattern */ +struct re_pattern_buffer +{ + unsigned char *buffer; + unsigned long allocated; + unsigned long used; + reg_syntax_t syntax; + char *fastmap; + char *translate; + size_t re_nsub; + unsigned can_be_null : 1; + unsigned regs_allocated : 2; + unsigned fastmap_accurate : 1; + unsigned no_sub : 1; + unsigned not_bol : 1; + unsigned not_eol : 1; + unsigned newline_anchor : 1; +}; + +typedef struct re_pattern_buffer regex_t; + +/* search.c (search_buffer) in Emacs needs this one opcode value. It is + defined both in `regex.c' and here. */ +#define RE_EXACTN_VALUE 1 + +/* Type for byte offsets within the string. POSIX mandates this. */ +typedef int regoff_t; + + +/* This is the structure we store register match data in. See + regex.texinfo for a full description of what registers match. */ +struct re_registers +{ + unsigned num_regs; + regoff_t *start; + regoff_t *end; +}; + + +#ifndef RE_NREGS +#define RE_NREGS 30 +#endif + + +/* POSIX specification for registers. Aside from the different names than + `re_registers', POSIX uses an array of structures, instead of a + structure of arrays. */ +typedef struct +{ + regoff_t rm_so; /* Byte offset from string's start to substring's start. */ + regoff_t rm_eo; /* Byte offset from string's start to substring's end. */ +} regmatch_t; + +/* Declarations for routines. */ + +extern reg_syntax_t re_set_syntax (reg_syntax_t syntax); + +extern const char *re_compile_pattern (const char *pattern, size_t length, + struct re_pattern_buffer *buffer); + +extern int re_compile_fastmap (struct re_pattern_buffer *buffer); + +extern int re_search (struct re_pattern_buffer *buffer, const char *string, + int length, int start, int range, + struct re_registers *regs); + +extern int re_search_2 (struct re_pattern_buffer *buffer, const char *string1, + int length1, const char *string2, int length2, + int start, int range, struct re_registers *regs, + int stop); + +extern int re_match (struct re_pattern_buffer *buffer, const char *string, + int length, int start, struct re_registers *regs); + +extern int re_match_2 (struct re_pattern_buffer *buffer, const char *string1, + int length1, const char *string2, int length2, + int start, struct re_registers *regs, int stop); + +extern void re_set_registers (struct re_pattern_buffer *buffer, + struct re_registers *regs, unsigned num_regs, + regoff_t *starts, regoff_t *ends); + +/* 4.2 bsd compatibility. */ +extern char *re_comp (const char *); +extern int re_exec (const char *); + +/* POSIX compatibility. */ +extern int regcomp (regex_t *preg, const char *pattern, int cflags); + +extern int regexec (const regex_t *preg, const char *string, size_t nmatch, + regmatch_t pmatch[], int eflags); + +extern size_t regerror (int errcode, const regex_t *preg, char *errbuf, + size_t errbuf_size); + +extern void regfree (regex_t *preg); + +#endif /* not __REGEXP_LIBRARY_H__ */ diff --git a/libmultipath/sg_include.h b/libmultipath/sg_include.h new file mode 100644 index 0000000..3cb107a --- /dev/null +++ b/libmultipath/sg_include.h @@ -0,0 +1,2 @@ +#define __user +#include <scsi/sg.h> diff --git a/libmultipath/structs.c b/libmultipath/structs.c new file mode 100644 index 0000000..79783eb --- /dev/null +++ b/libmultipath/structs.c @@ -0,0 +1,213 @@ +#include <stdio.h> +#include <unistd.h> + +#include "memory.h" +#include "vector.h" +#include "util.h" +#include "structs.h" +#include "config.h" +#include "debug.h" + +struct path * +alloc_path (void) +{ + return (struct path *)MALLOC(sizeof(struct path)); +} + +void +free_path (struct path * pp) +{ + if (!pp) + return; + + if (pp->checker_context) + FREE(pp->checker_context); + + if (pp->fd > 0) + close(pp->fd); + + FREE(pp); +} + +void +free_pathvec (vector vec, int free_paths) +{ + int i; + struct path * pp; + + if (!vec) + return; + + if (free_paths) + vector_foreach_slot(vec, pp, i) + free_path(pp); + + vector_free(vec); +} + +struct pathgroup * +alloc_pathgroup (void) +{ + struct pathgroup * pgp; + + pgp = (struct pathgroup *)MALLOC(sizeof(struct pathgroup)); + + if (!pgp) + return NULL; + + pgp->paths = vector_alloc(); + + if (!pgp->paths) + FREE(pgp); + + return pgp; +} + +void +free_pathgroup (struct pathgroup * pgp, int free_paths) +{ + if (!pgp) + return; + + free_pathvec(pgp->paths, free_paths); + FREE(pgp); +} + +void +free_pgvec (vector pgvec, int free_paths) +{ + int i; + struct pathgroup * pgp; + + if (!pgvec) + return; + + vector_foreach_slot(pgvec, pgp, i) + free_pathgroup(pgp, free_paths); + + vector_free(pgvec); +} + +struct multipath * +alloc_multipath (void) +{ + return (struct multipath *)MALLOC(sizeof(struct multipath)); +} + +void +free_multipath (struct multipath * mpp, int free_paths) +{ + if (!mpp) + return; + + if (mpp->selector && + mpp->selector != conf->default_selector && + (!mpp->mpe || (mpp->mpe && mpp->selector != mpp->mpe->selector)) && + (!mpp->hwe || (mpp->hwe && mpp->selector != mpp->hwe->selector))) + FREE(mpp->selector); + + if (mpp->alias && + (!mpp->mpe || (mpp->mpe && mpp->alias != mpp->mpe->alias)) && + (mpp->wwid && mpp->alias != mpp->wwid)) + FREE(mpp->alias); + + if (mpp->features && + mpp->features != conf->default_features && + (!mpp->hwe || (mpp->hwe && mpp->features != mpp->hwe->features))) + FREE(mpp->features); + + if (mpp->hwhandler && + mpp->hwhandler != conf->default_hwhandler && + (!mpp->hwe || (mpp->hwe && mpp->hwhandler != mpp->hwe->hwhandler))) + FREE(mpp->hwhandler); + + free_pathvec(mpp->paths, free_paths); + free_pgvec(mpp->pg, free_paths); + FREE(mpp); +} + +void +free_multipathvec (vector mpvec, int free_paths) +{ + int i; + struct multipath * mpp; + + if (!mpvec) + return; + + vector_foreach_slot (mpvec, mpp, i) + free_multipath(mpp, free_paths); + + vector_free(mpvec); +} + +int +store_path (vector pathvec, struct path * pp) +{ + if (!vector_alloc_slot(pathvec)) + return 1; + + vector_set_slot(pathvec, pp); + + return 0; +} + +int +store_pathgroup (vector pgvec, struct pathgroup * pgp) +{ + if (!vector_alloc_slot(pgvec)) + return 1; + + vector_set_slot(pgvec, pgp); + + return 0; +} + +struct multipath * +find_mp (vector mp, char * alias) +{ + int i; + int len; + struct multipath * mpp; + + len = strlen(alias); + + if (!len) + return NULL; + + vector_foreach_slot (mp, mpp, i) { + if (strlen(mpp->alias) == len && + !strncmp(mpp->alias, alias, len)) + return mpp; + } + return NULL; +} + +struct path * +find_path_by_dev (vector pathvec, char * dev) +{ + int i; + struct path * pp; + + vector_foreach_slot (pathvec, pp, i) + if (!strcmp_chomp(pp->dev, dev)) + return pp; + + condlog(3, "path %s not found in pathvec\n", dev); + return NULL; +} + +struct path * +find_path_by_devt (vector pathvec, char * dev_t) +{ + int i; + struct path * pp; + + vector_foreach_slot (pathvec, pp, i) + if (!strcmp_chomp(pp->dev_t, dev_t)) + return pp; + + condlog(3, "path %s not found in pathvec\n", dev_t); + return NULL; +} + diff --git a/libmultipath/structs.h b/libmultipath/structs.h new file mode 100644 index 0000000..7b55c4a --- /dev/null +++ b/libmultipath/structs.h @@ -0,0 +1,131 @@ +#ifndef _STRUCTS_H +#define _STRUCTS_H + +#define WWID_SIZE 64 +#define SERIAL_SIZE 17 +#define NODE_NAME_SIZE 19 +#define PATH_STR_SIZE 16 +#define PARAMS_SIZE 1024 +#define FILE_NAME_SIZE 256 +#define CALLOUT_MAX_SIZE 128 +#define BLK_DEV_SIZE 33 + +#define SCSI_VENDOR_SIZE 9 +#define SCSI_PRODUCT_SIZE 17 +#define SCSI_REV_SIZE 5 + +#define KEEP_PATHS 0 +#define FREE_PATHS 1 + +enum pathstates { + PSTATE_RESERVED, + PSTATE_FAILED, + PSTATE_ACTIVE +}; + +enum pgstates { + PGSTATE_RESERVED, + PGSTATE_ENABLED, + PGSTATE_DISABLED, + PGSTATE_ACTIVE +}; + +struct scsi_idlun { + int dev_id; + int host_unique_id; + int host_no; +}; + +struct sg_id { + int host_no; + int channel; + int scsi_id; + int lun; + short h_cmd_per_lun; + short d_queue_depth; + int unused1; + int unused2; +}; + +struct scsi_dev { + char dev[FILE_NAME_SIZE]; + struct scsi_idlun scsi_id; + int host_no; +}; + +struct path { + char dev[FILE_NAME_SIZE]; + char dev_t[BLK_DEV_SIZE]; + struct scsi_idlun scsi_id; + struct sg_id sg_id; + char wwid[WWID_SIZE]; + char vendor_id[SCSI_VENDOR_SIZE]; + char product_id[SCSI_PRODUCT_SIZE]; + char rev[SCSI_REV_SIZE]; + char serial[SERIAL_SIZE]; + char tgt_node_name[NODE_NAME_SIZE]; + unsigned long size; + int state; + int dmstate; + int failcount; + int priority; + int claimed; + char * getuid; + char * getprio; + int (*checkfn) (int, char *, void **); + void * checker_context; + struct multipath * mpp; + int fd; + + /* configlet pointers */ + struct hwentry * hwe; +}; + +struct multipath { + char wwid[WWID_SIZE]; + int pgpolicy; + int nextpg; + int queuedio; + int action; + unsigned long size; + vector paths; + vector pg; + char params[PARAMS_SIZE]; + char status[PARAMS_SIZE]; + + /* configlet pointers */ + char * alias; + char * selector; + char * features; + char * hwhandler; + struct mpentry * mpe; + struct hwentry * hwe; +}; + +struct pathgroup { + long id; + int status; + int priority; + vector paths; +}; + +struct path * alloc_path (void); +struct pathgroup * alloc_pathgroup (void); +struct multipath * alloc_multipath (void); +void free_path (struct path *); +void free_pathvec (vector vec, int free_paths); +void free_pathgroup (struct pathgroup * pgp, int free_paths); +void free_pgvec (vector pgvec, int free_paths); +void free_multipath (struct multipath *, int free_paths); +void free_multipathvec (vector mpvec, int free_paths); + +int store_path (vector pathvec, struct path * pp); +int store_pathgroup (vector pgvec, struct pathgroup * pgp); + +struct multipath * find_mp (vector mp, char * alias); +struct path * find_path_by_devt (vector pathvec, char * devt); +struct path * find_path_by_dev (vector pathvec, char * dev); + +char sysfs_path[FILE_NAME_SIZE]; + +#endif diff --git a/libmultipath/uevent.c b/libmultipath/uevent.c new file mode 100644 index 0000000..1a887c0 --- /dev/null +++ b/libmultipath/uevent.c @@ -0,0 +1,135 @@ +/* + * uevent.c - trigger upon netlink uevents from the kernel + * + * Only kernels from version 2.6.10* on provide the uevent netlink socket. + * Until the libc-kernel-headers are updated, you need to compile with: + * + * gcc -I /lib/modules/`uname -r`/build/include -o uevent_listen uevent_listen.c + * + * Copyright (C) 2004 Kay Sievers <kay.sievers@vrfy.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation version 2 of the License. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + * + */ + +#include <unistd.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <fcntl.h> +#include <time.h> +#include <sys/socket.h> +#include <sys/user.h> +#include <asm/types.h> +#include <linux/netlink.h> + +#include "memory.h" +#include "debug.h" +#include "uevent.h" + +struct uevent * alloc_uevent (void) +{ + return (struct uevent *)MALLOC(sizeof(struct uevent)); +} + +int uevent_listen(int (*uev_trigger)(struct uevent *, void * trigger_data), + void * trigger_data) +{ + int sock; + struct sockaddr_nl snl; + int retval; + + memset(&snl, 0x00, sizeof(struct sockaddr_nl)); + snl.nl_family = AF_NETLINK; + snl.nl_pid = getpid(); + snl.nl_groups = 0xffffffff; + + sock = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_KOBJECT_UEVENT); + if (sock == -1) { + condlog(0, "error getting socket, exit\n"); + exit(1); + } + + retval = bind(sock, (struct sockaddr *) &snl, + sizeof(struct sockaddr_nl)); + if (retval < 0) { + condlog(0, "bind failed, exit\n"); + goto exit; + } + + while (1) { + static char buffer[HOTPLUG_BUFFER_SIZE + OBJECT_SIZE]; + int i; + char *pos; + size_t bufpos; + ssize_t buflen; + struct uevent *uev; + + buflen = recv(sock, &buffer, sizeof(buffer), 0); + if (buflen < 0) { + condlog(0, "error receiving message\n"); + continue; + } + + if ((size_t)buflen > sizeof(buffer)-1) + buflen = sizeof(buffer)-1; + + buffer[buflen] = '\0'; + uev = alloc_uevent(); + + if (!uev) { + condlog(1, "lost uevent, oom"); + continue; + } + + /* save start of payload */ + bufpos = strlen(buffer) + 1; + + /* action string */ + uev->action = buffer; + pos = strchr(buffer, '@'); + if (!pos) + continue; + pos[0] = '\0'; + + /* sysfs path */ + uev->devpath = &pos[1]; + + /* hotplug events have the environment attached - reconstruct envp[] */ + for (i = 0; (bufpos < (size_t)buflen) && (i < HOTPLUG_NUM_ENVP-1); i++) { + int keylen; + char *key; + + key = &buffer[bufpos]; + keylen = strlen(key); + uev->envp[i] = key; + bufpos += keylen + 1; + } + uev->envp[i] = NULL; + + condlog(3, "uevent '%s' from '%s'\n", uev->action, uev->devpath); + + /* print payload environment */ + for (i = 0; uev->envp[i] != NULL; i++) + condlog(3, "%s\n", uev->envp[i]); + + if (uev_trigger && uev_trigger(uev, trigger_data)) + condlog(0, "uevent trigger error"); + + } + +exit: + close(sock); + return 1; +} diff --git a/libmultipath/uevent.h b/libmultipath/uevent.h new file mode 100644 index 0000000..add5826 --- /dev/null +++ b/libmultipath/uevent.h @@ -0,0 +1,17 @@ +/* environment buffer, the kernel's size in lib/kobject_uevent.c should fit in */ +#define HOTPLUG_BUFFER_SIZE 1024 +#define HOTPLUG_NUM_ENVP 32 +#define OBJECT_SIZE 512 + +#ifndef NETLINK_KOBJECT_UEVENT +#define NETLINK_KOBJECT_UEVENT 15 +#endif + +struct uevent { + char *devpath; + char *action; + char *envp[HOTPLUG_NUM_ENVP]; +}; + +int uevent_listen(int (*store_uev)(struct uevent *, void * trigger_data), + void * trigger_data); diff --git a/libmultipath/util.c b/libmultipath/util.c new file mode 100644 index 0000000..8a3f790 --- /dev/null +++ b/libmultipath/util.c @@ -0,0 +1,51 @@ +#include <string.h> +#include <ctype.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> + +#define PARAMS_SIZE 255 + +int +strcmp_chomp(char *str1, char *str2) +{ + int i; + char s1[PARAMS_SIZE],s2[PARAMS_SIZE]; + + if(!str1 || !str2) + return 1; + + strncpy(s1, str1, PARAMS_SIZE); + strncpy(s2, str2, PARAMS_SIZE); + + for (i=strlen(s1)-1; i >=0 && isspace(s1[i]); --i) ; + s1[++i] = '\0'; + for (i=strlen(s2)-1; i >=0 && isspace(s2[i]); --i) ; + s2[++i] = '\0'; + + return(strcmp(s1,s2)); +} + +void +basename (char * str1, char * str2) +{ + char *p = str1 + (strlen(str1) - 1); + + while (*--p != '/' && p != str1) + continue; + + if (p != str1) + p++; + + strcpy(str2, p); +} + +int +filepresent (char * run) { + struct stat buf; + + if(!stat(run, &buf)) + return 1; + return 0; +} + diff --git a/libmultipath/util.h b/libmultipath/util.h new file mode 100644 index 0000000..f7f6fc3 --- /dev/null +++ b/libmultipath/util.h @@ -0,0 +1,13 @@ +#ifndef _UTIL_H +#define _UTIL_H + +int strcmp_chomp(char *, char *); +void basename (char * src, char * dst); +int filepresent (char * run); + +#define safe_sprintf(var, format, args...) \ + snprintf(var, sizeof(var), format, ##args) >= sizeof(var) +#define safe_snprintf(var, size, format, args...) \ + snprintf(var, size, format, ##args) >= size + +#endif /* _UTIL_H */ diff --git a/libmultipath/vector.c b/libmultipath/vector.c new file mode 100644 index 0000000..d0223b0 --- /dev/null +++ b/libmultipath/vector.c @@ -0,0 +1,146 @@ +/* + * Part: Vector structure manipulation. + * + * Version: $Id: vector.c,v 1.0.3 2003/05/11 02:28:03 acassen Exp $ + * + * Author: Alexandre Cassen, <acassen@linux-vs.org> + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include "memory.h" +#include <stdlib.h> +#include "vector.h" + +/* + * Initialize vector struct. + * allocated 'size' slot elements then return vector. + */ +vector +vector_alloc(void) +{ + vector v = (vector) MALLOC(sizeof (struct _vector)); + return v; +} + +/* allocated one slot */ +void * +vector_alloc_slot(vector v) +{ + v->allocated += VECTOR_DEFAULT_SIZE; + if (v->slot) + v->slot = REALLOC(v->slot, sizeof (void *) * v->allocated); + else + v->slot = (void *) MALLOC(sizeof (void *) * v->allocated); + + if (!v->slot) + v->allocated -= VECTOR_DEFAULT_SIZE; + + return v->slot; +} + +void * +vector_insert_slot(vector v, int slot, void *value) +{ + int i; + + if (!vector_alloc_slot(v)) + return NULL; + + for (i = (v->allocated /VECTOR_DEFAULT_SIZE) - 2; i >= slot; i--) + v->slot[i + 1] = v->slot[i]; + + v->slot[slot] = value; + + return v->slot[slot]; +} + +int +find_slot(vector v, void * addr) +{ + int i; + + for (i = 0; i < (v->allocated / VECTOR_DEFAULT_SIZE); i++) + if (v->slot[i] == addr) + return i; + + return -1; +} + +void +vector_del_slot(vector v, int slot) +{ + int i; + + if (!v->allocated) + return; + + for (i = slot + 1; i < (v->allocated / VECTOR_DEFAULT_SIZE); i++) + v->slot[i-1] = v->slot[i]; + + v->allocated -= VECTOR_DEFAULT_SIZE; + + if (!v->allocated) + v->slot = NULL; + else + v = REALLOC(v->slot, sizeof (void *) * v->allocated); +} + +void +vector_repack(vector v) +{ + int i; + + if (!v->allocated) + return; + + for (i = 0; i < (v->allocated / VECTOR_DEFAULT_SIZE); i++) + if (i > 0 && v->slot[i] == NULL) + vector_del_slot(v, i--); +} + +/* Free memory vector allocation */ +void +vector_free(vector v) +{ + if (!v) + return; + + if (v->slot) + FREE(v->slot); + + FREE(v); +} + +void +free_strvec(vector strvec) +{ + int i; + char *str; + + if (!strvec) + return; + + for (i = 0; i < VECTOR_SIZE(strvec); i++) + if ((str = VECTOR_SLOT(strvec, i)) != NULL) + FREE(str); + + vector_free(strvec); +} + +/* Set a vector slot value */ +void +vector_set_slot(vector v, void *value) +{ + unsigned int i = v->allocated - 1; + + v->slot[i] = value; +} diff --git a/libmultipath/vector.h b/libmultipath/vector.h new file mode 100644 index 0000000..294f0b1 --- /dev/null +++ b/libmultipath/vector.h @@ -0,0 +1,54 @@ +/* + * Soft: Keepalived is a failover program for the LVS project + * <www.linuxvirtualserver.org>. It monitor & manipulate + * a loadbalanced server pool using multi-layer checks. + * + * Part: vector.c include file. + * + * Version: $Id: vector.h,v 1.0.3 2003/05/11 02:28:03 acassen Exp $ + * + * Author: Alexandre Cassen, <acassen@linux-vs.org> + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _VECTOR_H +#define _VECTOR_H + +/* vector definition */ +struct _vector { + unsigned int allocated; + void **slot; +}; +typedef struct _vector *vector; + +#define VECTOR_DEFAULT_SIZE 1 +#define VECTOR_SLOT(V,E) ((V)->slot[(E)]) +#define VECTOR_SIZE(V) ((V)->allocated) +#define VECTOR_LAST_SLOT(V) ((V)->slot[((V)->allocated - 1)]) + +#define vector_foreach_slot(v,p,i) \ + for (i = 0; i < (v)->allocated && ((p) = (v)->slot[i]); i++) + +/* Prototypes */ +extern vector vector_alloc(void); +extern void *vector_alloc_slot(vector v); +extern void vector_free(vector v); +extern void free_strvec(vector strvec); +extern void vector_set_slot(vector v, void *value); +extern void vector_del_slot(vector v, int slot); +extern void *vector_insert_slot(vector v, int slot, void *value); +int find_slot(vector v, void * addr); +extern void vector_repack(vector v); +extern void vector_dump(vector v); +extern void dump_strvec(vector strvec); + +#endif diff --git a/multipath-tools.spec.in b/multipath-tools.spec.in new file mode 100644 index 0000000..092091e --- /dev/null +++ b/multipath-tools.spec.in @@ -0,0 +1,56 @@ +%define _rpmdir rpms +%define _builddir . + +Summary: Tools to manage multipathed devices with the device-mapper. +Name: multipath-tools +Version: __VERSION__ +Release: 1 +License: GPL +Group: Utilities/System +URL: http://christophe.varoqui.free.fr +Source: /dev/null +BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-buildroot +Packager: Christophe Varoqui <christophe.varoqui@free.fr> +Prefix: / +Vendor: Starving Linux Artists (tm Brian O'Sullivan) +ExclusiveOS: linux + +%description +%{name} provides the tools to manage multipathed devices by +instructing the device-mapper multipath module what to do. The tools +are : +* multipath : scan the system for multipathed devices, assembles them + and update the device-mapper's maps +* multipathd : wait for maps events, then execs multipath +* devmap-name : provides a meaningful device name to udev for devmaps +* kpartx : maps linear devmaps upon device partitions, which makes + multipath maps partionable + +%prep +mkdir -p %{buildroot} %{_rpmdir} + +%build +make + +%install +rm -rf %{buildroot} +make DESTDIR=%{buildroot} install + +%clean +rm -rf $RPM_BUILD_ROOT + +%files +%defattr(-,root,root,-) +%{prefix}/sbin/devmap_name +%{prefix}/sbin/multipath +%{prefix}/sbin/kpartx +%{prefix}/usr/share/man/man8/devmap_name.8.gz +%{prefix}/usr/share/man/man8/multipath.8.gz +%{prefix}/usr/bin/multipathd +%{prefix}/etc/hotplug.d/scsi/multipath.hotplug +%{prefix}/etc/init.d/multipathd + + +%changelog +* Sat May 14 2004 Christophe Varoqui +- Initial build. diff --git a/multipath.conf.annotated b/multipath.conf.annotated new file mode 100644 index 0000000..9f01655 --- /dev/null +++ b/multipath.conf.annotated @@ -0,0 +1,343 @@ +# +# name : defaults +# desc : multipath-tools default settings +# +defaults { + # + # name : multipath_tool + # scope : multipathd + # desc : the tool in charge of configuring the multipath device maps + # default : "/sbin/multipath -v 0 -S" + # + multipath_tool "/sbin/multipath -v 0 -S" + + # + # name : udev_dir + # desc : directory where udev creates its device nodes + # default : /udev + # + udev_dir /dev + + # + # name : polling_interval + # scope : multipathd + # desc : interval between two path checks in seconds + # default : 5 + # + polling_interval 10 + + # + # name : default_selector + # scope : multipath + # desc : the default path selector algorithm to use + # these algorithms are offered by the kernel multipath target + # values : "round-robin 0" + # default : "round-robin 0" + # + default_selector "round-robin 0" + + # + # name : default_path_grouping_policy + # scope : multipath + # desc : the default path grouping policy to apply to unspecified + # multipaths + # default : multibus + # + default_path_grouping_policy multibus + + # + # name : default_getuid_callout + # scope : multipath + # desc : the default program and args to callout to obtain a unique + # path identifier. Absolute path required + # default : /sbin/scsi_id -g -u -s + # + default_getuid_callout "/sbin/scsi_id -g -u -s /block/%n" + + # + # name : default_prio_callout + # scope : multipath + # desc : the default program and args to callout to obtain a path + # priority value. The ALUA bits in SPC-3 provide an + # exploitable prio value for example. "none" is a valid value + # default : (null) + # + #default_prio_callout "/bin/true" + + # + # name : rr_min_io + # scope : multipath + # desc : the number of IO to route to a path before switching + # to the next in the same path group + # default : 1000 + # + r_min_io 100 +} + +# +# name : blacklist +# scope : multipath & multipathd +# desc : list of device names to discard as not multipath candidates +# default : cciss, fd, hd, md, dm, sr, scd, st, ram, raw, loop +# +blacklist { + wwid 26353900f02796769 + devnode "(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" + devnode "hd[a-z][[0-9]*]" + devnode "cciss!c[0-9]d[0-9]*[p[0-9]*]" +} + +# +# name : multipaths +# scope : multipath & multipathd +# desc : list of multipaths finest-grained settings +# +multipaths { + # + # name : multipath + # scope : multipath & multipathd + # desc : container for settings that apply to one specific multipath + # + multipath { + # + # name : wwid + # scope : multipath & multipathd + # desc : index of the container + # + wwid 3600508b4000156d700012000000b0000 + + # + # name : alias + # scope : multipath + # desc : symbolic name for the multipath + # + alias yellow + + # + # name : path_grouping_policy + # scope : multipath + # desc : path grouping policy to apply to this multipath + # values : failover, multibus, group_by_serial + # default : failover + # + path_grouping_policy multibus + + # + # name : path_checker + # scope : multipathd + # desc : path checking alorithm to use to check path state + # values : readsector0, tur + # default : readsector0 + # + # path_checker readsector0 + + # + # name : path_selector + # desc : the path selector algorithm to use for this mpath + # these algo are offered by the kernel mpath target + # values : "round-robin 0" + # default : "round-robin 0" + # + path_selector "round-robin 0" + } + multipath { + wwid 1DEC_____321816758474 + alias red + } +} + +# +# name : devices +# scope : multipath & multipathd +# desc : list of per storage controler settings +# overrides default settings (device_maps block) +# overriden by per multipath settings (multipaths block) +# +devices { + # + # name : device + # scope : multipath & multipathd + # desc : settings for this specific storage controler + # + device { + # + # name : vendor, product + # scope : multipath & multipathd + # desc : index for the block + # + vendor "COMPAQ " + product "HSV110 (C)COMPAQ" + + # + # name : path_grouping_policy + # scope : multipath + # desc : path grouping policy to apply to multipath hosted + # by this storage controler + # values : failover = 1 path per priority group + # multibus = all valid paths in 1 priority + # group + # group_by_serial = 1 priority group per detected + # serial number + # default : failover + # + path_grouping_policy multibus + + # + # name : getuid_callout + # scope : multipath + # desc : the program and args to callout to obtain a unique + # path identifier. Absolute path required + # default : /sbin/scsi_id -g -u -s + # + getuid_callout "/sbin/scsi_id -g -u -s /block/%n" + + # + # name : prio_callout + # scope : multipath + # desc : the program and args to callout to obtain a path + # weight. Weights are summed for each path group to + # determine the next PG to use case of failure. + # "none" is a valid value. + # default : no callout, all paths equals + # + prio_callout "/sbin/pp_balance_units %d" + + # + # name : path_checker + # scope : multipathd + # desc : path checking alorithm to use to check path state + # values : readsector0, tur + # default : readsector0 + # + path_checker readsector0 + + # + # name : path_selector + # desc : the path selector algorithm to use for this mpath + # these algo are offered by the kernel mpath target + # values : "round-robin 0" + # default : "round-robin 0" + # + path_selector "round-robin 0" + } + device { + vendor "COMPAQ " + product "MSA1000 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "COMPAQ " + product "MSA1000 VOLUME " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "DEC " + product "HSG80 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HP " + product "HSV100 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "3PARdata" + product "VV " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "DDN " + product "SAN DataDirector" + path_grouping_policy multibus + path_checker tur + } + device { + vendor "FSC " + product "CentricStor " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HITACHI " + product "DF400 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HITACHI " + product "DF500 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HITACHI " + product "DF600 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "IBM " + product "ProFibre 4000R " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "IBM " + product "3542 " + path_grouping_policy group_by_serial + path_checker tur + } + device { + vendor "SGI " + product "TP9100 " + vendor "COMPAQ " + product "MSA1000 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "SGI " + product "TP9300 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "SGI " + product "TP9400 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "SGI " + product "TP9500 " + path_grouping_policy multibus + path_checker tur + } + device { + # all paths active but with a switchover latency + # LSI controlers + vendor "STK " + product "OPENstorage D280" + path_grouping_policy group_by_serial + path_checker tur + } + device { + # assymmetric array + vendor "SUN " + product "StorEdge 3510 " + path_grouping_policy multibus + path_checker tur + } + device { + # symmetric array + vendor "SUN " + product "T4 " + path_grouping_policy multibus + path_checker tur + } +} diff --git a/multipath.conf.synthetic b/multipath.conf.synthetic new file mode 100644 index 0000000..88210dc --- /dev/null +++ b/multipath.conf.synthetic @@ -0,0 +1,162 @@ +defaults { + multipath_tool "/sbin/multipath -v 0 -S" + udev_dir /dev + polling_interval 10 + default_selector "round-robin 0" + default_path_grouping_policy multibus + default_getuid_callout "/sbin/scsi_id -g -u -s /block/%n" + default_prio_callout "/bin/true" + default_features "0" + rr_wmin_io 100 +} +devnode_blacklist { + wwid 26353900f02796769 + devnode "(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" + devnode "hd[a-z][[0-9]*]" + devnode "cciss!c[0-9]d[0-9]*[p[0-9]*]" +} +multipaths { + multipath { + wwid 3600508b4000156d700012000000b0000 + alias yellow + path_grouping_policy multibus + path_checker readsector0 + path_selector "round-robin 0" + } + multipath { + wwid 1DEC_____321816758474 + alias red + } +} +devices { + device { + vendor "COMPAQ " + product "HSV110 (C)COMPAQ" + path_grouping_policy multibus + getuid_callout "/sbin/scsi_id -g -u -s /block/%n" + path_checker readsector0 + path_selector "round-robin 0" + features "1 queue_if_no_path" + hardware_handler "0" + } + device { + vendor "COMPAQ " + product "MSA1000 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "COMPAQ " + product "MSA1000 VOLUME " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "DEC " + product "HSG80 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HP " + product "HSV100 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "3PARdata" + product "VV " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "IBM " + product "3542 " + path_grouping_policy group_by_serial + path_checker tur + } + device { + vendor "DDN " + product "SAN DataDirector" + path_grouping_policy multibus + path_checker tur + } + device { + vendor "FSC " + product "CentricStor " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HITACHI " + product "DF400 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HITACHI " + product "DF500 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "HITACHI " + product "DF600 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "IBM " + product "ProFibre 4000R " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "SGI " + product "TP9100 " + vendor "COMPAQ " + product "MSA1000 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "SGI " + product "TP9300 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "SGI " + product "TP9400 " + path_grouping_policy multibus + path_checker tur + } + device { + vendor "SGI " + product "TP9500 " + path_grouping_policy multibus + path_checker tur + } + device { + # all paths active but with a switchover latency + # LSI controlers + vendor "STK " + product "OPENstorage D280" + path_grouping_policy group_by_serial + path_checker tur + } + device { + # assymmetric array + vendor "SUN " + product "StorEdge 3510 " + path_grouping_policy multibus + path_checker tur + } + device { + # symmetric array + vendor "SUN " + product "T4 " + path_grouping_policy multibus + path_checker tur + } +} diff --git a/multipath/01_udev b/multipath/01_udev new file mode 100755 index 0000000..0f68996 --- /dev/null +++ b/multipath/01_udev @@ -0,0 +1,50 @@ +#!/bin/sh +# +cp /sbin/udev $INITRDDIR/sbin/hotplug +cp /sbin/udevstart $INITRDDIR/sbin/ +cp /bin/mountpoint $INITRDDIR/bin/ +cp /bin/readlink $INITRDDIR/bin/ + +PROGS="/sbin/udev /sbin/udevstart /bin/mountpoint /bin/readlink" +LIBS=`ldd $PROGS | grep -v linux-gate.so | sort -u | \ +awk '{print $3}'` +for i in $LIBS +do + mkdir -p `dirname $INITRDDIR/$i` + cp $i $INITRDDIR/$i +done + +# +# config files +# +if [ -d /etc/dev.d ] +then + cp -a /etc/dev.d $INITRDDIR/etc/ +fi + +if [ -d /etc/udev ] +then + cp -a /etc/udev $INITRDDIR/etc/ +fi + +# +# run udev from initrd +# +cat <<EOF >| $INITRDDIR/scripts/10_udev.sh + +cd / +mount -nt proc proc proc +mount -nt sysfs sysfs sys +mount -nt tmpfs tmpfs dev || mount -nt ramfs ramfs dev +mount -nt tmpfs tmpfs tmp || mount -nt ramfs ramfs tmp + +#modprobe dm-mod +#modprobe dm-multipath +/sbin/udevstart + +umount -n tmp +umount -n sys +umount -n proc + +sleep 2 +EOF diff --git a/multipath/02_multipath b/multipath/02_multipath new file mode 100755 index 0000000..1a5d5a1 --- /dev/null +++ b/multipath/02_multipath @@ -0,0 +1,34 @@ +#!/bin/sh +# +# store the multipath tool in the initrd +# hotplug & udev will take care of calling it when appropriate +# this tool is statically linked against klibc : no additional libs +# +cp /sbin/multipath $INITRDDIR/sbin +cp /sbin/devmap_name $INITRDDIR/sbin +cp /sbin/kpartx $INITRDDIR/sbin + +# +# feed the dependencies too +# scsi_id is dynamicaly linked, so store the libs too +# +cp /sbin/scsi_id $INITRDDIR/sbin +cp /bin/mountpoint $INITRDDIR/bin + +PROGS="/sbin/scsi_id /bin/mountpoint" +LIBS=`ldd $PROGS | grep -v linux-gate.so | sort -u | \ +awk '{print $3}'` +for i in $LIBS +do + mkdir -p `dirname $INITRDDIR/$i` + cp $i $INITRDDIR/$i +done + +# +# config file ? +# +if [ -f /etc/multipath.conf ] +then + cp /etc/multipath.conf $INITRDDIR/etc/ +fi + diff --git a/multipath/Makefile b/multipath/Makefile new file mode 100644 index 0000000..382b616 --- /dev/null +++ b/multipath/Makefile @@ -0,0 +1,61 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> +BUILD = glibc + +include ../Makefile.inc + +OBJS = main.o $(MULTIPATHLIB)-$(BUILD).a $(CHECKERSLIB)-$(BUILD).a + +CFLAGS = -pipe -g -Wall -Wunused -Wstrict-prototypes \ + -I$(multipathdir) -I$(checkersdir) + +ifeq ($(strip $(BUILD)),klibc) + OBJS += $(libdm) $(libsysfs) +else + LDFLAGS += -ldevmapper -lsysfs +endif + +EXEC = multipath + +all: $(BUILD) + +prepare: + make -C $(multipathdir) clean + rm -f core *.o *.gz + +glibc: prepare $(OBJS) + $(CC) $(OBJS) -o $(EXEC) $(LDFLAGS) + $(STRIP) $(EXEC) + $(GZIP) $(EXEC).8 > $(EXEC).8.gz + +klibc: prepare $(OBJS) + $(CC) -static -o $(EXEC) $(CRT0) $(OBJS) $(KLIBC) $(LIBGCC) + $(STRIP) $(EXEC) + $(GZIP) $(EXEC).8 > $(EXEC).8.gz + +$(CHECKERSLIB)-$(BUILD).a: + make -C $(checkersdir) BUILD=$(BUILD) $(BUILD) + +$(MULTIPATHLIB)-$(BUILD).a: + make -C $(multipathdir) BUILD=$(BUILD) $(BUILD) + +install: + install -d $(DESTDIR)$(bindir) + install -m 755 $(EXEC) $(DESTDIR)$(bindir)/ + install -d $(DESTDIR)/var/cache/multipath/ + install -d $(DESTDIR)/etc/dev.d/block/ + install -m 755 multipath.dev $(DESTDIR)/etc/dev.d/block/ + install -d $(DESTDIR)/etc/udev/rules.d + install -m 755 multipath.rules $(DESTDIR)/etc/udev/rules.d/ + install -d $(DESTDIR)$(mandir) + install -m 644 $(EXEC).8.gz $(DESTDIR)$(mandir) + +uninstall: + rm $(DESTDIR)/etc/dev.d/block/multipath.dev + rm $(DESTDIR)/etc/udev/rules.d/multipath.rules + rm $(DESTDIR)$(bindir)/$(EXEC) + rm $(DESTDIR)$(mandir)/$(EXEC).8.gz + +clean: + rm -f core *.o $(EXEC) *.gz diff --git a/multipath/dev_t.h b/multipath/dev_t.h new file mode 100644 index 0000000..90c64f3 --- /dev/null +++ b/multipath/dev_t.h @@ -0,0 +1,15 @@ +#define MINORBITS 20 +#define MINORMASK ((1U << MINORBITS) - 1) + +#define MAJOR(dev) ((unsigned int) ((dev) >> MINORBITS)) +#define MINOR(dev) ((unsigned int) ((dev) & MINORMASK)) +#define MKDEV(ma,mi) (((ma) << MINORBITS) | (mi)) + +#define print_dev_t(buffer, dev) \ + sprintf((buffer), "%u:%u\n", MAJOR(dev), MINOR(dev)) + +#define format_dev_t(buffer, dev) \ + ({ \ + sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev)); \ + buffer; \ + }) diff --git a/multipath/main.c b/multipath/main.c new file mode 100644 index 0000000..52d3b8a --- /dev/null +++ b/multipath/main.c @@ -0,0 +1,991 @@ +/* + * Soft: multipath device mapper target autoconfig + * + * Version: $Id: main.h,v 0.0.1 2003/09/18 15:13:38 cvaroqui Exp $ + * + * Author: Copyright (C) 2003 Christophe Varoqui + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <string.h> + +#include <parser.h> +#include <vector.h> +#include <memory.h> +#include <libdevmapper.h> +#include <devmapper.h> +#include <checkers.h> +#include <path_state.h> +#include <blacklist.h> +#include <hwtable.h> +#include <util.h> +#include <defaults.h> +#include <structs.h> +#include <dmparser.h> +#include <cache.h> +#include <config.h> +#include <propsel.h> +#include <discovery.h> +#include <debug.h> +#include <sysfs/libsysfs.h> + +#include "main.h" +#include "pgpolicies.h" +#include "dict.h" + +static char * +get_refwwid (vector pathvec) +{ + struct path * pp; + char buff[FILE_NAME_SIZE]; + char * refwwid; + + if (conf->dev_type == DEV_NONE) + return NULL; + + if (conf->dev_type == DEV_DEVNODE) { + condlog(3, "limited scope = %s", conf->dev); + basename(conf->dev, buff); + pp = find_path_by_dev(pathvec, buff); + + if (!pp) { + pp = alloc_path(); + + if (!pp) + return NULL; + + if (store_path(pathvec, pp)) { + free_path(pp); + return NULL; + } + strncpy(pp->dev, buff, FILE_NAME_SIZE); + if (pathinfo(pp, conf->hwtable, DI_SYSFS | DI_WWID)) + return NULL; + } + + refwwid = MALLOC(WWID_SIZE); + + if (!refwwid) + return NULL; + + memcpy(refwwid, pp->wwid, WWID_SIZE); + return refwwid; + } + + if (conf->dev_type == DEV_DEVT) { + condlog(3, "limited scope = %s", conf->dev); + pp = find_path_by_devt(pathvec, conf->dev); + + if (!pp) { + pp = alloc_path(); + + if (!pp) + return NULL; + + if (store_path(pathvec, pp)) { + free_path(pp); + return NULL; + } + devt2devname(conf->dev, buff); + + if(safe_sprintf(pp->dev, "%s", buff)) { + fprintf(stderr, "pp->dev too small\n"); + exit(1); + } + if (pathinfo(pp, conf->hwtable, DI_SYSFS | DI_WWID)) + return NULL; + } + + refwwid = MALLOC(WWID_SIZE); + + if (!refwwid) + return NULL; + + memcpy(refwwid, pp->wwid, WWID_SIZE); + return refwwid; + } + if (conf->dev_type == DEV_DEVMAP) { + condlog(3, "limited scope = %s", conf->dev); + /* + * may be an alias + */ + refwwid = get_mpe_wwid(conf->dev); + + if (refwwid) + return refwwid; + + /* + * or directly a wwid + */ + refwwid = MALLOC(WWID_SIZE); + + if (!refwwid) + return NULL; + + strncpy(refwwid, conf->dev, WWID_SIZE); + return refwwid; + } + return NULL; +} + +/* + * print_path styles + */ +#define PRINT_PATH_ALL 0 +#define PRINT_PATH_SHORT 1 + +static void +print_path (struct path * pp, int style) +{ + if (style != PRINT_PATH_SHORT && pp->wwid) + printf ("%s ", pp->wwid); + else + printf (" \\_ "); + + printf("%i:%i:%i:%i ", + pp->sg_id.host_no, + pp->sg_id.channel, + pp->sg_id.scsi_id, + pp->sg_id.lun); + + if (pp->dev) + printf("%-4s ", pp->dev); + + if (pp->dev_t) + printf("%-7s ", pp->dev_t); + + switch (pp->state) { + case PATH_UP: + printf("[ready ]"); + break; + case PATH_DOWN: + printf("[faulty]"); + break; + case PATH_SHAKY: + printf("[shaky ]"); + break; + default: + printf("[undef ]"); + break; + } + switch (pp->dmstate) { + case PSTATE_ACTIVE: + printf("[active]"); + break; + case PSTATE_FAILED: + printf("[failed]"); + break; + default: + break; + } + if (pp->claimed) + printf("[claimed]"); + + if (style != PRINT_PATH_SHORT && pp->product_id) + printf("[%.16s]", pp->product_id); + + fprintf(stdout, "\n"); +} + +static void +print_map (struct multipath * mpp) +{ + if (mpp->size && mpp->params) + printf("0 %lu %s %s\n", + mpp->size, DEFAULT_TARGET, mpp->params); + return; +} + +static void +print_all_paths (vector pathvec) +{ + int i; + struct path * pp; + + vector_foreach_slot (pathvec, pp, i) + print_path(pp, PRINT_PATH_ALL); +} + +static void +print_mp (struct multipath * mpp) +{ + int j, i; + struct path * pp = NULL; + struct pathgroup * pgp = NULL; + + if (mpp->action == ACT_NOTHING || conf->verbosity == 0) + return; + + if (conf->verbosity > 1) { + switch (mpp->action) { + case ACT_RELOAD: + printf("%s: ", ACT_RELOAD_STR); + break; + + case ACT_CREATE: + printf("%s: ", ACT_CREATE_STR); + break; + + case ACT_SWITCHPG: + printf("%s: ", ACT_SWITCHPG_STR); + break; + + default: + break; + } + } + + if (mpp->alias) + printf("%s", mpp->alias); + + if (conf->verbosity == 1) { + printf("\n"); + return; + } + if (strncmp(mpp->alias, mpp->wwid, WWID_SIZE)) + printf(" (%s)", mpp->wwid); + + printf("\n"); + + if (mpp->size < 2000) + printf("[size=%lu kB]", mpp->size / 2); + else if (mpp->size < (2000 * 1024)) + printf("[size=%lu MB]", mpp->size / 2 / 1024); + else if (mpp->size < (2000 * 1024 * 1024)) + printf("[size=%lu GB]", mpp->size / 2 / 1024 / 1024); + else + printf("[size=%lu TB]", mpp->size / 2 / 1024 / 1024 / 1024); + + if (mpp->features) + printf("[features=\"%s\"]", mpp->features); + + if (mpp->hwhandler) + printf("[hwhandler=\"%s\"]", mpp->hwhandler); + + fprintf(stdout, "\n"); + + if (!mpp->pg) + return; + + vector_foreach_slot (mpp->pg, pgp, j) { + printf("\\_ "); + + if (mpp->selector) + printf("%s ", mpp->selector); + + switch (pgp->status) { + case PGSTATE_ENABLED: + printf("[enabled]"); + break; + case PGSTATE_DISABLED: + printf("[disabled]"); + break; + case PGSTATE_ACTIVE: + printf("[active]"); + break; + default: + break; + } + if (mpp->nextpg && mpp->nextpg == j + 1) + printf("[first]"); + + printf("\n"); + + vector_foreach_slot (pgp->paths, pp, i) + print_path(pp, PRINT_PATH_SHORT); + } + printf("\n"); +} + +static int +filter_pathvec (vector pathvec, char * refwwid) +{ + int i; + struct path * pp; + + if (!refwwid || !strlen(refwwid)) + return 0; + + vector_foreach_slot (pathvec, pp, i) { + if (memcmp(pp->wwid, refwwid, WWID_SIZE) != 0) { + condlog(3, "skip path %s : out of scope", pp->dev); + free_path(pp); + vector_del_slot(pathvec, i); + i--; + } + } + return 0; +} + +/* + * Transforms the path group vector into a proper device map string + */ +int +assemble_map (struct multipath * mp) +{ + int i, j; + int shift, freechar; + char * p; + struct pathgroup * pgp; + struct path * pp; + + p = mp->params; + freechar = sizeof(mp->params); + + shift = snprintf(p, freechar, "%s %s %i %i", + mp->features, mp->hwhandler, + VECTOR_SIZE(mp->pg), mp->nextpg); + + if (shift >= freechar) { + fprintf(stderr, "mp->params too small\n"); + return 1; + } + p += shift; + freechar -= shift; + + vector_foreach_slot (mp->pg, pgp, i) { + pgp = VECTOR_SLOT(mp->pg, i); + shift = snprintf(p, freechar, " %s %i 1", mp->selector, + VECTOR_SIZE(pgp->paths)); + if (shift >= freechar) { + fprintf(stderr, "mp->params too small\n"); + return 1; + } + p += shift; + freechar -= shift; + + vector_foreach_slot (pgp->paths, pp, j) { + shift = snprintf(p, freechar, " %s %d", + pp->dev_t, conf->minio); + if (shift >= freechar) { + fprintf(stderr, "mp->params too small\n"); + return 1; + } + p += shift; + freechar -= shift; + } + } + if (freechar < 1) { + fprintf(stderr, "mp->params too small\n"); + return 1; + } + snprintf(p, 1, "\n"); + + if (conf->verbosity > 2) + print_map(mp); + + return 0; +} + +static int +setup_map (struct multipath * mpp) +{ + struct path * pp; + struct pathgroup * pgp; + int i, j; + int highest = 0; + + /* + * don't bother if devmap size is unknown + */ + if (mpp->size <= 0) { + condlog(3, "%s devmap size is unknown", mpp->alias); + return 1; + } + + /* + * don't bother if a constituant path is claimed + * FIXME : claimed detection broken, always unclaimed for now + */ + vector_foreach_slot (mpp->paths, pp, i) { + if (pp->claimed) { + condlog(3, "%s claimed", pp->dev); + return 1; + } + } + + /* + * properties selectors + */ + select_pgpolicy(mpp); + select_selector(mpp); + select_features(mpp); + select_hwhandler(mpp); + + /* + * apply selected grouping policy to valid paths + */ + switch (mpp->pgpolicy) { + case MULTIBUS: + one_group(mpp); + break; + case FAILOVER: + one_path_per_group(mpp); + break; + case GROUP_BY_SERIAL: + group_by_serial(mpp); + break; + case GROUP_BY_PRIO: + group_by_prio(mpp); + break; + case GROUP_BY_NODE_NAME: + group_by_node_name(mpp); + break; + default: + break; + } + + if (mpp->pg == NULL) { + condlog(3, "pgpolicy failed to produce a pg vector"); + return 1; + } + + /* + * ponders each path group and determine highest prio pg + */ + mpp->nextpg = 1; + vector_foreach_slot (mpp->pg, pgp, i) { + vector_foreach_slot (pgp->paths, pp, j) { + pgp->id ^= (long)pp; + if (pp->state != PATH_DOWN) + pgp->priority += pp->priority; + } + if (pgp->priority > highest) { + highest = pgp->priority; + mpp->nextpg = i + 1; + } + } + + /* + * transform the mp->pg vector of vectors of paths + * into a mp->params strings to feed the device-mapper + */ + if (assemble_map(mpp)) { + condlog(3, "problem assembing map"); + return 1; + } + return 0; +} + +static int +pathcount (struct multipath * mpp, int state) +{ + struct pathgroup *pgp; + struct path *pp; + int i, j; + int count = 0; + + vector_foreach_slot (mpp->pg, pgp, i) + vector_foreach_slot (pgp->paths, pp, j) + if (pp->state == state) + count++; + return count; +} + +/* + * detect if a path is in the map we are about to create but not in the + * current one (triggers a valid reload) + * if a path is in the current map but not in the one we are about to create, + * don't reload : it may come back latter so save the reload burden + */ +static int +pgcmp2 (struct multipath * mpp, struct multipath * cmpp) +{ + int i, j, k, l; + struct pathgroup * pgp; + struct pathgroup * cpgp; + struct path * pp; + struct path * cpp; + int found = 0; + + vector_foreach_slot (mpp->pg, pgp, i) { + vector_foreach_slot (pgp->paths, pp, j) { + vector_foreach_slot (cmpp->pg, cpgp, k) { + vector_foreach_slot (cpgp->paths, cpp, l) { + if (pp == cpp) { + found = 1; + break; + } + } + if (found) + break; + } + if (found) + found = 0; + else + return 1; + } + } + return 0; +} + +static void +select_action (struct multipath * mpp, vector curmp) +{ + struct multipath * cmpp; + + cmpp = find_mp(curmp, mpp->alias); + + if (!cmpp) { + mpp->action = ACT_CREATE; + return; + } + if (pathcount(mpp, PATH_UP) == 0) { + condlog(3, "no good path"); + mpp->action = ACT_NOTHING; + return; + } + if (cmpp->size != mpp->size) { + condlog(3, "size different than current"); + mpp->action = ACT_RELOAD; + return; + } + if (strncmp(cmpp->features, mpp->features, + strlen(mpp->features))) { + condlog(3, "features different than current"); + mpp->action = ACT_RELOAD; + return; + } + if (strncmp(cmpp->hwhandler, mpp->hwhandler, + strlen(mpp->hwhandler))) { + condlog(3, "hwhandler different than current"); + mpp->action = ACT_RELOAD; + return; + } + if (strncmp(cmpp->selector, mpp->selector, + strlen(mpp->selector))) { + condlog(3, "selector different than current"); + mpp->action = ACT_RELOAD; + return; + } + if (VECTOR_SIZE(cmpp->pg) != VECTOR_SIZE(mpp->pg)) { + condlog(3, "different number of PG"); + mpp->action = ACT_RELOAD; + return; + } + if (pgcmp2(mpp, cmpp)) { + condlog(3, "different path group topology"); + mpp->action = ACT_RELOAD; + return; + } + if (cmpp->nextpg != mpp->nextpg) { + condlog(3, "nextpg different than current"); + mpp->action = ACT_SWITCHPG; + return; + } + mpp->action = ACT_NOTHING; + return; +} + +static int +reinstate_paths (struct multipath * mpp) +{ + int i, j; + struct pathgroup * pgp; + struct path * pp; + + vector_foreach_slot (mpp->pg, pgp, i) { + vector_foreach_slot (pgp->paths, pp, j) { + if (pp->state != PATH_UP && + (pgp->status == PGSTATE_DISABLED || + pgp->status == PGSTATE_ACTIVE)) + continue; + + if (pp->dmstate == PSTATE_FAILED) { + if (dm_reinstate(mpp->alias, pp->dev_t)) + condlog(0, "error reinstating %s", + pp->dev); + } + } + } + return 0; +} + +static int +domap (struct multipath * mpp) +{ + int op = ACT_NOTHING; + int r = 0; + + print_mp(mpp); + + /* + * last chance to quit before touching the devmaps + */ + if (conf->dry_run || mpp->action == ACT_NOTHING) + return 0; + + if (mpp->action == ACT_SWITCHPG) { + dm_switchgroup(mpp->alias, mpp->nextpg); + /* + * we may have avoided reinstating paths because there where in + * active or disabled PG. Now that the topology has changed, + * retry. + */ + reinstate_paths(mpp); + return 0; + } + if (mpp->action == ACT_CREATE) + op = DM_DEVICE_CREATE; + + if (mpp->action == ACT_RELOAD) + op = DM_DEVICE_RELOAD; + + + /* + * device mapper creation or updating + * here we know we'll have garbage on stderr from + * libdevmapper. so shut it down temporarily. + */ + dm_log_init_verbose(0); + + r = dm_addmap(op, mpp->alias, DEFAULT_TARGET, mpp->params, mpp->size); + + if (r == 0) + dm_simplecmd(DM_DEVICE_REMOVE, mpp->alias); + else if (op == DM_DEVICE_RELOAD) + dm_simplecmd(DM_DEVICE_RESUME, mpp->alias); + + /* + * PG order is random, so we need to set the primary one + * upon create or reload + */ + dm_switchgroup(mpp->alias, mpp->nextpg); + + dm_log_init_verbose(1); + + return r; +} + +static int +coalesce_paths (vector curmp, vector pathvec) +{ + int k, i; + char empty_buff[WWID_SIZE]; + struct multipath * mpp; + struct path * pp1; + struct path * pp2; + + memset(empty_buff, 0, WWID_SIZE); + + vector_foreach_slot (pathvec, pp1, k) { + /* skip this path for some reason */ + + /* 1. if path has no unique id or wwid blacklisted */ + if (memcmp(empty_buff, pp1->wwid, WWID_SIZE) == 0 || + blacklist(conf->blist, pp1->wwid)) + continue; + + /* 2. if path already coalesced */ + if (pp1->mpp) + continue; + + /* + * at this point, we know we really got a new mp + */ + mpp = alloc_multipath(); + + if (!mpp) + return 1; + + mpp->mpe = find_mpe(pp1->wwid); + mpp->hwe = pp1->hwe; + select_alias(mpp); + + pp1->mpp = mpp; + strcpy(mpp->wwid, pp1->wwid); + mpp->size = pp1->size; + mpp->paths = vector_alloc(); + + if (pp1->priority < 0) + mpp->action = ACT_NOTHING; + + if (!mpp->paths) + return 1; + + if (store_path(mpp->paths, pp1)) + return 1; + + for (i = k + 1; i < VECTOR_SIZE(pathvec); i++) { + pp2 = VECTOR_SLOT(pathvec, i); + + if (strcmp(pp1->wwid, pp2->wwid)) + continue; + + pp2->mpp = mpp; + + if (pp2->size != mpp->size) { + /* + * ouch, avoid feeding that to the DM + */ + condlog(3, "path size mismatch : discard %s", + mpp->wwid); + mpp->action = ACT_NOTHING; + } + if (pp2->priority < 0) + mpp->action = ACT_NOTHING; + + if (store_path(mpp->paths, pp2)) + return 1; + } + if (setup_map(mpp)) { + free_multipath(mpp, KEEP_PATHS); + continue; + } + condlog(3, "action preset to %i", mpp->action); + + if (mpp->action == ACT_UNDEF) + select_action(mpp, curmp); + + condlog(3, "action set to %i", mpp->action); + + domap(mpp); + free_multipath(mpp, KEEP_PATHS); + } + return 0; +} + +static void +usage (char * progname) +{ + fprintf (stderr, VERSION_STRING); + fprintf (stderr, "Usage: %s\t[-v level] [-d] [-l] [-S]\n", + progname); + fprintf (stderr, + "\t\t\t[-p failover|multibus|group_by_serial|group_by_prio]\n" \ + "\t\t\t[device]\n" \ + "\n" \ + "\t-v level\tverbosty level\n" \ + "\t 0\t\t\tno output\n" \ + "\t 1\t\t\tprint created devmap names only\n" \ + "\t 2\t\t\tdefault verbosity\n" \ + "\t 3\t\t\tprint debug information\n" \ + "\t-d\t\tdry run, do not create or update devmaps\n" \ + "\t-l\t\tlist the current multipath topology\n" \ + "\t-F\t\tflush all multipath device maps\n" \ + "\t-p policy\tforce all maps to specified policy :\n" \ + "\t failover\t\t1 path per priority group\n" \ + "\t multibus\t\tall paths in 1 priority group\n" \ + "\t group_by_serial\t1 priority group per serial\n" \ + "\t group_by_prio\t1 priority group per priority lvl\n" \ + "\t group_by_node_name\t1 priority group per target node\n" \ + "\n" \ + "\tdevice\t\tlimit scope to the device's multipath\n" \ + "\t\t\t(udev-style $DEVNAME reference, eg /dev/sdb\n" \ + "\t\t\tor major:minor or a device map name)\n" \ + ); + + exit(1); +} + +static int +update_pathvec (vector pathvec) +{ + int i; + struct path * pp; + + vector_foreach_slot (pathvec, pp, i) { + if (pp->dev && pp->dev_t && strlen(pp->dev) == 0) { + devt2devname(pp->dev, pp->dev_t); + pathinfo(pp, conf->hwtable, + DI_SYSFS | DI_CHECKER | DI_SERIAL | DI_PRIO); + } + if (pp->checkfn && pp->state == PATH_UNCHECKED) + pp->state = pp->checkfn(pp->fd, NULL, NULL); + } + return 0; +} + +static int +get_dm_mpvec (vector curmp, vector pathvec, char * refwwid) +{ + int i; + struct multipath * mpp; + char * wwid; + + if (dm_get_maps(curmp, DEFAULT_TARGET)) + return 1; + + vector_foreach_slot (curmp, mpp, i) { + wwid = get_mpe_wwid(mpp->alias); + + if (wwid) { + strncpy(mpp->wwid, wwid, WWID_SIZE); + wwid = NULL; + } else + strncpy(mpp->wwid, mpp->alias, WWID_SIZE); + + if (refwwid && strncmp(mpp->wwid, refwwid, WWID_SIZE)) + continue; + + condlog(3, "params = %s", mpp->params); + condlog(3, "status = %s", mpp->status); + disassemble_map(pathvec, mpp->params, mpp); + update_pathvec(pathvec); + disassemble_status(mpp->status, mpp); + + if (conf->list) + print_mp(mpp); + + if (!conf->dry_run) + reinstate_paths(mpp); + } + return 0; +} + +int +main (int argc, char *argv[]) +{ + vector curmp = NULL; + vector pathvec = NULL; + int i; + int arg; + extern char *optarg; + extern int optind; + char * refwwid = NULL; + + if (dm_prereq(DEFAULT_TARGET, 1, 0, 3)) { + condlog(0, "device mapper prerequisites not met"); + exit(1); + } + if (sysfs_get_mnt_path(sysfs_path, FILE_NAME_SIZE)) { + condlog(0, "multipath tools need sysfs mounted"); + exit(1); + } + if (load_config(DEFAULT_CONFIGFILE)) + exit(1); + + while ((arg = getopt(argc, argv, ":qdlFi:M:v:p:")) != EOF ) { + switch(arg) { + case 1: printf("optarg : %s\n",optarg); + break; + case 'v': + if (sizeof(optarg) > sizeof(char *) || + !isdigit(optarg[0])) + usage (argv[0]); + + conf->verbosity = atoi(optarg); + break; + case 'd': + conf->dry_run = 1; + break; + case 'F': + dm_flush_maps(DEFAULT_TARGET); + goto out; + break; + case 'l': + conf->list = 1; + conf->dry_run = 1; + break; + case 'M': +#if _DEBUG_ + debug = atoi(optarg); +#endif + break; + case 'p': + conf->pgpolicy_flag = get_pgpolicy_id(optarg); + if (conf->pgpolicy_flag == -1) { + printf("'%s' is not a valid policy\n", optarg); + usage(argv[0]); + } + break; + case ':': + fprintf(stderr, "Missing option arguement\n"); + usage(argv[0]); + case '?': + fprintf(stderr, "Unknown switch: %s\n", optarg); + usage(argv[0]); + default: + usage(argv[0]); + } + } + if (optind < argc) { + conf->dev = MALLOC(FILE_NAME_SIZE); + + if (!conf->dev) + goto out; + + strncpy(conf->dev, argv[optind], FILE_NAME_SIZE); + + if (filepresent(conf->dev)) + conf->dev_type = DEV_DEVNODE; + else if (sscanf(conf->dev, "%d:%d", &i, &i) == 2) + conf->dev_type = DEV_DEVT; + else + conf->dev_type = DEV_DEVMAP; + + } + + /* + * allocate core vectors to store paths and multipaths + */ + curmp = vector_alloc(); + pathvec = vector_alloc(); + + if (!curmp || !pathvec) { + condlog(0, "can not allocate memory"); + goto out; + } + + /* + * if we have a blacklisted device parameter, exit early + */ + if (conf->dev && blacklist(conf->blist, conf->dev)) + goto out; + + if (!cache_cold(CACHE_EXPIRE)) { + condlog(3, "load path identifiers cache"); + cache_load(pathvec); + } + + /* + * get a path list + */ + if (path_discovery(pathvec, conf, DI_CHECKER) || VECTOR_SIZE(pathvec) == 0) + goto out; + + if (conf->verbosity > 2) { + fprintf(stdout, "#\n# all paths :\n#\n"); + print_all_paths(pathvec); + } + + refwwid = get_refwwid(pathvec); + + if (get_dm_mpvec(curmp, pathvec, refwwid)) + goto out; + + cache_dump(pathvec); + filter_pathvec(pathvec, refwwid); + + if (conf->list) + goto out; + + /* + * core logic entry point + */ + coalesce_paths(curmp, pathvec); + +out: + if (refwwid) + FREE(refwwid); + + free_multipathvec(curmp, KEEP_PATHS); + free_pathvec(pathvec, FREE_PATHS); + free_config(conf); +#ifdef _DEBUG_ + dbg_free_final(NULL); +#endif + exit(0); +} diff --git a/multipath/main.h b/multipath/main.h new file mode 100644 index 0000000..a5edb01 --- /dev/null +++ b/multipath/main.h @@ -0,0 +1,55 @@ +/* + * Soft: Description here... + * + * Version: $Id: main.h,v 0.0.1 2003/09/18 15:13:38 cvaroqui Exp $ + * + * Author: Copyright (C) 2003 Christophe Varoqui + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + * See the GNU General Public License for more details. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _MAIN_H +#define _MAIN_H + +/* + * configurator actions + */ +#define ACT_NOTHING_STR "unchanged" +#define ACT_RELOAD_STR "reload" +#define ACT_SWITCHPG_STR "switchpg" +#define ACT_CREATE_STR "create" + +enum actions { + ACT_UNDEF, + ACT_NOTHING, + ACT_RELOAD, + ACT_SWITCHPG, + ACT_CREATE +}; + +/* + * Build version + */ +#define PROG "multipath" + +#define VERSION_CODE 0x000404 +#define DATE_CODE 0x100405 + +#define MULTIPATH_VERSION(version) \ + (version >> 16) & 0xFF, \ + (version >> 8) & 0xFF, \ + version & 0xFF + +#define VERSION_STRING PROG" v%d.%d.%d (%.2d/%.2d, 20%.2d)\n", \ + MULTIPATH_VERSION(VERSION_CODE), \ + MULTIPATH_VERSION(DATE_CODE) + +#endif diff --git a/multipath/multipath.8 b/multipath/multipath.8 new file mode 100644 index 0000000..79e3b00 --- /dev/null +++ b/multipath/multipath.8 @@ -0,0 +1,90 @@ +.TH MULTIPATH 8 "February 2004" "" "Linux Administrator's Manual" +.SH NAME +multipath \- Device mapper target autoconfig +.SH SYNOPSIS +.B multipath +.RB [\| \-v\ \c +.IR verbosity \|] +.RB [\| \-d \|] +.RB [\| \-l \|] +.RB [\| \-i\ \c +.IR int \|] +.RB [\| \-p\ \c +.BR failover | multibus | group_by_serial | group_by_prio | group_by_node_name \|] +.RB [\| -S \|] +.RB [\| device \|] +.SH DESCRIPTION +.B multipath +is used to detect multiple paths to devices for fail-over or performance reasons and coalesces them +.SH OPTIONS +.TP +.B \-v " level" +verbosity, print all paths and multipaths +.RS 1.2i +.TP 1.2i +.B 0 +no output +.TP +.B 1 +print the created or updated multipath names only, for use to feed other tools like kpartx +.TP +.B 2 + +print all info : detected paths, coalesced paths (ie multipaths) and device maps +.RE +.TP +.B \-d +dry run, do not create or update devmaps +.TP +.B \-l +list the current multipath configuration +.TP +.TP +.BI \-i " interval" +multipath target param: polling interval +.TP +.BI \-D " major:minor" +update only the devmap the path pointed by +.I major:minor +is in +.TP +.B \-F +flush all the multipath device maps +.TP +.BI \-p " policy" +force maps to specified policy: +.RS 1.2i +.TP 1.2i +.B failover +1 path per priority group +.TP +.B multibus +all paths in 1 priority group +.TP +.B group_by_serial +1 priority group per serial +.TP +.B group_by_prio +1 priority group per priority value. Priorities are determined by callout programs specified as a global, per-controler or per-multipath option in the configuration file +.TP +.B group_by_node_name +1 priority group per target node name. Target node names are fetched in /sys/class/fc_transport/target*/node_name. +.RE +.TP +.B \-S +do not send signal to multipathd. -d activate this silently. +.TP +.BI device +update only the devmap the path pointed by +.I device +is in. +.I device +is in the /dev/sdb (as shown by udev in the $DEVNAME variable) or major:minor format. +.I device +may alternatively be a multipath mapname +.SH "SEE ALSO" +.BR udev (8), +.BR dmsetup (8) +.BR hotplug (8) +.SH AUTHORS +.B multipath +was developed by Christophe Varoqui, <christophe.varoqui@free.fr> and others. diff --git a/multipath/multipath.dev b/multipath/multipath.dev new file mode 100644 index 0000000..dab433d --- /dev/null +++ b/multipath/multipath.dev @@ -0,0 +1,13 @@ +#!/bin/sh -e + +if [ ! "${ACTION}" = add ] ; then + exit +fi + +if [ "${DEVPATH:7:3}" = "dm-" ] ; then + dev=$(</sys${DEVPATH}/dev) + map=$(/sbin/devmap_name $dev) + /sbin/kpartx -v -a /dev/$map +else + /sbin/multipath -v0 ${DEVNAME} +fi diff --git a/multipath/multipath.rules b/multipath/multipath.rules new file mode 100644 index 0000000..39f266e --- /dev/null +++ b/multipath/multipath.rules @@ -0,0 +1,3 @@ +# multipath wants the devmaps presented as meaninglful device names +# so name them after their devmap name +KERNEL="dm-[0-9]*", PROGRAM="/sbin/devmap_name %M %m", NAME="%k", SYMLINK="%c" diff --git a/multipathd/Makefile b/multipathd/Makefile new file mode 100644 index 0000000..9d656af --- /dev/null +++ b/multipathd/Makefile @@ -0,0 +1,64 @@ +BUILD = glibc +EXEC = multipathd + +include ../Makefile.inc + +# +# directories where to put stuff +# +bindir = /usr/bin +mandir = /usr/share/man/man8 +rcdir = /etc/init.d + +# +# basic flags setting +# +CFLAGS = -pipe -g -Wall -Wunused -Wstrict-prototypes \ + -DDAEMON -I$(multipathdir) -I$(checkersdir) +LDFLAGS = -lpthread -ldevmapper -lsysfs + +# +# object files +# +OBJS = main.o copy.o log.o log_pthread.o pidfile.o \ + $(MULTIPATHLIB)-glibc.a \ + $(CHECKERSLIB)-glibc.a \ + + +# +# directives +# +all : $(BUILD) + +glibc: $(EXEC) + +klibc: + $(MAKE) BUILD=glibc glibc + +$(EXEC): clean $(OBJS) + $(CC) $(OBJS) -o $(EXEC) $(LDFLAGS) + $(STRIP) $(EXEC) + $(GZIP) $(EXEC).8 > $(EXEC).8.gz + +$(CHECKERSLIB)-glibc.a: + $(MAKE) -C $(checkersdir) BUILD=glibc glibc + +$(MULTIPATHLIB)-glibc.a: + $(MAKE) -C $(multipathdir) DAEMON=1 BUILD=glibc glibc + +install: + install -d $(DESTDIR)$(bindir) + install -m 755 $(EXEC) $(DESTDIR)$(bindir) + install -d $(DESTDIR)$(rcdir) + install -d $(DESTDIR)$(mandir) + install -m 644 $(EXEC).8.gz $(DESTDIR)$(mandir) + +uninstall: + rm -f $(DESTDIR)$(bindir)/$(EXEC) + rm -f $(DESTDIR)$(rcdir)/$(EXEC) + rm -f $(DESTDIR)$(mandir)/$(EXEC).8.gz + +clean: + $(MAKE) -C $(multipathdir) clean + rm -f core *.o $(EXEC) *.gz + diff --git a/multipathd/clone_platform.h b/multipathd/clone_platform.h new file mode 100644 index 0000000..cbaa518 --- /dev/null +++ b/multipathd/clone_platform.h @@ -0,0 +1,15 @@ +/* + * This file is copied from the LTP project. Thanks. + * It is covered by the GPLv2, see the LICENSE file + */ +#define CHILD_STACK_SIZE 16384 + +#if defined (__s390__) || (__s390x__) +#define clone __clone +extern int __clone(int(void*),void*,int,void*); +#elif defined(__ia64__) +#define clone2 __clone2 +extern int __clone2(int (*fn) (void *arg), void *child_stack_base, + size_t child_stack_size, int flags, void *arg, + pid_t *parent_tid, void *tls, pid_t *child_tid); +#endif diff --git a/multipathd/copy.c b/multipathd/copy.c new file mode 100644 index 0000000..060a457 --- /dev/null +++ b/multipathd/copy.c @@ -0,0 +1,99 @@ +#include <fcntl.h> +#include <stdlib.h> +#include <stdio.h> +#include <sys/mman.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <unistd.h> +#include <string.h> +#include <errno.h> +#include <util.h> + +#include "log_pthread.h" + +#define FILESIZE 128 + +int +copy (char * src, char * dst) +{ + int fdin; + int fdout; + char * mmsrc; + char * mmdst; + struct stat statbuf; + + fdin = open (src, O_RDONLY); + + if (fdin < 0) { + log_safe(3, "[copy.c] cannot open %s", src); + return -1; + } + /* + * Stat the input file to obtain its size + */ + if (fstat (fdin, &statbuf) < 0) { + log_safe(3, "[copy.c] cannot stat %s", src); + goto out1; + } + /* + * Open the output file for writing, + * with the same permissions as the source file + */ + fdout = open (dst, O_RDWR | O_CREAT | O_TRUNC, statbuf.st_mode); + + if (fdout < 0) { + log_safe(3, "[copy.c] cannot open %s", dst); + goto out1; + } + + if (lseek (fdout, statbuf.st_size - 1, SEEK_SET) == -1) { + log_safe(3, "[copy.c] cannot lseek %s", dst); + goto out2; + } + + if (write (fdout, "", 1) != 1) { + log_safe(3, "[copy.c] cannot write dummy char"); + goto out2; + } + /* + * Blast the bytes from one file to the other + */ + if ((mmsrc = mmap(0, statbuf.st_size, PROT_READ, MAP_SHARED, fdin, 0)) + == (caddr_t) -1) { + log_safe(3, "[copy.c] cannot mmap %s", src); + goto out2; + } + + if ((mmdst = mmap(0, statbuf.st_size, PROT_READ | PROT_WRITE, + MAP_SHARED, fdout, 0)) == (caddr_t) -1) { + log_safe(3, "[copy.c] cannot mmap %s", dst); + goto out3; + } + memcpy(mmdst, mmsrc, statbuf.st_size); + +/* done */ + munmap(mmdst, statbuf.st_size); +out3: + munmap(mmsrc, statbuf.st_size); +out2: + close (fdout); +out1: + close (fdin); + + return 0; +} + +int +copytodir (char * src, char * dstdir) +{ + char dst[FILESIZE]; + char filename[FILESIZE]; + + basename(src, filename); + if (FILESIZE <= snprintf(dst, FILESIZE, "%s/%s", dstdir, filename)) { + log_safe(3, "[copy.c] filename buffer overflow : %s ", filename); + return -1; + } + + return copy(src, dst); +} diff --git a/multipathd/copy.h b/multipathd/copy.h new file mode 100644 index 0000000..999391f --- /dev/null +++ b/multipathd/copy.h @@ -0,0 +1,6 @@ +#ifndef _COPY_H +#define _COPY_H + +int copytodir (char *, char *); + +#endif /* _COPY_H */ diff --git a/multipathd/log.c b/multipathd/log.c new file mode 100644 index 0000000..21879a2 --- /dev/null +++ b/multipathd/log.c @@ -0,0 +1,181 @@ +#include <stdio.h> +#include <stdlib.h> +#include <stdarg.h> +#include <string.h> +#include <syslog.h> + +#include "log.h" + +#if LOGDBG +static void dump_logarea (void) +{ + struct logmsg * msg; + + logdbg(stderr, "\n==== area: start addr = %p, end addr = %p ====\n", + la->start, la->end); + logdbg(stderr, "|addr |next |prio|msg\n"); + + for (msg = (struct logmsg *)la->head; (void *)msg != la->tail; + msg = msg->next) + logdbg(stderr, "|%p |%p |%i |%s\n", (void *)msg, msg->next, + msg->prio, (char *)&msg->str); + + logdbg(stderr, "|%p |%p |%i |%s\n", (void *)msg, msg->next, + msg->prio, (char *)&msg->str); + + logdbg(stderr, "\n\n"); +} +#endif + +static int logarea_init (int size) +{ + logdbg(stderr,"enter logarea_init\n"); + la = malloc(sizeof(struct logarea)); + + if (!la) + return 1; + + if (size < MAX_MSG_SIZE) + size = DEFAULT_AREA_SIZE; + + la->start = malloc(size); + memset(la->start, 0, size); + + if (!la->start) { + free(la); + return 1; + } + + la->empty = 1; + la->end = la->start + size; + la->head = la->start; + la->tail = la->start; + + la->buff = malloc(MAX_MSG_SIZE + sizeof(struct logmsg)); + + if (!la->buff) { + free(la->start); + free(la); + return 1; + } + return 0; + +} + +int log_init(char *program_name, int size) +{ + logdbg(stderr,"enter log_init\n"); + openlog(program_name, 0, LOG_DAEMON); + + if (logarea_init(size)) + return 1; + + return 0; +} + +void free_logarea (void) +{ + free(la->start); + free(la->buff); + free(la); + return; +} + +void log_close (void) +{ + free_logarea(); + closelog(); + + return; +} + +int log_enqueue (int prio, const char * fmt, va_list ap) +{ + int len, fwd; + char buff[MAX_MSG_SIZE]; + struct logmsg * msg; + struct logmsg * lastmsg; + + lastmsg = (struct logmsg *)la->tail; + + if (!la->empty) { + fwd = sizeof(struct logmsg) + + strlen((char *)&lastmsg->str) * sizeof(char) + 1; + la->tail += fwd; + } + vsnprintf(buff, MAX_MSG_SIZE, fmt, ap); + len = strlen(buff) * sizeof(char) + 1; + + /* not enough space on tail : rewind */ + if (la->head <= la->tail && + (len + sizeof(struct logmsg)) > (la->end - la->tail)) { + logdbg(stderr, "enqueue: rewind tail to %p\n", la->tail); + la->tail = la->start; + } + + /* not enough space on head : drop msg */ + if (la->head > la->tail && + (len + sizeof(struct logmsg)) > (la->head - la->tail)) { + logdbg(stderr, "enqueue: log area overrun, drop msg\n"); + + if (!la->empty) + la->tail = lastmsg; + + return 1; + } + + /* ok, we can stage the msg in the area */ + la->empty = 0; + msg = (struct logmsg *)la->tail; + msg->prio = prio; + memcpy((void *)&msg->str, buff, len); + lastmsg->next = la->tail; + msg->next = la->head; + + logdbg(stderr, "enqueue: %p, %p, %i, %s\n", (void *)msg, msg->next, + msg->prio, (char *)&msg->str); + +#if LOGDBG + dump_logarea(); +#endif + return 0; +} + +int log_dequeue (void * buff) +{ + struct logmsg * src = (struct logmsg *)la->head; + struct logmsg * dst = (struct logmsg *)buff; + struct logmsg * lst = (struct logmsg *)la->tail; + + if (la->empty) + return 1; + + int len = strlen((char *)&src->str) * sizeof(char) + + sizeof(struct logmsg) + 1; + + dst->prio = src->prio; + memcpy(dst, src, len); + + if (la->tail == la->head) + la->empty = 1; /* we purge the last logmsg */ + else { + la->head = src->next; + lst->next = la->head; + } + logdbg(stderr, "dequeue: %p, %p, %i, %s\n", + (void *)src, src->next, src->prio, (char *)&src->str); + + memset((void *)src, 0, len); + + return la->empty; +} + +/* + * this one can block under memory pressure + */ +void log_syslog (void * buff) +{ + struct logmsg * msg = (struct logmsg *)buff; + + syslog(msg->prio, "%s", (char *)&msg->str); +} diff --git a/multipathd/log.h b/multipathd/log.h new file mode 100644 index 0000000..c697118 --- /dev/null +++ b/multipathd/log.h @@ -0,0 +1,42 @@ +#ifndef LOG_H +#define LOG_H + +#define DEFAULT_AREA_SIZE 8192 +#define MAX_MSG_SIZE 128 + +#ifndef LOGLEVEL +#define LOGLEVEL 5 +#endif + +#if LOGDBG +#define logdbg(file, fmt, args...) fprintf(file, fmt, ##args) +#else +#define logdbg(file, fmt, args...) do {} while (0) +#endif + +struct logmsg { + short int prio; + void * next; + char * str; +}; + +struct logarea { + int empty; + void * head; + void * tail; + void * start; + void * end; + char * buff; +}; + +struct logarea * la; + +int log_init (char * progname, int size); +void log_close (void); +int log_enqueue (int prio, const char * fmt, va_list ap); +int log_dequeue (void *); +void log_syslog (void *); +void dump_logmsg (void *); +void free_logarea (void); + +#endif /* LOG_H */ diff --git a/multipathd/log_pthread.c b/multipathd/log_pthread.c new file mode 100644 index 0000000..278d173 --- /dev/null +++ b/multipathd/log_pthread.c @@ -0,0 +1,92 @@ +#include <stdio.h> +#include <stdlib.h> +#include <stdarg.h> +#include <pthread.h> +#include <sys/mman.h> + +#include "log_pthread.h" +#include "log.h" + +void log_safe (int prio, char * fmt, ...) +{ + va_list ap; + + pthread_mutex_lock(logq_lock); + va_start(ap, fmt); + log_enqueue(prio, fmt, ap); + va_end(ap); + pthread_mutex_unlock(logq_lock); + + pthread_mutex_lock(logev_lock); + pthread_cond_signal(logev_cond); + pthread_mutex_unlock(logev_lock); +} + +static void flush_logqueue (void) +{ + int empty; + + do { + pthread_mutex_lock(logq_lock); + empty = log_dequeue(la->buff); + pthread_mutex_unlock(logq_lock); + log_syslog(la->buff); + } while (empty == 0); +} + +static void * log_thread (void * et) +{ + mlockall(MCL_CURRENT | MCL_FUTURE); + logdbg(stderr,"enter log_thread\n"); + + while (1) { + pthread_mutex_lock(logev_lock); + pthread_cond_wait(logev_cond, logev_lock); + pthread_mutex_unlock(logev_lock); + + flush_logqueue(); + } +} + +void log_thread_start (void) +{ + pthread_attr_t attr; + + logdbg(stderr,"enter log_thread_start\n"); + + logq_lock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); + logev_lock = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t)); + logev_cond = (pthread_cond_t *) malloc(sizeof(pthread_cond_t)); + + pthread_mutex_init(logq_lock, NULL); + pthread_mutex_init(logev_lock, NULL); + pthread_cond_init(logev_cond, NULL); + + pthread_attr_init(&attr); + pthread_attr_setstacksize(&attr, 64 * 1024); + + if (log_init("multipathd", 0)) { + fprintf(stderr,"can't initialize log buffer\n"); + exit(1); + } + pthread_create(&log_thr, &attr, log_thread, NULL); + + return; +} + +void log_thread_stop (void) +{ + logdbg(stderr,"enter log_thread_stop\n"); + + pthread_mutex_lock(logq_lock); + pthread_cancel(log_thr); + pthread_mutex_unlock(logq_lock); + + flush_logqueue(); + + pthread_mutex_destroy(logq_lock); + pthread_mutex_destroy(logev_lock); + pthread_cond_destroy(logev_cond); + + free_logarea(); +} diff --git a/multipathd/log_pthread.h b/multipathd/log_pthread.h new file mode 100644 index 0000000..adbbb7a --- /dev/null +++ b/multipathd/log_pthread.h @@ -0,0 +1,14 @@ +#ifndef _LOG_PTHREAD_H +#define _LOG_PTHREAD_H + +pthread_t log_thr; + +pthread_mutex_t *logq_lock; +pthread_mutex_t *logev_lock; +pthread_cond_t *logev_cond; + +void log_safe(int prio, char * fmt, ...); +void log_thread_start(void); +void log_thread_stop(void); + +#endif /* _LOG_PTHREAD_H */ diff --git a/multipathd/main.c b/multipathd/main.c new file mode 100644 index 0000000..0f36b77 --- /dev/null +++ b/multipathd/main.c @@ -0,0 +1,1059 @@ +#include <string.h> +#include <pthread.h> +#include <stdio.h> +#include <unistd.h> +#include <linux/unistd.h> +#include <stdlib.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> +#include <libdevmapper.h> +#include <signal.h> +#include <wait.h> +#include <sched.h> +#include <errno.h> +#include <sys/mount.h> +#include <sys/mman.h> + +/* + * libsysfs + */ +#include <sysfs/libsysfs.h> +#include <sysfs/dlist.h> + +/* + * libcheckers + */ +#include <checkers.h> +#include <path_state.h> + +/* + * libmultipath + */ +#include <parser.h> +#include <vector.h> +#include <memory.h> +#include <config.h> +#include <callout.h> +#include <util.h> +#include <blacklist.h> +#include <hwtable.h> +#include <defaults.h> +#include <structs.h> +#include <dmparser.h> +#include <devmapper.h> +#include <dict.h> +#include <discovery.h> +#include <debug.h> +#include <propsel.h> +#include <uevent.h> + +#include "main.h" +#include "copy.h" +#include "clone_platform.h" +#include "pidfile.h" + +#define FILE_NAME_SIZE 256 +#define CMDSIZE 160 + +#define CALLOUT_DIR "/var/cache/multipathd" + +#define LOG_MSG(a,b) \ + if (strlen(a)) { \ + log_safe(LOG_WARNING, "%s: %s", b, a); \ + memset(a, 0, MAX_CHECKER_MSG_SIZE); \ + } + +#ifdef LCKDBG +#define lock(a) \ + fprintf(stderr, "%s:%s(%i) lock %p\n", __FILE__, __FUNCTION__, __LINE__, a); \ + pthread_mutex_lock(a) +#define unlock(a) \ + fprintf(stderr, "%s:%s(%i) unlock %p\n", __FILE__, __FUNCTION__, __LINE__, a); \ + pthread_mutex_unlock(a) +#else +#define lock(a) pthread_mutex_lock(a) +#define unlock(a) pthread_mutex_unlock(a) +#endif + +/* + * global vars + */ +int pending_event = 0; +pthread_mutex_t *event_lock; +pthread_cond_t *event; + +/* + * structs + */ +struct paths { + pthread_mutex_t *lock; + vector pathvec; +}; + +struct event_thread { + pthread_t *thread; + pthread_mutex_t *waiter_lock; + int lease; + int event_nr; + char mapname[WWID_SIZE]; + struct paths *allpaths; +}; + +int +uev_trigger (struct uevent * uev, void * trigger_data) +{ + int r = 0; + int i; + char devname[32]; + struct path * pp; + struct paths * allpaths; + + allpaths = (struct paths *)trigger_data; + + if (strncmp(uev->devpath, "/block", 6)) + goto out; + + basename(uev->devpath, devname); + lock(allpaths->lock); + pp = find_path_by_dev(allpaths->pathvec, devname); + + r = 1; + + if (pp && !strncmp(uev->action, "remove", 6)) { + condlog(2, "remove %s path checker", devname); + i = find_slot(allpaths->pathvec, (void *)pp); + vector_del_slot(allpaths->pathvec, i); + free_path(pp); + } + if (!pp && !strncmp(uev->action, "add", 3)) { + condlog(2, "add %s path checker", devname); + store_pathinfo(allpaths->pathvec, conf->hwtable, devname); + } + unlock(allpaths->lock); + + r = 0; +out: + FREE(uev); + return r; +} + +static void * +ueventloop (void * ap) +{ + uevent_listen(&uev_trigger, ap); + + return NULL; +} + +static void +strvec_free (vector vec) +{ + int i; + char * str; + + vector_foreach_slot (vec, str, i) + if (str) + FREE(str); + + vector_free(vec); +} + +static int +exit_daemon (int status) +{ + if (status != 0) + fprintf(stderr, "bad exit status. see daemon.log\n"); + + log_safe(LOG_INFO, "umount ramfs"); + umount(CALLOUT_DIR); + + log_safe(LOG_INFO, "unlink pidfile"); + unlink(DEFAULT_PIDFILE); + + log_safe(LOG_NOTICE, "--------shut down-------"); + log_thread_stop(); + exit(status); +} + +static void +set_paths_owner (struct paths * allpaths, struct multipath * mpp) +{ + int i; + struct path * pp; + + lock(allpaths->lock); + + vector_foreach_slot (allpaths->pathvec, pp, i) + if (!strncmp(mpp->wwid, pp->wwid, WWID_SIZE)) + pp->mpp = mpp; + + unlock(allpaths->lock); +} + +static int +get_dm_mpvec (vector mpvec, struct paths * allpaths) +{ + int i; + struct multipath * mpp; + char * wwid; + + if (dm_get_maps(mpvec, "multipath")) + return 1; + + vector_foreach_slot (mpvec, mpp, i) { + wwid = get_mpe_wwid(mpp->alias); + + if (wwid) { + strncpy(mpp->wwid, wwid, WWID_SIZE); + wwid = NULL; + } else + strncpy(mpp->wwid, mpp->alias, WWID_SIZE); + + set_paths_owner(allpaths, mpp); + } + return 0; +} + +static int +path_discovery_locked (struct paths *allpaths, char *sysfs_path) +{ + lock(allpaths->lock); + path_discovery(allpaths->pathvec, conf, 0); + unlock(allpaths->lock); + + return 0; +} + +static int +mark_failed_path (struct paths *allpaths, char *mapname) +{ + struct multipath *mpp; + struct pathgroup *pgp; + struct path *pp; + struct path *app; + int i, j; + int r = 1; + + if (!dm_map_present(mapname)) + return 0; + + mpp = alloc_multipath(); + + if (!mpp) + return 1; + + if (dm_get_map(mapname, &mpp->size, (char *)mpp->params)) + goto out; + + if (dm_get_status(mapname, mpp->status)) + goto out; + + lock(allpaths->lock); + r = disassemble_map(allpaths->pathvec, mpp->params, mpp); + unlock(allpaths->lock); + + if (r) + goto out; + + r = disassemble_status(mpp->status, mpp); + + if (r) + goto out; + + r = 0; /* can't fail from here on */ + lock(allpaths->lock); + + vector_foreach_slot (mpp->pg, pgp, i) { + vector_foreach_slot (pgp->paths, pp, j) { + if (pp->dmstate != PSTATE_FAILED) + continue; + + app = find_path_by_devt(allpaths->pathvec, pp->dev_t); + + if (app && app->state != PATH_DOWN) { + log_safe(LOG_NOTICE, "mark %s as failed", + pp->dev_t); + app->state = PATH_DOWN; + } + } + } + unlock(allpaths->lock); +out: + free_multipath(mpp, KEEP_PATHS); + + return r; +} + +static void * +waiteventloop (struct event_thread * waiter, char * cmd) +{ + struct dm_task *dmt; + int event_nr; + + if (!waiter->event_nr) + waiter->event_nr = dm_geteventnr(waiter->mapname); + + if (!(dmt = dm_task_create(DM_DEVICE_WAITEVENT))) + return 0; + + if (!dm_task_set_name(dmt, waiter->mapname)) + goto out; + + if (waiter->event_nr && !dm_task_set_event_nr(dmt, waiter->event_nr)) + goto out; + + dm_task_no_open_count(dmt); + + dm_task_run(dmt); + + waiter->event_nr++; + + /* + * upon event ... + */ + while (1) { + log_safe(LOG_DEBUG, "%s", cmd); + log_safe(LOG_NOTICE, "devmap event (%i) on %s", + waiter->event_nr, waiter->mapname); + + mark_failed_path(waiter->allpaths, waiter->mapname); + + event_nr = dm_geteventnr(waiter->mapname); + + if (waiter->event_nr == event_nr) + break; + + waiter->event_nr = event_nr; + } + +out: + dm_task_destroy(dmt); + + /* + * tell waiterloop we have an event + */ + lock(event_lock); + pending_event++; + pthread_cond_signal(event); + unlock(event_lock); + + return NULL; +} + +static void * +waitevent (void * et) +{ + struct event_thread *waiter; + char cmd[CMDSIZE]; + + mlockall(MCL_CURRENT | MCL_FUTURE); + + waiter = (struct event_thread *)et; + lock(waiter->waiter_lock); + pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL); + + if (safe_snprintf(cmd, CMDSIZE, "%s %s", + conf->multipath, waiter->mapname)) { + log_safe(LOG_ERR, "command too long, abord reconfigure"); + goto out; + } + while (1) + waiteventloop(waiter, cmd); + +out: + /* + * release waiter_lock so that waiterloop knows we are gone + */ + unlock(waiter->waiter_lock); + pthread_exit(waiter->thread); + + return NULL; +} + +static void * +alloc_waiter (void) +{ + + struct event_thread * wp; + + wp = MALLOC(sizeof(struct event_thread)); + + if (!wp) + return NULL; + + wp->thread = MALLOC(sizeof(pthread_t)); + + if (!wp->thread) + goto out; + + wp->waiter_lock = (pthread_mutex_t *)MALLOC(sizeof(pthread_mutex_t)); + + if (!wp->waiter_lock) + goto out1; + + pthread_mutex_init(wp->waiter_lock, NULL); + return wp; + +out1: + free(wp->thread); +out: + free(wp); + return NULL; +} + +static void +free_waiter (struct event_thread * wp) +{ + pthread_mutex_destroy(wp->waiter_lock); + free(wp->thread); + free(wp); +} + +static void +fail_path (struct path * pp) +{ + if (!pp->mpp) + return; + + log_safe(LOG_NOTICE, "checker failed path %s in map %s", + pp->dev_t, pp->mpp->alias); + + dm_fail_path(pp->mpp->alias, pp->dev_t); +} + +static void * +waiterloop (void *ap) +{ + struct paths *allpaths; + struct multipath *mpp; + vector mpvec = NULL; + vector waiters; + struct event_thread *wp; + pthread_attr_t attr; + int r; + char buff[1]; + int i, j; + + mlockall(MCL_CURRENT | MCL_FUTURE); + log_safe(LOG_NOTICE, "start DM events thread"); + + if (sysfs_get_mnt_path(sysfs_path, FILE_NAME_SIZE)) { + log_safe(LOG_ERR, "can not find sysfs mount point"); + return NULL; + } + + /* + * inits + */ + allpaths = (struct paths *)ap; + waiters = vector_alloc(); + + if (!waiters) + return NULL; + + if (pthread_attr_init(&attr)) + return NULL; + + pthread_attr_setstacksize(&attr, 32 * 1024); + + /* + * update paths list + */ + log_safe(LOG_INFO, "fetch paths list"); + + while(path_discovery_locked(allpaths, sysfs_path)) { + log_safe(LOG_ERR, "can't update path list ... retry"); + sleep(5); + } + + log_safe(LOG_NOTICE, "initial reconfigure multipath maps"); +// execute_program(conf->multipath, buff, 1); + + while (1) { + /* + * revoke the leases + */ + vector_foreach_slot(waiters, wp, i) + wp->lease = 0; + + /* + * update multipaths list + */ + log_safe(LOG_INFO, "refresh multipaths list"); + + if (mpvec) + free_multipathvec(mpvec, KEEP_PATHS); + + while (1) { + /* + * we're not allowed to fail here + */ + mpvec = vector_alloc(); + + if (mpvec && !get_dm_mpvec(mpvec, allpaths)) + break; + + log_safe(LOG_ERR, "can't get mpvec ... retry"); + sleep(5); + } + + /* + * start waiters on all mpvec + */ + log_safe(LOG_INFO, "start up event loops"); + + vector_foreach_slot (mpvec, mpp, i) { + /* + * find out if devmap already has + * a running waiter thread + */ + vector_foreach_slot (waiters, wp, j) + if (!strcmp(wp->mapname, mpp->alias)) + break; + + /* + * no event_thread struct : init it + */ + if (j == VECTOR_SIZE(waiters)) { + wp = alloc_waiter(); + + if (!wp) + continue; + + strncpy(wp->mapname, mpp->alias, WWID_SIZE); + wp->allpaths = allpaths; + + if (!vector_alloc_slot(waiters)) { + free_waiter(wp); + continue; + } + vector_set_slot(waiters, wp); + } + + /* + * event_thread struct found + */ + else if (j < VECTOR_SIZE(waiters)) { + r = pthread_mutex_trylock(wp->waiter_lock); + + /* + * thread already running : next devmap + */ + if (r) { + log_safe(LOG_DEBUG, + "event checker running : %s", + wp->mapname); + + /* + * renew the lease + */ + wp->lease = 1; + continue; + } + pthread_mutex_unlock(wp->waiter_lock); + } + + if (pthread_create(wp->thread, &attr, waitevent, wp)) { + log_safe(LOG_ERR, + "cannot create event checker : %s", + wp->mapname); + free_waiter(wp); + vector_del_slot(waiters, j); + continue; + } + wp->lease = 1; + log_safe(LOG_NOTICE, "event checker started : %s", + wp->mapname); + } + vector_foreach_slot (waiters, wp, i) { + if (wp->lease == 0) { + log_safe(LOG_NOTICE, "reap event checker : %s", + wp->mapname); + + pthread_cancel(*wp->thread); + free_waiter(wp); + vector_del_slot(waiters, i); + i--; + } + } + /* + * wait event condition + */ + lock(event_lock); + + if (pending_event > 0) + pending_event--; + + log_safe(LOG_INFO, "%i pending event(s)", pending_event); + if(pending_event == 0) + pthread_cond_wait(event, event_lock); + + unlock(event_lock); + } + return NULL; +} + +static void * +checkerloop (void *ap) +{ + struct paths *allpaths; + struct path *pp; + int i; + int newstate; + char buff[1]; + char cmd[CMDSIZE]; + char checker_msg[MAX_CHECKER_MSG_SIZE]; + + mlockall(MCL_CURRENT | MCL_FUTURE); + + memset(checker_msg, 0, MAX_CHECKER_MSG_SIZE); + allpaths = (struct paths *)ap; + + log_safe(LOG_NOTICE, "path checkers start up"); + + while (1) { + lock(allpaths->lock); + log_safe(LOG_DEBUG, "checking paths"); + + vector_foreach_slot (allpaths->pathvec, pp, i) { + if (!pp->checkfn) { + pathinfo(pp, conf->hwtable, DI_SYSFS); + select_checkfn(pp); + } + + if (!pp->checkfn) { + log_safe(LOG_ERR, "%s: checkfn is void", + pp->dev); + continue; + } + newstate = pp->checkfn(pp->fd, checker_msg, + &pp->checker_context); + + if (newstate != pp->state) { + pp->state = newstate; + LOG_MSG(checker_msg, pp->dev_t); + + /* + * proactively fail path in the DM + */ + if (newstate == PATH_DOWN || + newstate == PATH_SHAKY) { + fail_path(pp); + continue; + } + + /* + * reconfigure map now + */ + if (safe_snprintf(cmd, CMDSIZE, "%s %s", + conf->multipath, pp->dev_t)) { + log_safe(LOG_ERR, "command too long," + " abord reconfigure"); + } else { + log_safe(LOG_DEBUG, "%s", cmd); + log_safe(LOG_INFO, + "reconfigure %s multipath", + pp->dev_t); + execute_program(cmd, buff, 1); + } + + /* + * tell waiterloop we have an event + */ + lock (event_lock); + pending_event++; + pthread_cond_signal(event); + unlock (event_lock); + } + pp->state = newstate; + } + unlock(allpaths->lock); + sleep(conf->checkint); + } + return NULL; +} + +static struct paths * +init_paths (void) +{ + struct paths *allpaths; + + allpaths = MALLOC(sizeof(struct paths)); + + if (!allpaths) + return NULL; + + allpaths->lock = + (pthread_mutex_t *)MALLOC(sizeof(pthread_mutex_t)); + + if (!allpaths->lock) + goto out; + + allpaths->pathvec = vector_alloc(); + + if (!allpaths->pathvec) + goto out1; + + pthread_mutex_init (allpaths->lock, NULL); + + return (allpaths); +out1: + FREE(allpaths->lock); +out: + FREE(allpaths); + return NULL; +} + +static int +init_event (void) +{ + event = (pthread_cond_t *)MALLOC(sizeof (pthread_cond_t)); + + if (!event) + return 1; + + pthread_cond_init (event, NULL); + event_lock = (pthread_mutex_t *) MALLOC (sizeof (pthread_mutex_t)); + + if (!event_lock) + goto out; + + pthread_mutex_init (event_lock, NULL); + + return 0; +out: + FREE(event); + return 1; +} +/* + * this logic is all about keeping callouts working in case of + * system disk outage (think system over SAN) + * this needs the clone syscall, so don't bother if not present + * (Debian Woody) + */ +#ifdef CLONE_NEWNS +static int +prepare_namespace(void) +{ + mode_t mode = S_IRWXU; + struct stat *buf; + char ramfs_args[64]; + int i; + int fd; + char * bin; + size_t size = 10; + struct stat statbuf; + + buf = MALLOC(sizeof(struct stat)); + + /* + * create a temp mount point for ramfs + */ + if (stat(CALLOUT_DIR, buf) < 0) { + if (mkdir(CALLOUT_DIR, mode) < 0) { + log_safe(LOG_ERR, "cannot create " CALLOUT_DIR); + return -1; + } + log_safe(LOG_DEBUG, "created " CALLOUT_DIR); + } + + /* + * compute the optimal ramdisk size + */ + vector_foreach_slot (conf->binvec, bin,i) { + if ((fd = open(bin, O_RDONLY)) < 0) { + log_safe(LOG_ERR, "cannot open %s", bin); + return -1; + } + if (fstat(fd, &statbuf) < 0) { + log_safe(LOG_ERR, "cannot stat %s", bin); + return -1; + } + size += statbuf.st_size; + close(fd); + } + log_safe(LOG_INFO, "ramfs maxsize is %u", (unsigned int) size); + + /* + * mount the ramfs + */ + if (safe_sprintf(ramfs_args, "maxsize=%u", (unsigned int) size)) { + fprintf(stderr, "ramfs_args too small\n"); + return -1; + } + if (mount(NULL, CALLOUT_DIR, "ramfs", MS_SYNCHRONOUS, ramfs_args) < 0) { + log_safe(LOG_ERR, "cannot mount ramfs on " CALLOUT_DIR); + return -1; + } + log_safe(LOG_DEBUG, "mount ramfs on " CALLOUT_DIR); + + /* + * populate the ramfs with callout binaries + */ + vector_foreach_slot (conf->binvec, bin,i) { + if (copytodir(bin, CALLOUT_DIR) < 0) { + log_safe(LOG_ERR, "cannot copy %s in ramfs", bin); + exit_daemon(1); + } + log_safe(LOG_DEBUG, "cp %s in ramfs", bin); + } + strvec_free(conf->binvec); + + /* + * bind the ramfs to : + * /sbin : default home of multipath ... + * /bin : default home of scsi_id ... + * /tmp : home of scsi_id temp files + */ + if (mount(CALLOUT_DIR, "/sbin", NULL, MS_BIND, NULL) < 0) { + log_safe(LOG_ERR, "cannot bind ramfs on /sbin"); + return -1; + } + log_safe(LOG_DEBUG, "bind ramfs on /sbin"); + if (mount(CALLOUT_DIR, "/bin", NULL, MS_BIND, NULL) < 0) { + log_safe(LOG_ERR, "cannot bind ramfs on /bin"); + return -1; + } + log_safe(LOG_DEBUG, "bind ramfs on /bin"); + if (mount(CALLOUT_DIR, "/tmp", NULL, MS_BIND, NULL) < 0) { + log_safe(LOG_ERR, "cannot bind ramfs on /tmp"); + return -1; + } + log_safe(LOG_DEBUG, "bind ramfs on /tmp"); + + return 0; +} +#endif + +static void * +signal_set(int signo, void (*func) (int)) +{ + int r; + struct sigaction sig; + struct sigaction osig; + + sig.sa_handler = func; + sigemptyset(&sig.sa_mask); + sig.sa_flags = 0; + + r = sigaction(signo, &sig, &osig); + + if (r < 0) + return (SIG_ERR); + else + return (osig.sa_handler); +} + +static void +sighup (int sig) +{ + log_safe(LOG_NOTICE, "SIGHUP received"); + +#ifdef _DEBUG_ + dbg_free_final(NULL); +#endif +} + +static void +sigend (int sig) +{ + exit_daemon(0); +} + +static void +signal_init(void) +{ + signal_set(SIGHUP, sighup); + signal_set(SIGINT, sigend); + signal_set(SIGTERM, sigend); + signal_set(SIGKILL, sigend); +} + +static void +setscheduler (void) +{ + int res; + static struct sched_param sched_param = { + sched_priority: 99 + }; + + res = sched_setscheduler (0, SCHED_RR, &sched_param); + + if (res == -1) + log_safe(LOG_WARNING, "Could not set SCHED_RR at priority 99"); + return; +} + +static void +set_oom_adj (int val) +{ + FILE *fp; + + fp = fopen("/proc/self/oom_adj", "w"); + + if (!fp) + return; + + fprintf(fp, "%i", val); + fclose(fp); +} + +static int +child (void * param) +{ + pthread_t wait_thr, check_thr, uevent_thr; + pthread_attr_t attr; + struct paths * allpaths; + + mlockall(MCL_CURRENT | MCL_FUTURE); + + log_thread_start(); + log_safe(LOG_NOTICE, "--------start up--------"); + + if (pidfile_create(DEFAULT_PIDFILE, getpid())) { + log_thread_stop(); + exit(1); + } + signal_init(); + setscheduler(); + set_oom_adj(-17); + allpaths = init_paths(); + + if (!allpaths || init_event()) + exit(1); + + conf->checkint = CHECKINT; + + setlogmask(LOG_UPTO(conf->verbosity + 3)); + + condlog(2, "read " DEFAULT_CONFIGFILE); + init_data(DEFAULT_CONFIGFILE, init_keywords); + + /* + * fill the voids left in the config file + */ + if (conf->binvec == NULL) { + conf->binvec = vector_alloc(); + push_callout("/sbin/scsi_id"); + } + if (conf->multipath == NULL) { + conf->multipath = MULTIPATH; + push_callout(conf->multipath); + } + if (conf->hwtable == NULL) { + conf->hwtable = vector_alloc(); + setup_default_hwtable(conf->hwtable); + } + if (conf->blist == NULL) { + conf->blist = vector_alloc(); + setup_default_blist(conf->blist); + } + if (conf->default_selector == NULL) + conf->default_selector = set_default(DEFAULT_SELECTOR); + + if (conf->udev_dir == NULL) + conf->udev_dir = set_default(DEFAULT_UDEVDIR); + + if (conf->default_getuid == NULL) + conf->default_getuid = set_default(DEFAULT_GETUID); + + if (conf->default_features == NULL) + conf->default_features = set_default(DEFAULT_FEATURES); + + if (conf->default_hwhandler == NULL) + conf->default_hwhandler = set_default(DEFAULT_HWHANDLER); + + +#ifdef CLONE_NEWNS + if (prepare_namespace() < 0) { + log_safe(LOG_ERR, "cannot prepare namespace"); + exit_daemon(1); + } +#endif + + /* + * start threads + */ + pthread_attr_init(&attr); + pthread_attr_setstacksize(&attr, 64 * 1024); + + pthread_create(&wait_thr, &attr, waiterloop, allpaths); + pthread_create(&check_thr, &attr, checkerloop, allpaths); + pthread_create(&uevent_thr, &attr, ueventloop, allpaths); + pthread_join(wait_thr, NULL); + pthread_join(check_thr, NULL); + pthread_join(uevent_thr, NULL); + + return 0; +} + +int +main (int argc, char *argv[]) +{ + extern char *optarg; + extern int optind; + int arg; + int err; + void * child_stack; + + if (getuid() != 0) { + fprintf(stderr, "need to be root, exit"); + exit(1); + } + + /* make sure we don't lock any path */ + chdir("/"); + umask(umask(077) | 022); + + child_stack = (void *)malloc(CHILD_STACK_SIZE); + + if (!child_stack) + exit(1); + + conf = alloc_config(); + + if (!conf) + exit(1); + + conf->verbosity = 2; + + while ((arg = getopt(argc, argv, ":qdlFSi:v:p:")) != EOF ) { + switch(arg) { + case 'v': + if (sizeof(optarg) > sizeof(char *) || + !isdigit(optarg[0])) + exit(1); + + conf->verbosity = atoi(optarg); + break; + default: + ; + } + } + +#ifdef CLONE_NEWNS /* recent systems have clone() */ + +# if defined(__hppa__) || defined(__powerpc64__) + err = clone(child, child_stack, CLONE_NEWNS, NULL); +# elif defined(__ia64__) + err = clone2(child, child_stack, + CHILD_STACK_SIZE, CLONE_NEWNS, NULL, + NULL, NULL, NULL); +# else + err = clone(child, child_stack + CHILD_STACK_SIZE, CLONE_NEWNS, NULL); +# endif + if (err < 0) + exit (1); + + exit(0); +#else /* older system fallback to fork() */ + err = fork(); + + if (err < 0) + exit (1); + + return (child(child_stack)); +#endif + +} diff --git a/multipathd/main.h b/multipathd/main.h new file mode 100644 index 0000000..911c125 --- /dev/null +++ b/multipathd/main.h @@ -0,0 +1,8 @@ +#ifndef HWTABLE_H +#define HWTABLE_H + +#define DAEMON 1 +#define CHECKINT 5 +#define MULTIPATH "/sbin/multipath -v0 -S" + +#endif diff --git a/multipathd/multipathd.8 b/multipathd/multipathd.8 new file mode 100644 index 0000000..48b1b04 --- /dev/null +++ b/multipathd/multipathd.8 @@ -0,0 +1,22 @@ +.TH MULTIPATHD 8 "October 2004" "Linux Administrator's Manual" +.SH NAME +multipathd \- multipath daemon +.SH SYNOPSYS +.B multipathd + +This daemon is in charge of checking for failed paths. When this happens, +it will reconfigure the multipath map the path belongs to, so that this map +regain its maximum performance and redundancy. + +This daemon executes the external multipath config tool when events occur. +In turn, the multipath tool signals the multipathd daemon it is done with +devmap reconfiguration, so that it can refresh its failed path list. + +.SH "SEE ALSO" +.BR multipath (8) +.BR kpartx (8) +.BR hotplug (8) +.SH "AUTHORS" +This man page was assembled by Patrick Caulfield +for the Debian project. From documentation provided +by the multipath author Christophe Varoqui, <christophe.varoqui@free.fr> and others. diff --git a/multipathd/multipathd.init.debian b/multipathd/multipathd.init.debian new file mode 100644 index 0000000..f1e2de0 --- /dev/null +++ b/multipathd/multipathd.init.debian @@ -0,0 +1,35 @@ +#!/bin/sh + +PATH=/bin:/usr/bin:/sbin:/usr/sbin +DAEMON=/usr/bin/multipathd +PIDFILE=/var/run/multipathd.pid + +test -x $DAEMON || exit 0 + +case "$1" in + start) + echo -n "Starting multipath daemon: multipathd" + $DAEMON + echo "." + ;; + stop) + echo -n "Stopping multipath daemon: multipathd" + echo "." + if [ -f $PIDFILE ] + then + kill `cat $PIDFILE` + else + echo "multipathd not running: Nothing to stop..." + fi + ;; + force-reload|restart) + $0 stop + $0 start + ;; + *) + echo "Usage: /etc/init.d/multipathd {start|stop|restart|force-reload}" + exit 1 + ;; +esac + +exit 0 diff --git a/multipathd/multipathd.init.redhat b/multipathd/multipathd.init.redhat new file mode 100644 index 0000000..eee039a --- /dev/null +++ b/multipathd/multipathd.init.redhat @@ -0,0 +1,91 @@ +#!/bin/bash + +# +# /etc/rc.d/init.d/multipathd +# +# Starts the multipath daemon +# +# chkconfig: 2345 13 87 +# description: Manage device-mapper multipath devices +# processname: multipathd + +DAEMON=/sbin/multipathd +prog=`basename $DAEMON` +initdir=/etc/rc.d/init.d +lockdir=/var/lock/subsys +sysconfig=/etc/sysconfig + + +system=redhat + +if [ $system = redhat ]; then + # Source function library. + . $initdir/functions +fi + +test -x $DAEMON || exit 0 +test -r $sysconfig/$prog && . $sysconfig/$prog + +RETVAL=0 + +# +# See how we were called. +# + +start() { + echo -n $"Starting $prog daemon: " + daemon $DAEMON + RETVAL=$? + [ $RETVAL -eq 0 ] && touch $lockdir/$prog + echo +} + +stop() { + echo -n $"Stopping $prog daemon: " + killproc $DAEMON + RETVAL=$? + [ $RETVAL -eq 0 ] && rm -f $lockdir/$prog + echo +} + +restart() { + stop + start +} + +reload() { + echo -n "Reloading $prog: " + trap "" SIGHUP + killproc $DAEMON -HUP + RETVAL=$? + echo +} + +case "$1" in +start) + start + ;; +stop) + stop + ;; +reload) + reload + ;; +restart) + restart + ;; +condrestart) + if [ -f $lockdir/$prog ]; then + restart + fi + ;; +status) + status $prog + RETVAL=$? + ;; +*) + echo $"Usage: $0 {start|stop|status|restart|condrestart|reload}" + RETVAL=1 +esac + +exit $RETVAL diff --git a/multipathd/pidfile.c b/multipathd/pidfile.c new file mode 100644 index 0000000..e3fb896 --- /dev/null +++ b/multipathd/pidfile.c @@ -0,0 +1,67 @@ +#include <sys/types.h> /* for pid_t */ +#include <sys/stat.h> /* for open */ +#include <signal.h> /* for kill() */ +#include <errno.h> /* for ESHRC */ +#include <stdio.h> /* for f...() */ +#include <string.h> /* for memset() */ +#include <stdlib.h> /* for atoi() */ +#include <unistd.h> /* for unlink() */ +#include <fcntl.h> /* for fcntl() */ + +#include <debug.h> + +#include "pidfile.h" + +int pidfile_create(const char *pidFile, pid_t pid) +{ + char buf[20]; + struct flock lock; + int fd, value; + + if((fd = open(pidFile, O_WRONLY | O_CREAT, + (S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH))) < 0) { + condlog(0, "Cannot open pidfile [%s], error was [%s]", + pidFile, strerror(errno)); + return 1; + } + lock.l_type = F_WRLCK; + lock.l_start = 0; + lock.l_whence = SEEK_SET; + lock.l_len = 0; + + if (fcntl(fd, F_SETLK, &lock) < 0) { + if (errno != EACCES && errno != EAGAIN) + condlog(0, "Cannot lock pidfile [%s], error was [%s]", + pidFile, strerror(errno)); + else + condlog(0, "process is already running"); + goto fail; + } + if (ftruncate(fd, 0) < 0) { + condlog(0, "Cannot truncate pidfile [%s], error was [%s]", + pidFile, strerror(errno)); + goto fail; + } + memset(buf, 0, sizeof(buf)); + snprintf(buf, sizeof(buf)-1, "%u", pid); + if (write(fd, buf, strlen(buf)) != strlen(buf)) { + condlog(0, "Cannot write pid to pidfile [%s], error was [%s]", + pidFile, strerror(errno)); + goto fail; + } + if ((value = fcntl(fd, F_GETFD, 0)) < 0) { + condlog(0, "Cannot get close-on-exec flag from pidfile [%s], " + "error was [%s]", pidFile, strerror(errno)); + goto fail; + } + value |= FD_CLOEXEC; + if (fcntl(fd, F_SETFD, value) < 0) { + condlog(0, "Cannot set close-on-exec flag from pidfile [%s], " + "error was [%s]", pidFile, strerror(errno)); + goto fail; + } + return 0; +fail: + close(fd); + return 1; +} diff --git a/multipathd/pidfile.h b/multipathd/pidfile.h new file mode 100644 index 0000000..d308892 --- /dev/null +++ b/multipathd/pidfile.h @@ -0,0 +1 @@ +int pidfile_create(const char *pidFile, pid_t pid); diff --git a/path_priority/Makefile b/path_priority/Makefile new file mode 100644 index 0000000..a5c1abc --- /dev/null +++ b/path_priority/Makefile @@ -0,0 +1,27 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> +# +DEBUG = 0 + +SUBDIRS = $(shell find . -type d -mindepth 1 -maxdepth 1|cut -c3-) + +all: + @for DIR in $(SUBDIRS); do \ + $(MAKE) -C $$DIR BUILD=$(BUILD) VERSION=$(VERSION); \ + done + +install: + @for DIR in $(SUBDIRS); do \ + $(MAKE) -C $$DIR install; \ + done + +uninstall: + @for DIR in $(SUBDIRS); do \ + $(MAKE) -C $$DIR uninstall; \ + done + +clean: + @for DIR in $(SUBDIRS); do \ + $(MAKE) -C $$DIR clean; \ + done diff --git a/path_priority/pp_alua/LICENSE b/path_priority/pp_alua/LICENSE new file mode 100644 index 0000000..9e31bbf --- /dev/null +++ b/path_priority/pp_alua/LICENSE @@ -0,0 +1,483 @@ + + GNU LIBRARY GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1991 Free Software Foundation, Inc. + 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the library GPL. It is + numbered 2 because it goes with version 2 of the ordinary GPL.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Library General Public License, applies to some +specially designated Free Software Foundation software, and to any +other libraries whose authors decide to use it. You can use it for +your libraries, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if +you distribute copies of the library, or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link a program with the library, you must provide +complete object files to the recipients so that they can relink them +with the library, after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + Our method of protecting your rights has two steps: (1) copyright +the library, and (2) offer you this license which gives you legal +permission to copy, distribute and/or modify the library. + + Also, for each distributor's protection, we want to make certain +that everyone understands that there is no warranty for this free +library. If the library is modified by someone else and passed on, we +want its recipients to know that what they have is not the original +version, so that any problems introduced by others will not reflect on +the original authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that companies distributing free +software will individually obtain patent licenses, thus in effect +transforming the program into proprietary software. To prevent this, +we have made it clear that any patent must be licensed for everyone's +free use or not licensed at all. + + Most GNU software, including some libraries, is covered by the ordinary +GNU General Public License, which was designed for utility programs. This +license, the GNU Library General Public License, applies to certain +designated libraries. This license is quite different from the ordinary +one; be sure to read it in full, and don't assume that anything in it is +the same as in the ordinary license. + + The reason we have a separate public license for some libraries is that +they blur the distinction we usually make between modifying or adding to a +program and simply using it. Linking a program with a library, without +changing the library, is in some sense simply using the library, and is +analogous to running a utility program or application program. However, in +a textual and legal sense, the linked executable is a combined work, a +derivative of the original library, and the ordinary General Public License +treats it as such. + + Because of this blurred distinction, using the ordinary General +Public License for libraries did not effectively promote software +sharing, because most developers did not use the libraries. We +concluded that weaker conditions might promote sharing better. + + However, unrestricted linking of non-free programs would deprive the +users of those programs of all benefit from the free status of the +libraries themselves. This Library General Public License is intended to +permit developers of non-free programs to use free libraries, while +preserving your freedom as a user of such programs to change the free +libraries that are incorporated in them. (We have not seen how to achieve +this as regards changes in header files, but we have achieved it as regards +changes in the actual functions of the Library.) The hope is that this +will lead to faster development of free libraries. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, while the latter only +works together with the library. + + Note that it is possible for a library to be covered by the ordinary +General Public License rather than by this special one. + + GNU LIBRARY GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library which +contains a notice placed by the copyright holder or other authorized +party saying it may be distributed under the terms of this Library +General Public License (also called "this License"). Each licensee is +addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also compile or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + c) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + d) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the source code distributed need not include anything that is normally +distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Library General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + + Appendix: How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +"copyright" line and a pointer to where the full notice is found. + + <one line to give the library's name and a brief idea of what it does.> + Copyright (C) <year> <name of author> + + This library is free software; you can redistribute it and/or + modify it under the terms of the GNU Library General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later version. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Library General Public License for more details. + + You should have received a copy of the GNU Library General Public + License along with this library; if not, write to the Free + Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, + MA 02111-1307, USA + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the library, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the + library `Frob' (a library for tweaking knobs) written by James Random Hacker. + + <signature of Ty Coon>, 1 April 1990 + Ty Coon, President of Vice + +That's all there is to it! diff --git a/path_priority/pp_alua/Makefile b/path_priority/pp_alua/Makefile new file mode 100644 index 0000000..ce5af15 --- /dev/null +++ b/path_priority/pp_alua/Makefile @@ -0,0 +1,55 @@ +#============================================================================== +# (C) Copyright IBM Corp. 2004, 2005 All Rights Reserved. +# +# Makefile +# +# Tool to make use of a SCSI-feature called Asymmetric Logical Unit Access. +# It determines the ALUA state of a device and prints a priority value to +# stdout. +# +# Author(s): Jan Kunigk +# S. Bader <shbader@de.ibm.com> +# +# This file is released under the GPL. +#============================================================================== +EXEC = pp_alua +BUILD = glibc +DEBUG = 0 +DEBUG_DUMPHEX = 0 +OBJS = main.o rtpg.o + +TOPDIR = ../.. + +ifneq ($(shell ls $(TOPDIR)/Makefile.inc 2>/dev/null),) +include $(TOPDIR)/Makefile.inc +else +# "out of tree building" +STRIP = strip --strip-all -R .comment -R .note +endif + +CFLAGS = -pipe -g -O2 -Wall -Wunused -Wstrict-prototypes -DDEBUG=$(DEBUG) + +all: $(BUILD) + +glibc: $(OBJS) + $(CC) -o $(EXEC) $(OBJS) $(LDFLAGS) + $(STRIP) $(EXEC) + +klibc: $(OBJS) + $(CC) -static -o $(EXEC) $(OBJS) + $(STRIP) $(EXEC) + +install: $(EXEC) + install -m 755 $(EXEC) $(DESTDIR)$(bindir)/$(EXEC) + +uninstall: + rm $(DESTDIR)$(bindir)/$(EXEC) +clean: + rm -f *.o $(EXEC) + +%.o: %.c + $(CC) $(CFLAGS) -c -o $@ $< + +main.o: main.c rtpg.h spc3.h + +rtpg.o: rtpg.c rtpg.h spc3.h diff --git a/path_priority/pp_alua/main.c b/path_priority/pp_alua/main.c new file mode 100644 index 0000000..c243d70 --- /dev/null +++ b/path_priority/pp_alua/main.c @@ -0,0 +1,229 @@ +/* + * (C) Copyright IBM Corp. 2004, 2005 All Rights Reserved. + * + * main.c + * + * Tool to make use of a SCSI-feature called Asymmetric Logical Unit Access. + * It determines the ALUA state of a device and prints a priority value to + * stdout. + * + * Author(s): Jan Kunigk + * S. Bader <shbader@de.ibm.com> + * + * This file is released under the GPL. + */ +#include <sys/types.h> +#include <sys/stat.h> + +#include <unistd.h> +#include <errno.h> +#include <fcntl.h> +#include <stdio.h> + +#include "rtpg.h" + +#define ALUA_PRIO_SUCCESS 0 +#define ALUA_PRIO_INVALID_COMMANDLINE 1 +#define ALUA_PRIO_OPEN_FAILED 2 +#define ALUA_PRIO_NOT_SUPPORTED 3 +#define ALUA_PRIO_RTPG_FAILED 4 +#define ALUA_PRIO_GETAAS_FAILED 5 + +#define ALUA_PRIO_MAJOR 0 +#define ALUA_PRIO_MINOR 4 + +#define PRINT_ERROR(f, a...) \ + if (verbose) \ + fprintf(stderr, "ERROR: " f, ##a) +#define PRINT_VERBOSE(f, a...) \ + if (verbose) \ + printf(f, ##a) + +char * devicename = NULL; +int verbose = 0; + +char *basename(char *p) +{ + char *r; + + for(r = p; *r != '\0'; r++); + for(; r > p && *(r - 1) != '/'; r--); + + return r; +} + +void +print_help(char *command) +{ + printf("Usage: %s <options> <device> [<device> [...]]\n\n", + basename(command)); + printf("Options are:\n"); + + printf("\t-v\n"); + printf("\t\tTurn on verbose output.\n"); + + printf("\t-V\n"); + printf("\t\tPrints the version number and exits.\n"); +} + +void +print_version(char *command) +{ + printf("(C) Copyright IBM Corp. 2004, 2005 All Rights Reserved.\n"); + printf("This is %s version %u.%u\n", + basename(command), + ALUA_PRIO_MAJOR, + ALUA_PRIO_MINOR + ); +} + +int +open_block_device(char *name) +{ + int fd; + struct stat st; + + if (stat(name, &st) != 0) { + PRINT_ERROR("Cannot get file status from %s (errno = %i)!\n", + name, errno); + return -ALUA_PRIO_OPEN_FAILED; + } + if (!S_ISBLK(st.st_mode)) { + PRINT_ERROR("%s is not a block device!\n", name); + return -ALUA_PRIO_OPEN_FAILED; + } + fd = open(name, O_RDONLY); + if (fd < 0) { + PRINT_ERROR("Couldn't open %s (errno = %i)!\n", name, errno); + return -ALUA_PRIO_OPEN_FAILED; + } + return fd; +} + +int +close_block_device(int fd) +{ + return close(fd); +} + +int +get_alua_info(int fd) +{ + char * aas_string[] = { + [AAS_OPTIMIZED] = "active/optimized", + [AAS_NON_OPTIMIZED] = "active/non-optimized", + [AAS_STANDBY] = "standby", + [AAS_UNAVAILABLE] = "unavailable", + [AAS_TRANSITIONING] = "transitioning between states", + }; + int rc; + int tpg; + + rc = get_target_port_group_support(fd); + if (rc < 0) + return rc; + + if (verbose) { + printf("Target port groups are "); + switch(rc) { + case TPGS_NONE: + printf("not"); + break; + case TPGS_IMPLICIT: + printf("implicitly"); + break; + case TPGS_EXPLICIT: + printf("explicitly"); + break; + case TPGS_BOTH: + printf("implicitly and explicitly"); + break; + } + printf(" supported.\n"); + } + + if (rc == TPGS_NONE) + return -ALUA_PRIO_NOT_SUPPORTED; + + tpg = get_target_port_group(fd); + if (tpg < 0) { + PRINT_ERROR("Couldn't get target port group!\n"); + return -ALUA_PRIO_RTPG_FAILED; + } + PRINT_VERBOSE("Reported target port group is %i", tpg); + + rc = get_asymmetric_access_state(fd, tpg); + if (rc < 0) { + PRINT_VERBOSE(" [get AAS failed]\n"); + PRINT_ERROR("Couln't get asymmetric access state!\n"); + return -ALUA_PRIO_GETAAS_FAILED; + } + PRINT_VERBOSE(" [%s]\n", + (aas_string[rc]) ? aas_string[rc] : "invalid/reserved" + ); + + return rc; +} + +int +main (int argc, char **argv) +{ + char * s_opts = "hvV"; + int fd; + int rc; + int c; + + while ((c = getopt(argc, argv, s_opts)) >= 0) { + switch(c) { + case 'h': + print_help(argv[0]); + return ALUA_PRIO_SUCCESS; + case 'V': + print_version(argv[0]); + return ALUA_PRIO_SUCCESS; + case 'v': + verbose = 1; + break; + case '?': + case ':': + default: + return ALUA_PRIO_INVALID_COMMANDLINE; + } + } + + if (optind == argc) { + print_help(argv[0]); + printf("\n"); + PRINT_ERROR("No device specified!\n"); + return ALUA_PRIO_INVALID_COMMANDLINE; + } + + rc = ALUA_PRIO_SUCCESS; + for(c = optind; c < argc && !rc; c++) { + fd = open_block_device(argv[c]); + if (fd < 0) { + return -fd; + } + rc = get_alua_info(fd); + if (rc >= 0) { + switch(rc) { + case AAS_OPTIMIZED: + rc = 50; + break; + case AAS_NON_OPTIMIZED: + rc = 10; + break; + case AAS_STANDBY: + rc = 1; + break; + default: + rc = 0; + } + printf("%u\n", rc); + rc = ALUA_PRIO_SUCCESS; + } + close_block_device(fd); + } + + return -rc; +} diff --git a/path_priority/pp_alua/rtpg.c b/path_priority/pp_alua/rtpg.c new file mode 100644 index 0000000..5eb1a1c --- /dev/null +++ b/path_priority/pp_alua/rtpg.c @@ -0,0 +1,280 @@ +/* + * (C) Copyright IBM Corp. 2004, 2005 All Rights Reserved. + * + * rtpg.c + * + * Tool to make use of a SCSI-feature called Asymmetric Logical Unit Access. + * It determines the ALUA state of a device and prints a priority value to + * stdout. + * + * Author(s): Jan Kunigk + * S. Bader <shbader@de.ibm.com> + * + * This file is released under the GPL. + */ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <errno.h> + +#define __user +#include <scsi/sg.h> + +#include "rtpg.h" + +#define SENSE_BUFF_LEN 32 +#define DEF_TIMEOUT 60000 + +/* + * Macro used to print debug messaged. + */ +#if DEBUG > 0 +#define PRINT_DEBUG(f, a...) \ + fprintf(stderr, "DEBUG: " f, ##a) +#else +#define PRINT_DEBUG(f, a...) +#endif + +/* + * Optionally print the commands sent and the data received a hex dump. + */ +#if DEBUG > 0 +#if DEBUG_DUMPHEX > 0 +#define PRINT_HEX(p, l) print_hex(p, l) +void +print_hex(unsigned char *p, unsigned long len) +{ + int i; + + for(i = 0; i < len; i++) { + if (i % 16 == 0) + printf("%04x: ", i); + printf("%02x%s", p[i], (((i + 1) % 16) == 0) ? "\n" : " "); + } + printf("\n"); +} +#else +#define PRINT_HEX(p, l) +#endif +#else +#define PRINT_HEX(p, l) +#endif + +/* + * Returns 0 if the SCSI command either was successful or if the an error was + * recovered, otherwise 1. (definitions taken from sg_err.h) + */ +#define SCSI_CHECK_CONDITION 0x2 +#define SCSI_COMMAND_TERMINATED 0x22 +#define SG_ERR_DRIVER_SENSE 0x08 +#define RECOVERED_ERROR 0x01 + +static int +scsi_error(struct sg_io_hdr *hdr) +{ + /* Treat SG_ERR here to get rid of sg_err.[ch] */ + hdr->status &= 0x7e; + + if ( + (hdr->status == 0) && + (hdr->host_status == 0) && + (hdr->driver_status == 0) + ) { + return 0; + } + + if ( + (hdr->status == SCSI_CHECK_CONDITION) || + (hdr->status == SCSI_COMMAND_TERMINATED) || + ((hdr->driver_status & 0xf) == SG_ERR_DRIVER_SENSE) + ) { + if (hdr->sbp && (hdr->sb_len_wr > 2)) { + int sense_key; + unsigned char * sense_buffer = hdr->sbp; + + if (sense_buffer[0] & 0x2) + sense_key = sense_buffer[1] & 0xf; + else + sense_key = sense_buffer[2] & 0xf; + + if (sense_key == RECOVERED_ERROR) + return 0; + } + } + + return 1; +} + +/* + * Helper function to setup and run a SCSI inquiry command. + */ +int +do_inquiry(int fd, int evpd, unsigned int codepage, void *resp, int resplen) +{ + struct inquiry_command cmd; + struct sg_io_hdr hdr; + unsigned char sense[SENSE_BUFF_LEN]; + + memset(&cmd, 0, sizeof(cmd)); + cmd.op = OPERATION_CODE_INQUIRY; + if (evpd) { + cmd.evpd = 1; + cmd.page = codepage; + } + set_uint16(cmd.length, resplen); + PRINT_HEX((unsigned char *) &cmd, sizeof(cmd)); + + memset(&hdr, 0, sizeof(hdr)); + hdr.interface_id = 'S'; + hdr.cmdp = (unsigned char *) &cmd; + hdr.cmd_len = sizeof(cmd); + hdr.dxfer_direction = SG_DXFER_FROM_DEV; + hdr.dxferp = resp; + hdr.dxfer_len = resplen; + hdr.sbp = sense; + hdr.mx_sb_len = sizeof(sense); + hdr.timeout = DEF_TIMEOUT; + + if (ioctl(fd, SG_IO, &hdr) < 0) { + PRINT_DEBUG("do_inquiry: IOCTL failed!\n"); + return -RTPG_INQUIRY_FAILED; + } + + if (scsi_error(&hdr)) { + PRINT_DEBUG("do_inquiry: SCSI error!\n"); + return -RTPG_INQUIRY_FAILED; + } + PRINT_HEX((unsigned char *) resp, resplen); + + return 0; +} + +/* + * This function returns the support for target port groups by evaluating the + * data returned by the standard inquiry command. + */ +int +get_target_port_group_support(int fd) +{ + struct inquiry_data inq; + int rc; + + rc = do_inquiry(fd, 0, 0x00, &inq, sizeof(inq)); + if (!rc) { + rc = inq.tpgs; + } + + return rc; +} + +int +get_target_port_group(int fd) +{ + unsigned char buf[128]; + struct vpd83_data * vpd83; + struct vpd83_dscr * dscr; + int rc; + + rc = do_inquiry(fd, 1, 0x83, buf, sizeof(buf)); + if (!rc) { + vpd83 = (struct vpd83_data *) buf; + + rc = -RTPG_NO_TPG_IDENTIFIER; + FOR_EACH_VPD83_DSCR(vpd83, dscr) { + if ((((char *) dscr) - ((char *) vpd83)) > sizeof(buf)) + break; + + if (dscr->id_type == IDTYPE_TARGET_PORT_GROUP) { + struct vpd83_tpg_dscr * p; + + if (rc != -RTPG_NO_TPG_IDENTIFIER) { + PRINT_DEBUG("get_target_port_group: " + "more than one TPG identifier " + "found!\n"); + continue; + } + + p = (struct vpd83_tpg_dscr *) dscr->data; + rc = get_uint16(p->tpg); + } + } + if (rc == -RTPG_NO_TPG_IDENTIFIER) { + PRINT_DEBUG("get_target_port_group: " + "no TPG identifier found!\n"); + } + } + + return rc; +} + +int +do_rtpg(int fd, void* resp, long resplen) +{ + struct rtpg_command cmd; + struct sg_io_hdr hdr; + unsigned char sense[SENSE_BUFF_LEN]; + + memset(&cmd, 0, sizeof(cmd)); + cmd.op = OPERATION_CODE_RTPG; + cmd.service_action = SERVICE_ACTION_RTPG; + set_uint32(cmd.length, resplen); + PRINT_HEX((unsigned char *) &cmd, sizeof(cmd)); + + memset(&hdr, 0, sizeof(hdr)); + hdr.interface_id = 'S'; + hdr.cmdp = (unsigned char *) &cmd; + hdr.cmd_len = sizeof(cmd); + hdr.dxfer_direction = SG_DXFER_FROM_DEV; + hdr.dxferp = resp; + hdr.dxfer_len = resplen; + hdr.mx_sb_len = sizeof(sense); + hdr.sbp = sense; + hdr.timeout = DEF_TIMEOUT; + + if (ioctl(fd, SG_IO, &hdr) < 0) + return -RTPG_RTPG_FAILED; + + if (scsi_error(&hdr)) { + PRINT_DEBUG("do_rtpg: SCSI error!\n"); + return -RTPG_RTPG_FAILED; + } + PRINT_HEX(resp, resplen); + + return 0; +} + +int +get_asymmetric_access_state(int fd, unsigned int tpg) +{ + unsigned char buf[128]; + struct rtpg_data * tpgd; + struct rtpg_tpg_dscr * dscr; + int rc; + + rc = do_rtpg(fd, buf, sizeof(buf)); + if (rc < 0) + return rc; + + tpgd = (struct rtpg_data *) buf; + rc = -RTPG_TPG_NOT_FOUND; + RTPG_FOR_EACH_PORT_GROUP(tpgd, dscr) { + if (get_uint16(dscr->tpg) == tpg) { + if (rc != -RTPG_TPG_NOT_FOUND) { + PRINT_DEBUG("get_asymmetric_access_state: " + "more than one entry with same port " + "group.\n"); + } else { + PRINT_DEBUG("pref=%i\n", dscr->pref); + rc = dscr->aas; + } + } + } + + return rc; +} + diff --git a/path_priority/pp_alua/rtpg.h b/path_priority/pp_alua/rtpg.h new file mode 100644 index 0000000..3c5dcf1 --- /dev/null +++ b/path_priority/pp_alua/rtpg.h @@ -0,0 +1,30 @@ +/* + * (C) Copyright IBM Corp. 2004, 2005 All Rights Reserved. + * + * rtpg.h + * + * Tool to make use of a SCSI-feature called Asymmetric Logical Unit Access. + * It determines the ALUA state of a device and prints a priority value to + * stdout. + * + * Author(s): Jan Kunigk + * S. Bader <shbader@de.ibm.com> + * + * This file is released under the GPL. + */ +#ifndef __RTPG_H__ +#define __RTPG_H__ +#include "spc3.h" + +#define RTPG_SUCCESS 0 +#define RTPG_INQUIRY_FAILED 1 +#define RTPG_NO_TPG_IDENTIFIER 2 +#define RTPG_RTPG_FAILED 3 +#define RTPG_TPG_NOT_FOUND 4 + +int get_target_port_group_support(int fd); +int get_target_port_group(int fd); +int get_asymmetric_access_state(int fd, unsigned int tpg); + +#endif /* __RTPG_H__ */ + diff --git a/path_priority/pp_alua/spc3.h b/path_priority/pp_alua/spc3.h new file mode 100644 index 0000000..eac647d --- /dev/null +++ b/path_priority/pp_alua/spc3.h @@ -0,0 +1,305 @@ +/* + * (C) Copyright IBM Corp. 2004, 2005 All Rights Reserved. + * + * spc3.h + * + * Tool to make use of a SCSI-feature called Asymmetric Logical Unit Access. + * It determines the ALUA state of a device and prints a priority value to + * stdout. + * + * Author(s): Jan Kunigk + * S. Bader <shbader@de.ibm.com> + * + * This file is released under the GPL. + */ +#ifndef __SPC3_H__ +#define __SPC3_H__ +/*============================================================================= + * Some helper functions for getting and setting 16 and 32 bit values. + *============================================================================= + */ +static inline unsigned short +get_uint16(unsigned char *p) +{ + return (p[0] << 8) + p[1]; +} + +static inline void +set_uint16(unsigned char *p, unsigned short v) +{ + p[0] = (v >> 8) & 0xff; + p[1] = v & 0xff; +} + +static inline unsigned int +get_uint32(unsigned char *p) +{ + return (p[0] << 24) + (p[1] << 16) + (p[2] << 8) + p[3]; +} + +static inline void +set_uint32(unsigned char *p, unsigned int v) +{ + p[0] = (v >> 24) & 0xff; + p[1] = (v >> 16) & 0xff; + p[2] = (v >> 8) & 0xff; + p[3] = v & 0xff; +} + +/*============================================================================= + * Definitions to support the standard inquiry command as defined in SPC-3. + * If the evpd (enable vital product data) bit is set the data that will be + * returned is selected by the page field. This field must be 0 if the evpd + * bit is not set. + *============================================================================= + */ +#define OPERATION_CODE_INQUIRY 0x12 + +struct inquiry_command { + unsigned char op; + unsigned char reserved1 : 6; + unsigned char obsolete1 : 1; + unsigned char evpd : 1; + unsigned char page; + unsigned char length[2]; + unsigned char control; +} __attribute__((packed)); + +/*----------------------------------------------------------------------------- + * Data returned by the standard inquiry command. + *----------------------------------------------------------------------------- + * + * Peripheral qualifier codes. + */ +#define PQ_CONNECTED 0x0 +#define PQ_DISCONNECTED 0x1 +#define PQ_UNSUPPORTED 0x3 + +/* Defined peripheral device types. */ +#define PDT_DIRECT_ACCESS 0x00 +#define PDT_SEQUENTIAL_ACCESS 0x01 +#define PDT_PRINTER 0x02 +#define PDT_PROCESSOR 0x03 +#define PDT_WRITE_ONCE 0x04 +#define PDT_CD_DVD 0x05 +#define PDT_SCANNER 0x06 +#define PDT_OPTICAL_MEMORY 0x07 +#define PDT_MEDIUM_CHANGER 0x08 +#define PDT_COMMUNICATIONS 0x09 +#define PDT_STORAGE_ARRAY_CONTROLLER 0x0c +#define PDT_ENCLOSURE_SERVICES 0x0d +#define PDT_SIMPLIFIED_DIRECT_ACCESS 0x0e +#define PDT_OPTICAL_CARD_READER_WRITER 0x0f +#define PDT_BRIDGE_CONTROLLER 0x10 +#define PDT_OBJECT_BASED 0x11 +#define PDT_AUTOMATION_INTERFACE 0x12 +#define PDT_LUN 0x1e +#define PDT_UNKNOWN 0x1f + +/* Defined version codes. */ +#define VERSION_NONE 0x00 +#define VERSION_SPC 0x03 +#define VERSION_SPC2 0x04 +#define VERSION_SPC3 0x05 + +/* Defined TPGS field values. */ +#define TPGS_NONE 0x0 +#define TPGS_IMPLICIT 0x1 +#define TPGS_EXPLICIT 0x2 +#define TPGS_BOTH 0x3 + +struct inquiry_data { + unsigned char peripheral_qualifier : 3; + unsigned char peripheral_device_type : 5; + /* Removable Medium Bit (1 == removable) */ + unsigned char rmb : 1; + unsigned char reserved1 : 7; + unsigned char version; + unsigned char obsolete1 : 2; + /* Normal ACA Supported */ + unsigned char norm_aca : 1; + /* Hierarchical LUN assignment support */ + unsigned char hi_sup : 1; + /* If 2 then response data is as defined in SPC-3. */ + unsigned char response_data_format : 4; + unsigned char length; + /* Storage Controller Component Supported. */ + unsigned char sccs : 1; + /* Access Controls Cordinator. */ + unsigned char acc : 1; + /* Target Port Group Support */ + unsigned char tpgs : 2; + /* Third Party Copy support. */ + unsigned char tpc : 1; + unsigned char reserved2 : 2; + /* PROTECTion information supported. */ + unsigned char protect : 1; + /* Basic task management model supported (CmdQue must be 0). */ + unsigned char bque : 1; + /* ENClosure SERVices supported. */ + unsigned char encserv : 1; + unsigned char vs1 : 1; + /* MULTIPort support. */ + unsigned char multip : 1; + /* Medium CHaNGeR. */ + unsigned char mchngr : 1; + unsigned char obsolete2 : 2; + unsigned char addr16 : 1; + unsigned char obsolete3 : 2; + unsigned char wbus16 : 1; + unsigned char sync : 1; + /* LINKed commands supported. */ + unsigned char link : 1; + unsigned char obsolete4 : 1; + unsigned char cmdque : 1; + unsigned char vs2 : 1; + unsigned char vendor_identification[8]; + unsigned char product_identification[8]; + unsigned char product_revision[4]; + unsigned char vendor_specific[20]; + unsigned char reserved3 : 4; + unsigned char clocking : 2; + unsigned char qas : 1; + unsigned char ius : 1; + unsigned char reserved4; + unsigned char version_descriptor[8][2]; + unsigned char reserved5[22]; + unsigned char vendor_parameters[0]; +} __attribute__((packed)); + +/*----------------------------------------------------------------------------- + * Inquiry data returned when requesting vital product data page 0x83. + *----------------------------------------------------------------------------- + */ +#define CODESET_BINARY 0x1 +#define CODESET_ACSII 0x2 +#define CODESET_UTF8 0x3 + +#define ASSOCIATION_UNIT 0x0 +#define ASSOCIATION_PORT 0x1 +#define ASSOCIATION_DEVICE 0x2 + +#define IDTYPE_VENDOR_SPECIFIC 0x0 +#define IDTYPE_T10_VENDOR_ID 0x1 +#define IDTYPE_EUI64 0x2 +#define IDTYPE_NAA 0x3 +#define IDTYPE_RELATIVE_TPG_ID 0x4 +#define IDTYPE_TARGET_PORT_GROUP 0x5 +#define IDTYPE_LUN_GROUP 0x6 +#define IDTYPE_MD5_LUN_ID 0x7 +#define IDTYPE_SCSI_NAME_STRING 0x8 + +struct vpd83_tpg_dscr { + unsigned char reserved1[2]; + unsigned char tpg[2]; +} __attribute__((packed)); + +struct vpd83_dscr { + unsigned char protocol_id : 4; + unsigned char codeset : 4; + /* Set if the protocol_id field is valid. */ + unsigned char piv : 1; + unsigned char reserved1 : 1; + unsigned char association : 2; + unsigned char id_type : 4; + unsigned char reserved2; + unsigned char length; /* size-4 */ + unsigned char data[0]; +} __attribute__((packed)); + +struct vpd83_data { + unsigned char peripheral_qualifier : 3; + unsigned char peripheral_device_type : 5; + unsigned char page_code; /* 0x83 */ + unsigned char length[2]; /* size-4 */ + struct vpd83_dscr data[0]; +} __attribute__((packed)); + +/*----------------------------------------------------------------------------- + * This macro should be used to walk through all identification descriptors + * defined in the code page 0x83. + * The argument p is a pointer to the code page 0x83 data and d is used to + * point to the current descriptor. + *----------------------------------------------------------------------------- + */ +#define FOR_EACH_VPD83_DSCR(p, d) \ + for( \ + d = p->data; \ + (((char *) d) - ((char *) p)) < \ + get_uint16(p->length); \ + d = (struct vpd83_dscr *) \ + ((char *) d + d->length + 4) \ + ) + +/*============================================================================= + * The following stuctures and macros are used to call the report target port + * groups command defined in SPC-3. + * This command is used to get information about the target port groups (which + * states are supported, which ports belong to this group, and so on) and the + * current state of each target port group. + *============================================================================= + */ +#define OPERATION_CODE_RTPG 0xa3 +#define SERVICE_ACTION_RTPG 0x0a + +struct rtpg_command { + unsigned char op; /* 0xa3 */ + unsigned char reserved1 : 3; + unsigned char service_action : 5; /* 0x0a */ + unsigned char reserved2[4]; + unsigned char length[4]; + unsigned char reserved3; + unsigned char control; +} __attribute__((packed)); + +struct rtpg_tp_dscr { + unsigned char obsolete1[2]; + /* The Relative Target Port Identifier of a target port. */ + unsigned char rtpi[2]; +} __attribute__((packed)); + +#define AAS_OPTIMIZED 0x0 +#define AAS_NON_OPTIMIZED 0x1 +#define AAS_STANDBY 0x2 +#define AAS_UNAVAILABLE 0x3 +#define AAS_TRANSITIONING 0xf + +#define TPG_STATUS_NONE 0x0 +#define TPG_STATUS_SET 0x1 +#define TPG_STATUS_IMPLICIT_CHANGE 0x2 + +struct rtpg_tpg_dscr { + unsigned char pref : 1; + unsigned char reserved1 : 3; + unsigned char aas : 4; + unsigned char reserved2 : 4; + unsigned char u_sup : 1; + unsigned char s_sup : 1; + unsigned char an_sup : 1; + unsigned char ao_sup : 1; + unsigned char tpg[2]; + unsigned char reserved3; + unsigned char status; + unsigned char vendor_unique; + unsigned char port_count; + struct rtpg_tp_dscr data[0]; +} __attribute__((packed)); + +struct rtpg_data { + unsigned char length[4]; /* size-4 */ + struct rtpg_tpg_dscr data[0]; +} __attribute__((packed)); + +#define RTPG_FOR_EACH_PORT_GROUP(p, g) \ + for( \ + g = &(p->data[0]); \ + (((char *) g) - ((char *) p)) < get_uint32(p->length); \ + g = (struct rtpg_tpg_dscr *) ( \ + ((char *) g) + \ + sizeof(struct rtpg_tpg_dscr) + \ + g->port_count * sizeof(struct rtpg_tp_dscr) \ + ) \ + ) + +#endif /* __SPC3_H__ */ + diff --git a/path_priority/pp_balance_units/Makefile b/path_priority/pp_balance_units/Makefile new file mode 100644 index 0000000..0100f79 --- /dev/null +++ b/path_priority/pp_balance_units/Makefile @@ -0,0 +1,47 @@ +# Makefile +# +# Copyright (C) 2003 Christophe Varoqui, <christophe.varoqui@free.fr> +# +BUILD = glibc +DEBUG = 0 + +TOPDIR = ../.. +include $(TOPDIR)/Makefile.inc + +ifeq ($(strip $(BUILD)),klibc) + CFLAGS = -I/usr/include -DDEBUG=$(DEBUG) + OBJS = pp_balance_units.o $(MULTIPATHLIB)-$(BUILD).a +else + CFLAGS = -pipe -g -Wall -Wunused -Wstrict-prototypes \ + -I$(multipathdir) -DDEBUG=$(DEBUG) + LDFLAGS = -ldevmapper + OBJS = pp_balance_units.o $(MULTIPATHLIB)-$(BUILD).a +endif + +EXEC = pp_balance_units + +all: $(BUILD) + +prepare: + rm -f core *.o *.gz + +glibc: prepare $(OBJS) + $(CC) -o $(EXEC) $(OBJS) $(LDFLAGS) + $(STRIP) $(EXEC) + +klibc: prepare $(OBJS) + $(CC) -static -o $(EXEC) $(CRT0) $(OBJS) $(KLIBC) $(LIBGCC) + $(STRIP) $(EXEC) + +$(MULTIPATHLIB)-$(BUILD).a: + make -C $(multipathdir) BUILD=$(BUILD) $(BUILD) + +install: + install -d $(DESTDIR)$(bindir) + install -m 755 $(EXEC) $(DESTDIR)$(bindir)/ + +uninstall: + rm $(DESTDIR)$(bindir)/$(EXEC) + +clean: + rm -f core *.o $(EXEC) *.gz diff --git a/path_priority/pp_balance_units/pp_balance_units.c b/path_priority/pp_balance_units/pp_balance_units.c new file mode 100644 index 0000000..307a959 --- /dev/null +++ b/path_priority/pp_balance_units/pp_balance_units.c @@ -0,0 +1,474 @@ +/* + * Christophe Varoqui (2004) + * This code is GPLv2, see license file + * + * This path prioritizer aims to balance logical units over all + * controlers available. The logic is : + * + * - list all paths in all primary path groups + * - for each path, get the controler's serial + * - compute the number of active paths attached to each controler + * - compute the max number of paths attached to the same controler + * - if sums are already balanced or if the path passed as parameter is + * attached to controler with less active paths, then return + * (max_path_attached_to_one_controler - number_of_paths_on_this_controler) + * - else, or if anything goes wrong, return 1 as a default prio + * + */ +#define __user + +#include <stdio.h> +#include <stdlib.h> +#include <libdevmapper.h> +#include <vector.h> +#include <memory.h> + +#include <string.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <scsi/sg.h> + +#define SERIAL_SIZE 255 +#define WORD_SIZE 255 +#define PARAMS_SIZE 255 +#define FILE_NAME_SIZE 255 +#define INQUIRY_CMDLEN 6 +#define INQUIRY_CMD 0x12 +#define SENSE_BUFF_LEN 32 +#define DEF_TIMEOUT 60000 +#define RECOVERED_ERROR 0x01 +#define MX_ALLOC_LEN 255 +#define SCSI_CHECK_CONDITION 0x2 +#define SCSI_COMMAND_TERMINATED 0x22 +#define SG_ERR_DRIVER_SENSE 0x08 + +#if DEBUG +#define debug(format, arg...) fprintf(stderr, format "\n", ##arg) +#else +#define debug(format, arg...) do {} while(0) +#endif + +#define safe_sprintf(var, format, args...) \ + snprintf(var, sizeof(var), format, ##args) >= sizeof(var) +#define safe_snprintf(var, size, format, args...) \ + snprintf(var, size, format, ##args) >= size + +struct path { + char dev_t[WORD_SIZE]; + char serial[SERIAL_SIZE]; +}; + +struct controler { + char serial[SERIAL_SIZE]; + int path_count; +}; + +static int +exit_tool (int ret) +{ + printf("1\n"); + exit(ret); +} + +static int +opennode (char * devt, int mode) +{ + char devpath[FILE_NAME_SIZE]; + unsigned int major; + unsigned int minor; + int fd; + + sscanf(devt, "%u:%u", &major, &minor); + memset(devpath, 0, FILE_NAME_SIZE); + + if (safe_sprintf(devpath, "/tmp/.pp_balance.%u.%u.devnode", + major, minor)) { + fprintf(stderr, "devpath too small\n"); + return -1; + } + unlink (devpath); + mknod(devpath, S_IFBLK|S_IRUSR|S_IWUSR, makedev(major, minor)); + fd = open(devpath, mode); + + if (fd < 0) + unlink(devpath); + + return fd; + +} + +static void +closenode (char * devt, int fd) +{ + char devpath[FILE_NAME_SIZE]; + unsigned int major; + unsigned int minor; + + if (fd >= 0) + close(fd); + + sscanf(devt, "%u:%u", &major, &minor); + if (safe_sprintf(devpath, "/tmp/.pp_balance.%u.%u.devnode", + major, minor)) { + fprintf(stderr, "devpath too small\n"); + return; + } + unlink(devpath); +} + +static int +do_inq(int sg_fd, int cmddt, int evpd, unsigned int pg_op, + void *resp, int mx_resp_len, int noisy) +{ + unsigned char inqCmdBlk[INQUIRY_CMDLEN] = + { INQUIRY_CMD, 0, 0, 0, 0, 0 }; + unsigned char sense_b[SENSE_BUFF_LEN]; + struct sg_io_hdr io_hdr; + + if (cmddt) + inqCmdBlk[1] |= 2; + if (evpd) + inqCmdBlk[1] |= 1; + inqCmdBlk[2] = (unsigned char) pg_op; + inqCmdBlk[3] = (unsigned char)((mx_resp_len >> 8) & 0xff); + inqCmdBlk[4] = (unsigned char) (mx_resp_len & 0xff); + memset(&io_hdr, 0, sizeof (struct sg_io_hdr)); + io_hdr.interface_id = 'S'; + io_hdr.cmd_len = sizeof (inqCmdBlk); + io_hdr.mx_sb_len = sizeof (sense_b); + io_hdr.dxfer_direction = SG_DXFER_FROM_DEV; + io_hdr.dxfer_len = mx_resp_len; + io_hdr.dxferp = resp; + io_hdr.cmdp = inqCmdBlk; + io_hdr.sbp = sense_b; + io_hdr.timeout = DEF_TIMEOUT; + + if (ioctl(sg_fd, SG_IO, &io_hdr) < 0) + return -1; + + /* treat SG_ERR here to get rid of sg_err.[ch] */ + io_hdr.status &= 0x7e; + if ((0 == io_hdr.status) && (0 == io_hdr.host_status) && + (0 == io_hdr.driver_status)) + return 0; + if ((SCSI_CHECK_CONDITION == io_hdr.status) || + (SCSI_COMMAND_TERMINATED == io_hdr.status) || + (SG_ERR_DRIVER_SENSE == (0xf & io_hdr.driver_status))) { + if (io_hdr.sbp && (io_hdr.sb_len_wr > 2)) { + int sense_key; + unsigned char * sense_buffer = io_hdr.sbp; + if (sense_buffer[0] & 0x2) + sense_key = sense_buffer[1] & 0xf; + else + sense_key = sense_buffer[2] & 0xf; + if(RECOVERED_ERROR == sense_key) + return 0; + } + } + return -1; +} + +static int +get_serial (char * str, char * devt) +{ + int fd; + int len; + char buff[MX_ALLOC_LEN + 1]; + + fd = opennode(devt, O_RDONLY); + + if (fd < 0) + return 0; + + if (0 == do_inq(fd, 0, 1, 0x80, buff, MX_ALLOC_LEN, 0)) { + len = buff[3]; + if (len > 0) { + memcpy(str, buff + 4, len); + buff[len] = '\0'; + } + close(fd); + return 1; + } + + closenode(devt, fd); + return 0; +} + +static void * +get_params (void) +{ + struct dm_task *dmt, *dmt1; + struct dm_names *names = NULL; + unsigned next = 0; + void *nexttgt; + uint64_t start, length; + char *target_type = NULL; + char *params; + char *pp; + vector paramsvec = NULL; + + if (!(dmt = dm_task_create(DM_DEVICE_LIST))) + return NULL; + + if (!dm_task_run(dmt)) + goto out; + + if (!(names = dm_task_get_names(dmt))) + goto out; + + if (!names->dev) { + debug("no devmap found"); + goto out; + } + do { + /* + * keep only multipath maps + */ + names = (void *) names + next; + nexttgt = NULL; + debug("devmap %s :", names->name); + + if (!(dmt1 = dm_task_create(DM_DEVICE_TABLE))) + goto out; + + if (!dm_task_set_name(dmt1, names->name)) + goto out1; + + if (!dm_task_run(dmt1)) + goto out1; + + do { + nexttgt = dm_get_next_target(dmt1, nexttgt, + &start, + &length, + &target_type, + ¶ms); + debug("\\_ %lu %lu %s", (unsigned long) start, + (unsigned long) length, + target_type); + + if (!target_type) { + debug("unknown target type"); + goto out1; + } + + if (!strncmp(target_type, "multipath", 9)) { + if (!paramsvec) + paramsvec = vector_alloc(); + + pp = malloc(PARAMS_SIZE); + strncpy(pp, params, PARAMS_SIZE); + vector_alloc_slot(paramsvec); + vector_set_slot(paramsvec, pp); + } else + debug("skip non multipath target"); + } while (nexttgt); +out1: + dm_task_destroy(dmt1); + next = names->next; + } while (next); +out: + dm_task_destroy(dmt); + return paramsvec; +} + +static int +get_word (char *sentence, char *word) +{ + char *p; + int skip = 0; + + while (*sentence == ' ') { + sentence++; + skip++; + } + p = sentence; + + while (*p != ' ' && *p != '\0') + p++; + + skip += (p - sentence); + + if (p - sentence > WORD_SIZE) { + fprintf(stderr, "word too small\n"); + exit_tool(1); + } + strncpy(word, sentence, WORD_SIZE); + word += p - sentence; + *word = '\0'; + + if (*p == '\0') + return 0; + + return skip; +} + +static int +is_path (char * word) +{ + char *p; + + if (!word) + return 0; + + p = word; + + while (*p != '\0') { + if (*p == ':') + return 1; + p++; + } + return 0; +} + +static int +get_paths (vector pathvec) +{ + vector paramsvec = NULL; + char * str; + struct path * pp; + int i; + enum where {BEFOREPG, INPG, AFTERPG}; + int pos = BEFOREPG; + + if (!pathvec) + return 1; + + if (!(paramsvec = get_params())) + exit_tool(0); + + vector_foreach_slot (paramsvec, str, i) { + debug("params %s", str); + while (pos != AFTERPG) { + pp = zalloc(sizeof(struct path)); + str += get_word(str, pp->dev_t); + + if (!is_path(pp->dev_t)) { + debug("skip \"%s\"", pp->dev_t); + free(pp); + + if (pos == INPG) + pos = AFTERPG; + + continue; + } + if (pos == BEFOREPG) + pos = INPG; + + get_serial(pp->serial, pp->dev_t); + vector_alloc_slot(pathvec); + vector_set_slot(pathvec, pp); + debug("store %s [%s]", + pp->dev_t, pp->serial); + } + pos = BEFOREPG; + } + return 0; +} + +static void * +find_controler (vector controlers, char * serial) +{ + int i; + struct controler * cp; + + if (!controlers) + return NULL; + + vector_foreach_slot (controlers, cp, i) + if (!strncmp(cp->serial, serial, SERIAL_SIZE)) + return cp; + return NULL; +} + +static void +get_controlers (vector controlers, vector pathvec) +{ + int i; + struct path * pp; + struct controler * cp; + + if (!controlers) + return; + + vector_foreach_slot (pathvec, pp, i) { + if (!pp || !strlen(pp->serial)) + continue; + + cp = find_controler(controlers, pp->serial); + + if (!cp) { + cp = zalloc(sizeof(struct controler)); + vector_alloc_slot(controlers); + vector_set_slot(controlers, cp); + strncpy(cp->serial, pp->serial, SERIAL_SIZE); + } + cp->path_count++; + } +} + +static int +get_max_path_count (vector controlers) +{ + int i; + int max = 0; + struct controler * cp; + + if (!controlers) + return 0; + + vector_foreach_slot (controlers, cp, i) { + debug("controler %s : %i paths", cp->serial, cp->path_count); + if(cp->path_count > max) + max = cp->path_count; + } + debug("max_path_count = %i", max); + return max; +} + +int +main (int argc, char **argv) +{ + vector pathvec = NULL; + vector controlers = NULL; + struct path * ref_path = NULL; + struct controler * cp = NULL; + int max_path_count = 0; + + ref_path = zalloc(sizeof(struct path)); + + if (!ref_path) + exit_tool(1); + + if (argc != 2) + exit_tool(1); + + if (optind<argc) + strncpy(ref_path->dev_t, argv[optind], WORD_SIZE); + + get_serial(ref_path->serial, ref_path->dev_t); + + if (!ref_path->serial || !strlen(ref_path->serial)) + exit_tool(0); + + pathvec = vector_alloc(); + controlers = vector_alloc(); + + get_paths(pathvec); + get_controlers(controlers, pathvec); + max_path_count = get_max_path_count(controlers); + cp = find_controler(controlers, ref_path->serial); + + if (!cp) { + debug("no other active path on serial %s\n", + ref_path->serial); + exit_tool(0); + } + + printf("%i\n", max_path_count - cp->path_count + 1); + + return(0); +} diff --git a/path_priority/pp_emc/Makefile b/path_priority/pp_emc/Makefile new file mode 100644 index 0000000..d02a069 --- /dev/null +++ b/path_priority/pp_emc/Makefile @@ -0,0 +1,29 @@ +EXEC = pp_emc +BUILD = glibc +OBJS = pp_emc.o + +TOPDIR = ../.. +include $(TOPDIR)/Makefile.inc + +CFLAGS = -pipe -g -O2 -Wall -Wunused -Wstrict-prototypes + +all: $(BUILD) + +glibc: $(OBJS) + $(CC) -o $(EXEC) $(OBJS) $(LDFLAGS) + $(STRIP) $(EXEC) + +klibc: $(OBJS) + $(CC) -static -o $(EXEC) $(OBJS) + $(STRIP) $(EXEC) + +install: $(EXEC) + install -m 755 $(EXEC) $(DESTDIR)$(bindir)/$(EXEC) + +uninstall: + rm $(DESTDIR)$(bindir)/$(EXEC) +clean: + rm -f *.o $(EXEC) + +%.o: %.c + $(CC) $(CFLAGS) -c -o $@ $< diff --git a/path_priority/pp_emc/pp_emc.c b/path_priority/pp_emc/pp_emc.c new file mode 100644 index 0000000..dd58424 --- /dev/null +++ b/path_priority/pp_emc/pp_emc.c @@ -0,0 +1,97 @@ +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/ioctl.h> +#include <errno.h> + +#include "../../libmultipath/sg_include.h" + +#define INQUIRY_CMD 0x12 +#define INQUIRY_CMDLEN 6 + +int emc_clariion_prio(const char *dev) +{ + unsigned char sense_buffer[256]; + unsigned char sb[128]; + unsigned char inqCmdBlk[INQUIRY_CMDLEN] = {INQUIRY_CMD, 1, 0xC0, 0, + sizeof(sb), 0}; + struct sg_io_hdr io_hdr; + int ret = 0; + int fd; + + fd = open(dev, O_RDWR|O_NONBLOCK); + + if (fd <= 0) { + fprintf(stderr, "Opening the device failed.\n"); + goto out; + } + + memset(&io_hdr, 0, sizeof (struct sg_io_hdr)); + io_hdr.interface_id = 'S'; + io_hdr.cmd_len = sizeof (inqCmdBlk); + io_hdr.mx_sb_len = sizeof (sb); + io_hdr.dxfer_direction = SG_DXFER_FROM_DEV; + io_hdr.dxfer_len = sizeof (sense_buffer); + io_hdr.dxferp = sense_buffer; + io_hdr.cmdp = inqCmdBlk; + io_hdr.sbp = sb; + io_hdr.timeout = 60000; + io_hdr.pack_id = 0; + if (ioctl(fd, SG_IO, &io_hdr) < 0) { + fprintf(stderr, "sending query command failed\n"); + goto out; + } + if (io_hdr.info & SG_INFO_OK_MASK) { + fprintf(stderr, "query command indicates error"); + goto out; + } + + close(fd); + + if (/* Verify the code page - right page & revision */ + sense_buffer[1] != 0xc0 || sense_buffer[9] != 0x00) { + fprintf(stderr, "Path unit report page in unknown format"); + goto out; + } + + if ( /* Effective initiator type */ + sense_buffer[27] != 0x03 + /* Failover mode should be set to 1 */ + || (sense_buffer[28] & 0x07) != 0x04 + /* Arraycommpath should be set to 1 */ + || (sense_buffer[30] & 0x04) != 0x04) { + fprintf(stderr, "Path not correctly configured for failover"); + } + + if ( /* LUN operations should indicate normal operations */ + sense_buffer[48] != 0x00) { + fprintf(stderr, "Path not available for normal operations"); + } + + /* Is the default owner equal to this path? */ + /* Note this will switch to the default priority group, even if + * it is not the currently active one. */ + ret = (sense_buffer[5] == sense_buffer[8]) ? 1 : 0; + +out: + return(ret); +} + +int +main (int argc, char **argv) +{ + int prio; + if (argc != 2) { + fprintf(stderr, "Arguments wrong!\n"); + prio = 0; + } else + prio = emc_clariion_prio(argv[1]); + + printf("%d\n", prio); + exit(0); +} + diff --git a/path_priority/pp_random b/path_priority/pp_random new file mode 100755 index 0000000..ec535ad --- /dev/null +++ b/path_priority/pp_random @@ -0,0 +1,2 @@ +#!/bin/sh +echo $RANDOM |