summaryrefslogtreecommitdiff
path: root/00QUICKSTART
diff options
context:
space:
mode:
authorPatrick McCarty <patrick.mccarty@linux.intel.com>2013-02-08 13:26:27 -0800
committerPatrick McCarty <patrick.mccarty@linux.intel.com>2013-02-08 13:26:27 -0800
commit9bb81f8a90ecc8b70c955bff72ec59dd3d9e5ae7 (patch)
tree881eebfa461e4f8aa6b6f44b96ac0decd3bc887a /00QUICKSTART
downloadlsof-9bb81f8a90ecc8b70c955bff72ec59dd3d9e5ae7.tar.gz
lsof-9bb81f8a90ecc8b70c955bff72ec59dd3d9e5ae7.tar.bz2
lsof-9bb81f8a90ecc8b70c955bff72ec59dd3d9e5ae7.zip
Imported Upstream version 4.87upstream/4.87
Diffstat (limited to '00QUICKSTART')
-rw-r--r--00QUICKSTART1023
1 files changed, 1023 insertions, 0 deletions
diff --git a/00QUICKSTART b/00QUICKSTART
new file mode 100644
index 0000000..697734e
--- /dev/null
+++ b/00QUICKSTART
@@ -0,0 +1,1023 @@
+
+ A Quick Start for Lsof
+
+1. Introduction
+================
+
+ Agreed, the lsof man page is dense and lsof has a plethora of
+ options. There are examples, but the manual page format buries
+ them at the end. How does one get started with lsof?
+
+ This file is an attempt to answer that question. It plunges
+ immediately into examples of lsof use to solve problems that
+ involve looking at the open files of Unix processes.
+
+
+ Contents
+
+ 1. Introduction
+ 2. Finding Uses of a Specific Open File
+ 3. Finding Open Files Filling a File System
+ a. Finding an Unlinked Open File
+ 4. Finding Processes Blocking Umount
+ 5. Finding Listening Sockets
+ 6. Finding a Particular Network Connection
+ 7. Identifying a Netstat Connection
+ 8. Finding Files Open to a Named Command
+ 9. Deciphering the Remote Login Trail
+ a. The Fundamentals
+ b. The idrlogin.perl[5] Scripts
+ 10. Watching an Ftp or Rcp Transfer
+ 11. Listing Open NFS Files
+ 12. Listing Files Open by a Specific Login
+ a. Ignoring a Specific Login
+ 13. Listing Files Open to a Specific Process Group
+ 14. When Lsof Seems to Hang
+ a. Kernel lstat(), readlink(), and stat() Blockages
+ b. Problems with /dev or /devices
+ c. Host and Service Name Lookup Hangs
+ d. UID to Login Name Conversion Delays
+ 15. Output for Other Programs
+ 16. The Lsof Exit Code and Shell Scripts
+ 17. Strange messages in the NAME column
+
+ Options
+
+ A. Selection Options
+ B. Output Options
+ C. Precautionary Options
+ D. Miscellaneous Lsof Options
+
+
+2. Finding Uses of a Specific Open File
+========================================
+
+ Often you're interested in knowing who is using a specific file.
+ You know the path to it and you want lsof to tell you the processes
+ that have open references to it.
+
+ Simple -- execute lsof and give it the path name of the file of
+ interest -- e.g.,
+
+ $ lsof /etc/passwd
+
+ Caveat: this only works if lsof has permission to get the status
+ (via stat(2)) of the file at the named path. Unless the lsof
+ process has enough authority -- e.g., it is being run with a
+ real User ID (UID) of root -- this AIX example won't work:
+
+ Further caveat: this use of lsof will fail if the stat(2) kernel
+ syscall returns different file parameters -- particularly device
+ and inode numbers -- than lsof finds in kernel node structures.
+ This condition is rare and is usually documented in the 00FAQ
+ file of the lsof distribution.
+
+ $ lsof /etc/security/passwd
+ lsof: status error on /etc/security/passwd: Permission denied
+
+
+3. Finding Open Files Filling a File System
+============================================
+
+ Oh! Oh! /tmp is filling and ls doesn't show that any large files
+ are being created. Can lsof help?
+
+ Maybe. If there's a process that is writing to a file that has
+ been unlinked, lsof may be able to discover the process for you.
+ You ask it to list all open files on the file system where /tmp
+ is located.
+
+ Sometimes /tmp is a file system by itself. In that case,
+
+ $ lsof /tmp
+
+ is the appropriate command. If, however, /tmp is part of another
+ file system, typically /, then you may have to ask lsof to list
+ all files open on the containing file system and locate the
+ offending file and its process by inspection -- e.g.,
+
+ $ lsof / | more
+ or
+ $ lsof / | grep ...
+
+ Caveat: there must be a file open to a for the lsof search to
+ succeed. Sometimes the kernel may cause a file reference to
+ persist, even where there's no file open to a process. (Can you
+ say kernel bug? Maybe.) In any event, lsof won't be able to
+ help in this case.
+
+ a. Finding an Unlinked Open File
+ =================================
+
+ A pesky variant of a file that is filling a file system is an
+ unlinked file to which some process is still writing. When a
+ process opens a file and then unlinks it, the file's resources
+ remain in use by the process, but the file's directory entries
+ are removed. Hence, even when you know the directory where the
+ file once resided, you can't detect it with ls.
+
+ This can be an administrative problem when the unlinked file is
+ large, and the process that holds it open continues to write to
+ it. Only when the process closes the file will its resources,
+ particularly disk space, be released.
+
+ Lsof can help you find unlinked files on local disks. It has an
+ option, +L, that will list the link counts of open files. That
+ helps because an unlinked file on a local disk has a zero link
+ count. Note: this is NOT true for NFS files, accessed from a
+ remote server.
+
+ You could use the option to list all files and look for a zero
+ link count in the NLINK column -- e.g.,
+
+ $lsof +L
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
+ ...
+ less 25366 abe txt VREG 6,0 40960 1 76319 /usr/...
+ ...
+ > less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a)
+
+ Better yet, you can specify an upper bound to the +L option, and
+ lsof will select only files that have a link count less than the
+ upper bound. For example:
+
+ $ lsof +L1
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
+ less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a)
+
+ You can use lsof's -a (AND) option to narrow the link count search
+ to a particular file system. For example, to look for zero link
+ counts on the /home file system, use:
+
+ $ lsof -a +L1 /home
+
+ CAUTION: lsof can't always report link counts for all file types
+ -- e.g., it may not report them for FIFOs, pipes, or sockets.
+ Remember also that link counts for NFS files on an NFS client
+ host don't behave as do link counts for files on local disks.
+
+
+4. Finding Processes Blocking Umount
+=====================================
+
+ When you need to unmount a file system with the umount command,
+ you may find the operation blocked by a process that has a file
+ open on the file systems. Lsof may be able to help you find the
+ process. In response to:
+
+ $ lsof <file_system_name>
+
+ Lsof will display all open files on the named file system. It
+ will also set its exit code zero when it finds some open files
+ and non-zero when it doesn't, making this type of lsof call
+ useful in shell scripts. (See section 16.)
+
+ Consult the output of the df command for file system names.
+
+ See the caveat in the preceding section about file references
+ that persist in the kernel without open file traces. That
+ situation may hamper lsof's ability to help with umount, too.
+
+
+5. Finding Listening Sockets
+=============================
+
+ Sooner or later you may wonder if someone has installed a network
+ server that you don't know about. Lsof can list for you all the
+ network socket files open on your machine with:
+
+ $ lsof -i
+
+ The -i option without further qualification lists all open Internet
+ socket files. You can add network names or addresses, protocol
+ names, and service names or port numbers to the -i option to
+ refine the search. (See the next section.)
+
+
+6. Finding a Particular Network Connection
+===========================================
+
+ When you know the source or destination of a network connection
+ whose open files and process you'd like to identify, the -i option
+ may help.
+
+ If, for example, you want to know what process has a connection
+ open to or from the Internet host named aaa.bbb.ccc, you can ask
+ lsof to search for it with:
+
+ $ lsof -i@aaa.bbb.ccc
+
+ If you're interested in a particular protocol -- TCP or UDP --
+ and a specific port number or service name, you can add those
+ discriminators to the -i information:
+
+ $ lsof -iTCP@aaa.bbb.ccc:ftp-data
+
+ If you're interested in a particular IP version -- IPv4 or IPv6
+ -- and your UNIX dialect supports both (It does if "IPv[46]"
+ appears in the lsof -h output.), you can add the '4' or '6'
+ selector immediately after -i:
+
+ $ lsof -i4
+ $ lsof -i6
+
+
+7. Identifying a Netstat Connection
+====================================
+
+ How do I identify the process that has a network connection
+ described in netstat output? For example, if netstat says:
+
+ Proto Recv-Q Send-Q Local Address Foreign Address (state)
+ tcp 0 0 vic.1023 ipscgate.login ESTABLISHED
+
+ What process is connected to service name ``login'' on ipscgate?
+
+ Use lsof's -i option:
+
+ $lsof -iTCP@ipscgate:login
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login
+ ...
+
+ There's another way. Notice the 0x10144168 in the DEVICE column
+ of the lsof output? That's the protocol control block (PCB)
+ address. Many netstat applications will display it when given
+ the -A option:
+
+ $ netstat -A
+ PCB Proto Recv-Q Send-Q Local Address Foreign Address (state)
+ 10144168 tcp 0 0 vic.1023 ipscgate.login ESTABLISHED
+ ...
+
+ Using the PCB address, lsof, and grep, you can find the process this
+ way, too:
+
+ $ lsof -i | grep 10144168
+ rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login
+ ...
+
+ If the file is a UNIX socket and netstat reveals and adress for it,
+ like this Solaris 11 example:
+
+ $ netstat -a -f unix
+ Active UNIX domain sockets
+ Address Type Vnode Conn Local Addr Remote Addr
+ ffffff0084253b68 stream-ord 0000000 0000000
+
+ Using lsof's -U opetion and its output piped to a grep on the address
+ yields:
+
+ $ lsof -U | grep ffffff0084253b68
+ squid 1638 nobody 12u unix 18,98 0t10 9437188 /devices/pseudo/tl@0:ticots->0xffffff0084253b68 stream-ord
+ $ lsof -U |
+
+
+8. Finding Files Open to a Named Command
+=========================================
+
+ When you want to look at the files open to a particular command,
+ you can look up the PID of the process running the command and
+ use lsof's -p option to specify it.
+
+ $ lsof -p <PID>
+
+ However, there's a quicker way, using lsof's -c option, provided
+ you don't mind seeing output for every process running the named
+ command.
+
+ $ lsof -c <first_characters_of_command_name_that_interest_you>
+
+ The lsof -c option is useful when you want to see how many instances
+ of a given command are executing and what their open files are.
+ One useful example is for the sendmail command.
+
+ $ lsof -c sendmail
+
+
+9. Deciphering the Remote Login Trail
+======================================
+
+ If the network connection you're interested in tracing has been
+ initiated externally and is connected to an rlogind, sshd, or
+ telnetd process, asking lsof to identify that process might not
+ give a wholly satisfying answer. The report may be that the
+ connection exists, but to a process owned by root.
+
+ a. The Fundamentals
+ ====================
+
+ How do you get from there to the login name really using the
+ connection? You have to know a little about how real and pseudo
+ ttys are paired in your system, and then use several lsof probes
+ to identify the login.
+
+ This example comes from a Solaris 2.4 system, named klaatu.cc.
+ I've logged on to it via rlogin from lsof.itap. The first lsof
+ probe,
+
+ $ lsof -i@lsof.itap
+
+ yields (among other things):
+
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023
+ ...
+
+ This confirms that a connection exists. A second lsof probe
+ shows:
+
+ $ lsof -p7362
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ ...
+ in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023
+ ...
+ in.rlogin 7362 root 3u VCHR 23, 0 0t66 52928 /devices/pseudo/clone@0:ptmx->pckt->ptm
+
+ 7362 is the Process ID (PID) of the in.rlogin process, discovered
+ in the first lsof probe. (I've abbreviated the output to simplify
+ the example.) Now comes a need to understand Solaris pseudo-ttys.
+ The key indicator is in the DEVICE column for FD 3, the major/minor
+ device number of 23,0. This translates to /dev/pts/0, so a third
+ lsof probe,
+
+ $ lsof /dev/pts/0
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ ksh 7364 abe 0u VCHR 24, 0 0t2410 53410 /dev/pts/../../devices/pseudo/pts@0:0
+
+ shows in part that login abe has a ksh process on /dev/pts/0.
+ (The NAME that lsof shows is not /dev/pts/0 but the full expansion
+ of the symbolic link that lsof finds at /dev/pts/0.)
+
+ Here's a second example, done on an HP-UX 9.01 host named ghg.ecn.
+ Again, I've logged on to it from lsof.itap, so I start with:
+
+ $ lsof -i@lsof.itap
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ rlogind 10214 root 0u inet 0x041d5f00 0t1536 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023
+ ...
+
+ Then,
+
+ $ lsof -p10214
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ ...
+ rlogind 10214 root 0u inet 0x041d5f00 0t2005 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023
+ ...
+ rlogind 10214 root 3u VCHR 16,0x000030 0t2037 24642 /dev/ptym/ptys0
+
+ Here the key is the NAME /dev/ptym/ptys0. In HP-UX 9.01 tty and
+ pseudo tty devices are paired with the names like /dev/ptym/ptys0
+ and /dev/pty/ttys0, so the following lsof probe is the final step.
+
+ $ lsof /dev/pty/ttys0
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ ksh 10215 abe 0u VCHR 17,0x000030 0t3399 22607 /dev/pty/ttys0
+ ...
+
+ Here's a third example for an AIX 4.1.4 system. I've used telnet
+ to connect to it from lsof.itap.purdue.edu. I start with:
+
+ $ lsof -i@lsof.itap.purdue.edu
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ ...
+ telnetd 15616 root 0u inet 0x05a93400 0t5156 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369
+
+ Then I look at the telnetd process:
+
+ $ lsof -p15616
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ ...
+ telnetd 15616 root 0u inet 0x05a93400 0t5641 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369
+ ...
+ telnetd 15616 root 3u VCHR 25, 0 0t5493 103 /dev/ptc/0
+
+ Here the key is /dev/ptc/0. In AIX it's paired with /dev/pts/0.
+ The last probe for that shows:
+
+ $ lsof /dev/pts/0
+ COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
+ ...
+ ksh 16642 abe 0u VCHR 26, 0 0t6461 360 /dev/pts/0
+
+ b. The idrlogin.perl[5] Scripts
+ ================================
+
+ There's another, perhaps easier way, to go about the job of
+ tracing a network connection. The lsof distribution contains
+ two Perl scripts, idrlogin.perl (Perl 4) and idrlogin.perl5
+ (Perl 5), that use lsof field output to display values for
+ shells that are parented by rlogind, sshd, or telnetd, or
+ connected directly to TCP sockets. The lsof test suite contains
+ a C library that can be adapted for use with C programs that
+ need to call lsof and process its field output.
+
+ The two Perl scripts use the lsof -R option; it causes the
+ paRent process ID (PPID) to be listed in the lsof output. The
+ scripts identify all shell processes -- e.g., ones whose command
+ names end in ``sh'' -- and determine if: 1) the ultimate ancestor
+ process before a PID greater than 2 (e.g., init's PID is 1) is
+ rlogind, sshd, or telnetd; or 2) the shell process has open
+ TCP socket files.
+
+ Here's an example of output from idlogin.perl on a Solaris 2.4
+ system:
+
+ centurion: 1 = cd src/lsof4/scripts
+ centurion: 2 = ./idrlogin.perl
+ Login Shell PID Via PID TTY From
+ oboyle ksh 12640 in.telnetd 12638 pts/5 opal.cc.purdue.edu
+ icdtest ksh 15158 in.rlogind 15155 pts/6 localhost
+ sh csh 18207 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu
+ root csh 18242 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu
+ trouble ksh 19208 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu
+ abe ksh 21334 in.rlogind 21332 pts/2 lsof.itap.purdue.edu
+
+ The scripts assume that its parent directory contains an
+ executable lsof. If you decide to use one of the scripts, you
+ may want to customize it for your local lsof and perl paths.
+
+ Note that processes executing as remote shells are also
+ identified.
+
+ Here's another example from a UnixWare 7.1.0 system.
+
+ tweeker: 1 = cd src/lsof4/scripts
+ tweeker: 9 = ./idrlogin.perl
+ Login Shell PID Via PID TTY From
+ abe ksh 9438 in.telnetd 9436 pts/3 lsof.itap.purdue.edu
+
+
+10. Watching an Ftp or Rcp Transfer
+===================================
+
+ The nature of the Internet being one of unpredictable performance
+ at times, occasionally you want to know if a file transfer, being
+ done by ftp or rcp, is making any progress.
+
+ To use lsof for watching a file transfer, you need to know the
+ PID of the file transfer process. You can use ps to find that.
+ Then use lsof,
+
+ $ lsof -p<PID>
+
+ to examine the files open to the transfer process. Usually the
+ ftp files or interest are at file descriptors 9 and 10 or 10 and
+ 11; for rcp, 3 and 4. They describe the network socket file and
+ the local data file.
+
+ If you want to watch only those file descriptors as the file
+ transfer progresses, try these lsof forms (for ftp in the example):
+
+ $ lsof -p<PID> -ad9,10 -r
+ or
+ $ lsof -p<PID> -ad10,11 -r
+
+ Some options need explaining:
+
+ -p<PID> specifies that lsof is to restrict its attention
+ to the process whose ID is <PID>. You can specify
+ a set of PIDs by separating them with commas.
+
+ $ lsof -p 1234,5678,9012
+
+ -a specifies that lsof is to AND its tests together.
+ The two tests that are specified are tests on the
+ PID and tests on file descriptions (``d9,10'').
+
+ d9,10 specifies that lsof is to test only file descriptors
+ 9 and 10. Note that the `-' is absent, since ``-a''
+ is a unary option and can be followed immediately
+ by another lsof option.
+
+ -r tells lsof to list the requested open file information,
+ sleep for a default 15 seconds, then list the open
+ file information again. You can specify a different
+ time (in seconds) after -r and override the default.
+ Lsof issues a short line of equal signs between
+ each set of output to distinguish it.
+
+ For an rcp transfer, the above example becomes:
+
+ $ lsof -p<PID> -ad3,4 -r
+
+
+11. Listing Open NFS Files
+==========================
+
+ Lsof will list all files open on remote file systems, supported
+ by an NFS server. Just use:
+
+ $ lsof -N
+
+ Note, however, that when run on an NFS server, lsof will not list
+ files open to the server from one of its clients. That's because
+ lsof can only examine the processes running on the machine where
+ it is called -- i.e., on the NFS server.
+
+ If you run lsof on the NFS client, using the -N option, it will
+ list files open by processes on the client that are on remote
+ NFS file systems.
+
+
+12. Listing Files Open by a Specific Login
+==========================================
+
+ If you're interested in knowing what files the processes owned
+ by a particular login name have open, lsof can help.
+
+ $ lsof -u<login>
+ or
+ $ lsof -u<User ID number>
+
+ You can specify either the login name or the UID associated with
+ it. You can specify multiple login names and UID numbers, mixed
+ together, by separating them with commas.
+
+ $ lsof -u548,abe
+
+ On the subject of login names and UIDs, it's worth noting that
+ lsof can be told to report either. By default it reports login
+ names; the -l option switches reporting to UIDs. You might want
+ to use -l if login name lookup is slow for some reason.
+
+ a. Ignoring a Specific Login
+ =============================
+
+ The -u option can also be used to direct lsof to ignore a
+ specific login name or UID, or a list of them. Simply prefix
+ the login names or UIDs with a `^' character, as you might do
+ in a regular expression. The `^' prefix is useful, for example,
+ when you want to have lsof ignore the files open to system
+ processes, owned by the root (UID 0) login. Try:
+
+ $ lsof -u ^root
+ or
+ $ lsof -u ^0
+
+
+13. Listing Files Open to a Specific Process Group
+==================================================
+
+ There's a Unix collection of processes called a process group.
+ The name indicates that the processes of the group have a common
+ association and are grouped so that a signal sent to one (e.g.,
+ a keyboard kill stroke) is delivered to all.
+
+ This causes Unix to create a two element process group:
+
+ $ lsof | less
+
+ You can use lsof to look at the open files of all members of a
+ process group, if you know the process group ID number. Assuming
+ that it is 12717 for the above example, this lsof command:
+
+ $ lsof -g12717 -adcwd
+
+ would produce on a Solaris 8 system:
+
+ $ lsof -g12717 -adcwd
+ COMMAND PID PGID USER FD TYPE DEVICE SIZE/OFF NODE NAME
+ sshd 11369 12717 root cwd VDIR 0,2 189 1449175 /tmp (swap)
+ sshd 12717 12717 root cwd VDIR 136,0 1024 2 /
+
+ The ``-g12717'' option specifies the process group ID of interest;
+ the ``-adcwd'' option specifies that options are to be ANDed and
+ that lsof should limit file output to information about current
+ working directory (``cwd'') files.
+
+
+14. When Lsof Seems to Hang
+===========================
+
+ On occasion when you run lsof it seems to hang and produce no
+ output. This may result from system conditions beyond the control
+ of lsof. Lsof has a number of options that may allow you to
+ bypass the blockage.
+
+ a. Kernel lstat(), readlink(), and stat() Blockages
+ ====================================================
+
+ Lsof uses the kernel (system) calls lstat(), readlink(), and
+ stat() to locate mounted file system information. When a file
+ system has been mounted from an NFS server and that server is
+ temporarily unavailable, the calls lsof uses may block in the
+ kernel.
+
+ Lsof will announce that it is being blocked with warning messages
+ (unless they have been suppressed by the lsof builder), but
+ only after a default waiting period of fifteen seconds has
+ expired for each file system whose server is unavailable. If
+ you have a number of such file systems, the total wait may be
+ unacceptably long.
+
+ You can do two things to shorten your suffering: 1) reduce the
+ wait time with the -S option; or 2) tell lsof to avoid the
+ kernel calls that might block by specifying the -b option.
+
+ $ lsof -S 5
+ or
+ $ lsof -b
+
+ Avoiding the kernel calls that might block may result in the
+ lack of some information that lsof needs to know about mounted
+ file systems. Thus, when you use -b, lsof warns that it might
+ lack important information.
+
+ The warnings that result from using -b (unless suppressed by
+ the lsof builder) can themselves be annoying. You can suppress
+ them by adding the -w option. (Of course, if you do, you won't
+ know what warning messages lsof might have issued.)
+
+ $ lsof -bw
+
+ Note: if the lsof builder suppressed warning message issuance,
+ you don't need to use -w to suppress them. You can tell what
+ the default state of message warning issuance is by looking at
+ the -h (help) output. If it says ``-w enable warnings'' then
+ warnings are disabled by default; ``-w disable warnings'', they
+ are enabled by default.
+
+ b. Problems with /dev or /devices
+ ==================================
+
+ Lsof scans the /dev or /devices branch of your file system to
+ obtain information about your system's devices. (The scan isn't
+ necessary when a device cache file exists.)
+
+ Sometimes that scan can take a very long time, especially if
+ you have a large number of devices, and if your kernel is
+ relatively slow to process the stat() system call on device
+ nodes. You can't do anything about the stat() system call
+ speed.
+
+ However, you can make sure that lsof is allowed to use its
+ device cache file feature. When lsof can use a device cache
+ file, it retains information it gleans via the stat() calls
+ on /dev or /devices in a separate file for later, faster
+ access.
+
+ The device cache file feature is described in the lsof man
+ page. See the DEVICE CACHE FILE, LSOF PERMISSIONS THAT AFFECT
+ DEVICE CACHE FILE ACCESS, DEVICE CACHE FILE PATH FROM THE -D
+ OPTION, DEVICE CACHE PATH FROM AN ENVIRONMENT VARIABLE,
+ SYSTEM-WIDE DEVICE CACHE PATH, PERSONAL DEVICE CACHE PATH
+ (DEFAULT), and MODIFIED PERSONAL DEVICE CACHE PATH sections.
+
+ There is also a separate file in the lsof distribution, named
+ 00DCACHE, that describes the device cache file in detail,
+ including information about possible security problems.
+
+ One final observation: don't overlook the possibility that your
+ /dev or /devices tree might be damaged. See if
+
+ $ ls -R /dev
+ or
+ $ ls -R /devices
+
+ completes or hangs. If it hangs, then lsof will probably hang,
+ too, and you should try to discover why ls hangs.
+
+ c. Host and Service Name Lookup Hangs
+ ======================================
+
+ Lsof can hang up when it tries to convert an Internet dot-form
+ address to a host name, or a port number to a service name. Both
+ hangs are caused by the lookup functions of your system.
+
+ An independent check for both types of hangs can be made with
+ the netstat program. Run it without arguments. If it hangs,
+ then it is probably having lookup difficulties. When you run
+ it with -n it shouldn't hang and should report network and port
+ numbers instead of names.
+
+ Lsof has two options that serve the same purpose as netstat's
+ -n option. The lsof -n option tells it to avoid host name
+ lookups; and -P, service name lookups. Try those options when
+ you suspect lsof may be hanging because of lookup problems.
+
+ $ lsof -n
+ or
+ $ lsof -P
+ or
+ $ lsof -nP
+
+ d. UID to Login Name Conversion Delays
+ =======================================
+
+ By default lsof converts User IDentification (UID) numbers to
+ login names when it produces output. That conversion process
+ may sometimes hang because of system problems or interlocks.
+
+ You can tell lsof to skip the lookup with the -l option; it
+ will then report UIDs in the USER column.
+
+ $ lsof -l
+
+
+15. Output for Other Programs
+=============================
+
+ The -F option allows you to specify that lsof should describe
+ open files with a special form of output, called field output,
+ that can be parsed easily by a subsequent program. The lsof
+ distribution comes with sample AWK, Perl 4, and Perl 5 scripts
+ that post-process field output. The lsof test suite has a C
+ library that could be adapted for use by C programs that want to
+ process lsof field output from an in-bound pipe.
+
+ The lsof manual page describes field output in detail in its
+ OUTPUT FOR OTHER PROGRAMS section. A quick look at a sample
+ script in the scripts/ subdirectory of the lsof distribution will
+ also give you an idea how field output works.
+
+ The most important thing about field output is that it is relatively
+ homogeneous across Unix dialects. Thus, if you write a script
+ to post-process field output for AIX, it probably will work for
+ HP-UX, Solaris, and Ultrix as well.
+
+
+16. The Lsof Exit Code and Shell Scripts
+========================================
+
+ When lsof exits successfully it returns an exit code based on
+ the result of its search for specified files. (If no files were
+ specified, then the successful exit code is 0 (zero).)
+
+ If lsof was asked to search for specific files, including any
+ files on specified file systems, it returns an exit code of 0
+ (zero) if it found all the specified files and at least one file
+ on each specified file system. Otherwise it returns a 1 (one).
+
+ If lsof detects an error and makes an unsuccessful exit, it
+ returns an exit code of 1 (one).
+
+ You can use the exit code in a shell script to search for files
+ on a file system and take action based on the result -- e.g.,
+
+ #!/bin/sh
+ lsof <file_system_name> > /dev/null 2>&1
+ if test $? -eq 0
+ then
+ echo "<file_system_name> has some users."
+ else
+ echo "<file_system_name> may have no users."
+ fi
+
+
+17. Strange messages in the NAME column
+=======================================
+
+ When lsof encounters problems analyzing a particular file, it may
+ put a message in the file's NAME column. Many of those messages
+ are explained in the 00FAQ file of the lsof distribution.
+
+ So consult 00FAQ first if you encounter a NAME column message you
+ don't understand. (00FAQ is a possible source of information
+ about other unfamiliar things in lsof output, too.)
+
+ If you can't find help in 00FAQ, you can use grep to look in the
+ lsof source files for the message -- e.g.,
+
+ $ cd .../lsof_4.76_src
+ $ grep "can't identify protocol" *.[ch]
+
+ The code associated with the message will usually make clear the
+ reason for the message.
+
+ If you have an lsof source tree that has been processed by the
+ lsof Configure script, you need grep only there. If, however,
+ your source tree hasn't been processed by Configure, you may
+ have to look in the top-level lsof source directory and in the
+ dialects sub-directory for the UNIX dialect you are using - e.g.,
+
+ $ cd .../lsof_4.76_src
+ $ grep "can't identify protocol" *.[ch]
+ $ cd dialects/Linux
+ $ grep "can't identify protocol" *.[ch]
+
+ In rare cases you may have to look in the lsof library, too --
+ e.g.,
+
+ $ cd .../lsof_4.76_src
+ $ grep "can't identify protocol" *.[ch]
+ $ cd dialects/Linux
+ $ grep "can't identify protocol" *.[ch]
+ $ cd ../../lib
+ $ grep "can't identify protocol" *.[ch]
+
+
+Options
+=======
+
+ The following appendices describe the lsof options in detail.
+
+
+A. Selection Options
+====================
+
+ Lsof has a rich set of options for selecting the files to be
+ displayed. These include:
+
+ -a tells lsof to AND the set of selection options that
+ are specified. Normally lsof ORs them.
+
+ For example, if you specify the -p<PID> and -u<UID>
+ options, lsof will display all files for the
+ specified PID or for the specified UID.
+
+ By adding -a, you specify that the listed files
+ should be limited to PIDs owned by the specified
+ UIDs -- i.e., they match the PIDs *and* the UIDs.
+
+ $ lsof -p1234 -au 5678
+
+ -c specifies that lsof should list files belonging
+ to processes having the associated command name.
+
+ Hint: if you want to select files based on more than
+ one command name, use multiple -c<name> specifications.
+
+ $ lsof -clsof -cksh
+
+ -d tells lsof to select by the associated file descriptor
+ (FD) set. An FD set is a comma-separated list of
+ numbers and the names lsof normally displays in
+ its FD column: cwd, Lnn, ltx, <number>, etc. See
+ the OUTPUT section of the lsof man page for the
+ complete list of possible file descriptors. Example:
+
+ $ lsof -dcwd,0,1,2
+
+ -g tells lsof to select by the associated process
+ group ID (PGID) set. The PGID set is a comma-separated
+ list of PGID numbers. When -g is specified, it also
+ enables the display of PGID numbers.
+
+ Note: when -g isn't followed by a PGID set, it
+ simply selects the listing of PGID for all processes.
+ Examples:
+
+ $ lsof -g
+ $ lsof -g1234,5678
+
+ -i tells lsof to display Internet socket files. If no
+ protocol/address/port specification follows -i,
+ lsof lists all Internet socket files.
+
+ If a specification follows -i, lsof lists only the
+ socket files whose Internet addresses match the
+ specification.
+
+ Hint: multiple addresses may be specified with
+ multiple -i options. Examples:
+
+ $ lsof -iTCP
+ $ lsof -i@lsof.itap.purdue.edu:sendmail
+
+ -N selects the listing of files mounted on NFS devices.
+
+ -U selects the listing of socket files in the Unix
+ domain.
+
+
+B. Output Options
+==================
+
+ Lsof has these options to control its output format:
+
+ -F produce output that can be parsed by a subsequent
+ program.
+
+ -g print process group (PGID) IDs.
+
+ -l list UID numbers instead of login names.
+
+ -n list network numbers instead of host names.
+
+ -o always list file offset.
+
+ -P list port numbers instead of port service names.
+
+ -s always list file size.
+
+
+C. Precautionary Options
+=========================
+
+ Lsof uses system functions that can block or take a long time,
+ depending on the health of the Unix dialect supporting it. These
+ include:
+
+ -b directs lsof to avoid system functions -- e.g.,
+ lstat(2), readlink(2), stat(2) -- that might block
+ in the kernel. See the BLOCKS AND TIMEOUTS
+ section of the lsof man page.
+
+ You might want to use this option when you have
+ a mount from an NFS server that is not responding.
+
+ -C tells lsof to ignore the kernel's name cache. As
+ a precaution this option will have little effect on
+ lsof performance, but might be useful if the kernel's
+ name cache is scrambled. (I've never seen that
+ happen.)
+
+ -D might be used to direct lsof to ignore an existing
+ device cache file and generate a new one from /dev
+ (and /devices). This might be useful if you have
+ doubts about the integrity of an existing device
+ cache file.
+
+ -l tells lsof to list UID numbers instead of login
+ names -- this is useful when UID to login name
+ conversion is slow or inoperative.
+
+ -n tells lsof to avoid converting Internet addresses
+ to host numbers. This might be useful when your
+ host name lookup (e.g., DNS) is inoperative.
+
+ -O tells lsof to avoid its strategy of forking to
+ perform potentially blocking kernel operations.
+ While the forking allows lsof to detect that a
+ block has occurred (and possibly break it), the
+ fork operation is a costly one. Use the -O option
+ with care, lest your lsof be blocked.
+
+ -P directs lsof to list port numbers instead of trying
+ to convert them to port service names. This might
+ be useful if port to service name lookups (e.g.,
+ via NIS) are slow or failing.
+
+ -S can be used to change the lstat/readlink/stat
+ timeout interval that governs how long lsof waits
+ for response from the kernel. This might be useful
+ when an NFS server is slow or unresponsive. When
+ lsof times out of a kernel function, it may have
+ less information to display. Example:
+
+ $ lsof -S2
+
+ -w tells lsof to avoid issuing warning messages, if
+ they are enabled by default, or enable them if they
+ are disabled by default. Check the -h (help) output
+ to determine their status. If it says ``-w enable
+ warnings'', then warning messages are disabled by
+ default; ``-w disable warnings'', they are enabled
+ by default.
+
+ This may be a useful option, for example, when you
+ specify -b, if warning messages are enabled, because
+ it will suppress the warning messages lsof issues
+ about avoiding functions that might block in the
+ kernel.
+
+
+D. Miscellaneous Lsof Options
+==============================
+
+ There are some lsof options that are hard to classify, including:
+
+ -? these options select help output.
+ -h
+
+ -F selects field output. Field output is a mode where
+ lsof produces output that can be parsed easily by
+ subsequent programs -- e.g., AWK or Perl scripts.
+ See ``15. Output for Other Programs'' for more
+ information.
+
+ -k specifies an alternate kernel symbol file -- i.e.,
+ where nlist() will get its information. Example:
+
+ $ lsof -k/usr/crash/vmunix.1
+
+ -m specifies an alternate kernel memory file from
+ which lsof will read kernel structures in place
+ of /dev/kmem or kvm_read(). Example:
+
+ $ lsof -m/usr/crash/vmcore.n
+
+ -r tells lsof to repeat its scan every 15 seconds (the
+ default when no associated value is specified). A
+ repeat time, different from the default, can follow
+ -r. Example:
+
+ $ lsof -r30
+
+ -v displays information about the building of the
+ lsof executable.
+
+ -- The double minus sign option may be used to
+ signal the end of options. It's particularly useful
+ when arguments to the last option are optional and
+ you want to supply a file path that could be confused
+ for arguments to the last option. Example:
+
+ $ lsof -g -- 1
+
+ Where `1' is a file path, not PGID ID 1.
+
+
+Vic Abell <abe@purdue.edu>
+January 18, 2010