diff options
Diffstat (limited to '00QUICKSTART')
-rw-r--r-- | 00QUICKSTART | 1023 |
1 files changed, 1023 insertions, 0 deletions
diff --git a/00QUICKSTART b/00QUICKSTART new file mode 100644 index 0000000..697734e --- /dev/null +++ b/00QUICKSTART @@ -0,0 +1,1023 @@ + + A Quick Start for Lsof + +1. Introduction +================ + + Agreed, the lsof man page is dense and lsof has a plethora of + options. There are examples, but the manual page format buries + them at the end. How does one get started with lsof? + + This file is an attempt to answer that question. It plunges + immediately into examples of lsof use to solve problems that + involve looking at the open files of Unix processes. + + + Contents + + 1. Introduction + 2. Finding Uses of a Specific Open File + 3. Finding Open Files Filling a File System + a. Finding an Unlinked Open File + 4. Finding Processes Blocking Umount + 5. Finding Listening Sockets + 6. Finding a Particular Network Connection + 7. Identifying a Netstat Connection + 8. Finding Files Open to a Named Command + 9. Deciphering the Remote Login Trail + a. The Fundamentals + b. The idrlogin.perl[5] Scripts + 10. Watching an Ftp or Rcp Transfer + 11. Listing Open NFS Files + 12. Listing Files Open by a Specific Login + a. Ignoring a Specific Login + 13. Listing Files Open to a Specific Process Group + 14. When Lsof Seems to Hang + a. Kernel lstat(), readlink(), and stat() Blockages + b. Problems with /dev or /devices + c. Host and Service Name Lookup Hangs + d. UID to Login Name Conversion Delays + 15. Output for Other Programs + 16. The Lsof Exit Code and Shell Scripts + 17. Strange messages in the NAME column + + Options + + A. Selection Options + B. Output Options + C. Precautionary Options + D. Miscellaneous Lsof Options + + +2. Finding Uses of a Specific Open File +======================================== + + Often you're interested in knowing who is using a specific file. + You know the path to it and you want lsof to tell you the processes + that have open references to it. + + Simple -- execute lsof and give it the path name of the file of + interest -- e.g., + + $ lsof /etc/passwd + + Caveat: this only works if lsof has permission to get the status + (via stat(2)) of the file at the named path. Unless the lsof + process has enough authority -- e.g., it is being run with a + real User ID (UID) of root -- this AIX example won't work: + + Further caveat: this use of lsof will fail if the stat(2) kernel + syscall returns different file parameters -- particularly device + and inode numbers -- than lsof finds in kernel node structures. + This condition is rare and is usually documented in the 00FAQ + file of the lsof distribution. + + $ lsof /etc/security/passwd + lsof: status error on /etc/security/passwd: Permission denied + + +3. Finding Open Files Filling a File System +============================================ + + Oh! Oh! /tmp is filling and ls doesn't show that any large files + are being created. Can lsof help? + + Maybe. If there's a process that is writing to a file that has + been unlinked, lsof may be able to discover the process for you. + You ask it to list all open files on the file system where /tmp + is located. + + Sometimes /tmp is a file system by itself. In that case, + + $ lsof /tmp + + is the appropriate command. If, however, /tmp is part of another + file system, typically /, then you may have to ask lsof to list + all files open on the containing file system and locate the + offending file and its process by inspection -- e.g., + + $ lsof / | more + or + $ lsof / | grep ... + + Caveat: there must be a file open to a for the lsof search to + succeed. Sometimes the kernel may cause a file reference to + persist, even where there's no file open to a process. (Can you + say kernel bug? Maybe.) In any event, lsof won't be able to + help in this case. + + a. Finding an Unlinked Open File + ================================= + + A pesky variant of a file that is filling a file system is an + unlinked file to which some process is still writing. When a + process opens a file and then unlinks it, the file's resources + remain in use by the process, but the file's directory entries + are removed. Hence, even when you know the directory where the + file once resided, you can't detect it with ls. + + This can be an administrative problem when the unlinked file is + large, and the process that holds it open continues to write to + it. Only when the process closes the file will its resources, + particularly disk space, be released. + + Lsof can help you find unlinked files on local disks. It has an + option, +L, that will list the link counts of open files. That + helps because an unlinked file on a local disk has a zero link + count. Note: this is NOT true for NFS files, accessed from a + remote server. + + You could use the option to list all files and look for a zero + link count in the NLINK column -- e.g., + + $lsof +L + COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME + ... + less 25366 abe txt VREG 6,0 40960 1 76319 /usr/... + ... + > less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a) + + Better yet, you can specify an upper bound to the +L option, and + lsof will select only files that have a link count less than the + upper bound. For example: + + $ lsof +L1 + COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME + less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a) + + You can use lsof's -a (AND) option to narrow the link count search + to a particular file system. For example, to look for zero link + counts on the /home file system, use: + + $ lsof -a +L1 /home + + CAUTION: lsof can't always report link counts for all file types + -- e.g., it may not report them for FIFOs, pipes, or sockets. + Remember also that link counts for NFS files on an NFS client + host don't behave as do link counts for files on local disks. + + +4. Finding Processes Blocking Umount +===================================== + + When you need to unmount a file system with the umount command, + you may find the operation blocked by a process that has a file + open on the file systems. Lsof may be able to help you find the + process. In response to: + + $ lsof <file_system_name> + + Lsof will display all open files on the named file system. It + will also set its exit code zero when it finds some open files + and non-zero when it doesn't, making this type of lsof call + useful in shell scripts. (See section 16.) + + Consult the output of the df command for file system names. + + See the caveat in the preceding section about file references + that persist in the kernel without open file traces. That + situation may hamper lsof's ability to help with umount, too. + + +5. Finding Listening Sockets +============================= + + Sooner or later you may wonder if someone has installed a network + server that you don't know about. Lsof can list for you all the + network socket files open on your machine with: + + $ lsof -i + + The -i option without further qualification lists all open Internet + socket files. You can add network names or addresses, protocol + names, and service names or port numbers to the -i option to + refine the search. (See the next section.) + + +6. Finding a Particular Network Connection +=========================================== + + When you know the source or destination of a network connection + whose open files and process you'd like to identify, the -i option + may help. + + If, for example, you want to know what process has a connection + open to or from the Internet host named aaa.bbb.ccc, you can ask + lsof to search for it with: + + $ lsof -i@aaa.bbb.ccc + + If you're interested in a particular protocol -- TCP or UDP -- + and a specific port number or service name, you can add those + discriminators to the -i information: + + $ lsof -iTCP@aaa.bbb.ccc:ftp-data + + If you're interested in a particular IP version -- IPv4 or IPv6 + -- and your UNIX dialect supports both (It does if "IPv[46]" + appears in the lsof -h output.), you can add the '4' or '6' + selector immediately after -i: + + $ lsof -i4 + $ lsof -i6 + + +7. Identifying a Netstat Connection +==================================== + + How do I identify the process that has a network connection + described in netstat output? For example, if netstat says: + + Proto Recv-Q Send-Q Local Address Foreign Address (state) + tcp 0 0 vic.1023 ipscgate.login ESTABLISHED + + What process is connected to service name ``login'' on ipscgate? + + Use lsof's -i option: + + $lsof -iTCP@ipscgate:login + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login + ... + + There's another way. Notice the 0x10144168 in the DEVICE column + of the lsof output? That's the protocol control block (PCB) + address. Many netstat applications will display it when given + the -A option: + + $ netstat -A + PCB Proto Recv-Q Send-Q Local Address Foreign Address (state) + 10144168 tcp 0 0 vic.1023 ipscgate.login ESTABLISHED + ... + + Using the PCB address, lsof, and grep, you can find the process this + way, too: + + $ lsof -i | grep 10144168 + rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login + ... + + If the file is a UNIX socket and netstat reveals and adress for it, + like this Solaris 11 example: + + $ netstat -a -f unix + Active UNIX domain sockets + Address Type Vnode Conn Local Addr Remote Addr + ffffff0084253b68 stream-ord 0000000 0000000 + + Using lsof's -U opetion and its output piped to a grep on the address + yields: + + $ lsof -U | grep ffffff0084253b68 + squid 1638 nobody 12u unix 18,98 0t10 9437188 /devices/pseudo/tl@0:ticots->0xffffff0084253b68 stream-ord + $ lsof -U | + + +8. Finding Files Open to a Named Command +========================================= + + When you want to look at the files open to a particular command, + you can look up the PID of the process running the command and + use lsof's -p option to specify it. + + $ lsof -p <PID> + + However, there's a quicker way, using lsof's -c option, provided + you don't mind seeing output for every process running the named + command. + + $ lsof -c <first_characters_of_command_name_that_interest_you> + + The lsof -c option is useful when you want to see how many instances + of a given command are executing and what their open files are. + One useful example is for the sendmail command. + + $ lsof -c sendmail + + +9. Deciphering the Remote Login Trail +====================================== + + If the network connection you're interested in tracing has been + initiated externally and is connected to an rlogind, sshd, or + telnetd process, asking lsof to identify that process might not + give a wholly satisfying answer. The report may be that the + connection exists, but to a process owned by root. + + a. The Fundamentals + ==================== + + How do you get from there to the login name really using the + connection? You have to know a little about how real and pseudo + ttys are paired in your system, and then use several lsof probes + to identify the login. + + This example comes from a Solaris 2.4 system, named klaatu.cc. + I've logged on to it via rlogin from lsof.itap. The first lsof + probe, + + $ lsof -i@lsof.itap + + yields (among other things): + + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023 + ... + + This confirms that a connection exists. A second lsof probe + shows: + + $ lsof -p7362 + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + ... + in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023 + ... + in.rlogin 7362 root 3u VCHR 23, 0 0t66 52928 /devices/pseudo/clone@0:ptmx->pckt->ptm + + 7362 is the Process ID (PID) of the in.rlogin process, discovered + in the first lsof probe. (I've abbreviated the output to simplify + the example.) Now comes a need to understand Solaris pseudo-ttys. + The key indicator is in the DEVICE column for FD 3, the major/minor + device number of 23,0. This translates to /dev/pts/0, so a third + lsof probe, + + $ lsof /dev/pts/0 + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + ksh 7364 abe 0u VCHR 24, 0 0t2410 53410 /dev/pts/../../devices/pseudo/pts@0:0 + + shows in part that login abe has a ksh process on /dev/pts/0. + (The NAME that lsof shows is not /dev/pts/0 but the full expansion + of the symbolic link that lsof finds at /dev/pts/0.) + + Here's a second example, done on an HP-UX 9.01 host named ghg.ecn. + Again, I've logged on to it from lsof.itap, so I start with: + + $ lsof -i@lsof.itap + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + rlogind 10214 root 0u inet 0x041d5f00 0t1536 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023 + ... + + Then, + + $ lsof -p10214 + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + ... + rlogind 10214 root 0u inet 0x041d5f00 0t2005 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023 + ... + rlogind 10214 root 3u VCHR 16,0x000030 0t2037 24642 /dev/ptym/ptys0 + + Here the key is the NAME /dev/ptym/ptys0. In HP-UX 9.01 tty and + pseudo tty devices are paired with the names like /dev/ptym/ptys0 + and /dev/pty/ttys0, so the following lsof probe is the final step. + + $ lsof /dev/pty/ttys0 + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + ksh 10215 abe 0u VCHR 17,0x000030 0t3399 22607 /dev/pty/ttys0 + ... + + Here's a third example for an AIX 4.1.4 system. I've used telnet + to connect to it from lsof.itap.purdue.edu. I start with: + + $ lsof -i@lsof.itap.purdue.edu + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + ... + telnetd 15616 root 0u inet 0x05a93400 0t5156 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369 + + Then I look at the telnetd process: + + $ lsof -p15616 + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + ... + telnetd 15616 root 0u inet 0x05a93400 0t5641 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369 + ... + telnetd 15616 root 3u VCHR 25, 0 0t5493 103 /dev/ptc/0 + + Here the key is /dev/ptc/0. In AIX it's paired with /dev/pts/0. + The last probe for that shows: + + $ lsof /dev/pts/0 + COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME + ... + ksh 16642 abe 0u VCHR 26, 0 0t6461 360 /dev/pts/0 + + b. The idrlogin.perl[5] Scripts + ================================ + + There's another, perhaps easier way, to go about the job of + tracing a network connection. The lsof distribution contains + two Perl scripts, idrlogin.perl (Perl 4) and idrlogin.perl5 + (Perl 5), that use lsof field output to display values for + shells that are parented by rlogind, sshd, or telnetd, or + connected directly to TCP sockets. The lsof test suite contains + a C library that can be adapted for use with C programs that + need to call lsof and process its field output. + + The two Perl scripts use the lsof -R option; it causes the + paRent process ID (PPID) to be listed in the lsof output. The + scripts identify all shell processes -- e.g., ones whose command + names end in ``sh'' -- and determine if: 1) the ultimate ancestor + process before a PID greater than 2 (e.g., init's PID is 1) is + rlogind, sshd, or telnetd; or 2) the shell process has open + TCP socket files. + + Here's an example of output from idlogin.perl on a Solaris 2.4 + system: + + centurion: 1 = cd src/lsof4/scripts + centurion: 2 = ./idrlogin.perl + Login Shell PID Via PID TTY From + oboyle ksh 12640 in.telnetd 12638 pts/5 opal.cc.purdue.edu + icdtest ksh 15158 in.rlogind 15155 pts/6 localhost + sh csh 18207 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu + root csh 18242 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu + trouble ksh 19208 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu + abe ksh 21334 in.rlogind 21332 pts/2 lsof.itap.purdue.edu + + The scripts assume that its parent directory contains an + executable lsof. If you decide to use one of the scripts, you + may want to customize it for your local lsof and perl paths. + + Note that processes executing as remote shells are also + identified. + + Here's another example from a UnixWare 7.1.0 system. + + tweeker: 1 = cd src/lsof4/scripts + tweeker: 9 = ./idrlogin.perl + Login Shell PID Via PID TTY From + abe ksh 9438 in.telnetd 9436 pts/3 lsof.itap.purdue.edu + + +10. Watching an Ftp or Rcp Transfer +=================================== + + The nature of the Internet being one of unpredictable performance + at times, occasionally you want to know if a file transfer, being + done by ftp or rcp, is making any progress. + + To use lsof for watching a file transfer, you need to know the + PID of the file transfer process. You can use ps to find that. + Then use lsof, + + $ lsof -p<PID> + + to examine the files open to the transfer process. Usually the + ftp files or interest are at file descriptors 9 and 10 or 10 and + 11; for rcp, 3 and 4. They describe the network socket file and + the local data file. + + If you want to watch only those file descriptors as the file + transfer progresses, try these lsof forms (for ftp in the example): + + $ lsof -p<PID> -ad9,10 -r + or + $ lsof -p<PID> -ad10,11 -r + + Some options need explaining: + + -p<PID> specifies that lsof is to restrict its attention + to the process whose ID is <PID>. You can specify + a set of PIDs by separating them with commas. + + $ lsof -p 1234,5678,9012 + + -a specifies that lsof is to AND its tests together. + The two tests that are specified are tests on the + PID and tests on file descriptions (``d9,10''). + + d9,10 specifies that lsof is to test only file descriptors + 9 and 10. Note that the `-' is absent, since ``-a'' + is a unary option and can be followed immediately + by another lsof option. + + -r tells lsof to list the requested open file information, + sleep for a default 15 seconds, then list the open + file information again. You can specify a different + time (in seconds) after -r and override the default. + Lsof issues a short line of equal signs between + each set of output to distinguish it. + + For an rcp transfer, the above example becomes: + + $ lsof -p<PID> -ad3,4 -r + + +11. Listing Open NFS Files +========================== + + Lsof will list all files open on remote file systems, supported + by an NFS server. Just use: + + $ lsof -N + + Note, however, that when run on an NFS server, lsof will not list + files open to the server from one of its clients. That's because + lsof can only examine the processes running on the machine where + it is called -- i.e., on the NFS server. + + If you run lsof on the NFS client, using the -N option, it will + list files open by processes on the client that are on remote + NFS file systems. + + +12. Listing Files Open by a Specific Login +========================================== + + If you're interested in knowing what files the processes owned + by a particular login name have open, lsof can help. + + $ lsof -u<login> + or + $ lsof -u<User ID number> + + You can specify either the login name or the UID associated with + it. You can specify multiple login names and UID numbers, mixed + together, by separating them with commas. + + $ lsof -u548,abe + + On the subject of login names and UIDs, it's worth noting that + lsof can be told to report either. By default it reports login + names; the -l option switches reporting to UIDs. You might want + to use -l if login name lookup is slow for some reason. + + a. Ignoring a Specific Login + ============================= + + The -u option can also be used to direct lsof to ignore a + specific login name or UID, or a list of them. Simply prefix + the login names or UIDs with a `^' character, as you might do + in a regular expression. The `^' prefix is useful, for example, + when you want to have lsof ignore the files open to system + processes, owned by the root (UID 0) login. Try: + + $ lsof -u ^root + or + $ lsof -u ^0 + + +13. Listing Files Open to a Specific Process Group +================================================== + + There's a Unix collection of processes called a process group. + The name indicates that the processes of the group have a common + association and are grouped so that a signal sent to one (e.g., + a keyboard kill stroke) is delivered to all. + + This causes Unix to create a two element process group: + + $ lsof | less + + You can use lsof to look at the open files of all members of a + process group, if you know the process group ID number. Assuming + that it is 12717 for the above example, this lsof command: + + $ lsof -g12717 -adcwd + + would produce on a Solaris 8 system: + + $ lsof -g12717 -adcwd + COMMAND PID PGID USER FD TYPE DEVICE SIZE/OFF NODE NAME + sshd 11369 12717 root cwd VDIR 0,2 189 1449175 /tmp (swap) + sshd 12717 12717 root cwd VDIR 136,0 1024 2 / + + The ``-g12717'' option specifies the process group ID of interest; + the ``-adcwd'' option specifies that options are to be ANDed and + that lsof should limit file output to information about current + working directory (``cwd'') files. + + +14. When Lsof Seems to Hang +=========================== + + On occasion when you run lsof it seems to hang and produce no + output. This may result from system conditions beyond the control + of lsof. Lsof has a number of options that may allow you to + bypass the blockage. + + a. Kernel lstat(), readlink(), and stat() Blockages + ==================================================== + + Lsof uses the kernel (system) calls lstat(), readlink(), and + stat() to locate mounted file system information. When a file + system has been mounted from an NFS server and that server is + temporarily unavailable, the calls lsof uses may block in the + kernel. + + Lsof will announce that it is being blocked with warning messages + (unless they have been suppressed by the lsof builder), but + only after a default waiting period of fifteen seconds has + expired for each file system whose server is unavailable. If + you have a number of such file systems, the total wait may be + unacceptably long. + + You can do two things to shorten your suffering: 1) reduce the + wait time with the -S option; or 2) tell lsof to avoid the + kernel calls that might block by specifying the -b option. + + $ lsof -S 5 + or + $ lsof -b + + Avoiding the kernel calls that might block may result in the + lack of some information that lsof needs to know about mounted + file systems. Thus, when you use -b, lsof warns that it might + lack important information. + + The warnings that result from using -b (unless suppressed by + the lsof builder) can themselves be annoying. You can suppress + them by adding the -w option. (Of course, if you do, you won't + know what warning messages lsof might have issued.) + + $ lsof -bw + + Note: if the lsof builder suppressed warning message issuance, + you don't need to use -w to suppress them. You can tell what + the default state of message warning issuance is by looking at + the -h (help) output. If it says ``-w enable warnings'' then + warnings are disabled by default; ``-w disable warnings'', they + are enabled by default. + + b. Problems with /dev or /devices + ================================== + + Lsof scans the /dev or /devices branch of your file system to + obtain information about your system's devices. (The scan isn't + necessary when a device cache file exists.) + + Sometimes that scan can take a very long time, especially if + you have a large number of devices, and if your kernel is + relatively slow to process the stat() system call on device + nodes. You can't do anything about the stat() system call + speed. + + However, you can make sure that lsof is allowed to use its + device cache file feature. When lsof can use a device cache + file, it retains information it gleans via the stat() calls + on /dev or /devices in a separate file for later, faster + access. + + The device cache file feature is described in the lsof man + page. See the DEVICE CACHE FILE, LSOF PERMISSIONS THAT AFFECT + DEVICE CACHE FILE ACCESS, DEVICE CACHE FILE PATH FROM THE -D + OPTION, DEVICE CACHE PATH FROM AN ENVIRONMENT VARIABLE, + SYSTEM-WIDE DEVICE CACHE PATH, PERSONAL DEVICE CACHE PATH + (DEFAULT), and MODIFIED PERSONAL DEVICE CACHE PATH sections. + + There is also a separate file in the lsof distribution, named + 00DCACHE, that describes the device cache file in detail, + including information about possible security problems. + + One final observation: don't overlook the possibility that your + /dev or /devices tree might be damaged. See if + + $ ls -R /dev + or + $ ls -R /devices + + completes or hangs. If it hangs, then lsof will probably hang, + too, and you should try to discover why ls hangs. + + c. Host and Service Name Lookup Hangs + ====================================== + + Lsof can hang up when it tries to convert an Internet dot-form + address to a host name, or a port number to a service name. Both + hangs are caused by the lookup functions of your system. + + An independent check for both types of hangs can be made with + the netstat program. Run it without arguments. If it hangs, + then it is probably having lookup difficulties. When you run + it with -n it shouldn't hang and should report network and port + numbers instead of names. + + Lsof has two options that serve the same purpose as netstat's + -n option. The lsof -n option tells it to avoid host name + lookups; and -P, service name lookups. Try those options when + you suspect lsof may be hanging because of lookup problems. + + $ lsof -n + or + $ lsof -P + or + $ lsof -nP + + d. UID to Login Name Conversion Delays + ======================================= + + By default lsof converts User IDentification (UID) numbers to + login names when it produces output. That conversion process + may sometimes hang because of system problems or interlocks. + + You can tell lsof to skip the lookup with the -l option; it + will then report UIDs in the USER column. + + $ lsof -l + + +15. Output for Other Programs +============================= + + The -F option allows you to specify that lsof should describe + open files with a special form of output, called field output, + that can be parsed easily by a subsequent program. The lsof + distribution comes with sample AWK, Perl 4, and Perl 5 scripts + that post-process field output. The lsof test suite has a C + library that could be adapted for use by C programs that want to + process lsof field output from an in-bound pipe. + + The lsof manual page describes field output in detail in its + OUTPUT FOR OTHER PROGRAMS section. A quick look at a sample + script in the scripts/ subdirectory of the lsof distribution will + also give you an idea how field output works. + + The most important thing about field output is that it is relatively + homogeneous across Unix dialects. Thus, if you write a script + to post-process field output for AIX, it probably will work for + HP-UX, Solaris, and Ultrix as well. + + +16. The Lsof Exit Code and Shell Scripts +======================================== + + When lsof exits successfully it returns an exit code based on + the result of its search for specified files. (If no files were + specified, then the successful exit code is 0 (zero).) + + If lsof was asked to search for specific files, including any + files on specified file systems, it returns an exit code of 0 + (zero) if it found all the specified files and at least one file + on each specified file system. Otherwise it returns a 1 (one). + + If lsof detects an error and makes an unsuccessful exit, it + returns an exit code of 1 (one). + + You can use the exit code in a shell script to search for files + on a file system and take action based on the result -- e.g., + + #!/bin/sh + lsof <file_system_name> > /dev/null 2>&1 + if test $? -eq 0 + then + echo "<file_system_name> has some users." + else + echo "<file_system_name> may have no users." + fi + + +17. Strange messages in the NAME column +======================================= + + When lsof encounters problems analyzing a particular file, it may + put a message in the file's NAME column. Many of those messages + are explained in the 00FAQ file of the lsof distribution. + + So consult 00FAQ first if you encounter a NAME column message you + don't understand. (00FAQ is a possible source of information + about other unfamiliar things in lsof output, too.) + + If you can't find help in 00FAQ, you can use grep to look in the + lsof source files for the message -- e.g., + + $ cd .../lsof_4.76_src + $ grep "can't identify protocol" *.[ch] + + The code associated with the message will usually make clear the + reason for the message. + + If you have an lsof source tree that has been processed by the + lsof Configure script, you need grep only there. If, however, + your source tree hasn't been processed by Configure, you may + have to look in the top-level lsof source directory and in the + dialects sub-directory for the UNIX dialect you are using - e.g., + + $ cd .../lsof_4.76_src + $ grep "can't identify protocol" *.[ch] + $ cd dialects/Linux + $ grep "can't identify protocol" *.[ch] + + In rare cases you may have to look in the lsof library, too -- + e.g., + + $ cd .../lsof_4.76_src + $ grep "can't identify protocol" *.[ch] + $ cd dialects/Linux + $ grep "can't identify protocol" *.[ch] + $ cd ../../lib + $ grep "can't identify protocol" *.[ch] + + +Options +======= + + The following appendices describe the lsof options in detail. + + +A. Selection Options +==================== + + Lsof has a rich set of options for selecting the files to be + displayed. These include: + + -a tells lsof to AND the set of selection options that + are specified. Normally lsof ORs them. + + For example, if you specify the -p<PID> and -u<UID> + options, lsof will display all files for the + specified PID or for the specified UID. + + By adding -a, you specify that the listed files + should be limited to PIDs owned by the specified + UIDs -- i.e., they match the PIDs *and* the UIDs. + + $ lsof -p1234 -au 5678 + + -c specifies that lsof should list files belonging + to processes having the associated command name. + + Hint: if you want to select files based on more than + one command name, use multiple -c<name> specifications. + + $ lsof -clsof -cksh + + -d tells lsof to select by the associated file descriptor + (FD) set. An FD set is a comma-separated list of + numbers and the names lsof normally displays in + its FD column: cwd, Lnn, ltx, <number>, etc. See + the OUTPUT section of the lsof man page for the + complete list of possible file descriptors. Example: + + $ lsof -dcwd,0,1,2 + + -g tells lsof to select by the associated process + group ID (PGID) set. The PGID set is a comma-separated + list of PGID numbers. When -g is specified, it also + enables the display of PGID numbers. + + Note: when -g isn't followed by a PGID set, it + simply selects the listing of PGID for all processes. + Examples: + + $ lsof -g + $ lsof -g1234,5678 + + -i tells lsof to display Internet socket files. If no + protocol/address/port specification follows -i, + lsof lists all Internet socket files. + + If a specification follows -i, lsof lists only the + socket files whose Internet addresses match the + specification. + + Hint: multiple addresses may be specified with + multiple -i options. Examples: + + $ lsof -iTCP + $ lsof -i@lsof.itap.purdue.edu:sendmail + + -N selects the listing of files mounted on NFS devices. + + -U selects the listing of socket files in the Unix + domain. + + +B. Output Options +================== + + Lsof has these options to control its output format: + + -F produce output that can be parsed by a subsequent + program. + + -g print process group (PGID) IDs. + + -l list UID numbers instead of login names. + + -n list network numbers instead of host names. + + -o always list file offset. + + -P list port numbers instead of port service names. + + -s always list file size. + + +C. Precautionary Options +========================= + + Lsof uses system functions that can block or take a long time, + depending on the health of the Unix dialect supporting it. These + include: + + -b directs lsof to avoid system functions -- e.g., + lstat(2), readlink(2), stat(2) -- that might block + in the kernel. See the BLOCKS AND TIMEOUTS + section of the lsof man page. + + You might want to use this option when you have + a mount from an NFS server that is not responding. + + -C tells lsof to ignore the kernel's name cache. As + a precaution this option will have little effect on + lsof performance, but might be useful if the kernel's + name cache is scrambled. (I've never seen that + happen.) + + -D might be used to direct lsof to ignore an existing + device cache file and generate a new one from /dev + (and /devices). This might be useful if you have + doubts about the integrity of an existing device + cache file. + + -l tells lsof to list UID numbers instead of login + names -- this is useful when UID to login name + conversion is slow or inoperative. + + -n tells lsof to avoid converting Internet addresses + to host numbers. This might be useful when your + host name lookup (e.g., DNS) is inoperative. + + -O tells lsof to avoid its strategy of forking to + perform potentially blocking kernel operations. + While the forking allows lsof to detect that a + block has occurred (and possibly break it), the + fork operation is a costly one. Use the -O option + with care, lest your lsof be blocked. + + -P directs lsof to list port numbers instead of trying + to convert them to port service names. This might + be useful if port to service name lookups (e.g., + via NIS) are slow or failing. + + -S can be used to change the lstat/readlink/stat + timeout interval that governs how long lsof waits + for response from the kernel. This might be useful + when an NFS server is slow or unresponsive. When + lsof times out of a kernel function, it may have + less information to display. Example: + + $ lsof -S2 + + -w tells lsof to avoid issuing warning messages, if + they are enabled by default, or enable them if they + are disabled by default. Check the -h (help) output + to determine their status. If it says ``-w enable + warnings'', then warning messages are disabled by + default; ``-w disable warnings'', they are enabled + by default. + + This may be a useful option, for example, when you + specify -b, if warning messages are enabled, because + it will suppress the warning messages lsof issues + about avoiding functions that might block in the + kernel. + + +D. Miscellaneous Lsof Options +============================== + + There are some lsof options that are hard to classify, including: + + -? these options select help output. + -h + + -F selects field output. Field output is a mode where + lsof produces output that can be parsed easily by + subsequent programs -- e.g., AWK or Perl scripts. + See ``15. Output for Other Programs'' for more + information. + + -k specifies an alternate kernel symbol file -- i.e., + where nlist() will get its information. Example: + + $ lsof -k/usr/crash/vmunix.1 + + -m specifies an alternate kernel memory file from + which lsof will read kernel structures in place + of /dev/kmem or kvm_read(). Example: + + $ lsof -m/usr/crash/vmcore.n + + -r tells lsof to repeat its scan every 15 seconds (the + default when no associated value is specified). A + repeat time, different from the default, can follow + -r. Example: + + $ lsof -r30 + + -v displays information about the building of the + lsof executable. + + -- The double minus sign option may be used to + signal the end of options. It's particularly useful + when arguments to the last option are optional and + you want to supply a file path that could be confused + for arguments to the last option. Example: + + $ lsof -g -- 1 + + Where `1' is a file path, not PGID ID 1. + + +Vic Abell <abe@purdue.edu> +January 18, 2010 |