1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
|
A Quick Start for Lsof
1. Introduction
================
Agreed, the lsof man page is dense and lsof has a plethora of
options. There are examples, but the manual page format buries
them at the end. How does one get started with lsof?
This file is an attempt to answer that question. It plunges
immediately into examples of lsof use to solve problems that
involve looking at the open files of Unix processes.
Contents
1. Introduction
2. Finding Uses of a Specific Open File
3. Finding Open Files Filling a File System
a. Finding an Unlinked Open File
4. Finding Processes Blocking Umount
5. Finding Listening Sockets
6. Finding a Particular Network Connection
7. Identifying a Netstat Connection
8. Finding Files Open to a Named Command
9. Deciphering the Remote Login Trail
a. The Fundamentals
b. The idrlogin.perl[5] Scripts
10. Watching an Ftp or Rcp Transfer
11. Listing Open NFS Files
12. Listing Files Open by a Specific Login
a. Ignoring a Specific Login
13. Listing Files Open to a Specific Process Group
14. When Lsof Seems to Hang
a. Kernel lstat(), readlink(), and stat() Blockages
b. Problems with /dev or /devices
c. Host and Service Name Lookup Hangs
d. UID to Login Name Conversion Delays
15. Output for Other Programs
16. The Lsof Exit Code and Shell Scripts
17. Strange messages in the NAME column
Options
A. Selection Options
B. Output Options
C. Precautionary Options
D. Miscellaneous Lsof Options
2. Finding Uses of a Specific Open File
========================================
Often you're interested in knowing who is using a specific file.
You know the path to it and you want lsof to tell you the processes
that have open references to it.
Simple -- execute lsof and give it the path name of the file of
interest -- e.g.,
$ lsof /etc/passwd
Caveat: this only works if lsof has permission to get the status
(via stat(2)) of the file at the named path. Unless the lsof
process has enough authority -- e.g., it is being run with a
real User ID (UID) of root -- this AIX example won't work:
Further caveat: this use of lsof will fail if the stat(2) kernel
syscall returns different file parameters -- particularly device
and inode numbers -- than lsof finds in kernel node structures.
This condition is rare and is usually documented in the 00FAQ
file of the lsof distribution.
$ lsof /etc/security/passwd
lsof: status error on /etc/security/passwd: Permission denied
3. Finding Open Files Filling a File System
============================================
Oh! Oh! /tmp is filling and ls doesn't show that any large files
are being created. Can lsof help?
Maybe. If there's a process that is writing to a file that has
been unlinked, lsof may be able to discover the process for you.
You ask it to list all open files on the file system where /tmp
is located.
Sometimes /tmp is a file system by itself. In that case,
$ lsof /tmp
is the appropriate command. If, however, /tmp is part of another
file system, typically /, then you may have to ask lsof to list
all files open on the containing file system and locate the
offending file and its process by inspection -- e.g.,
$ lsof / | more
or
$ lsof / | grep ...
Caveat: there must be a file open to a for the lsof search to
succeed. Sometimes the kernel may cause a file reference to
persist, even where there's no file open to a process. (Can you
say kernel bug? Maybe.) In any event, lsof won't be able to
help in this case.
a. Finding an Unlinked Open File
=================================
A pesky variant of a file that is filling a file system is an
unlinked file to which some process is still writing. When a
process opens a file and then unlinks it, the file's resources
remain in use by the process, but the file's directory entries
are removed. Hence, even when you know the directory where the
file once resided, you can't detect it with ls.
This can be an administrative problem when the unlinked file is
large, and the process that holds it open continues to write to
it. Only when the process closes the file will its resources,
particularly disk space, be released.
Lsof can help you find unlinked files on local disks. It has an
option, +L, that will list the link counts of open files. That
helps because an unlinked file on a local disk has a zero link
count. Note: this is NOT true for NFS files, accessed from a
remote server.
You could use the option to list all files and look for a zero
link count in the NLINK column -- e.g.,
$lsof +L
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
...
less 25366 abe txt VREG 6,0 40960 1 76319 /usr/...
...
> less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a)
Better yet, you can specify an upper bound to the +L option, and
lsof will select only files that have a link count less than the
upper bound. For example:
$ lsof +L1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
less 25366 abe 3r VREG 6,0 17360 0 98768 / (/dev/sd0a)
You can use lsof's -a (AND) option to narrow the link count search
to a particular file system. For example, to look for zero link
counts on the /home file system, use:
$ lsof -a +L1 /home
CAUTION: lsof can't always report link counts for all file types
-- e.g., it may not report them for FIFOs, pipes, or sockets.
Remember also that link counts for NFS files on an NFS client
host don't behave as do link counts for files on local disks.
4. Finding Processes Blocking Umount
=====================================
When you need to unmount a file system with the umount command,
you may find the operation blocked by a process that has a file
open on the file systems. Lsof may be able to help you find the
process. In response to:
$ lsof <file_system_name>
Lsof will display all open files on the named file system. It
will also set its exit code zero when it finds some open files
and non-zero when it doesn't, making this type of lsof call
useful in shell scripts. (See section 16.)
Consult the output of the df command for file system names.
See the caveat in the preceding section about file references
that persist in the kernel without open file traces. That
situation may hamper lsof's ability to help with umount, too.
5. Finding Listening Sockets
=============================
Sooner or later you may wonder if someone has installed a network
server that you don't know about. Lsof can list for you all the
network socket files open on your machine with:
$ lsof -i
The -i option without further qualification lists all open Internet
socket files. You can add network names or addresses, protocol
names, and service names or port numbers to the -i option to
refine the search. (See the next section.)
6. Finding a Particular Network Connection
===========================================
When you know the source or destination of a network connection
whose open files and process you'd like to identify, the -i option
may help.
If, for example, you want to know what process has a connection
open to or from the Internet host named aaa.bbb.ccc, you can ask
lsof to search for it with:
$ lsof -i@aaa.bbb.ccc
If you're interested in a particular protocol -- TCP or UDP --
and a specific port number or service name, you can add those
discriminators to the -i information:
$ lsof -iTCP@aaa.bbb.ccc:ftp-data
If you're interested in a particular IP version -- IPv4 or IPv6
-- and your UNIX dialect supports both (It does if "IPv[46]"
appears in the lsof -h output.), you can add the '4' or '6'
selector immediately after -i:
$ lsof -i4
$ lsof -i6
7. Identifying a Netstat Connection
====================================
How do I identify the process that has a network connection
described in netstat output? For example, if netstat says:
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp 0 0 vic.1023 ipscgate.login ESTABLISHED
What process is connected to service name ``login'' on ipscgate?
Use lsof's -i option:
$lsof -iTCP@ipscgate:login
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login
...
There's another way. Notice the 0x10144168 in the DEVICE column
of the lsof output? That's the protocol control block (PCB)
address. Many netstat applications will display it when given
the -A option:
$ netstat -A
PCB Proto Recv-Q Send-Q Local Address Foreign Address (state)
10144168 tcp 0 0 vic.1023 ipscgate.login ESTABLISHED
...
Using the PCB address, lsof, and grep, you can find the process this
way, too:
$ lsof -i | grep 10144168
rlogin 25023 abe 3u inet 0x10144168 0t184 TCP lsof.itap.purdue.edu:1023->ipscgate.cc.purdue.edu:login
...
If the file is a UNIX socket and netstat reveals and adress for it,
like this Solaris 11 example:
$ netstat -a -f unix
Active UNIX domain sockets
Address Type Vnode Conn Local Addr Remote Addr
ffffff0084253b68 stream-ord 0000000 0000000
Using lsof's -U opetion and its output piped to a grep on the address
yields:
$ lsof -U | grep ffffff0084253b68
squid 1638 nobody 12u unix 18,98 0t10 9437188 /devices/pseudo/tl@0:ticots->0xffffff0084253b68 stream-ord
$ lsof -U |
8. Finding Files Open to a Named Command
=========================================
When you want to look at the files open to a particular command,
you can look up the PID of the process running the command and
use lsof's -p option to specify it.
$ lsof -p <PID>
However, there's a quicker way, using lsof's -c option, provided
you don't mind seeing output for every process running the named
command.
$ lsof -c <first_characters_of_command_name_that_interest_you>
The lsof -c option is useful when you want to see how many instances
of a given command are executing and what their open files are.
One useful example is for the sendmail command.
$ lsof -c sendmail
9. Deciphering the Remote Login Trail
======================================
If the network connection you're interested in tracing has been
initiated externally and is connected to an rlogind, sshd, or
telnetd process, asking lsof to identify that process might not
give a wholly satisfying answer. The report may be that the
connection exists, but to a process owned by root.
a. The Fundamentals
====================
How do you get from there to the login name really using the
connection? You have to know a little about how real and pseudo
ttys are paired in your system, and then use several lsof probes
to identify the login.
This example comes from a Solaris 2.4 system, named klaatu.cc.
I've logged on to it via rlogin from lsof.itap. The first lsof
probe,
$ lsof -i@lsof.itap
yields (among other things):
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023
...
This confirms that a connection exists. A second lsof probe
shows:
$ lsof -p7362
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
...
in.rlogin 7362 root 0u inet 0xfc0193b0 0t242 TCP klaatu.cc.purdue.edu:login->lsof.itap.purdue.edu:1023
...
in.rlogin 7362 root 3u VCHR 23, 0 0t66 52928 /devices/pseudo/clone@0:ptmx->pckt->ptm
7362 is the Process ID (PID) of the in.rlogin process, discovered
in the first lsof probe. (I've abbreviated the output to simplify
the example.) Now comes a need to understand Solaris pseudo-ttys.
The key indicator is in the DEVICE column for FD 3, the major/minor
device number of 23,0. This translates to /dev/pts/0, so a third
lsof probe,
$ lsof /dev/pts/0
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
ksh 7364 abe 0u VCHR 24, 0 0t2410 53410 /dev/pts/../../devices/pseudo/pts@0:0
shows in part that login abe has a ksh process on /dev/pts/0.
(The NAME that lsof shows is not /dev/pts/0 but the full expansion
of the symbolic link that lsof finds at /dev/pts/0.)
Here's a second example, done on an HP-UX 9.01 host named ghg.ecn.
Again, I've logged on to it from lsof.itap, so I start with:
$ lsof -i@lsof.itap
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
rlogind 10214 root 0u inet 0x041d5f00 0t1536 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023
...
Then,
$ lsof -p10214
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
...
rlogind 10214 root 0u inet 0x041d5f00 0t2005 TCP ghg.ecn.purdue.edu:login->lsof.itap.purdue.edu:1023
...
rlogind 10214 root 3u VCHR 16,0x000030 0t2037 24642 /dev/ptym/ptys0
Here the key is the NAME /dev/ptym/ptys0. In HP-UX 9.01 tty and
pseudo tty devices are paired with the names like /dev/ptym/ptys0
and /dev/pty/ttys0, so the following lsof probe is the final step.
$ lsof /dev/pty/ttys0
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
ksh 10215 abe 0u VCHR 17,0x000030 0t3399 22607 /dev/pty/ttys0
...
Here's a third example for an AIX 4.1.4 system. I've used telnet
to connect to it from lsof.itap.purdue.edu. I start with:
$ lsof -i@lsof.itap.purdue.edu
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
...
telnetd 15616 root 0u inet 0x05a93400 0t5156 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369
Then I look at the telnetd process:
$ lsof -p15616
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
...
telnetd 15616 root 0u inet 0x05a93400 0t5641 TCP cloud.cc.purdue.edu:telnet->lsof.itap.purdue.edu:3369
...
telnetd 15616 root 3u VCHR 25, 0 0t5493 103 /dev/ptc/0
Here the key is /dev/ptc/0. In AIX it's paired with /dev/pts/0.
The last probe for that shows:
$ lsof /dev/pts/0
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
...
ksh 16642 abe 0u VCHR 26, 0 0t6461 360 /dev/pts/0
b. The idrlogin.perl[5] Scripts
================================
There's another, perhaps easier way, to go about the job of
tracing a network connection. The lsof distribution contains
two Perl scripts, idrlogin.perl (Perl 4) and idrlogin.perl5
(Perl 5), that use lsof field output to display values for
shells that are parented by rlogind, sshd, or telnetd, or
connected directly to TCP sockets. The lsof test suite contains
a C library that can be adapted for use with C programs that
need to call lsof and process its field output.
The two Perl scripts use the lsof -R option; it causes the
paRent process ID (PPID) to be listed in the lsof output. The
scripts identify all shell processes -- e.g., ones whose command
names end in ``sh'' -- and determine if: 1) the ultimate ancestor
process before a PID greater than 2 (e.g., init's PID is 1) is
rlogind, sshd, or telnetd; or 2) the shell process has open
TCP socket files.
Here's an example of output from idlogin.perl on a Solaris 2.4
system:
centurion: 1 = cd src/lsof4/scripts
centurion: 2 = ./idrlogin.perl
Login Shell PID Via PID TTY From
oboyle ksh 12640 in.telnetd 12638 pts/5 opal.cc.purdue.edu
icdtest ksh 15158 in.rlogind 15155 pts/6 localhost
sh csh 18207 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu
root csh 18242 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu
trouble ksh 19208 in.rlogind 18205 pts/1 babylon5.cc.purdue.edu
abe ksh 21334 in.rlogind 21332 pts/2 lsof.itap.purdue.edu
The scripts assume that its parent directory contains an
executable lsof. If you decide to use one of the scripts, you
may want to customize it for your local lsof and perl paths.
Note that processes executing as remote shells are also
identified.
Here's another example from a UnixWare 7.1.0 system.
tweeker: 1 = cd src/lsof4/scripts
tweeker: 9 = ./idrlogin.perl
Login Shell PID Via PID TTY From
abe ksh 9438 in.telnetd 9436 pts/3 lsof.itap.purdue.edu
10. Watching an Ftp or Rcp Transfer
===================================
The nature of the Internet being one of unpredictable performance
at times, occasionally you want to know if a file transfer, being
done by ftp or rcp, is making any progress.
To use lsof for watching a file transfer, you need to know the
PID of the file transfer process. You can use ps to find that.
Then use lsof,
$ lsof -p<PID>
to examine the files open to the transfer process. Usually the
ftp files or interest are at file descriptors 9 and 10 or 10 and
11; for rcp, 3 and 4. They describe the network socket file and
the local data file.
If you want to watch only those file descriptors as the file
transfer progresses, try these lsof forms (for ftp in the example):
$ lsof -p<PID> -ad9,10 -r
or
$ lsof -p<PID> -ad10,11 -r
Some options need explaining:
-p<PID> specifies that lsof is to restrict its attention
to the process whose ID is <PID>. You can specify
a set of PIDs by separating them with commas.
$ lsof -p 1234,5678,9012
-a specifies that lsof is to AND its tests together.
The two tests that are specified are tests on the
PID and tests on file descriptions (``d9,10'').
d9,10 specifies that lsof is to test only file descriptors
9 and 10. Note that the `-' is absent, since ``-a''
is a unary option and can be followed immediately
by another lsof option.
-r tells lsof to list the requested open file information,
sleep for a default 15 seconds, then list the open
file information again. You can specify a different
time (in seconds) after -r and override the default.
Lsof issues a short line of equal signs between
each set of output to distinguish it.
For an rcp transfer, the above example becomes:
$ lsof -p<PID> -ad3,4 -r
11. Listing Open NFS Files
==========================
Lsof will list all files open on remote file systems, supported
by an NFS server. Just use:
$ lsof -N
Note, however, that when run on an NFS server, lsof will not list
files open to the server from one of its clients. That's because
lsof can only examine the processes running on the machine where
it is called -- i.e., on the NFS server.
If you run lsof on the NFS client, using the -N option, it will
list files open by processes on the client that are on remote
NFS file systems.
12. Listing Files Open by a Specific Login
==========================================
If you're interested in knowing what files the processes owned
by a particular login name have open, lsof can help.
$ lsof -u<login>
or
$ lsof -u<User ID number>
You can specify either the login name or the UID associated with
it. You can specify multiple login names and UID numbers, mixed
together, by separating them with commas.
$ lsof -u548,abe
On the subject of login names and UIDs, it's worth noting that
lsof can be told to report either. By default it reports login
names; the -l option switches reporting to UIDs. You might want
to use -l if login name lookup is slow for some reason.
a. Ignoring a Specific Login
=============================
The -u option can also be used to direct lsof to ignore a
specific login name or UID, or a list of them. Simply prefix
the login names or UIDs with a `^' character, as you might do
in a regular expression. The `^' prefix is useful, for example,
when you want to have lsof ignore the files open to system
processes, owned by the root (UID 0) login. Try:
$ lsof -u ^root
or
$ lsof -u ^0
13. Listing Files Open to a Specific Process Group
==================================================
There's a Unix collection of processes called a process group.
The name indicates that the processes of the group have a common
association and are grouped so that a signal sent to one (e.g.,
a keyboard kill stroke) is delivered to all.
This causes Unix to create a two element process group:
$ lsof | less
You can use lsof to look at the open files of all members of a
process group, if you know the process group ID number. Assuming
that it is 12717 for the above example, this lsof command:
$ lsof -g12717 -adcwd
would produce on a Solaris 8 system:
$ lsof -g12717 -adcwd
COMMAND PID PGID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sshd 11369 12717 root cwd VDIR 0,2 189 1449175 /tmp (swap)
sshd 12717 12717 root cwd VDIR 136,0 1024 2 /
The ``-g12717'' option specifies the process group ID of interest;
the ``-adcwd'' option specifies that options are to be ANDed and
that lsof should limit file output to information about current
working directory (``cwd'') files.
14. When Lsof Seems to Hang
===========================
On occasion when you run lsof it seems to hang and produce no
output. This may result from system conditions beyond the control
of lsof. Lsof has a number of options that may allow you to
bypass the blockage.
a. Kernel lstat(), readlink(), and stat() Blockages
====================================================
Lsof uses the kernel (system) calls lstat(), readlink(), and
stat() to locate mounted file system information. When a file
system has been mounted from an NFS server and that server is
temporarily unavailable, the calls lsof uses may block in the
kernel.
Lsof will announce that it is being blocked with warning messages
(unless they have been suppressed by the lsof builder), but
only after a default waiting period of fifteen seconds has
expired for each file system whose server is unavailable. If
you have a number of such file systems, the total wait may be
unacceptably long.
You can do two things to shorten your suffering: 1) reduce the
wait time with the -S option; or 2) tell lsof to avoid the
kernel calls that might block by specifying the -b option.
$ lsof -S 5
or
$ lsof -b
Avoiding the kernel calls that might block may result in the
lack of some information that lsof needs to know about mounted
file systems. Thus, when you use -b, lsof warns that it might
lack important information.
The warnings that result from using -b (unless suppressed by
the lsof builder) can themselves be annoying. You can suppress
them by adding the -w option. (Of course, if you do, you won't
know what warning messages lsof might have issued.)
$ lsof -bw
Note: if the lsof builder suppressed warning message issuance,
you don't need to use -w to suppress them. You can tell what
the default state of message warning issuance is by looking at
the -h (help) output. If it says ``-w enable warnings'' then
warnings are disabled by default; ``-w disable warnings'', they
are enabled by default.
b. Problems with /dev or /devices
==================================
Lsof scans the /dev or /devices branch of your file system to
obtain information about your system's devices. (The scan isn't
necessary when a device cache file exists.)
Sometimes that scan can take a very long time, especially if
you have a large number of devices, and if your kernel is
relatively slow to process the stat() system call on device
nodes. You can't do anything about the stat() system call
speed.
However, you can make sure that lsof is allowed to use its
device cache file feature. When lsof can use a device cache
file, it retains information it gleans via the stat() calls
on /dev or /devices in a separate file for later, faster
access.
The device cache file feature is described in the lsof man
page. See the DEVICE CACHE FILE, LSOF PERMISSIONS THAT AFFECT
DEVICE CACHE FILE ACCESS, DEVICE CACHE FILE PATH FROM THE -D
OPTION, DEVICE CACHE PATH FROM AN ENVIRONMENT VARIABLE,
SYSTEM-WIDE DEVICE CACHE PATH, PERSONAL DEVICE CACHE PATH
(DEFAULT), and MODIFIED PERSONAL DEVICE CACHE PATH sections.
There is also a separate file in the lsof distribution, named
00DCACHE, that describes the device cache file in detail,
including information about possible security problems.
One final observation: don't overlook the possibility that your
/dev or /devices tree might be damaged. See if
$ ls -R /dev
or
$ ls -R /devices
completes or hangs. If it hangs, then lsof will probably hang,
too, and you should try to discover why ls hangs.
c. Host and Service Name Lookup Hangs
======================================
Lsof can hang up when it tries to convert an Internet dot-form
address to a host name, or a port number to a service name. Both
hangs are caused by the lookup functions of your system.
An independent check for both types of hangs can be made with
the netstat program. Run it without arguments. If it hangs,
then it is probably having lookup difficulties. When you run
it with -n it shouldn't hang and should report network and port
numbers instead of names.
Lsof has two options that serve the same purpose as netstat's
-n option. The lsof -n option tells it to avoid host name
lookups; and -P, service name lookups. Try those options when
you suspect lsof may be hanging because of lookup problems.
$ lsof -n
or
$ lsof -P
or
$ lsof -nP
d. UID to Login Name Conversion Delays
=======================================
By default lsof converts User IDentification (UID) numbers to
login names when it produces output. That conversion process
may sometimes hang because of system problems or interlocks.
You can tell lsof to skip the lookup with the -l option; it
will then report UIDs in the USER column.
$ lsof -l
15. Output for Other Programs
=============================
The -F option allows you to specify that lsof should describe
open files with a special form of output, called field output,
that can be parsed easily by a subsequent program. The lsof
distribution comes with sample AWK, Perl 4, and Perl 5 scripts
that post-process field output. The lsof test suite has a C
library that could be adapted for use by C programs that want to
process lsof field output from an in-bound pipe.
The lsof manual page describes field output in detail in its
OUTPUT FOR OTHER PROGRAMS section. A quick look at a sample
script in the scripts/ subdirectory of the lsof distribution will
also give you an idea how field output works.
The most important thing about field output is that it is relatively
homogeneous across Unix dialects. Thus, if you write a script
to post-process field output for AIX, it probably will work for
HP-UX, Solaris, and Ultrix as well.
16. The Lsof Exit Code and Shell Scripts
========================================
When lsof exits successfully it returns an exit code based on
the result of its search for specified files. (If no files were
specified, then the successful exit code is 0 (zero).)
If lsof was asked to search for specific files, including any
files on specified file systems, it returns an exit code of 0
(zero) if it found all the specified files and at least one file
on each specified file system. Otherwise it returns a 1 (one).
If lsof detects an error and makes an unsuccessful exit, it
returns an exit code of 1 (one).
You can use the exit code in a shell script to search for files
on a file system and take action based on the result -- e.g.,
#!/bin/sh
lsof <file_system_name> > /dev/null 2>&1
if test $? -eq 0
then
echo "<file_system_name> has some users."
else
echo "<file_system_name> may have no users."
fi
17. Strange messages in the NAME column
=======================================
When lsof encounters problems analyzing a particular file, it may
put a message in the file's NAME column. Many of those messages
are explained in the 00FAQ file of the lsof distribution.
So consult 00FAQ first if you encounter a NAME column message you
don't understand. (00FAQ is a possible source of information
about other unfamiliar things in lsof output, too.)
If you can't find help in 00FAQ, you can use grep to look in the
lsof source files for the message -- e.g.,
$ cd .../lsof_4.76_src
$ grep "can't identify protocol" *.[ch]
The code associated with the message will usually make clear the
reason for the message.
If you have an lsof source tree that has been processed by the
lsof Configure script, you need grep only there. If, however,
your source tree hasn't been processed by Configure, you may
have to look in the top-level lsof source directory and in the
dialects sub-directory for the UNIX dialect you are using - e.g.,
$ cd .../lsof_4.76_src
$ grep "can't identify protocol" *.[ch]
$ cd dialects/Linux
$ grep "can't identify protocol" *.[ch]
In rare cases you may have to look in the lsof library, too --
e.g.,
$ cd .../lsof_4.76_src
$ grep "can't identify protocol" *.[ch]
$ cd dialects/Linux
$ grep "can't identify protocol" *.[ch]
$ cd ../../lib
$ grep "can't identify protocol" *.[ch]
Options
=======
The following appendices describe the lsof options in detail.
A. Selection Options
====================
Lsof has a rich set of options for selecting the files to be
displayed. These include:
-a tells lsof to AND the set of selection options that
are specified. Normally lsof ORs them.
For example, if you specify the -p<PID> and -u<UID>
options, lsof will display all files for the
specified PID or for the specified UID.
By adding -a, you specify that the listed files
should be limited to PIDs owned by the specified
UIDs -- i.e., they match the PIDs *and* the UIDs.
$ lsof -p1234 -au 5678
-c specifies that lsof should list files belonging
to processes having the associated command name.
Hint: if you want to select files based on more than
one command name, use multiple -c<name> specifications.
$ lsof -clsof -cksh
-d tells lsof to select by the associated file descriptor
(FD) set. An FD set is a comma-separated list of
numbers and the names lsof normally displays in
its FD column: cwd, Lnn, ltx, <number>, etc. See
the OUTPUT section of the lsof man page for the
complete list of possible file descriptors. Example:
$ lsof -dcwd,0,1,2
-g tells lsof to select by the associated process
group ID (PGID) set. The PGID set is a comma-separated
list of PGID numbers. When -g is specified, it also
enables the display of PGID numbers.
Note: when -g isn't followed by a PGID set, it
simply selects the listing of PGID for all processes.
Examples:
$ lsof -g
$ lsof -g1234,5678
-i tells lsof to display Internet socket files. If no
protocol/address/port specification follows -i,
lsof lists all Internet socket files.
If a specification follows -i, lsof lists only the
socket files whose Internet addresses match the
specification.
Hint: multiple addresses may be specified with
multiple -i options. Examples:
$ lsof -iTCP
$ lsof -i@lsof.itap.purdue.edu:sendmail
-N selects the listing of files mounted on NFS devices.
-U selects the listing of socket files in the Unix
domain.
B. Output Options
==================
Lsof has these options to control its output format:
-F produce output that can be parsed by a subsequent
program.
-g print process group (PGID) IDs.
-l list UID numbers instead of login names.
-n list network numbers instead of host names.
-o always list file offset.
-P list port numbers instead of port service names.
-s always list file size.
C. Precautionary Options
=========================
Lsof uses system functions that can block or take a long time,
depending on the health of the Unix dialect supporting it. These
include:
-b directs lsof to avoid system functions -- e.g.,
lstat(2), readlink(2), stat(2) -- that might block
in the kernel. See the BLOCKS AND TIMEOUTS
section of the lsof man page.
You might want to use this option when you have
a mount from an NFS server that is not responding.
-C tells lsof to ignore the kernel's name cache. As
a precaution this option will have little effect on
lsof performance, but might be useful if the kernel's
name cache is scrambled. (I've never seen that
happen.)
-D might be used to direct lsof to ignore an existing
device cache file and generate a new one from /dev
(and /devices). This might be useful if you have
doubts about the integrity of an existing device
cache file.
-l tells lsof to list UID numbers instead of login
names -- this is useful when UID to login name
conversion is slow or inoperative.
-n tells lsof to avoid converting Internet addresses
to host numbers. This might be useful when your
host name lookup (e.g., DNS) is inoperative.
-O tells lsof to avoid its strategy of forking to
perform potentially blocking kernel operations.
While the forking allows lsof to detect that a
block has occurred (and possibly break it), the
fork operation is a costly one. Use the -O option
with care, lest your lsof be blocked.
-P directs lsof to list port numbers instead of trying
to convert them to port service names. This might
be useful if port to service name lookups (e.g.,
via NIS) are slow or failing.
-S can be used to change the lstat/readlink/stat
timeout interval that governs how long lsof waits
for response from the kernel. This might be useful
when an NFS server is slow or unresponsive. When
lsof times out of a kernel function, it may have
less information to display. Example:
$ lsof -S2
-w tells lsof to avoid issuing warning messages, if
they are enabled by default, or enable them if they
are disabled by default. Check the -h (help) output
to determine their status. If it says ``-w enable
warnings'', then warning messages are disabled by
default; ``-w disable warnings'', they are enabled
by default.
This may be a useful option, for example, when you
specify -b, if warning messages are enabled, because
it will suppress the warning messages lsof issues
about avoiding functions that might block in the
kernel.
D. Miscellaneous Lsof Options
==============================
There are some lsof options that are hard to classify, including:
-? these options select help output.
-h
-F selects field output. Field output is a mode where
lsof produces output that can be parsed easily by
subsequent programs -- e.g., AWK or Perl scripts.
See ``15. Output for Other Programs'' for more
information.
-k specifies an alternate kernel symbol file -- i.e.,
where nlist() will get its information. Example:
$ lsof -k/usr/crash/vmunix.1
-m specifies an alternate kernel memory file from
which lsof will read kernel structures in place
of /dev/kmem or kvm_read(). Example:
$ lsof -m/usr/crash/vmcore.n
-r tells lsof to repeat its scan every 15 seconds (the
default when no associated value is specified). A
repeat time, different from the default, can follow
-r. Example:
$ lsof -r30
-v displays information about the building of the
lsof executable.
-- The double minus sign option may be used to
signal the end of options. It's particularly useful
when arguments to the last option are optional and
you want to supply a file path that could be confused
for arguments to the last option. Example:
$ lsof -g -- 1
Where `1' is a file path, not PGID ID 1.
Vic Abell <abe@purdue.edu>
January 18, 2010
|