3 afsmonitor - Monitors File Servers and Cache Managers
7 afsmonitor [B<initcmd>] [B<-config> I<configuration file>]
8 [B<-frequency> I<poll frequency, in seconds>]
9 [B<-output> I<storage file name>] [B<-detailed>]
10 [B<-debug> I<turn debugging output on to the named file>]
11 [B<-fshosts> I<list of file servers to monitor> ...]
12 [B<-cmhosts> I<list of cache managers to monitor> ...]
13 [B<-buffers> I<number of buffer slots>] [B<-help>]
15 afsmonitor [B<i>] [B<-co> I<configuration file>]
16 [B<-fr> I<poll frequency, in seconds>]
17 [B<-o> I<storage file name>] [B<-det>]
18 [B<-deb> I<turn debugging output on to the named file>]
19 [B<-fs> I<list of file servers to monitor> ...]
20 [B<-cm> I<list of cache managers to monitor> ...]
21 [B<-b> I<number of buffer slots>] [B<-h>]
25 The C<afsmonitor> command initializes a program that gathers and displays
26 statistics about specified File Server and Cache Manager operations.
27 It allows the issuer to monitor, from a single location, a wide range
28 of File Server and Cache Manager operations on any number of machines
29 in both local and foreign cells.
31 There are 271 available File Server statistics and 570 available Cache
32 Manager statistics, listed in the appendix about C<afsmonitor> statistics
33 in the IBM AFS Administration Guide. By default, the command displays
34 all of the relevant statistics for the file server machines named by
35 the B<-fshosts> argument and the client machines named by the B<-cmhosts>
36 argument. To limit the display to only the statistics of interest,
37 list them in the configuration file specified by the B<-config> argument.
38 In addition, use the configuration file for the following purposes:
44 To set threshold values for any monitored statistic. When the
45 value of a statistic exceeds the threshold, the C<afsmonitor> command
46 displays it in reverse video. There are no default threshold
51 To invoke a program or script automatically when a statistic
52 exceeds its threshold. The AFS distribution does not include any
57 To list the file server and client machines to monitor, instead of
58 using the B<-fshosts> and B<-cmhosts> arguments.
62 For a description of the configuration file, see the B<afsmonitor
63 Configuration File> reference page
71 Accommodates the command's use of the AFS command parser, and
74 =item B<-config> I<configuration file>
76 Names the configuration file which lists the machines to
77 monitor, statistics to display, and threshold values, if any. A
78 partial pathname is interpreted relative to the current working
79 directory. Provide this argument if not providing the B<-fshosts>
80 argument, B<-cmhosts> argument, or neither. For instructions on
81 creating this file, see the preceding B<Description> section, and
82 the section on the C<afsmonitor> program in the IBM AFS
85 =item B<-frequency> I<poll frequency, in seconds>
87 Specifies in seconds how often the C<afsmonitor> program probes
88 the File Servers and Cache Managers. Valid values range from B<1>
89 to B<86400> (which is 24 hours); the default value is B<60>. This
90 frequency applies to both File Servers and Cache Managers, but
91 the C<afsmonitor> program initiates the two types of probes, and
92 processes their results, separately. The actual interval
93 between probes to a host is the probe frequency plus the time
94 required for all hosts to respond.
96 =item B<-output> I<storage file name>
98 Names the file to which the C<afsmonitor> program writes all of
99 the statistics that it collects. By default, no output file is
100 created. See the section on the C<afsmonitor> command in the IBM
101 AFS Administration Guide for information on this file.
105 Formats the information in the output file named by B<-output>
106 argument in a maximally readable format. Provide the B<-output>
107 argument along with this one.
109 =item B<-fshosts> I<list of file servers to monitor> ...
111 Names one or more machines from which to gather File Server
112 statistics. For each machine, provide either a fully qualified
113 host name, or an unambiguous abbreviation (the ability to
114 resolve an abbreviation depends on the state of the cell's name
115 service at the time the command is issued). This argument can
116 be combined with the B<-cmhosts> argument, but not with the
119 =item B<-cmhosts> I<list of cache managers to monitor> ...
121 Names one or more machines from which to gather Cache Manager
122 statistics. For each machine, provide either a fully qualified
123 host name, or an unambiguous abbreviation (the ability to
124 resolve an abbreviation depends on the state of the cell's name
125 service at the time the command is issued). This argument can
126 be combined with the B<-fshosts> argument, but not with the
129 =item B<-buffers> I<number of buffer slots>
131 Is nonoperational and provided to accommodate potential future
132 enhancements to the program.
136 Prints the online help for this command. All other valid
143 The C<afsmonitor> program displays its data on three screens:
149 C<System Overview>: This screen appears automatically when the
150 C<afsmonitor> program initializes. It summarizes separately for File
151 Servers and Cache Managers the number of machines being monitored
152 and how many of them have I<alerts> (statistics that have exceeded
153 their thresholds). It then lists the hostname and number of alerts
154 for each machine being monitored, indicating if appropriate that a
155 process failed to respond to the last probe.
159 C<File Server>: This screen displays File Server statistics for each
160 file server machine being monitored. It highlights statistics that
161 have exceeded their thresholds, and identifies machines that
162 failed to respond to the last probe.
166 C<Cache Managers>: This screen displays Cache Manager statistics for
167 each client machine being monitored. It highlights statistics that
168 have exceeded their thresholds, and identifies machines that
169 failed to respond to the last probe.
173 Fields at the corners of every screen display the following
180 In the top left corner, the program name and version number.
184 In the top right corner, the screen name, current and total page
185 numbers, and current and total column numbers. The page number
186 (for example, p. 1 of 3) indicates the index of the current page
187 and the total number of (vertical) pages over which data is
188 displayed. The column number (for example, c. 1 of 235) indicates
189 the index of the current leftmost column and the total number of
190 columns in which data appears. (The symbol >>> indicates that
191 there is additional data to the right; the symbol <<< indicates
192 that there is additional data to the left.)
196 In the bottom left corner, a list of the available commands. Enter
197 the first letter in the command name to run that command. Only the
198 currently possible options appear; for example, if there is only
199 one page of data, the C<next> and C<prev> commands, which scroll the
200 screen up and down respectively, do not appear. For descriptions
201 of the commands, see the following section about navigating the
206 In the bottom right corner, the C<probes> field reports how many
207 times the program has probed File Servers (C<fs>), Cache Managers
208 (C<cm>), or both. The counts for File Servers and Cache Managers can
209 differ. The C<freq> field reports how often the program sends probes.
213 =head1 Navigating the afsmonitor Display Screens
215 As noted, the lower left hand corner of every display screen displays
216 the names of the commands currently available for moving to alternate
217 screens, which can either be a different type or display more
218 statistics or machines of the current type. To execute a command,
219 press the lowercase version of the first letter in its name. Some
220 commands also have an uppercase version that has a somewhat different
221 effect, as indicated in the following list.
227 Switches to the C<Cache Managers> screen. Available only on the
228 C<System Overview> and C<File Servers> screens.
232 Switches to the C<File Servers> screen. Available only on the
233 C<System Overview> and the C<Cache Managers> screens.
237 Scrolls horizontally to the left, to access the data columns
238 situated to the left of the current set. Available when the <<<
239 symbol appears at the top left of the screen. Press uppercase B<L>
240 to scroll horizontally all the way to the left (to display the
241 first set of data columns).
245 Scrolls down vertically to the next page of machine names.
246 Available when there are two or more pages of machines and the
247 final page is not currently displayed. Press uppercase B<N> to
248 scroll to the final page.
252 Switches to the C<System Overview> screen. Available only on the
253 C<Cache Managers> and C<File Servers> screens.
257 Scrolls up vertically to the previous page of machine names.
258 Available when there are two or more pages of machines and the
259 first page is not currently displayed. Press uppercase B<P> to
260 scroll to the first page.
264 Scrolls horizontally to the right, to access the data columns
265 situated to the right of the current set. This command is
266 available when the >>> symbol appears at the upper right of the
267 screen. Press uppercase B<R> to scroll horizontally all the way to
268 the right (to display the final set of data columns).
272 =head1 The System Overview Screen
274 The C<System Overview> screen appears automatically as the C<afsmonitor>
275 program initializes. This screen displays the status of as many File
276 Server and Cache Manager processes as can fit in the current window;
277 scroll down to access additional information.
279 The information on this screen is split into File Server information
280 on the left and Cache Manager information on the right. The header for
281 each grouping reports two pieces of information:
287 The number of machines on which the program is monitoring the
292 The number of alerts and the number of machines affected by them
293 (an I<alert> means that a statistic has exceeded its threshold or a
294 process failed to respond to the last probe)
298 A list of the machines being monitored follows. If there are any
299 alerts on a machine, the number of them appears in square brackets to
300 the left of the hostname. If a process failed to respond to the last
301 probe, the letters C<PF> (probe failure) appear in square brackets to the
302 left of the hostname.
304 =head1 The File Servers Screen
306 The C<File Servers> screen displays the values collected at the most
307 recent probe for File Server statistics.
309 A summary line at the top of the screen (just below the standard
310 program version and screen title blocks) specifies the number of
311 monitored File Servers, the number of alerts, and the number of
312 machines affected by the alerts.
314 The first column always displays the hostnames of the machines running
315 the monitored File Servers.
317 To the right of the hostname column appear as many columns of
318 statistics as can fit within the current width of the display screen
319 or window; each column requires space for 10 characters. The name of
320 the statistic appears at the top of each column. If the File Server on
321 a machine did not respond to the most recent probe, a pair of dashes
322 (--) appears in each column. If a value exceeds its configured
323 threshold, it is highlighted in reverse video. If a value is too large
324 to fit into the allotted column width, it overflows into the next row
327 =head1 The Cache Managers Screen
329 The Cache Managers screen displays the values collected at the most
330 recent probe for Cache Manager statistics.
332 A summary line at the top of the screen (just below the standard
333 program version and screen title blocks) specifies the number of
334 monitored Cache Managers, the number of alerts, and the number of
335 machines affected by the alerts.
337 The first column always displays the hostnames of the machines running
338 the monitored Cache Managers.
340 To the right of the hostname column appear as many columns of
341 statistics as can fit within the current width of the display screen
342 or window; each column requires space for 10 characters. The name of
343 the statistic appears at the top of each column. If the Cache Manager
344 on a machine did not respond to the most recent probe, a pair of
345 dashes (--) appears in each column. If a value exceeds its configured
346 threshold, it is highlighted in reverse video. If a value is too large
347 to fit into the allotted column width, it overflows into the next row
350 =head1 Writing to an Output File
352 Include the B<-output> argument to name the file into which the
353 C<afsmonitor> program writes all of the statistics it collects. The
354 output file can be useful for tracking performance over long periods
355 of time, and enables the administrator to apply post-processing
356 techniques that reveal system trends. The AFS distribution does not
357 include any post-processing programs.
359 The output file is in ASCII format and records the same information as
360 the File Server and Cache Manager display screens. Each line in the
361 file uses the following format to record the time at which the
362 C<afsmonitor> program gathered the indicated statistic from the Cache
363 Manager (C<CM>) or File Server (C<FS>) running on the machine called
364 I<host_name>. If a probe failed, the error code B<-1> appears in the
367 I<time> I<host_name> CM|FS I<statistic>
369 If the administrator usually reviews the output file manually, rather
370 than using it as input to an automated analysis program or script,
371 including the B<-detail> flag formats the data in a more easily readable
376 For examples of commands, display screens, and configuration files,
377 see the section about the C<afsmonitor> program in the IBM AFS
378 Administration Guide.
380 =head1 PRIVILEGE REQUIRED
386 The following software must be accessible to a machine where the
387 C<afsmonitor> program is running:
393 The AFS B<xstat> libraries, which the C<afsmonitor> program uses to
398 The B<curses> graphics package, which most UNIX distributions provide
399 as a standard utility
403 The C<afsmonitor> screens format successfully both on so-called dumb
404 terminals and in windowing systems that emulate terminals. For the
405 output to looks its best, the display environment needs to support
406 reverse video and cursor addressing. Set the TERM environment variable
407 to the correct terminal type, or to a value that has characteristics
408 similar to the actual terminal type. The display window or terminal
409 must be at least 80 columns wide and 12 lines long.
411 The C<afsmonitor> program must run in the foreground, and in its own
412 separate, dedicated window or terminal. The window or terminal is
413 unavailable for any other activity as long as the C<afsmonitor> program
414 is running. Any number of instances of the C<afsmonitor> program can run
415 on a single machine, as long as each instance runs in its own
416 dedicated window or terminal. Note that it can take up to three
417 minutes to start an additional instance.
421 IBM Corporation 2000. <http://www.ibm.com/> All Rights Reserved.
423 Converted from html to pod by Alf Wachsmann <alfw@slac.stanford.edu>, 2003,
424 Stanford Linear Accelerator Center, a department of Stanford University.
428 L<afsmonitor_Configuration_File(1)>,