1 <?xml version="1.0" encoding="UTF-8"?>
2 <refentry id="udebug1">
4 <refentrytitle>udebug</refentrytitle>
5 <manvolnum>1</manvolnum>
8 <refname>udebug</refname>
9 <refpurpose>Reports Ubik process status for a database server process</refpurpose>
12 <title>Synopsis</title>
13 <para><emphasis role="bold">udebug</emphasis> <emphasis role="bold">-servers</emphasis> <<emphasis>server machine</emphasis>> [<emphasis role="bold">-port</emphasis> <<emphasis>IP port</emphasis>>]
14 [<emphasis role="bold">-long</emphasis>] [<emphasis role="bold">-help</emphasis>]</para>
16 <para><emphasis role="bold">udebug</emphasis> <emphasis role="bold">-s</emphasis> <<emphasis>server machine</emphasis>> [<emphasis role="bold">-p</emphasis> <<emphasis>IP port</emphasis>>] [<emphasis role="bold">-l</emphasis>] [<emphasis role="bold">-h</emphasis>]</para>
20 <title>Description</title>
21 <para>The <emphasis role="bold">udebug</emphasis> command displays the status of the lightweight Ubik process
22 for the database server process identified by the <emphasis role="bold">-port</emphasis> argument that
23 is running on the database server machine named by the <emphasis role="bold">-servers</emphasis>
24 argument. The output identifies the machines where peer database server
25 processes are running, which of them is the synchronization site (Ubik
26 coordinator), and the status of the connections between them.</para>
30 <title>Options</title>
33 <term><emphasis role="bold">-servers</emphasis> <<emphasis>server machine</emphasis>></term>
35 <para>Names the database server machine that is running the process for which to
36 display status information. Provide the machine's IP address in dotted
37 decimal format, its fully qualified host name (for example,
38 <emphasis role="bold">fs1.abc.com</emphasis>), or the shortest abbreviated form of its host name that
39 distinguishes it from other machines. Successful use of an abbreviated
40 form depends on the availability of a name resolution service (such as the
41 Domain Name Service or a local host table) at the time the command is
47 <term><emphasis role="bold">-port</emphasis> <<emphasis>IP port</emphasis>></term>
49 <para>Identifies the database server process for which to display status
50 information, either by its process name or port number. Provide one of the
51 following values.</para>
55 <term><emphasis role="bold">buserver</emphasis> or 7021 for the Backup Server</term>
62 <term><emphasis role="bold">kaserver</emphasis> or 7004 for the Authentication Server</term>
69 <term><emphasis role="bold">ptserver</emphasis> or 7002 for the Protection Server</term>
76 <term><emphasis role="bold">vlserver</emphasis> or 7003 for the Volume Location Server</term>
86 <term><emphasis role="bold">-long</emphasis></term>
88 <para>Reports additional information about each peer of the machine named by the
89 <emphasis role="bold">-servers</emphasis> argument. The information appears by default if that machine
90 is the synchronization site.</para>
95 <term><emphasis role="bold">-help</emphasis></term>
97 <para>Prints the online help for this command. All other valid options are
105 <title>Output</title>
106 <para>Several of the messages in the output provide basic status information
107 about the Ubik process on the machine specified by the <emphasis role="bold">-servers</emphasis>
108 argument, and the remaining messages are useful mostly for debugging
111 <para>To check basic Ubik status, issue the command for each database server
112 machine in turn. In the output for each, one of the following messages
113 appears in the top third of the output.</para>
116 I am sync site . . . (&lt;#_sites&gt; servers)
123 <para>For the synchronization site, the following message indicates that all
124 sites have the same version of the database, which implies that Ubik is
125 functioning correctly. See the following for a description of values other
126 than <computeroutput>1f</computeroutput>.</para>
132 <para>For correct Ubik operation, the database server machine clocks must agree
133 on the time. The following messages, which are the second and third lines
134 in the output, report the current date and time according to the database
135 server machine's clock and the clock on the machine where the <emphasis role="bold">udebug</emphasis>
136 command is issued.</para>
139 Host's &lt;IP_addr&gt; time is &lt;dbserver_date/time&gt;
140 Local time is &lt;local_date/time&gt; (time differential &lt;skew&gt; secs)
143 <para>The <skew> is the difference between the database server machine clock and
144 the local clock. Its absolute value is not vital for Ubik functioning, but
145 a difference of more than a few seconds between the <emphasis>skew</emphasis> values for the
146 database server machines indicates that their clocks are not synchronized
147 and Ubik performance is possibly hampered.</para>
149 <para>Following is a description of all messages in the output. As noted, it is
150 useful mostly for debugging and most meaningful to someone who understands
151 Ubik's implementation.</para>
153 <para>The output begins with the following messages. The first message reports
154 the IP addresses that are configured with the operating system on the
155 machine specified by the <emphasis role="bold">-servers</emphasis> argument. As previously noted, the
156 second and third messages report the current date and time according to
157 the clocks on the database server machine and the machine where the
158 <emphasis role="bold">udebug</emphasis> command is issued, respectively. All subsequent timestamps in
159 the output are expressed in terms of the local clock rather than the
160 database server machine clock.</para>
163 Host's addresses are: &lt;list_of_IP_addrs&gt;
164 Host's &lt;IP_addr&gt; time is &lt;dbserver_date/time&gt;
165 Local time is &lt;local_date/time&gt; (time differential &lt;skew&gt; secs)
168 <para>If the <skew> is more than about 10 seconds, the following message
169 appears. As noted, it does not necessarily indicate Ubik malfunction: it
170 denotes clock skew between the database server machine and the local
171 machine, rather than among the database server machines.</para>
177 <para>If the udebug command is issued during the coordinator election process
178 and voting has not yet begun, the following message appears next.</para>
181 Last yes vote not cast yet
184 <para>Otherwise, the output continues with the following messages.</para>
187 Last yes vote for &lt;sync_IP_addr&gt; was &lt;last_vote&gt; secs ago (sync site);
188 Last vote started &lt;vote_start&gt; secs ago (at &lt;date/time&gt;)
189 Local db version is &lt;db_version&gt;
192 <para>The first indicates which peer this Ubik process last voted for as
193 coordinator (it can vote for itself) and how long ago it sent the vote.
194 The second message indicates how long ago the Ubik coordinator requested
195 confirming votes from the secondary sites. Usually, the <last_vote> and
196 <vote_start> values are the same; a difference between them can indicate
197 clock skew or a slow network connection between the two database server
198 machines. A small difference is not harmful. The third message reports the
199 current version number <db_version> of the database maintained by this
200 Ubik process. It has two fields separated by a period. The field before
201 the period is based on a timestamp that reflects when the database first
202 changed after the most recent coordinator election, and the field after
203 the period indicates the number of changes since the election.</para>
205 <para>The output continues with messages that differ depending on whether the
206 Ubik process is the coordinator or not.</para>
210 <para>If there is only one database server machine, it is always the coordinator
211 (synchronization site), as indicated by the following message.</para>
214 I am sync site forever (1 server)
219 <para>If there are multiple database sites, and the <emphasis role="bold">-servers</emphasis> argument names
220 the coordinator (synchronization site), the output continues with the
221 following two messages.</para>
224 I am sync site until &lt;expiration&gt; secs from now (at &lt;date/time&gt;)
225 (&lt;#_sites&gt; servers)
226 Recovery state &lt;flags&gt;
229 <para>The first message (which is reported on one line) reports how much longer
230 the site remains coordinator even if the next attempt to maintain quorum
231 fails, and how many sites are participating in the quorum. The <emphasis>flags</emphasis>
232 field in the second message is a hexadecimal number that indicates the
233 current state of the quorum. A value of <computeroutput>1f</computeroutput> indicates complete database
234 synchronization, whereas a value of <computeroutput>f</computeroutput> means that the coordinator has
235 the correct database but cannot contact all secondary sites to determine
236 if they also have it. Lesser values are acceptable if the <emphasis role="bold">udebug</emphasis>
237 command is issued during coordinator election, but they denote a problem
238 if they persist. The individual flags have the following meanings:</para>
244 <para>This machine is the coordinator.</para>
251 <para>The coordinator has determined which site has the database with the
252 highest version number.</para>
259 <para>The coordinator has a copy of the database with the highest version
267 <para>The database's version number has been updated correctly.</para>
274 <para>All sites have the database with the highest version number.</para>
279 <para>If the udebug command is issued while the coordinator is writing a change
280 into the database, the following additional message appears.</para>
283 I am currently managing write transaction I&lt;identifier&gt;
288 <para>If the <emphasis role="bold">-servers</emphasis> argument names a secondary site, the output continues
289 with the following messages.</para>
293 Lowest host &lt;lowest_IP_addr&gt; was set &lt;low_time&gt; secs ago
294 Sync host &lt;sync_IP_addr&gt; was set &lt;sync_time&gt; secs ago
297 <para>The <lowest_IP_addr> is the lowest IP address of any peer from which the
298 Ubik process has received a message recently, whereas the <sync_IP_addr>
299 is the IP address of the current coordinator. If they differ, the machine
300 with the lowest IP address is not currently the coordinator. The Ubik
301 process continues voting for the current coordinator as long as they
302 remain in contact, which provides for maximum stability. However, in the
303 event of another coordinator election, this Ubik process votes for the
304 <lowest_IP_addr> site instead (assuming they are in contact), because it
305 has a bias to vote in elections for the site with the lowest IP address.</para>
309 <para>For both the synchronization and secondary sites, the output continues
310 with the following messages. The first message reports the version number
311 of the database at the synchronization site, which needs to match the
312 <db_version> reported by the preceding <computeroutput>Local db version</computeroutput> message. The
313 second message indicates how many VLDB records are currently locked for
314 any operation or for writing in particular. The values are nonzero if the
315 <emphasis role="bold">udebug</emphasis> command is issued while an operation is in progress.</para>
318 Sync site's db version is &lt;db_version&gt;
319 &lt;locked&gt; locked pages, &lt;writes&gt; of them for write
322 <para>The following messages appear next only if there are any read or write
323 locks on database records:</para>
326 There are read locks held
327 There are write locks held
330 <para>Similarly, one or more of the following messages appear next only if there
331 are any read or write transactions in progress when the <emphasis role="bold">udebug</emphasis> command
335 There is an active write transaction
336 There is at least one active read transaction
337 Transaction tid is &lt;tid&gt;
340 <para>If the machine named by the <emphasis role="bold">-servers</emphasis> argument is the coordinator, the
341 next message reports when the current coordinator last updated the
345 Last time a new db version was labelled was:
346 &lt;last_restart&gt; secs ago (at &lt;date/time&gt;)
349 <para>If the machine named by the <emphasis role="bold">-servers</emphasis> argument is the coordinator, the
350 output concludes with an entry for each secondary site that is
351 participating in the quorum, in the following format.</para>
354 Server (&lt;IP_address&gt;): (db &lt;db_version&gt;)
355 last vote rcvd &lt;last_vote&gt; secs ago (at &lt;date/time&gt;),
356 last beacon sent &lt;last_beacon&gt; secs ago (at &lt;date/time&gt;),
357 last vote was { yes | no }
358 dbcurrent={ 0 | 1 }, up={ 0 | 1 } beaconSince={ 0 | 1 }
361 <para>The first line reports the site's IP address and the version number of the
362 database it is maintaining. The <last_vote> field reports how long ago the
363 coordinator received a vote message from the Ubik process at the site, and
364 the <last_beacon> field how long ago the coordinator last requested a vote
365 message. If the <emphasis role="bold">udebug</emphasis> command is issued during the coordinator
366 election process and voting has not yet begun, the following messages
367 appear instead.</para>
371 Last beacon never sent
374 <para>On the final line of each entry, the fields have the following meaning:</para>
378 <para><computeroutput>dbcurrent</computeroutput> is <computeroutput>1</computeroutput> if the site has the database with the highest version
379 number, <computeroutput>0</computeroutput> if it does not.</para>
383 <para><computeroutput>up</computeroutput> is <computeroutput>1</computeroutput> if the Ubik process at the site is functioning correctly,
384 <computeroutput>0</computeroutput> if it is not.</para>
388 <para><computeroutput>beaconSince</computeroutput> is <computeroutput>1</computeroutput> if the site has responded to the coordinator's last
389 request for votes, <computeroutput>0</computeroutput> if it has not.</para>
393 <para>Including the <emphasis role="bold">-long</emphasis> flag produces peer entries even when the
394 <emphasis role="bold">-servers</emphasis> argument names a secondary site, but in that case only the
395 <emphasis>IP_address</emphasis> field is guaranteed to be accurate. For example, the value
396 in the <db_version> field is usually <computeroutput>0.0</computeroutput>, because secondary sites do
397 not poll their peers for this information. The values in the <emphasis>last_vote</emphasis>
398 and <emphasis>last_beacon</emphasis> fields indicate when this site last received or
399 requested a vote as coordinator; they generally indicate the time of the
400 last coordinator election.</para>
404 <title>Examples</title>
405 <para>This example checks the status of the Ubik process for the Volume Location
406 Server on the machine <computeroutput>afs1</computeroutput>, which is the synchronization site.</para>
409 % udebug afs1 vlserver
410 Host's addresses are: 192.12.107.33
411 Host's 192.12.107.33 time is Wed Oct 27 09:49:50 1999
412 Local time is Wed Oct 27 09:49:52 1999 (time differential 2 secs)
413 Last yes vote for 192.12.107.33 was 1 secs ago (sync site);
414 Last vote started 1 secs ago (at Wed Oct 27 09:49:51 1999)
415 Local db version is 940902602.674
416 I am sync site until 58 secs from now (at Wed Oct 27 09:50:50 1999) (3 servers)
418 Sync site's db version is 940902602.674
419 0 locked pages, 0 of them for write
420 Last time a new db version was labelled was:
421 129588 secs ago (at Mon Oct 25 21:50:04 1999)
425 Server( 192.12.107.35 ): (db 940902602.674)
426 last vote rcvd 2 secs ago (at Wed Oct 27 09:49:50 1999),
427 last beacon sent 1 secs ago (at Wed Oct 27 09:49:51 1999), last vote was yes
428 dbcurrent=1, up=1 beaconSince=1
432 Server( 192.12.107.34 ): (db 940902602.674)
433 last vote rcvd 2 secs ago (at Wed Oct 27 09:49:50 1999),
434 last beacon sent 1 secs ago (at Wed Oct 27 09:49:51 1999), last vote was yes
435 dbcurrent=1, up=1 beaconSince=1
438 <para>This example checks the status of the Authentication Server on the machine
439 with IP address 192.12.107.34, which is a secondary site. The local clock
440 is about 4 minutes behind the database server machine's clock.</para>
443 % udebug 192.12.107.34 7004
444 Host's addresses are: 192.12.107.34
445 Host's 192.12.107.34 time is Wed Oct 27 09:54:15 1999
446 Local time is Wed Oct 27 09:50:08 1999 (time differential -247 secs)
448 Last yes vote for 192.12.107.33 was 6 secs ago (sync site);
449 Last vote started 6 secs ago (at Wed Oct 27 09:50:02 1999)
450 Local db version is 940906574.25
452 Lowest host 192.12.107.33 was set 6 secs ago
453 Sync host 192.12.107.33 was set 6 secs ago
454 Sync site's db version is 940906574.25
455 0 locked pages, 0 of them for write
460 <title>Privilege Required</title>
465 <title>See Also</title>
466 <para><link linkend="buserver8">buserver(8)</link>,
467 <link linkend="kaserver8">kaserver(8)</link>,
468 <link linkend="ptserver8">ptserver(8)</link>,
469 <link linkend="vlserver8">vlserver(8)</link></para>
473 <title>Copyright</title>
474 <para>IBM Corporation 2000. <http://www.ibm.com/> All Rights Reserved.</para>
476 <para>This documentation is covered by the IBM Public License Version 1.0. It was
477 converted from HTML to POD by software written by Chas Williams and Russ
478 Allbery, based on work by Alf Wachsmann and Elizabeth Cassell.</para>