1 <?xml version="1.0" encoding="UTF-8"?>
3 <title>Administering Server Machines</title>
7 <primary>server machine</primary>
9 <secondary>administering</secondary>
13 <primary>administering</primary>
15 <secondary>server machine</secondary>
18 This chapter describes how to administer an AFS server machine. It describes the following configuration information and
19 administrative tasks: <itemizedlist>
21 <para>The binary and configuration files that must reside in the subdirectories of the <emphasis
22 role="bold">/usr/afs</emphasis> directory on every server machine's local disk; see <link linkend="HDRWQ83">Local Disk Files
23 on a Server Machine</link>.</para>
27 <para>The various <emphasis>roles</emphasis> or functions that an AFS server machine can perform, and how to determine which
28 machines are taking a role; see <link linkend="HDRWQ90">The Four Roles for File Server Machines</link>.</para>
32 <para>How to maintain database server machines; see <link linkend="HDRWQ101">Administering Database Server
33 Machines</link>.</para>
37 <para>How to maintain the list of database server machines in the <emphasis role="bold">/usr/afs/etc/CellServDB</emphasis>
38 file; see <link linkend="HDRWQ118">Maintaining the Server CellServDB File</link>.</para>
42 <para>How to control authorization checking on a server machine; see <link linkend="HDRWQ123">Managing Authentication and
43 Authorization Requirements</link>.</para>
47 <para>How to install new disks or partitions on a file server machine; see <link linkend="HDRWQ130">Adding or Removing Disks
48 and Partitions</link>.</para>
52 <para>How to change a server machine's IP addresses and manager VLDB server entries; see <link linkend="HDRWQ138">Managing
53 Server IP Addresses and VLDB Server Entries</link>.</para>
57 <para>How to reboot a file server machine; see <link linkend="HDRWQ139">Rebooting a Server Machine</link>.</para>
59 </itemizedlist></para>
61 <para>To learn how to install and configure a new server machine, see the <emphasis>OpenAFS Quick Beginnings</emphasis>.</para>
63 <para>To learn how to administer the server processes themselves, see <link linkend="HDRWQ142">Monitoring and Controlling Server
64 Processes</link>.</para>
66 <para>To learn how to administer volumes, see <link linkend="HDRWQ174">Managing Volumes</link>.</para>
69 <title>Summary of Instructions</title>
71 <para>This chapter explains how to perform the following tasks by using the indicated commands:</para>
73 <informaltable frame="none">
75 <colspec colwidth="70*" />
77 <colspec colwidth="30*" />
81 <entry>Install new binaries</entry>
83 <entry><emphasis role="bold">bos install</emphasis></entry>
87 <entry>Examine binary check-and-restart time</entry>
89 <entry><emphasis role="bold">bos getrestart</emphasis></entry>
93 <entry>Set binary check-and-restart time</entry>
95 <entry><emphasis role="bold">bos setrestart</emphasis></entry>
99 <entry>Examine compilation dates on binary files</entry>
101 <entry><emphasis role="bold">bos getdate</emphasis></entry>
105 <entry>Restart a process to use new binaries</entry>
107 <entry><emphasis role="bold">bos restart</emphasis></entry>
111 <entry>Revert to old version of binaries</entry>
113 <entry><emphasis role="bold">bos uninstall</emphasis></entry>
117 <entry>Remove obsolete <emphasis role="bold">.BAK</emphasis> and <emphasis role="bold">.OLD</emphasis> versions</entry>
119 <entry><emphasis role="bold">bos prune</emphasis></entry>
123 <entry>List partitions on a file server machine</entry>
125 <entry><emphasis role="bold">vos listpart</emphasis></entry>
129 <entry>Shutdown AFS server processes</entry>
131 <entry><emphasis role="bold">bos shutdown</emphasis></entry>
135 <entry>List volumes on a partition</entry>
137 <entry><emphasis role="bold">vos listvldb</emphasis></entry>
141 <entry>Move read/write volumes</entry>
143 <entry><emphasis role="bold">vos move</emphasis></entry>
147 <entry>List a cell's database server machines</entry>
149 <entry><emphasis role="bold">bos listhosts</emphasis></entry>
153 <entry>Add a database server machine to server <emphasis role="bold">CellServDB</emphasis> file</entry>
155 <entry><emphasis role="bold">bos addhost</emphasis></entry>
159 <entry>Remove a database server machine from server <emphasis role="bold">CellServDB</emphasis> file</entry>
161 <entry><emphasis role="bold">bos removehost</emphasis></entry>
165 <entry>Set authorization checking requirements</entry>
167 <entry><emphasis role="bold">bos setauth</emphasis></entry>
171 <entry>Prevent authentication for <emphasis role="bold">bos</emphasis>, <emphasis role="bold">pts</emphasis>, and
172 <emphasis role="bold">vos</emphasis> commands</entry>
174 <entry>Include <emphasis role="bold">-noauth</emphasis> flag</entry>
178 <entry>Prevent authentication for kas commands</entry>
180 <entry>Include <emphasis role="bold">-noauth</emphasis> flag on some commands or issue <emphasis
181 role="bold">noauthentication</emphasis> while in interactive mode</entry>
185 <entry>Display all VLDB server entries</entry>
187 <entry><emphasis role="bold">vos listaddrs</emphasis></entry>
191 <entry>Remove a VLDB server entry</entry>
193 <entry><emphasis role="bold">vos changeaddr</emphasis></entry>
197 <entry>Reboot a server machine remotely</entry>
199 <entry><emphasis role="bold">bos exec</emphasis> <emphasis>reboot_command</emphasis></entry>
207 <title>Local Disk Files on a Server Machine</title>
209 <para>Several types of files must reside in the subdirectories of the <emphasis role="bold">/usr/afs</emphasis> directory on an
210 AFS server machine's local disk. They include binaries, configuration files, the administrative database files (on database
211 server machines), log files, and volume header files.</para>
213 <para><emphasis role="bold">Note for Windows users:</emphasis> Some files described in this document possibly do not exist on
214 machines that run a Windows operating system. Also, Windows uses a backslash (<emphasis role="bold">\</emphasis>) rather than a
215 forward slash (<emphasis role="bold">/</emphasis>) to separate the elements in a pathname.</para>
218 <primary>usr/afs/bin directory on server machines</primary>
220 <secondary>contents listed</secondary>
224 <primary>directory</primary>
226 <secondary>/usr/afs/bin on server machines</secondary>
230 <primary>server process binaries</primary>
232 <secondary>in /usr/afs/bin</secondary>
236 <title>Binaries in the /usr/afs/bin Directory</title>
238 <para>The <emphasis role="bold">/usr/afs/bin</emphasis> directory stores the AFS server process and command suite binaries
239 appropriate for the machine's system (CPU and operating system) type. If a process has both a server portion and a client
240 portion (as with the Update Server) or if it has separate components (as with the <emphasis role="bold">fs</emphasis>
241 process), each component resides in a separate file.</para>
243 <para>To ensure predictable system performance, all file server machines must run the same AFS build version of a given
244 process. To maintain consistency easily, use the Update Server process to distribute binaries from a binary distribution
245 machine of each system type, as described further in <link linkend="HDRWQ93">Binary Distribution Machines</link>.</para>
247 <para>It is best to keep the binaries for all processes in the <emphasis role="bold">/usr/afs/bin</emphasis> directory, even
248 if you do not run the process actively on the machine. It simplifies the process of reconfiguring machines (for example,
249 adding database server functionality to an existing file server machine). Similarly, it is best to keep the command suite
250 binaries in the directory, even if you do not often issue commands while working on the server machine. It enables you to
251 issue commands during recovery from server and machine outages.</para>
253 <para>The following lists the binary files in the <emphasis role="bold">/usr/afs/bin</emphasis> directory that are directly
254 related to the AFS server processes or command suites. Other binaries (for example, for the <emphasis
255 role="bold">klog</emphasis> command) sometimes appear in this directory on a particular file server machine's disk or in an
256 AFS distribution. <variablelist>
258 <primary>files</primary>
260 <secondary>backup command binary</secondary>
264 <primary>backup commands</primary>
266 <secondary>binary in /usr/afs/bin</secondary>
270 <term><emphasis role="bold">backup</emphasis></term>
273 <para>The command suite for the AFS Backup System (the binary for the Backup Server is <emphasis
274 role="bold">buserver</emphasis>).</para>
277 <primary>files</primary>
279 <secondary>bos command binary</secondary>
283 <primary>bos commands</primary>
285 <secondary>binary in /usr/afs/bin</secondary>
291 <term><emphasis role="bold">bos</emphasis></term>
294 <para>The command suite for communicating with the Basic OverSeer (BOS) Server (the binary for the BOS Server is
295 <emphasis role="bold">bosserver</emphasis>).</para>
298 <primary>bosserver</primary>
300 <secondary>binary in /usr/afs/bin</secondary>
304 <primary>bosserver</primary>
306 <secondary></secondary>
308 <see>BOS Server</see>
312 <primary>files</primary>
314 <secondary>bosserver binary</secondary>
318 <primary>programs</primary>
320 <secondary>bosserver</secondary>
324 <primary>processes</primary>
326 <secondary>BOS Server, binary in /usr/afs/bin</secondary>
330 <primary>BOS Server</primary>
332 <secondary>binary in /usr/afs/bin</secondary>
338 <term><emphasis role="bold">bosserver</emphasis></term>
341 <para>The binary for the Basic OverSeer (BOS) Server process.</para>
344 <primary>buserver</primary>
346 <secondary>binary in /usr/afs/bin</secondary>
350 <primary>buserver</primary>
352 <secondary></secondary>
354 <see>Backup Server</see>
358 <primary>files</primary>
360 <secondary>buserver</secondary>
364 <primary>programs</primary>
366 <secondary>buserver</secondary>
370 <primary>processes</primary>
372 <secondary>Backup Server, binary in /usr/afs/bin</secondary>
376 <primary>Backup Server</primary>
378 <secondary>binary in /usr/afs/bin</secondary>
384 <term><emphasis role="bold">buserver</emphasis></term>
387 <para>The binary for the Backup Server process.</para>
390 <primary>fileserver</primary>
392 <secondary>binary in /usr/afs/bin</secondary>
396 <primary>fileserver</primary>
398 <secondary></secondary>
400 <see>File Server</see>
404 <primary>files</primary>
406 <secondary>fileserver</secondary>
410 <primary>programs</primary>
412 <secondary>fileserver</secondary>
416 <primary>processes</primary>
418 <secondary>File Server, binary in /usr/afs/bin</secondary>
422 <primary>File Server</primary>
424 <secondary>binary in /usr/afs/bin</secondary>
430 <term><emphasis role="bold">fileserver</emphasis></term>
433 <para>The binary for the File Server component of the <emphasis role="bold">fs</emphasis> process.</para>
436 <primary>files</primary>
438 <secondary>kas command binary</secondary>
442 <primary>kas commands</primary>
444 <secondary>binary in /usr/afs/bin</secondary>
450 <term><emphasis role="bold">kas</emphasis></term>
453 <para>The command suite for communicating with the Authentication Server (the binary for the Authentication Server is
454 <emphasis role="bold">kaserver</emphasis>).</para>
457 <primary>kaserver process</primary>
459 <secondary>binary in /usr/afs/bin</secondary>
463 <primary>kaserver process</primary>
465 <secondary></secondary>
467 <see>Authentication Server</see>
471 <primary>files</primary>
473 <secondary>kaserver binary file</secondary>
477 <primary>programs</primary>
479 <secondary>kaserver</secondary>
483 <primary>processes</primary>
485 <secondary>Authentication Server, binary in /usr/afs/bin</secondary>
489 <primary>Authentication Server</primary>
491 <secondary>binary in /usr/afs/bin</secondary>
497 <term><emphasis role="bold">kaserver</emphasis></term>
500 <para>The binary for the Authentication Server process.</para>
503 <primary>files</primary>
505 <secondary>pts command binary</secondary>
509 <primary>pts commands</primary>
511 <secondary>binary in /usr/afs/bin</secondary>
517 <term><emphasis role="bold">pts</emphasis></term>
520 <para>The command suite for communicating with the Protection Server process (the binary for the Protection Server is
521 <emphasis role="bold">ptserver</emphasis>).</para>
524 <primary>ptserver process</primary>
526 <secondary>binary in /usr/afs/bin</secondary>
530 <primary>ptserver process</primary>
532 <secondary></secondary>
534 <see>Protection Server</see>
538 <primary>files</primary>
540 <secondary>ptserver binary</secondary>
544 <primary>programs</primary>
546 <secondary>ptserver</secondary>
550 <primary>processes</primary>
552 <secondary>Protection Server, binary in /usr/afs/bin</secondary>
556 <primary>Protection Server</primary>
558 <secondary>binary in /usr/afs/bin</secondary>
564 <term><emphasis role="bold">ptserver</emphasis></term>
567 <para>The binary for the Protection Server process.</para>
570 <primary>Salvager</primary>
572 <secondary>binary in /usr/afs/bin</secondary>
576 <primary>Salvager</primary>
578 <secondary></secondary>
584 <primary>files</primary>
586 <secondary>salvager</secondary>
590 <primary>programs</primary>
592 <secondary>salvager</secondary>
596 <primary>processes</primary>
598 <secondary>Salvager, binary in /usr/afs/bin</secondary>
604 <term><emphasis role="bold">salvager</emphasis></term>
607 <para>The binary for the Salvager component of the <emphasis role="bold">fs</emphasis> process.</para>
610 <primary>udebug</primary>
612 <secondary>binary in /usr/afs/bin</secondary>
616 <primary>files</primary>
618 <secondary>udebug</secondary>
622 <primary>commands</primary>
624 <secondary>udebug</secondary>
628 <primary>programs</primary>
630 <secondary>udebug</secondary>
636 <term><emphasis role="bold">udebug</emphasis></term>
639 <para>The binary for a program that reports the status of AFS's distributed database technology, Ubik.</para>
642 <primary>upclient</primary>
644 <secondary>binary in /usr/afs/bin</secondary>
648 <primary>upclient</primary>
650 <secondary></secondary>
652 <see>Update Server</see>
656 <primary>files</primary>
658 <secondary>upclient</secondary>
662 <primary>programs</primary>
664 <secondary>upclient</secondary>
668 <primary>processes</primary>
670 <secondary>Update Server, binaries in /usr/afs/bin</secondary>
674 <primary>Update Server</primary>
676 <secondary>binaries in /usr/afs/bin</secondary>
682 <term><emphasis role="bold">upclient</emphasis></term>
685 <para>The binary for the client portion of the Update Server process.</para>
688 <primary>upserver</primary>
690 <secondary>binary in /usr/afs/bin</secondary>
694 <primary>upserver</primary>
696 <secondary></secondary>
698 <see>Update Server</see>
702 <primary>files</primary>
704 <secondary>upserver</secondary>
708 <primary>programs</primary>
710 <secondary>upserver</secondary>
716 <term><emphasis role="bold">upserver</emphasis></term>
719 <para>The binary for the server portion of the Update Server process.</para>
722 <primary>vlserver</primary>
724 <secondary>binary in /usr/afs/bin</secondary>
728 <primary>vlserver</primary>
730 <secondary></secondary>
736 <primary>files</primary>
738 <secondary>vlserver</secondary>
742 <primary>programs</primary>
744 <secondary>vlserver</secondary>
748 <primary>processes</primary>
750 <secondary>VL Server, binary in /usr/afs/bin</secondary>
754 <primary>VL Server</primary>
756 <secondary>binary in /usr/afs/bin</secondary>
760 <primary>Volume Location Server</primary>
762 <secondary></secondary>
770 <term><emphasis role="bold">vlserver</emphasis></term>
773 <para>The binary for the Volume Location (VL) Server process.</para>
776 <primary>volserver</primary>
778 <secondary>binary in /usr/afs/bin</secondary>
782 <primary>volserver</primary>
784 <secondary></secondary>
786 <see>Volume Server</see>
790 <primary>files</primary>
792 <secondary>volserver</secondary>
796 <primary>programs</primary>
798 <secondary>volserver</secondary>
802 <primary>processes</primary>
804 <secondary>Volume Server, binary in /usr/afs/bin</secondary>
808 <primary>Volume Server</primary>
810 <secondary>binary in /usr/afs/bin</secondary>
816 <term><emphasis role="bold">volserver</emphasis></term>
819 <para>The binary for the Volume Server component of the <emphasis role="bold">fs</emphasis> process.</para>
822 <primary>files</primary>
824 <secondary>vos command binary</secondary>
828 <primary>vos commands</primary>
830 <secondary>binary in /usr/afs/bin</secondary>
836 <term><emphasis role="bold">vos</emphasis></term>
839 <para>The command suite for communicating with the Volume and VL Server processes (the binaries for the servers are
840 <emphasis role="bold">volserver</emphasis> and <emphasis role="bold">vlserver</emphasis>, respectively).</para>
843 </variablelist></para>
846 <primary>usr/afs/etc directory on server machines</primary>
848 <secondary>contents listed</secondary>
852 <primary>directory</primary>
854 <secondary>/usr/afs/etc</secondary>
858 <primary>files</primary>
860 <secondary>server configuration, in /usr/afs/etc directory</secondary>
864 <primary>common configuration files (server)</primary>
868 <primary>server machine</primary>
870 <secondary>configuration files in /usr/afs/etc</secondary>
875 <title>Common Configuration Files in the /usr/afs/etc Directory</title>
877 <para>The directory <emphasis role="bold">/usr/afs/etc</emphasis> on every file server machine's local disk contains
878 configuration files in ASCII and machine-independent binary format. For predictable AFS performance throughout a cell, all
879 server machines must have the same version of each configuration file: <itemizedlist>
881 <primary>Update Server</primary>
883 <secondary>distributing server configuration files</secondary>
887 <para>Cells conventionally use the Update Server to distribute a common
888 version of each file from the cell's system control machine to other server machines (for more on the system control
889 machine, see <link linkend="HDRWQ94">The System Control Machine</link>). Run the Update Server's server portion on the
890 system control machine, and the client portion on all other server machines. Update the files on the system control
891 machine only, except as directed by instructions for dealing with emergencies.</para>
893 </itemizedlist></para>
895 <para>Never directly edit any of the files in the <emphasis role="bold">/usr/afs/etc</emphasis> directory, except as directed
896 by instructions for dealing with emergencies. In normal circumstances, use the appropriate <emphasis
897 role="bold">bos</emphasis> commands to change the files. The following list includes pointers to instructions.</para>
899 <para>The files in this directory include: <variablelist>
901 <primary>CellServDB file (server)</primary>
903 <secondary>about</secondary>
907 <primary>files</primary>
909 <secondary>CellServDB (server)</secondary>
913 <term><emphasis role="bold">CellServDB</emphasis></term>
916 <para>An ASCII file that names the cell's database server machines, which run the Authentication, Backup, Protection,
917 and VL Server processes. You create the initial version of this file by issuing the <emphasis role="bold">bos
918 setcellname</emphasis> command while installing your cell's first server machine. It is very important to update this
919 file when you change the identity of your cell's database server machines.</para>
921 <para>The server <emphasis role="bold">CellServDB</emphasis> file is not the same as the <emphasis
922 role="bold">CellServDB</emphasis> file stored in the <emphasis role="bold">/usr/vice/etc</emphasis> directory on
923 client machines. The client version lists the database server machines for every AFS cell that you choose to make
924 accessible from the client machine. The server <emphasis role="bold">CellServDB</emphasis> file lists only the local
925 cell's database server machines, because server processes never contact processes in other cells.</para>
927 <para>For instructions on maintaining this file, see <link linkend="HDRWQ118">Maintaining the Server CellServDB
931 <primary>KeyFile file</primary>
933 <secondary>function of</secondary>
937 <primary>files</primary>
939 <secondary>KeyFile</secondary>
943 <primary>server encryption key</primary>
949 <term><emphasis role="bold">KeyFile</emphasis></term>
952 <para>A machine-independent, binary-format file that lists the server encryption keys the AFS server processes use to
953 encrypt and decrypt tickets. The information in this file is the basis for secure communication in the cell, and so is
954 extremely sensitive. The file is specially protected so that only privileged users can read or change it.</para>
956 <para>For instructions on maintaining this file, see <link linkend="HDRWQ355">Managing Server Encryption
960 <primary>ThisCell file (server)</primary>
964 <primary>files</primary>
966 <secondary>ThisCell (server)</secondary>
972 <term><emphasis role="bold">ThisCell</emphasis></term>
975 <para>An ASCII file that consists of a single line defining the complete Internet domain-style name of the cell (such
976 as <computeroutput>example.com</computeroutput>). You create this file with the <emphasis role="bold">bos
977 setcellname</emphasis> command during the installation of your cell's first file server machine, as instructed in the
978 <emphasis>OpenAFS Quick Beginnings</emphasis>.</para>
980 <para>Note that changing this file is only one step in changing your cell's name. For discussion, see <link
981 linkend="HDRWQ34">Choosing a Cell Name</link>.</para>
984 <primary>UserList file</primary>
988 <primary>files</primary>
990 <secondary>UserList</secondary>
996 <term><emphasis role="bold">UserList</emphasis></term>
999 <para>An ASCII file that lists the usernames of the system administrators authorized to issue privileged <emphasis
1000 role="bold">bos</emphasis>, <emphasis role="bold">vos</emphasis>, and <emphasis role="bold">backup</emphasis>
1001 commands. For instructions on maintaining the file, see <link linkend="HDRWQ592">Administering the UserList
1005 </variablelist></para>
1008 <primary>usr/afs/local directory on server machines</primary>
1010 <secondary>contents listed</secondary>
1014 <primary>directory</primary>
1016 <secondary>/usr/afs/local on server machines</secondary>
1020 <primary>local configuration files (server)</primary>
1024 <primary>file server machine</primary>
1026 <secondary>configuration files in /usr/afs/local</secondary>
1030 <sect2 id="HDRWQ86">
1031 <title>Local Configuration Files in the /usr/afs/local Directory</title>
1033 <para>The directory <emphasis role="bold">/usr/afs/local</emphasis> contains configuration files that are different for each
1034 file server machine in a cell. Thus, they are not updated automatically from a central source like the files in <emphasis
1035 role="bold">/usr/afs/bin</emphasis> and <emphasis role="bold">/usr/afs/etc</emphasis> directories. The most important file is
1036 the <emphasis role="bold">BosConfig</emphasis> file; it defines which server processes are to run on that machine.</para>
1038 <para>As with the common configuration files in <emphasis role="bold">/usr/afs/etc</emphasis>, you must not edit these files
1039 directly. Use commands from the <emphasis role="bold">bos</emphasis> command suite where appropriate; some files never need to
1042 <para>The files in this directory include the following: <variablelist>
1044 <primary>BosConfig file</primary>
1048 <primary>files</primary>
1050 <secondary>BosConfig</secondary>
1054 <term><emphasis role="bold">BosConfig</emphasis></term>
1057 <para>This file lists the server processes to run on the server machine, by defining which processes the BOS Server
1058 monitors and what it does if the process fails. It also defines the times at which the BOS Server automatically
1059 restarts processes for maintenance purposes.</para>
1061 <para>As you create server processes during a file server machine's installation, their entries are defined in this
1062 file automatically. The <emphasis>OpenAFS Quick Beginnings</emphasis> outlines the <emphasis
1063 role="bold">bos</emphasis> commands to use. For a more complete description of the file, and instructions for
1064 controlling process status by editing the file with commands from the <emphasis role="bold">bos</emphasis> suite, see
1065 <link linkend="HDRWQ142">Monitoring and Controlling Server Processes</link>.</para>
1068 <primary>NetInfo file (server version)</primary>
1072 <primary>files</primary>
1074 <secondary>NetInfo (server version)</secondary>
1080 <term><emphasis role="bold">NetInfo</emphasis></term>
1083 <para>This optional ASCII file lists one or more of the network interface addresses on the server machine. If it
1084 exists when the File Server initializes, the File Server uses it as the basis for the list of interfaces that it
1085 registers in its Volume Location Database (VLDB) server entry. See <link linkend="HDRWQ138">Managing Server IP
1086 Addresses and VLDB Server Entries</link>.</para>
1089 <primary>NetRestrict file (server version)</primary>
1093 <primary>files</primary>
1095 <secondary>NetRestrict (server version)</secondary>
1101 <term><emphasis role="bold">NetRestrict</emphasis></term>
1104 <para>This optional ASCII file lists one or more network interface addresses. If it exists when the File Server
1105 initializes, the File Server removes the specified addresses from the list of interfaces that it registers in its VLDB
1106 server entry. See <link linkend="HDRWQ138">Managing Server IP Addresses and VLDB Server Entries</link>.</para>
1109 <primary>NoAuth file</primary>
1113 <primary>files</primary>
1115 <secondary>NoAuth</secondary>
1121 <term><emphasis role="bold">NoAuth</emphasis></term>
1124 <para>This zero-length file instructs all AFS server processes running on the machine not to perform authorization
1125 checking. Thus, they perform any action for any user, even <emphasis role="bold">anonymous</emphasis>. This very
1126 insecure state is useful only in rare instances, mainly during the installation of the machine.</para>
1128 <para>The file is created automatically when you start the initial <emphasis role="bold">bosserver</emphasis> process
1129 with the <emphasis role="bold">-noauth</emphasis> flag, or issue the <emphasis role="bold">bos setauth</emphasis>
1130 command to turn off authentication requirements. When you use the <emphasis role="bold">bos setauth</emphasis> command
1131 to turn on authentication, the BOS Server removes this file. For more information, see <link
1132 linkend="HDRWQ123">Managing Authentication and Authorization Requirements</link>.</para>
1135 <primary>SALVAGE.fs file</primary>
1139 <primary>files</primary>
1141 <secondary>SALVAGE.fs</secondary>
1147 <term><emphasis role="bold">SALVAGE.fs</emphasis></term>
1150 <para>This zero-length file controls how the BOS Server handles a crash of the File Server component of the <emphasis
1151 role="bold">fs</emphasis> process. The BOS Server creates this file each time it starts or restarts the <emphasis
1152 role="bold">fs</emphasis> process. If the file is present when the File Server crashes, then the BOS Server runs the
1153 Salvager before restarting the File Server and Volume Server again. When the File Server exits normally, the BOS
1154 Server removes the file so that the Salvager does not run.</para>
1156 <para>Do not create or remove this file yourself; the BOS Server does so automatically. If necessary, you can salvage
1157 a volume or partition by using the <emphasis role="bold">bos salvage</emphasis> command; see <link
1158 linkend="HDRWQ232">Salvaging Volumes</link>.</para>
1161 <primary>salvage.lock file</primary>
1165 <primary>files</primary>
1167 <secondary>salvage.lock</secondary>
1173 <term><emphasis role="bold">salvage.lock</emphasis></term>
1176 <para>This file guarantees that only one Salvager process runs on a file server machine at a time (the single process
1177 can fork multiple subprocesses to salvage multiple partitions in parallel). As the Salvager initiates (when invoked by
1178 the BOS Server or by issue of the <emphasis role="bold">bos salvage</emphasis> command), it creates this zero-length
1179 file and issues the <emphasis role="bold">flock</emphasis> system call on it. It removes the file when it completes
1180 the salvage operation. Because the Salvager must lock the file in order to run, only one Salvager can run at a
1184 <primary>sysid file</primary>
1188 <primary>files</primary>
1190 <secondary>sysid</secondary>
1194 <primary>File Server</primary>
1196 <secondary>interfaces registered in VLDB</secondary>
1198 <tertiary>listed in sysid file</tertiary>
1202 <primary>VLDB</primary>
1204 <secondary>server machine interfaces registered</secondary>
1206 <tertiary>listed in sysid file</tertiary>
1212 <term><emphasis role="bold">sysid</emphasis></term>
1215 <para>This file records the network interface addresses that the File Server (<emphasis
1216 role="bold">fileserver</emphasis> process) registers in its VLDB server entry. When the Cache Manager requests volume
1217 location information, the Volume Location (VL) Server provides all of the interfaces registered for each server
1218 machine that houses the volume. This enables the Cache Manager to make use of multiple addresses when accessing AFS
1219 data stored on a multihomed file server machine. For further information, see <link linkend="HDRWQ138">Managing Server
1220 IP Addresses and VLDB Server Entries</link>.</para>
1223 </variablelist></para>
1226 <primary>usr/afs/db directory on server machines</primary>
1228 <secondary>contents listed</secondary>
1232 <primary>directory</primary>
1234 <secondary>/usr/afs/db on server machines</secondary>
1238 <primary>database files</primary>
1242 <primary>replicated database files</primary>
1246 <primary>log files</primary>
1248 <secondary>for replicated databases</secondary>
1252 <primary>file server machine</primary>
1254 <secondary>database files in /usr/afs/db</secondary>
1258 <sect2 id="HDRWQ87">
1259 <title>Replicated Database Files in the /usr/afs/db Directory</title>
1261 <para>The directory <emphasis role="bold">/usr/afs/db</emphasis> contains two types of files pertaining to the four replicated
1262 databases in the cell--the Authentication Database, Backup Database, Protection Database, and Volume Location Database (VLDB):
1265 <para>A file that contains each database, with a <emphasis role="bold">.DB0</emphasis> extension.</para>
1269 <para>A log file for each database, with a <emphasis role="bold">.DBSYS1</emphasis> extension. The database server
1270 process logs each database operation in this file before performing it. If the operation is interrupted, the process
1271 consults this file to learn how to finish it.</para>
1273 </itemizedlist></para>
1275 <para>Each database server process (Authentication, Backup, Protection, or VL Server) maintains its own database and log
1276 files. The database files are in binary format, so you must always access or alter them using commands from the <emphasis
1277 role="bold">kas</emphasis> suite (for the Authentication Database), <emphasis role="bold">backup</emphasis> suite (for the
1278 Backup Database), <emphasis role="bold">pts</emphasis> suite (for the Protection Database), or <emphasis
1279 role="bold">vos</emphasis> suite (for the VLDB).</para>
1281 <para>If a cell runs more than one database server machine, each database server process keeps its own copy of its database on
1282 its machine's hard disk. However, it is important that all the copies of a given database are the same. To synchronize them,
1283 the database server processes call on AFS's distributed database technology, Ubik, as described in <link
1284 linkend="HDRWQ102">Replicating the OpenAFS Administrative Databases</link>.</para>
1286 <para>The files listed here appear in this directory only on database server machines. On non-database server machines, this
1287 directory is empty. <variablelist>
1289 <primary>files</primary>
1291 <secondary>bdb.DB0</secondary>
1295 <primary>bdb.DB0 file</primary>
1299 <term><emphasis role="bold">bdb.DB0</emphasis></term>
1302 <para>The Backup Database file.</para>
1305 <primary>files</primary>
1307 <secondary>bdb.DBSYS1</secondary>
1311 <primary>bdb.DBSYS1 file</primary>
1317 <term><emphasis role="bold">bdb.DBSYS1</emphasis></term>
1320 <para>The Backup Database log file.</para>
1323 <primary>files</primary>
1325 <secondary>kaserver.DB0</secondary>
1329 <primary>kaserver.DB0 file</primary>
1335 <term><emphasis role="bold">kaserver.DB0</emphasis></term>
1338 <para>The Authentication Database file.</para>
1341 <primary>files</primary>
1343 <secondary>kaserver.DBSYS1</secondary>
1347 <primary>kaserver.DBSYS1 file</primary>
1353 <term><emphasis role="bold">kaserver.DBSYS1</emphasis></term>
1356 <para>The Authentication Database log file.</para>
1359 <primary>files</primary>
1361 <secondary>prdb.DB0</secondary>
1365 <primary>prdb.DB0 file</primary>
1371 <term><emphasis role="bold">prdb.DB0</emphasis></term>
1374 <para>The Protection Database file.</para>
1377 <primary>files</primary>
1379 <secondary>prdb.DBSYS1</secondary>
1383 <primary>prdb.DBSYS1 file</primary>
1389 <term><emphasis role="bold">prdb.DBSYS1</emphasis></term>
1392 <para>The Protection Database log file.</para>
1395 <primary>files</primary>
1397 <secondary>vldb.DB0</secondary>
1401 <primary>vldb.DB0 file</primary>
1407 <term><emphasis role="bold">vldb.DB0</emphasis></term>
1410 <para>The Volume Location Database file.</para>
1413 <primary>files</primary>
1415 <secondary>vldb.DBSYS1</secondary>
1419 <primary>vldb.DBSYS1 file</primary>
1425 <term><emphasis role="bold">vldb.DBSYS1</emphasis></term>
1428 <para>The Volume Location Database log file.</para>
1431 </variablelist></para>
1434 <primary>usr/afs/logs directory on server machines</primary>
1436 <secondary>contents listed</secondary>
1440 <primary>directory</primary>
1442 <secondary>/usr/afs/logs on server machines</secondary>
1446 <primary>file server machine</primary>
1448 <secondary>core files in /usr/afs/logs</secondary>
1452 <primary>file server machine</primary>
1454 <secondary>log files in /usr/afs/logs</secondary>
1458 <primary>log files</primary>
1460 <secondary>for server processes</secondary>
1464 <primary>core files</primary>
1466 <secondary>for server processes</secondary>
1470 <sect2 id="HDRWQ88">
1471 <title>Log Files in the /usr/afs/logs Directory</title>
1473 <para>The <emphasis role="bold">/usr/afs/logs</emphasis> directory contains log files from various server processes. The files
1474 detail interesting events that occur during normal operations. For instance, the Volume Server can record volume moves in the
1475 <emphasis role="bold">VolserLog</emphasis> file. Events are recorded at completion, so the server processes do not use these
1476 files to reconstruct failed operations unlike the ones in the <emphasis role="bold">/usr/afs/db</emphasis> directory.</para>
1478 <para>The information in log files can be very useful as you evaluate process failures and other problems. For instance, if
1479 you receive a timeout message when you try to access a volume, checking the <emphasis role="bold">FileLog</emphasis> file
1480 possibly provides an explanation, showing that the File Server was unable to attach the volume. To examine a log file
1481 remotely, use the <emphasis role="bold">bos getlog</emphasis> command as described in <link linkend="HDRWQ173">Displaying
1482 Server Process Log Files</link>.</para>
1484 <para>This directory also contains the core image files generated if a process being monitored by the BOS Server crashes. The
1485 BOS Server attempts to add an extension to the standard <emphasis role="bold">core</emphasis> name to indicate which process
1486 generated the core file (for example, naming a core file generated by the Protection Server <emphasis
1487 role="bold">core.ptserver</emphasis>). The BOS Server cannot always assign the correct extension if two processes fail at
1488 about the same time, so it is not guaranteed to be correct.</para>
1490 <para>The directory contains the following files: <variablelist>
1492 <primary>AuthLog file</primary>
1496 <primary>files</primary>
1498 <secondary>AuthLog</secondary>
1502 <term><emphasis role="bold">AuthLog</emphasis></term>
1505 <para>The Authentication Server's log file.</para>
1508 <primary>BackupLog file</primary>
1512 <primary>files</primary>
1514 <secondary>BackupLog</secondary>
1520 <term><emphasis role="bold">BackupLog</emphasis></term>
1523 <para>The Backup Server's log file.</para>
1526 <primary>BosLog file</primary>
1530 <primary>files</primary>
1532 <secondary>BosLog</secondary>
1538 <term><emphasis role="bold">BosLog</emphasis></term>
1541 <para>The BOS Server's log file.</para>
1544 <primary>files</primary>
1546 <secondary>FileLog</secondary>
1550 <primary>FileLog file</primary>
1556 <term><emphasis role="bold">FileLog</emphasis></term>
1559 <para>The File Server's log file.</para>
1562 <primary>files</primary>
1564 <secondary>SalvageLog</secondary>
1568 <primary>SalvageLog file</primary>
1574 <term><emphasis role="bold">SalvageLog</emphasis></term>
1577 <para>The Salvager's log file.</para>
1580 <primary>VLLog file</primary>
1584 <primary>files</primary>
1586 <secondary>VLLog</secondary>
1592 <term><emphasis role="bold">VLLog</emphasis></term>
1595 <para>The Volume Location (VL) Server's log file.</para>
1598 <primary>VolserLog file</primary>
1602 <primary>files</primary>
1604 <secondary>VolserLog</secondary>
1610 <term><emphasis role="bold">VolserLog</emphasis></term>
1613 <para>The Volume Server's log file.</para>
1618 <term><emphasis role="bold">core.process</emphasis></term>
1621 <para>If present, a core image file produced as an AFS server process on the machine crashed (probably the process
1622 named by process).</para>
1625 </variablelist></para>
1628 <para>To prevent log files from growing unmanageably large, restart the server processes periodically, particularly the
1629 database server processes. To avoid restarting the processes, use the UNIX <emphasis role="bold">rm</emphasis> command to
1630 remove the file as the process runs; it re-creates it automatically.</para>
1634 <primary>vicep directory on server machines</primary>
1636 <secondary>contents listed</secondary>
1640 <primary>directory</primary>
1642 <secondary>/vicep on server machines</secondary>
1646 <primary>volume header</primary>
1648 <secondary>in /vicep directories</secondary>
1652 <primary>partition</primary>
1654 <secondary>housing AFS volumes</secondary>
1658 <primary>file server machine</primary>
1660 <secondary>partitions, naming</secondary>
1664 <sect2 id="HDRWQ89">
1665 <title>Volume Headers on Server Partitions</title>
1667 <para>A partition that houses AFS volumes must be mounted at a subdirectory of the machine's root ( / ) directory (not, for
1668 instance under the <emphasis role="bold">/usr</emphasis> directory). The file server machine's file system registry file
1669 (<emphasis role="bold">/etc/fstab</emphasis> or equivalent) must correctly map the directory name and the partition's device
1670 name. The directory name is of the form <emphasis role="bold">/vicep</emphasis>index, where each index is one or two lowercase
1671 letters. By convention, the first AFS partition on a machine is mounted at <emphasis role="bold">/vicepa</emphasis>, the
1672 second at <emphasis role="bold">/vicepb</emphasis>, and so on. If there are more than 26 partitions, continue with <emphasis
1673 role="bold">/vicepaa</emphasis>, <emphasis role="bold">/vicepab</emphasis> and so on. The <emphasis>OpenAFS Release
1674 Notes</emphasis> specifies the number of supported partitions per server machine.</para>
1676 <para>Do not store non-AFS files on AFS partitions. The File Server and Volume Server expect to have available all of the
1677 space on the partition.</para>
1679 <para>The <emphasis role="bold">/vicep</emphasis> directories contain two types of files: <variablelist>
1681 <primary>V.<emphasis>vol_ID</emphasis>.vol file</primary>
1685 <primary>files</primary>
1687 <secondary>V.<emphasis>vol_ID</emphasis>.vol</secondary>
1691 <term><emphasis role="bold">Vvol_ID.vol</emphasis></term>
1694 <para>Each such file is a volume header. The vol_ID corresponds to the volume ID number displayed in the output from
1695 the <emphasis role="bold">vos examine</emphasis>, <emphasis role="bold">vos listvldb</emphasis>, and <emphasis
1696 role="bold">vos listvol</emphasis> commands.</para>
1699 <primary>FORCESALVAGE file</primary>
1703 <primary>files</primary>
1705 <secondary>FORCESALVAGE</secondary>
1711 <term><emphasis role="bold">FORCESALVAGE</emphasis></term>
1714 <para>This zero-length file triggers the Salvager to salvage the entire partition. The AFS-modified version of the
1715 <emphasis role="bold">fsck</emphasis> program creates this file if it discovers corruption.</para>
1718 </variablelist></para>
1721 <para>For most system types, it is important never to run the standard <emphasis role="bold">fsck</emphasis> program
1722 provided with the operating system on an AFS file server machine. It removes all AFS volume data from server partitions
1723 because it does not recognize their format.</para>
1727 <primary>roles for server machine</primary>
1731 <primary>server machine</primary>
1733 <secondary>roles summarized</secondary>
1738 <sect1 id="HDRWQ90">
1739 <title>The Four Roles for File Server Machines</title>
1741 <para>In cells that have more than one server machine, not all server machines have to perform exactly the same functions. The
1742 are four possible <emphasis>roles</emphasis> a machine can assume, determined by which server processes it is running. A machine
1743 can assume more than one role by running all of the relevant processes. The following list summarizes the four roles, which are
1744 described more completely in subsequent sections. <itemizedlist>
1746 <para>A <emphasis>simple file server</emphasis> machine runs only the processes that store and deliver AFS files to client
1747 machines. You can run as many simple file server machines as you need to satisfy your cell's performance and disk space
1748 requirements.</para>
1752 <para>A <emphasis>database server machine</emphasis> runs the four database server processes that maintain AFS's
1753 replicated administrative databases: the Authentication, Backup, Protection, and Volume Location (VL) Server
1758 <para>A <emphasis>binary distribution machine</emphasis> distributes the AFS server binaries for its system type to all
1759 other server machines of that system type.</para>
1763 <para>The single <emphasis>system control machine</emphasis> distributes common server configuration files to all other
1764 server machines in the cell.</para>
1766 </itemizedlist></para>
1768 <para>If a cell has a single server machine, it assumes the simple file server and database server roles. The instructions in
1769 the <emphasis>OpenAFS Quick Beginnings</emphasis> also have you configure it as the system control machine and binary
1770 distribution machine for its system type, but it does not actually perform those functions until you install another server
1773 <para>It is best to keep the binaries for all of the AFS server processes in the <emphasis role="bold">/usr/afs/bin</emphasis>
1774 directory, even if not all processes are running. You can then change which roles a machine assumes simply by starting or
1775 stopping the processes that define the role.</para>
1778 <primary>simple file server machine</primary>
1782 <primary>server machine</primary>
1784 <secondary>simple file server role</secondary>
1787 <sect2 id="HDRWQ91">
1788 <title>Simple File Server Machines</title>
1790 <para>A <emphasis>simple file server machine</emphasis> runs only the server processes that store and deliver AFS files to
1791 client machines, monitor process status, and pick up binaries and configuration files from the cell's binary distribution and
1792 system control machines.</para>
1794 <para>In general, only cells with more than three server machines need to run simple file server machines. In cells with three
1795 or fewer machines, all of them are usually database server machines (to benefit from replicating the administrative
1796 databases); see <link linkend="HDRWQ92">Database Server Machines</link>.</para>
1798 <para>The following processes run on a simple file server machine: <itemizedlist>
1800 <para>The BOS Server (<emphasis role="bold">bosserver</emphasis> process)</para>
1804 <para>The <emphasis role="bold">fs</emphasis> process, which combines the File Server, Volume Server, and Salvager
1805 processes so that they can coordinate their operations on the data in volumes and avoid the inconsistencies that can
1806 result from multiple simultaneous operations on the same data</para>
1810 <para>A client portion of the Update Server that picks up binary files from the binary distribution machine of its AFS
1811 system type (the <emphasis role="bold">upclientbin</emphasis> process)</para>
1815 <para>A client portion of the Update Server that picks up common configuration files from the system control machine
1816 (the <emphasis role="bold">upclientetc</emphasis> process)</para>
1818 </itemizedlist></para>
1821 <primary>database server machine</primary>
1823 <secondary>defined</secondary>
1827 <primary>server machine</primary>
1829 <secondary>database server role</secondary>
1833 <primary>Backup Server</primary>
1835 <secondary>runs on database server machine</secondary>
1839 <primary>Authentication Server</primary>
1841 <secondary>runs on database server machine</secondary>
1845 <primary>Protection Server</primary>
1847 <secondary>runs on database server machine</secondary>
1851 <primary>VL Server</primary>
1853 <secondary>runs on database server machine</secondary>
1857 <sect2 id="HDRWQ92">
1858 <title>Database Server Machines</title>
1860 <para>A <emphasis>database server machine</emphasis> runs the four processes that maintain the AFS replicated administrative
1861 databases: the Authentication Server, Backup Server, Protection Server, and Volume Location (VL) Server, which maintain the
1862 Authentication Database, Backup Database, Protection Database, and Volume Location Database (VLDB), respectively. To review
1863 the functions of these server processes and their databases, see <link linkend="HDRWQ17">AFS Server Processes and the Cache
1864 Manager</link>.</para>
1866 <para>If a cell has more than one server machine, it is best to run more than one database server machine, but more than three
1867 are rarely necessary. Replicating the databases in this way yields the same benefits as replicating volumes: increased
1868 availability and reliability of information. If one database server machine or process goes down, the information in the
1869 database is still available from others. The load of requests for database information is spread across multiple machines,
1870 preventing any one from becoming overloaded.</para>
1872 <para>Unlike replicated volumes, however, replicated databases do change frequently. Consistent system performance demands
1873 that all copies of the database always be identical, so it is not possible to record changes in only some of them. To
1874 synchronize the copies of a database, the database server processes use AFS's distributed database technology, Ubik. See <link
1875 linkend="HDRWQ102">Replicating the OpenAFS Administrative Databases</link>.</para>
1877 <para>It is critical that the AFS server processes on every server machine in a cell know which machines are the database
1878 server machines. The database server processes in particular must maintain constant contact with their peers in order to
1879 coordinate the copies of the database. The other server processes often need information from the databases. Every file server
1880 machine keeps a list of its cell's database server machines in its local <emphasis
1881 role="bold">/usr/afs/etc/CellServDB</emphasis> file. Cells that use the States edition of AFS can use the system control
1882 machine to distribute this file (see <link linkend="HDRWQ94">The System Control Machine</link>).</para>
1884 <para>The following processes define a database server machine: <itemizedlist>
1886 <para>The Authentication Server (<emphasis role="bold">kaserver</emphasis> process)</para>
1890 <para>The Backup Server (<emphasis role="bold">buserver</emphasis> process)</para>
1894 <para>The Protection Server (<emphasis role="bold">ptserver</emphasis> process)</para>
1898 <para>The VL Server (<emphasis role="bold">vlserver</emphasis> process)</para>
1900 </itemizedlist></para>
1902 <para>Database server machines can also run the processes that define a simple file server machine, as listed in <link
1903 linkend="HDRWQ91">Simple File Server Machines</link>. One database server machine can act as the cell's system control
1904 machine, and any database server machine can serve as the binary distribution machine for its system type; see <link
1905 linkend="HDRWQ94">The System Control Machine</link> and <link linkend="HDRWQ93">Binary Distribution Machines</link>.</para>
1908 <primary>binary distribution machine</primary>
1910 <secondary>defined</secondary>
1914 <primary>server machine</primary>
1916 <secondary>binary distribution role</secondary>
1920 <primary>Update Server</primary>
1922 <secondary>server portion</secondary>
1924 <tertiary>on binary distribution machine</tertiary>
1928 <primary>Update Server</primary>
1930 <secondary>client portion</secondary>
1932 <tertiary>for binaries</tertiary>
1936 <sect2 id="HDRWQ93">
1937 <title>Binary Distribution Machines</title>
1939 <para>A <emphasis>binary distribution machine</emphasis> stores and distributes the binary files for the AFS processes and
1940 command suites to all other server machines of its system type. Each file server machine keeps its own copy of AFS server
1941 process binaries on its local disk, by convention in the <emphasis role="bold">/usr/afs/bin</emphasis> directory. For
1942 consistent system performance, however, all server machines must run the same version (build level) of a process. For
1943 instructions for checking a binary's build level, see <link linkend="HDRWQ117">Displaying A Binary File's Build Level</link>.
1944 The easiest way to keep the binaries consistent is to have a binary distribution machine of each system type distribute them
1945 to its system-type peers.</para>
1947 <para>The process that defines a binary distribution machine is the server portion of the Update Server (<emphasis
1948 role="bold">upserver</emphasis> process). The client portion of the Update Server (<emphasis
1949 role="bold">upclientbin</emphasis> process) runs on the other server machines of that system type and references the binary
1950 distribution machine.</para>
1952 <para>Binary distribution machines usually also run the processes that define a simple file server machine, as listed in <link
1953 linkend="HDRWQ91">Simple File Server Machines</link>. One binary distribution machine can act as the cell's system control
1954 machine, and any binary distribution machine can serve as a database server machine; see <link linkend="HDRWQ94">The System
1955 Control Machine</link> and <link linkend="HDRWQ92">Database Server Machines</link>.</para>
1958 <primary>system control machine</primary>
1962 <primary>configuration files</primary>
1964 <secondary>server machine, common</secondary>
1968 <primary>server machine</primary>
1970 <secondary>system control role</secondary>
1974 <sect2 id="HDRWQ94">
1975 <title>The System Control Machine</title>
1977 <para>The <emphasis>system control machine</emphasis> stores and
1978 distributes system configuration files shared by all of the server machines in the cell. Each file server machine keeps its
1979 own copy of the configuration files on its local disk, by convention in the <emphasis role="bold">/usr/afs/etc</emphasis>
1980 directory. For consistent system performance, however, all server machines must use the same files. The easiest way to keep
1981 the files consistent is to have the system control machine distribute them. You make changes only to the copy stored on the
1982 system control machine, as directed by the instructions in this document.</para>
1984 <para>For a list of the configuration files stored in the <emphasis role="bold">/usr/afs/etc</emphasis> directory, see <link
1985 linkend="HDRWQ85">Common Configuration Files in the /usr/afs/etc Directory</link>.</para>
1987 <para>The <emphasis>OpenAFS Quick Beginnings</emphasis> configures a cell's first server machine as the system control
1988 machine. If you wish, you can reassign the role to a different machine that you install later, but you must then change the
1989 client portion of the Update Server (<emphasis role="bold">upclientetc</emphasis>) process running on all other server
1990 machines to refer to the new system control machine.</para>
1992 <para>The following processes define the system control machine: <itemizedlist>
1994 <primary>Update Server</primary>
1996 <secondary>server portion</secondary>
1998 <tertiary>on system control machine</tertiary>
2002 <primary>Update Server</primary>
2004 <secondary>client portion</secondary>
2006 <tertiary>for configuration files</tertiary>
2010 <para>The server portion of the Update Server (<emphasis role="bold">upserver</emphasis>) process
2011 The client portion of the Update Server (<emphasis role="bold">upclientetc</emphasis>
2012 process) runs on the other server machines and references the system control machine.</para>
2014 </itemizedlist></para>
2016 <para>The system control machine can also run the processes that define a simple file server machine, as listed in <link
2017 linkend="HDRWQ91">Simple File Server Machines</link>. It can also server as a database server machine, and by convention acts
2018 as the binary distribution machine for its system type. A single <emphasis role="bold">upserver</emphasis> process can
2019 distribute both configuration files and binaries. See <link linkend="HDRWQ92">Database Server Machines</link> and <link
2020 linkend="HDRWQ93">Binary Distribution Machines</link>.</para>
2023 <primary>determining</primary>
2025 <secondary>roles taken by server machine</secondary>
2029 <primary>identifying</primary>
2031 <secondary>roles taken by server machine</secondary>
2035 <primary>server machine</primary>
2037 <secondary>determining roles</secondary>
2041 <primary>roles for server machine</primary>
2043 <secondary>determining</secondary>
2047 <primary>database server machine</primary>
2049 <secondary>identifying with bos status</secondary>
2053 <primary>determining</primary>
2055 <secondary>identity of database server machines</secondary>
2059 <primary>identifying</primary>
2061 <secondary>database server machine</secondary>
2065 <sect2 id="HDRWQ95">
2066 <title>To locate database server machines</title>
2070 <para>Issue the <emphasis role="bold">bos listhosts</emphasis> command. <programlisting>
2071 % <emphasis role="bold">bos listhosts</emphasis> <<replaceable>machine name</replaceable>>
2072 </programlisting></para>
2074 <para>The machines listed in the output are the cell's database server machines. For complete instructions and example
2075 output, see <link linkend="HDRWQ120">To display a cell's database server machines</link>.</para>
2079 <para><emphasis role="bold">(Optional)</emphasis> Issue the <emphasis role="bold">bos status</emphasis> command to verify
2080 that a machine listed in the output of the <emphasis role="bold">bos listhosts</emphasis> command is actually running the
2081 processes that define it as a database server machine. For complete instructions, see <link linkend="HDRWQ158">Displaying
2082 Process Status and Information from the BosConfig File</link>. <programlisting>
2083 % <emphasis role="bold">bos status</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">buserver kaserver ptserver vlserver</emphasis>
2084 </programlisting></para>
2086 <para>If the specified machine is a database server machine, the output from the <emphasis role="bold">bos
2087 status</emphasis> command includes the following lines:</para>
2090 Instance buserver, currently running normally.
2091 Instance kaserver, currently running normally.
2092 Instance ptserver, currently running normally.
2093 Instance vlserver, currently running normally.
2099 <primary>system control machine</primary>
2101 <secondary>identifying with bos status</secondary>
2105 <primary>determining</primary>
2107 <secondary>identity of system control machine</secondary>
2111 <primary>identifying</primary>
2113 <secondary>system control machine</secondary>
2117 <sect2 id="HDRWQ96">
2118 <title>To locate the system control machine</title>
2122 <para>Issue the <emphasis role="bold">bos status</emphasis> command for any server machine. Complete instructions appear
2123 in <link linkend="HDRWQ158">Displaying Process Status and Information from the BosConfig File</link>. <programlisting>
2124 % <emphasis role="bold">bos status</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">upserver upclientbin upclientetc</emphasis> <emphasis
2125 role="bold">-long</emphasis>
2126 </programlisting></para>
2128 <para>The output you see depends on the machine you have contacted: a simple file server machine, the system control
2129 machine, or a binary distribution machine. See <link linkend="HDRWQ98">Interpreting the Output from the bos status
2130 Command</link>.</para>
2135 <primary>binary distribution machine</primary>
2137 <secondary>identifying with bos status</secondary>
2141 <primary>determining</primary>
2143 <secondary>identity of binary distribution machine</secondary>
2147 <primary>identifying</primary>
2149 <secondary>binary distribution machine</secondary>
2153 <sect2 id="HDRWQ97">
2154 <title>To locate the binary distribution machine for a system type</title>
2158 <para>Issue the <emphasis role="bold">bos status</emphasis> command for a file server machine of the system type you are
2159 checking (to determine a machine's system type, issue the <emphasis role="bold">fs sysname</emphasis> or <emphasis
2160 role="bold">sys</emphasis> command as described in <link linkend="HDRWQ417">Displaying and Setting the System Type
2161 Name</link>. Complete instructions for the <emphasis role="bold">bos status</emphasis> command appear in <link
2162 linkend="HDRWQ158">Displaying Process Status and Information from the BosConfig File</link>. <programlisting>
2163 % <emphasis role="bold">bos status</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">upserver upclientbin upclientetc -long</emphasis>
2164 </programlisting></para>
2166 <para>The output you see depends on the machine you have contacted: a simple file server machine, the system control
2167 machine, or a binary distribution machine. See <link linkend="HDRWQ98">Interpreting the Output from the bos status
2168 Command</link>.</para>
2173 <primary>simple file server machine</primary>
2175 <secondary>identifying with bos status</secondary>
2179 <primary>determining</primary>
2181 <secondary>identity of:</secondary>
2183 <tertiary>simple file server machines</tertiary>
2187 <primary>identifying</primary>
2189 <secondary>simple file server machine</secondary>
2193 <sect2 id="HDRWQ98">
2194 <title>Interpreting the Output from the bos status Command</title>
2196 <para>Interpreting the output of the <emphasis role="bold">bos status</emphasis> command is most straightforward for a simple
2197 file server machine. There is no <emphasis role="bold">upserver</emphasis> process, so the output includes the following
2201 bos: failed to get instance info for 'upserver' (no such entity)
2204 <para>A simple file server machine runs the <emphasis role="bold">upclientbin</emphasis> process, so the output includes a
2205 message like the following. It indicates that <emphasis role="bold">fs7.example.com</emphasis> is the binary distribution machine
2206 for this system type.</para>
2209 Instance upclientbin, (type is simple) currently running normally.
2210 Process last started at Wed Mar 10 23:37:09 1999 (1 proc start)
2211 Command 1 is '/usr/afs/bin/upclient fs7.example.com -t 60 /usr/afs/bin'
2214 <para>A simple file server machine also runs the <emphasis
2215 role="bold">upclientetc</emphasis> process, so the output includes a message like the following. It indicates that <emphasis
2216 role="bold">fs1.example.com</emphasis> is the system control machine.</para>
2219 Instance upclientetc, (type is simple) currently running normally.
2220 Process last started at Mon Mar 22 05:23:49 1999 (1 proc start)
2221 Command 1 is '/usr/afs/bin/upclient fs1.example.com -t 60 /usr/afs/etc'
2224 <sect3 id="HDRWQ99">
2225 <title>The Output on the System Control Machine</title>
2227 <para>If you have issued the <emphasis role="bold">bos status</emphasis> command
2228 for the system control machine, the output includes an entry for the <emphasis role="bold">upserver</emphasis> process
2229 similar to the following:</para>
2232 Instance upserver, (type is simple) currently running normally.
2233 Process last started at Mon Mar 22 05:23:54 1999 (1 proc start)
2234 Command 1 is '/usr/afs/bin/upserver'
2237 <para>If you are using the default configuration recommended in the <emphasis>OpenAFS Quick Beginnings</emphasis>, the
2238 system control machine is also the binary distribution machine for its system type, and a single <emphasis
2239 role="bold">upserver</emphasis> process distributes both kinds of updates. In that case, the output includes the following
2243 bos: failed to get instance info for 'upclientbin' (no such entity)
2244 bos: failed to get instance info for 'upclientetc' (no such entity)
2247 <para>If the system control machine is not a binary distribution machine, the output includes an error message for the
2248 <emphasis role="bold">upclientetc</emphasis> process, but a complete a listing for the <emphasis
2249 role="bold">upclientbin</emphasis> process (in this case it refers to the machine <emphasis
2250 role="bold">fs5.example.com</emphasis> as the binary distribution machine):</para>
2253 Instance upclientbin, (type is simple) currently running normally.
2254 Process last started at Mon Mar 22 05:23:49 1999 (1 proc start)
2255 Command 1 is '/usr/afs/bin/upclient fs5.example.com -t 60 /usr/afs/bin'
2256 bos: failed to get instance info for 'upclientetc' (no such entity)
2260 <sect3 id="HDRWQ100">
2261 <title>The Output on a Binary Distribution Machine</title>
2263 <para>If you have issued the <emphasis role="bold">bos status</emphasis> command for a binary distribution machine, the
2264 output includes an entry for the <emphasis role="bold">upserver</emphasis> process similar to the following and error
2265 message for the <emphasis role="bold">upclientbin</emphasis> process:</para>
2268 Instance upserver, (type is simple) currently running normally.
2269 Process last started at Mon Apr 5 05:23:54 1999 (1 proc start)
2270 Command 1 is '/usr/afs/bin/upserver'
2271 bos: failed to get instance info for 'upclientbin' (no such entity)
2274 <para>Unless this machine also happens to be the system control machine, a message like the following references the system
2275 control machine (in this case, <emphasis role="bold">fs3.example.com</emphasis>):</para>
2278 Instance upclientetc, (type is simple) currently running normally.
2279 Process last started at Mon Apr 5 05:23:49 1999 (1 proc start)
2280 Command 1 is '/usr/afs/bin/upclient fs3.example.com -t 60 /usr/afs/etc'
2286 <sect1 id="HDRWQ101">
2287 <title>Administering Database Server Machines</title>
2289 <para>This section explains how to administer database server machines. For installation instructions, see the <emphasis>OpenAFS
2290 Quick Beginnings</emphasis>.</para>
2293 <primary>distribution</primary>
2295 <secondary>of databases</secondary>
2299 <primary>database, distributed</primary>
2301 <secondary></secondary>
2303 <see>administrative database</see>
2307 <primary>distributed database</primary>
2309 <secondary></secondary>
2311 <see>administrative database</see>
2315 <primary>administrative database</primary>
2317 <secondary>about replicating</secondary>
2321 <primary>database server machine</primary>
2323 <secondary>maintaining</secondary>
2327 <primary>Ubik</primary>
2329 <secondary>operation described</secondary>
2333 <primary>synchronization site (Ubik)</primary>
2335 <secondary>defined</secondary>
2339 <primary>secondary site (Ubik)</primary>
2343 <primary>coordinator (Ubik)</primary>
2345 <secondary>defined</secondary>
2349 <primary>Ubik</primary>
2351 <secondary>automatic updates</secondary>
2355 <primary>automatic</primary>
2357 <secondary>update to admin. databases by Ubik</secondary>
2360 <sect2 id="HDRWQ102">
2361 <title>Replicating the OpenAFS Administrative Databases</title>
2363 <para>There are several benefits to replicating the AFS administrative databases (the Authentication, Backup, Protection, and
2364 Volume Location Databases), as discussed in <link linkend="HDRWQ52">Replicating the OpenAFS Administrative Databases</link>. For
2365 correct cell functioning, the copies of each database must be identical at all times. To keep the databases synchronized, AFS
2366 uses library of utilities called <emphasis>Ubik</emphasis>. Each database server process runs an associated lightweight Ubik
2367 process, and client-side programs call Ubik's client-side subroutines when they submit requests to read and change the
2370 <para>Ubik is designed to work with minimal administrator intervention, but there are several configuration requirements, as
2371 detailed in <link linkend="HDRWQ103">Configuring the Cell for Proper Ubik Operation</link>. The following brief overview of
2372 Ubik's operation is helpful for understanding the requirements. For more details, see <link linkend="HDRWQ104">How Ubik
2373 Operates Automatically</link>.</para>
2375 <para>Ubik is designed to distribute changes made in an AFS administrative database to all copies as quickly as possible. Only
2376 one copy of the database, the <emphasis>synchronization site</emphasis>, accepts change requests from clients; the lightweight
2377 Ubik process running there is the <emphasis>Ubik coordinator</emphasis>. To maintain maximum availability, there is a separate
2378 Ubik coordinator for each database, and the synchronization site for each of the four databases can be on a different machine.
2379 The synchronization site for a database can also move from machine to machine in response to process, machine, or network
2382 <para>The other copies of a database, and the Ubik processes that maintain them, are termed <emphasis>secondary</emphasis>.
2383 The secondary sites do not accept database changes directly from client-side programs, but only from the synchronization
2386 <para>After the Ubik coordinator records a change in its copy of a database, it immediately sends the change to the secondary
2387 sites. During the brief distribution period, clients cannot access any of the copies of the database, even for reading. If the
2388 coordinator cannot reach a majority of the secondary sites, it halts the distribution and informs the client that the
2389 attempted change failed.</para>
2391 <para>To avoid distribution failures, the Ubik processes maintain constant contact by exchanging time-stamped messages. As
2392 long as a majority of the secondary sites respond to the coordinator's messages, there is a <emphasis>quorum</emphasis> of
2393 sites that are synchronized with the coordinator. If a process, machine, or network outage breaks the quorum, the Ubik
2394 processes attempt to elect a new coordinator in order to establish a new quorum among the highest possible number of sites.
2395 See <link linkend="HDRWQ106">A Flexible Coordinator Boosts Availability</link>.</para>
2398 <primary>Ubik</primary>
2400 <secondary>requirements summarized</secondary>
2404 <primary>database server process</primary>
2406 <secondary>need to run all on every database server machine</secondary>
2410 <primary>CellServDB file (server)</primary>
2412 <secondary>importance to Ubik operation</secondary>
2415 <sect3 id="HDRWQ103">
2416 <title>Configuring the Cell for Proper Ubik Operation</title>
2418 <para>This section describes how to configure your cell to maintain proper Ubik operation. <itemizedlist>
2420 <para>Run all four database server processes--Authentication Server, Backup Server, Protection Server, and VL
2421 Server--on all database server machines.</para>
2423 <para>Both the client and server portions of Ubik expect that all the database server machines listed in the <emphasis
2424 role="bold">CellServDB</emphasis> file are running all of the database server processes. There is no mechanism for
2425 indicating that only some database server processes are running on a machine.</para>
2429 <para>Maintain correct information in the <emphasis role="bold">/usr/afs/etc/CellServDB</emphasis> file at all
2432 <para>Ubik consults the <emphasis role="bold">/usr/afs/etc/CellServDB</emphasis> file to determine the sites with
2433 which to establish and maintain a quorum. Incorrect information can result in unsynchronized databases or election of
2434 a coordinator in each of several subgroups of machines, because the Ubik processes on various machines do not agree on
2435 which machines need to participate in the quorum.</para>
2437 <para>If you use the Update Server, it is simplest to maintain the <emphasis
2438 role="bold">/usr/afs/etc/CellServDB</emphasis> file on the system control machine, which distributes its copy to all
2439 other server machines. The <emphasis>OpenAFS Quick Beginnings</emphasis> explains how to configure the Update Server.
2442 <para>The only reason to alter the file is when configuring or decommissioning a database server machine. Use the
2443 appropriate <emphasis role="bold">bos</emphasis> commands rather than editing the file by hand. For instructions, see
2444 <link linkend="HDRWQ118">Maintaining the Server CellServDB File</link>. The instructions in <link
2445 linkend="HDRWQ142">Monitoring and Controlling Server Processes</link> for stopping and starting processes remind you
2446 to alter the <emphasis role="bold">CellServDB</emphasis> file when appropriate, as do the instructions in the
2447 <emphasis>OpenAFS Quick Beginnings</emphasis> for installing or decommissioning a database server machine.</para>
2449 <para>(Client processes and the server processes that do not maintain databases also rely on correct information in
2450 the <emphasis role="bold">CellServDB</emphasis> file for proper operation, but their use of the information does not
2451 affect Ubik's operation. See <link linkend="HDRWQ118">Maintaining the Server CellServDB File</link> and <link
2452 linkend="HDRWQ406">Maintaining Knowledge of Database Server Machines</link>.)</para>
2455 <primary>clocks</primary>
2457 <secondary>need to synchronize for Ubik</secondary>
2462 <para>Keep the clocks synchronized on all machines in the cell, especially the database server machines.</para>
2464 <para>Keeping clocks synchronized is important because the Ubik processes at a database's sites timestamp the messages
2465 which they exchange to maintain constant contact. Timestamping the messages is necessary because in a networked
2466 environment it is not safe to assume that a message reaches its destination instantly. Ubik compares the timestamp on
2467 an incoming message with the current time. If the difference is too great, it is possible that an outage is preventing
2468 reliable communication between the Ubik sites, which can possibly result in unsynchronized databases. Ubik considers
2469 the message invalid, which can prompt it to attempt election of a different coordinator.</para>
2471 <para>Electing a new coordinator is appropriate if a timestamped message is expired due to actual interruption of
2472 communication, but not if a message appears expired only because the sender and recipient do not share the same time.
2473 For detailed examples of how unsynchronized clocks can destabilize Ubik operation, see <link linkend="HDRWQ105">How
2474 Ubik Uses Timestamped Messages</link>.</para>
2476 </itemizedlist></para>
2479 <primary>Ubik</primary>
2481 <secondary>features summarized</secondary>
2485 <primary>process</primary>
2487 <secondary>lightweight Ubik</secondary>
2491 <primary>Ubik</primary>
2493 <secondary>server and client portions</secondary>
2497 <sect3 id="HDRWQ104">
2498 <title>How Ubik Operates Automatically</title>
2500 <para>The following Ubik features help keep its maintenance requirements to a minimum: <itemizedlist>
2502 <para>Ubik's server and client portions operate automatically.</para>
2504 <para>Each database server process runs a lightweight process to call on the server portion of the Ubik library. It is
2505 common to refer to this lightweight process itself as Ubik. Because it is lightweight, the Ubik process does not
2506 appear in process listings such as those generated by the UNIX <emphasis role="bold">ps</emphasis> command.
2507 Client-side programs that need to read and change the databases directly call the subroutines in the Ubik library's
2508 client portion, rather than running a separate lightweight process. Examples of such programs are the <emphasis
2509 role="bold">klog</emphasis> command and the commands in the <emphasis role="bold">pts</emphasis> suite.</para>
2513 <para>Ubik tracks database version numbers.</para>
2515 <para>As the coordinator records a change to a database, it increments the database's version number. The version
2516 number makes it easy for the coordinator to determine if a site has the most recent version or not. The version number
2517 speeds the return to normal functioning after election of a new coordinator or when communication is restored after an
2518 outage, because it makes it easy to determine which site has the most current database and which need to be
2523 <para>Ubik's use of timestamped messages guarantees that database copies are always synchronized during normal
2526 <para>Replicating a database to increase data availability is pointless if all copies of the database are not the
2527 same. Inconsistent performance can result if clients receive different information depending on which copy of the
2528 database they access. As previously noted, Ubik sites constantly track the status of their peers by exchanging
2529 timestamped messages. For a detailed description, see <link linkend="HDRWQ105">How Ubik Uses Timestamped
2530 Messages</link>.</para>
2534 <para>The ability to move the coordinator maximizes database availability.</para>
2536 <para>Suppose, for example, that in a cell with three database server machines a network partition separates the two
2537 secondary sites from the coordinator. The coordinator retires because it is no longer in contact with a majority of
2538 the sites listed in the <emphasis role="bold">CellServDB</emphasis> file. The two sites on the other side of the
2539 partition can elect a new coordinator among themselves, and it can then accept database changes from clients. If the
2540 coordinator cannot move in this way, the database has to be read-only until the network partition is repaired. For a
2541 detailed description of Ubik's election procedure, see <link linkend="HDRWQ106">A Flexible Coordinator Boosts
2542 Availability</link>.</para>
2544 </itemizedlist></para>
2547 <primary>consistency guarantees</primary>
2549 <secondary>administrative databases</secondary>
2553 <primary>Ubik</primary>
2555 <secondary>consistency guarantees</secondary>
2558 <sect4 id="HDRWQ105">
2559 <title>How Ubik Uses Timestamped Messages</title>
2561 <para>Ubik synchronizes the copies of a database by maintaining constant contact between the synchronization site and the
2562 secondary sites. The Ubik coordinator frequently sends a time-stamped <emphasis>guarantee</emphasis> message to each of
2563 the secondary sites. When the secondary site receives the message, it concludes that it is in contact with the
2564 coordinator. It considers its copy of the database to be valid until time <emphasis>T</emphasis>, which is usually 60
2565 seconds from the time the coordinator sent the message. In response, the secondary site returns a
2566 <emphasis>vote</emphasis> message that acknowledges the coordinator as valid until a certain time X, which is usually 120
2567 seconds in the future.</para>
2569 <para>The coordinator sends guarantee messages more frequently than every <emphasis>T</emphasis> seconds, so that the
2570 expiration periods overlap. There is no danger of expiration unless a network partition or other outage actually
2571 interrupts communication. If the guarantee expires, the secondary site's copy of the database it not necessarily current.
2572 Nonetheless, the database server continues to service client requests. It is considered better for overall cell
2573 functioning that a secondary site remains accessible even if the information it is distributing is possibly out of date.
2574 Most of the AFS administrative databases do not change that frequently, in any case, and making a database inaccessible
2575 causes a timeout for clients that happen to access that copy.</para>
2577 <para>As previously mentioned, Ubik's use of timestamped messages makes it vital to synchronize the clocks on database
2578 server machines. There are two ways that skewed clocks can interrupt normal Ubik functioning, depending on which clock is
2579 ahead of the others.</para>
2581 <para>Suppose, for example, that the Ubik coordinator's clock is ahead of the secondary sites: the coordinator's clock
2582 says 9:35:30, but the secondary clocks say 9:31:30. The secondary sites send votes messages that acknowledge the
2583 coordinator as valid until 9:33:30. This is two minutes in the future according to the secondary clocks, but is already in
2584 the past from the coordinator's perspective. The coordinator concludes that it no longer has enough support to remain
2585 coordinator and forces election of a new coordinator. Election takes about three minutes, during which time no copy of the
2586 database accepts changes.</para>
2588 <para>The opposite possibility is that a secondary site's clock (14:50:00) is ahead of the coordinator's (14:46:30). When
2589 the coordinator sends a guarantee message good until 14:47:30), it has already expired according to the secondary clock.
2590 Believing that it is out of contact with the coordinator, the secondary site stops sending votes for the coordinator and
2591 tries get itself elected as coordinator. This is appropriate if the coordinator has actually failed, but is inappropriate
2592 when there is no actual outage.</para>
2594 <para>The attempt of a single secondary site to get elected as the new coordinator usually does not affect the performance
2595 of the other sites. As long as their clocks agree with the coordinator's, they ignore the other secondary site's request
2596 for votes and continue voting for the current coordinator. However, if enough of the secondary sites's clocks get ahead of
2597 the coordinator's, they can force election of a new coordinator even though the current one is actually working
2601 <primary>Ubik</primary>
2603 <secondary>election of coordinator</secondary>
2607 <primary>coordinator (Ubik)</primary>
2609 <secondary>election procedure described</secondary>
2613 <primary>election of Ubik coordinator</primary>
2617 <primary>flexible synchronization site (Ubik)</primary>
2621 <primary>synchronization site (Ubik)</primary>
2623 <secondary>flexibility</secondary>
2627 <primary>Ubik</primary>
2629 <secondary>majority defined</secondary>
2633 <primary>majority</primary>
2635 <secondary>defined for Ubik</secondary>
2639 <primary>outages</primary>
2641 <secondary>due to Ubik election</secondary>
2645 <primary>system outages</primary>
2647 <secondary>due to Ubik election</secondary>
2651 <sect4 id="HDRWQ106">
2652 <title>A Flexible Coordinator Boosts Availability</title>
2654 <para>Ubik uses timestamped messages to determine when coordinator election is necessary, just as it does to keep the
2655 database copies synchronized. As long as the coordinator receives vote messages from a majority of the sites (it
2656 implicitly votes for itself), it is appropriate for it to continue as coordinator because it is successfully distributing
2657 database changes. A majority is defined as more than 50% of all database sites when there are an odd number of sites; with
2658 an even number of sites, the site with the lowest Internet address has an extra vote for breaking ties as necessary.If the
2659 coordinator is not receiving sufficient votes, it retires and the Ubik sites elect a new coordinator. This does not happen
2660 spontaneously, but only when the coordinator really fails or stops receiving a majority of the votes. The secondary sites
2661 have a built-in bias to continue voting for an existing coordinator, which prevents undue elections.</para>
2663 <para>The election of the new coordinator is by majority vote. The Ubik subprocesses have a bias to vote for the site with
2664 the lowest Internet address, which helps it gather the necessary majority quicker than if all the sites were competing to
2665 receive votes themselves. During the election (which normally lasts less than three minutes), clients can read information
2666 from the database, but cannot make any changes.</para>
2668 <para>Ubik's election procedure makes it possible for each database server process's coordinator to be on a different
2669 machine. For example, if the Ubik coordinators for all four processes start out on machine A and the Protection Server on
2670 machine A fails for some reason, then a different site (say machine B) must be elected as the new Protection Database Ubik
2671 coordinator. Machine B remains the coordinator for the Protection Database even after the Protection Server on machine A
2672 is working again. The failure of the Protection Server has no effect on the Authentication, Backup, or VL Servers, so
2673 their coordinators remain on machine A.</para>
2678 <sect2 id="HDRWQ107">
2679 <title>Backing Up and Restoring the Administrative Databases</title>
2681 <para>The AFS administrative databases store information that is critical for AFS operation in your cell. If a database
2682 becomes corrupted due to a hardware failure or other problem on a database server machine, it likely to be difficult and
2683 time-consuming to recreate all of the information from scratch. To protect yourself against loss of data, back up the
2684 administrative databases to a permanent media, such as tape, on a regular basis. The recommended method is to use a standard
2685 local disk backup utility such as the UNIX <emphasis role="bold">tar</emphasis> command.</para>
2687 <para>When deciding how often to back up a database, consider the amount of data that you are willing to recreate by hand if
2688 it becomes necessary to restore the database from a backup copy. In most cells, the databases differ quite a bit in how often
2689 and how much they change. Changes to the Authentication Database are probably the least frequent, and consist mostly of
2690 changed user passwords. Protection Database and VLDB changes are probably more frequent, as users add or delete groups and
2691 change group memberships, and as you and other administrators create or move volumes. The number and frequency of changes is
2692 probably greatest in the Backup Database, particularly if you perform backups every day.</para>
2694 <para>The ease with which you can recapture lost changes also differs for the different databases: <itemizedlist>
2696 <para>If regular users make a large proportion of the changes to the Authentication Database and Protection Database in
2697 your cell, then recovering them possibly requires a large amount of detective work and interviewing of users, assuming
2698 that they can even remember what changes they made at what time.</para>
2702 <para>Recovering lost changes to the VLDB is more straightforward, because you can use the <emphasis role="bold">vos
2703 syncserv</emphasis> and <emphasis role="bold">vos syncvldb</emphasis> commands to correct any discrepancies between the
2704 VLDB and the actual state of volumes on server machines. Running these commands can be time-consuming, however.</para>
2708 <para>The configuration information in the Backup Database (Tape Coordinator port offsets, volume sets and entries, the
2709 dump hierarchy, and so on) probably does not change that often, in which case it is not that hard to recover a few
2710 recent changes. In contrast, there are likely to be a large number of new dump records resulting from dump operations.
2711 You can recover these records by using the <emphasis role="bold">-dbadd</emphasis> argument to the <emphasis
2712 role="bold">backup scantape</emphasis> command, reading in information from the backup tapes themselves. This can take a
2713 long time and require numerous tape changes, however, depending on how much data you back up in your cell and how you
2714 append dumps. Furthermore, the <emphasis role="bold">backup scantape</emphasis> command is subject to several
2715 restrictions. The most basic is that it halts if it finds that an existing dump record in the database has the same dump
2716 ID number as a dump on the tape it is scanning. If you want to continue with the scanning operation, you must locate and
2717 remove the existing record from the database. For further discussion, see the <emphasis role="bold">backup
2718 scantape</emphasis> command's reference page in the <emphasis>OpenAFS Administration Reference</emphasis>.</para>
2720 </itemizedlist></para>
2722 <para>These differences between the databases possibly suggest backing up the database at different frequencies, ranging from
2723 every few days or weekly for the Backup Database to every few weeks for the Authentication Database. On the other hand, it is
2724 probably simpler from a logistical standpoint to back them all up at the same time (and frequently), particularly if tape
2725 consumption is not a major concern. Also, it is not generally necessary to keep backup copies of the databases for a long
2726 time, so you can recycle the tapes fairly frequently.</para>
2729 <primary>administrative database</primary>
2731 <secondary>backing up</secondary>
2735 <primary>backing up</primary>
2737 <secondary>administrative databases</secondary>
2741 <sect2 id="HDRWQ108">
2742 <title>To back up the administrative databases</title>
2746 <para>Log in as the local superuser <emphasis role="bold">root</emphasis> on a database server machine that is not the
2747 synchronization site. The machine with the highest IP address is normally the best choice, since it is least likely to
2748 become the synchronization site in an election.</para>
2751 <listitem id="LIDBBK_SHUTDOWN">
2752 <para>Issue the <emphasis role="bold">bos shutdown</emphasis> command to shut down the
2753 relevant server process on the local machine. For a complete description of the command, see <link linkend="HDRWQ168">To
2754 stop processes temporarily</link>.</para>
2756 <para>For the <emphasis role="bold">-instance</emphasis> argument, specify one or more database server process names
2757 (<emphasis role="bold">buserver</emphasis> for the Backup Server, <emphasis role="bold">kaserver</emphasis> for the
2758 Authentication Server, <emphasis role="bold">ptserver</emphasis> for the Protection Server, or <emphasis
2759 role="bold">vlserver</emphasis> for the Volume Location Server. Include the <emphasis role="bold">-localauth</emphasis>
2760 flag because you are logged in as the local superuser <emphasis role="bold">root</emphasis> but do not necessarily have
2761 administrative tokens.</para>
2764 # <emphasis role="bold">bos shutdown</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-instance</emphasis> <<replaceable>instances</replaceable>>+ <emphasis
2765 role="bold">-localauth</emphasis> [<emphasis role="bold">-wait</emphasis>]
2770 <para>Use a local disk backup utility, such as the UNIX <emphasis role="bold">tar</emphasis> command, to transfer one or
2771 more database files to tape. If the local database server machine does not have a tape device attached, use a remote copy
2772 command to transfer the file to a machine with a tape device, then use the <emphasis role="bold">tar</emphasis> command
2775 <para>The following command sequence backs up the complete contents of the <emphasis role="bold">/usr/afs/db</emphasis>
2779 # <emphasis role="bold">cd /usr/afs/db</emphasis>
2780 # <emphasis role="bold">tar cvf</emphasis> tape_device <emphasis role="bold">.</emphasis>
2783 <para>To back up individual database files, substitute their names for the period in the preceding <emphasis
2784 role="bold">tar</emphasis> command: <itemizedlist>
2786 <para><emphasis role="bold">bdb.DB0</emphasis> for the Backup Database</para>
2790 <para><emphasis role="bold">kaserver.DB0</emphasis> for the Authentication Database</para>
2794 <para><emphasis role="bold">prdb.DB0</emphasis> for the Protection Database</para>
2798 <para><emphasis role="bold">vldb.DB0</emphasis> for the VLDB</para>
2800 </itemizedlist></para>
2804 <para>Issue the <emphasis role="bold">bos start</emphasis> command to restart the server processes on the local machine.
2805 For a complete description of the command, see <link linkend="HDRWQ166">To start processes by changing their status flags
2806 to Run</link>. Provide the same values for the <emphasis role="bold">-instance</emphasis> argument as in Step <link
2807 linkend="LIDBBK_SHUTDOWN">2</link>, and the <emphasis role="bold">-localauth</emphasis> flag for the same reason.
2809 # <emphasis role="bold">bos start</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-instance</emphasis> <<replaceable>server process name</replaceable>>+ <emphasis
2810 role="bold">-localauth</emphasis>
2811 </programlisting></para>
2816 <primary>administrative database</primary>
2818 <secondary>restoring</secondary>
2822 <primary>restoring</primary>
2824 <secondary>administrative databases</secondary>
2828 <sect2 id="HDRWQ109">
2829 <title>To restore an administrative database</title>
2833 <para>Log in as the local superuser <emphasis role="bold">root</emphasis> on each database server machine in the
2837 <listitem id="LIDBREST_SHUTDOWN">
2838 <para>Working on one of the machines, issue the <emphasis role="bold">bos
2839 shutdown</emphasis> command once for each database server machine, to shut down the relevant server process on all of
2840 them. For a complete description of the command, see <link linkend="HDRWQ168">To stop processes temporarily</link>.</para>
2842 <para>For the <emphasis role="bold">-instance</emphasis> argument, specify one or more database server process names
2843 (<emphasis role="bold">buserver</emphasis> for the Backup Server, <emphasis role="bold">kaserver</emphasis> for the
2844 Authentication Server, <emphasis role="bold">ptserver</emphasis> for the Protection Server, or <emphasis
2845 role="bold">vlserver</emphasis> for the Volume Location Server. Include the <emphasis role="bold">-localauth</emphasis>
2846 flag because you are logged in as the local superuser <emphasis role="bold">root</emphasis> but do not necessarily have
2847 administrative tokens.</para>
2850 # <emphasis role="bold">bos shutdown</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-instance</emphasis> <<replaceable>instances</replaceable>>+ <emphasis
2851 role="bold">-localauth</emphasis> [<emphasis role="bold">-wait</emphasis>]
2856 <para>Remove the database from each database server machine, by issuing the following commands on each one.
2858 # <emphasis role="bold">cd /usr/afs/db</emphasis>
2859 </programlisting></para>
2861 <para>For the Backup Database:</para>
2864 # <emphasis role="bold">rm bdb.DB0</emphasis>
2865 # <emphasis role="bold">rm bdb.DBSYS1</emphasis>
2868 <para>For the Authentication Database:</para>
2871 # <emphasis role="bold">rm kaserver.DB0</emphasis>
2872 # <emphasis role="bold">rm kaserver.DBSYS1</emphasis>
2875 <para>For the Protection Database:</para>
2878 # <emphasis role="bold">rm prdb.DB0</emphasis>
2879 # <emphasis role="bold">rm prdb.DBSYS1</emphasis>
2882 <para>For the VLDB:</para>
2885 # <emphasis role="bold">rm vldb.DB0</emphasis>
2886 # <emphasis role="bold">rm vldb.DBSYS1</emphasis>
2891 <para>Using the local disk backup utility that you used to back up the database, copy the most recently backed-up version
2892 of it to the appropriate file on the database server machine with the lowest IP address. The following is an appropriate
2893 <emphasis role="bold">tar</emphasis> command if the synchronization site has a tape device attached: <programlisting>
2894 # <emphasis role="bold">cd /usr/afs/db</emphasis>
2895 # <emphasis role="bold">tar xvf</emphasis> tape_device database_file
2896 </programlisting></para>
2898 <para>where <emphasis>database_file</emphasis> is one of the following: <itemizedlist>
2900 <para><emphasis role="bold">bdb.DB0</emphasis> for the Backup Database</para>
2904 <para><emphasis role="bold">kaserver.DB0</emphasis> for the Authentication Database</para>
2908 <para><emphasis role="bold">prdb.DB0</emphasis> for the Protection Database</para>
2912 <para><emphasis role="bold">vldb.DB0</emphasis> for the VLDB</para>
2914 </itemizedlist></para>
2918 <para>Working on one of the machines, issue the <emphasis role="bold">bos start</emphasis> command to restart the server
2919 process on each of the database server machines in turn. Start with the machine with the lowest IP address, which becomes
2920 the synchronization site for the Backup Database. Wait for it to establish itself as the synchronization site before
2921 repeating the command to restart the process on the other database server machines. For a complete description of the
2922 command, see <link linkend="HDRWQ166">To start processes by changing their status flags to Run</link>. Provide the same
2923 values for the <emphasis role="bold">-instance</emphasis> argument as in Step <link linkend="LIDBREST_SHUTDOWN">2</link>,
2924 and the <emphasis role="bold">-localauth</emphasis> flag for the same reason. <programlisting>
2925 # <emphasis role="bold">bos start</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-instance</emphasis> <<replaceable>server process name</replaceable>>+ <emphasis
2926 role="bold">-localauth</emphasis>
2927 </programlisting></para>
2931 <para>If the database has changed since you last backed it up, issue the appropriate commands from the instructions in the
2932 indicated sections to recreate the information in the restored database. If issuing <emphasis role="bold">pts</emphasis>
2933 commands, you must first obtain administrative tokens. The <emphasis role="bold">backup</emphasis> and <emphasis
2934 role="bold">vos</emphasis> commands accept the <emphasis role="bold">-localauth</emphasis> flag if you are logged in as
2935 the local superuser <emphasis role="bold">root</emphasis>, so you do not need administrative tokens. The Authentication
2936 Server always performs a separate authentication anyway, so you only need to include the <emphasis
2937 role="bold">-admin</emphasis> argument if issuing <emphasis role="bold">kas</emphasis> commands. <itemizedlist>
2939 <para>To define or remove volume sets and volume entries in the Backup Database, see <link
2940 linkend="HDRWQ265">Defining and Displaying Volume Sets and Volume Entries</link>.</para>
2944 <para>To edit the dump hierarchy in the Backup Database, see <link linkend="HDRWQ267">Defining and Displaying the
2945 Dump Hierarchy</link>.</para>
2949 <para>To define or remove Tape Coordinator port offset entries in the Backup Database, see <link
2950 linkend="HDRWQ261">Configuring Tape Coordinator Machines and Tape Devices</link>.</para>
2954 <para>To restore dump records in the Backup Database, see <link linkend="HDRWQ305">To scan the contents of a
2959 <para>To recreate Authentication Database entries or password changes for users, see the appropriate section of
2960 <link linkend="HDRWQ491">Administering User Accounts</link>.</para>
2964 <para>To recreate Protection Database entries or group membership information, see the appropriate section of <link
2965 linkend="HDRWQ531">Administering the Protection Database</link>.</para>
2969 <para>To synchronize the VLDB with volume headers, see <link linkend="HDRWQ227">Synchronizing the VLDB and Volume
2970 Headers</link>.</para>
2972 </itemizedlist></para>
2977 <primary>installing</primary>
2979 <secondary>server process binaries, about</secondary>
2983 <primary>server process binaries</primary>
2985 <secondary>installing</secondary>
2989 <primary>BOS Server</primary>
2991 <secondary>maintainer of server process binaries</secondary>
2995 <primary>server process</primary>
2997 <secondary>binaries</secondary>
2999 <see>server process binaries</see>
3003 <primary>directories (server)</primary>
3005 <secondary>/usr/afs/bin</secondary>
3009 <primary>server machine</primary>
3011 <secondary>need for consistent version of software</secondary>
3016 <sect1 id="HDRWQ110">
3017 <title>Installing Server Process Software</title>
3019 <para>This section explains how to install new server process binaries on file server machines, how to revert to a previous
3020 version if the current version is not working properly, and how to install new disks to house AFS volumes on a file server
3023 <para>The most frequent reason to replace a server process's binaries is to upgrade AFS to a new version. In general,
3024 installation instructions accompany the updated software, but this chapter provides an additional reference.</para>
3026 <para>Each AFS server machine must store the server process binaries in a local disk directory, called <emphasis
3027 role="bold">/usr/afs/bin</emphasis> by convention. For predictable system performance, it is best that all server machines run
3028 the same build level, or at least the same version, of the server software. For instructions on checking AFS build level, see
3029 <link linkend="HDRWQ117">Displaying A Binary File's Build Level</link>.</para>
3031 <para>The Update Server makes it easy to distribute a consistent version of software to all server machines. You designate one
3032 server machine of each system type as the <emphasis>binary distribution machine</emphasis> by running the server portion of the
3033 Update Server (<emphasis role="bold">upserver</emphasis> process) on it. All other server machines of that system type run the
3034 client portion of the Update Server (<emphasis role="bold">upclientbin</emphasis> process) to retrieve updated software from the
3035 binary distribution machine. The <emphasis>OpenAFS Quick Beginnings</emphasis> explains how to install the appropriate
3036 processes. For more on binary distribution machines, see <link linkend="HDRWQ93">Binary Distribution Machines</link>.</para>
3038 <para>When you use the Update Server, you install new binaries on binary distribution machines only. If you install binaries
3039 directly on a machine that is running the <emphasis role="bold">upclientbin</emphasis> process, they are overwritten the next
3040 time the process compares the contents of the local <emphasis role="bold">/usr/afs/bin</emphasis> directory to the contents on
3041 the system control machine, by default within five minutes.</para>
3043 <para>The following instructions explain how to use the appropriate commands from the <emphasis role="bold">bos</emphasis> suite
3044 to install and uninstall server binaries.</para>
3047 <primary>installing</primary>
3049 <secondary>server binaries</secondary>
3053 <primary>server process binaries</primary>
3055 <secondary>installing</secondary>
3059 <primary>command suite</primary>
3061 <secondary>binaries</secondary>
3063 <tertiary>installing</tertiary>
3067 <primary>file server machine</primary>
3069 <secondary>installing command and process binaries</secondary>
3073 <primary>server process</primary>
3075 <secondary>restarting for changed binaries</secondary>
3078 <sect2 id="HDRWQ111">
3079 <title>Installing New Binaries</title>
3081 <para>An AFS server process does not automatically switch to a new process binary file as soon as it is installed in the
3082 <emphasis role="bold">/usr/afs/bin</emphasis> directory. The process continues to use the previous version of the binary file
3083 until it (the process) next restarts. By default, the BOS Server restarts processes for which there are new binary files every
3084 day at 5:00 a.m., as specified in the <emphasis role="bold">/usr/afs/local/BosConfig</emphasis> file. To display or change
3085 this <emphasis>binary restart time</emphasis>, use the <emphasis role="bold">bos getrestart</emphasis> and <emphasis
3086 role="bold">bos setrestart</emphasis> commands, as described in <link linkend="HDRWQ171">Setting the BOS Server's Restart
3087 Times</link>.</para>
3089 <para>You can force the server machine to start using new server process binaries immediately by issuing the <emphasis
3090 role="bold">bos restart</emphasis> command as described in the following instructions.</para>
3092 <para>You do not need to restart processes when you install new command suite binaries. The new binary is invoked
3093 automatically the next time a command from the suite is issued.</para>
3096 <primary>file extension</primary>
3098 <secondary>.BAK</secondary>
3102 <primary>file extension</primary>
3104 <secondary>.OLD</secondary>
3108 <primary>BAK version of binary file</primary>
3110 <secondary>created by bos install command</secondary>
3114 <primary>OLD version of binary file</primary>
3116 <secondary>created by bos install command</secondary>
3120 <primary>saving</primary>
3122 <secondary>previous version of server binaries</secondary>
3125 <para>When you use the <emphasis role="bold">bos install</emphasis> command, the BOS Server automatically saves the current
3126 version of a binary file by adding a <emphasis role="bold">.BAK</emphasis> extension to its name. It renames the current
3127 <emphasis role="bold">.BAK</emphasis> version, if any, to the <emphasis role="bold">.OLD</emphasis> version, if there is no
3128 <emphasis role="bold">.OLD</emphasis> version already. If there is a current <emphasis role="bold">.OLD</emphasis> version,
3129 the current <emphasis role="bold">.BAK</emphasis> version must be at least seven days old to replace it.</para>
3131 <para>It is best to store AFS binaries in the <emphasis role="bold">/usr/afs/bin</emphasis> directory, because that is the
3132 only directory the BOS Server automatically checks for new binaries. You can, however, use the <emphasis role="bold">bos
3133 install</emphasis> command's <emphasis role="bold">-dir</emphasis> argument to install non-AFS binaries into other directories
3134 on a server machine's local disk. See the command's reference page in the <emphasis>OpenAFS Administration
3135 Reference</emphasis> for further information.</para>
3138 <primary>bos commands</primary>
3140 <secondary>install</secondary>
3144 <primary>commands</primary>
3146 <secondary>bos install</secondary>
3150 <sect2 id="Header_130">
3151 <title>To install new server binaries</title>
3155 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
3156 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
3157 display the users in the UserList file</link>. <programlisting>
3158 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
3159 </programlisting></para>
3163 <para>Verify that the binaries are available in the source directory from which you are installing them. If the machine is
3164 also an AFS client, you can retrieve the binaries from a central directory in AFS. Otherwise, you can obtain them directly
3165 from the AFS distribution media, from a local disk directory where you previously installed them, or from a remote machine
3166 using a transfer utility such as the <emphasis role="bold">ftp</emphasis> command.</para>
3169 <listitem id="LIWQ112">
3170 <para>Issue the <emphasis role="bold">bos install</emphasis> command for the binary distribution
3171 machine. (If you have forgotten which machine is performing that role, see <link linkend="HDRWQ97">To locate the binary
3172 distribution machine for a system type</link>.) <programlisting>
3173 % <emphasis role="bold">bos install</emphasis> <<replaceable>machine name</replaceable>> <<replaceable>files to install</replaceable>>+
3174 </programlisting></para>
3176 <para>where <variablelist>
3178 <term><emphasis role="bold">i</emphasis></term>
3181 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">install</emphasis>.</para>
3186 <term><emphasis role="bold">machine name</emphasis></term>
3189 <para>Names the binary distribution machine.</para>
3194 <term><emphasis role="bold">files to install</emphasis></term>
3197 <para>Names each binary file to install into the local <emphasis role="bold">/usr/afs/bin</emphasis> directory.
3198 Partial pathnames are interpreted relative to the current working directory. The last element in each pathname
3199 (the filename itself) matches the name of the file it is replacing, such as <emphasis
3200 role="bold">bosserver</emphasis> or <emphasis role="bold">volserver</emphasis> for server processes, <emphasis
3201 role="bold">bos</emphasis> or <emphasis role="bold">vos</emphasis> for commands.</para>
3203 <para>Each AFS server process other than the <emphasis role="bold">fs</emphasis> process uses a single binary
3204 file. The <emphasis role="bold">fs</emphasis> process uses three binary files: <emphasis
3205 role="bold">fileserver</emphasis>, <emphasis role="bold">volserver</emphasis>, and <emphasis
3206 role="bold">salvager</emphasis>. Installing a new version of one component does not necessarily mean that you need
3207 to replace all three.</para>
3210 </variablelist></para>
3214 <para>Repeat Step <link linkend="LIWQ112">3</link> for each binary distribution machine.</para>
3218 <para><emphasis role="bold">(Optional)</emphasis> If you want to restart processes to use the new binaries immediately,
3219 wait until the <emphasis role="bold">upclientbin</emphasis> process retrieves them from the binary distribution machine.
3220 You can verify the timestamps on binary files by using the <emphasis role="bold">bos getdate</emphasis> command as
3221 described in <link linkend="HDRWQ115">Displaying Binary Version Dates</link>. When the binary files are available on each
3222 server machine, issue the <emphasis role="bold">bos restart</emphasis> command, for which complete instructions appear in
3223 <link linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>.</para>
3225 <para>If you are working on an AFS client machine, it is a wise precaution to have a copy of the <emphasis
3226 role="bold">bos</emphasis> command suite binaries on the local disk before restarting server processes. In the
3227 conventional configuration, the <emphasis role="bold">/usr/afsws/bin</emphasis> directory that houses the <emphasis
3228 role="bold">bos</emphasis> command binary on client machines is a symbolic link into AFS, which conserves local disk
3229 space. However, restarting certain processes (particularly the database server processes) can make the AFS filespace
3230 inaccessible, particularly if a problem arises during the restart. Having a local copy of the <emphasis
3231 role="bold">bos</emphasis> binary enables you to uninstall or reinstall process binaries or restart processes even in this
3232 case. Use the <emphasis role="bold">cp</emphasis> command to copy the <emphasis role="bold">bos</emphasis> command binary
3233 from the <emphasis role="bold">/usr/afsws/bin</emphasis> directory to a local directory such as <emphasis
3234 role="bold">/tmp</emphasis>.</para>
3236 <para>Restarting a process causes a service outage. It is best to perform the restart at times of low system usage if
3240 % <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <<replaceable>instances</replaceable>>+
3246 <primary>uninstalling</primary>
3248 <secondary>server process and command suite binaries</secondary>
3252 <primary>reverting</primary>
3254 <secondary>to old version of server process and command binaries</secondary>
3258 <primary>server process binaries</primary>
3260 <secondary>uninstalling</secondary>
3264 <primary>server process binaries</primary>
3266 <secondary>reverting to old version</secondary>
3270 <primary>command suite</primary>
3272 <secondary>binaries</secondary>
3274 <tertiary>uninstalling</tertiary>
3278 <primary>server machine</primary>
3280 <secondary>uninstalling command & process binaries</secondary>
3284 <primary>BAK version of binary file</primary>
3286 <secondary>used by bos uninstall command</secondary>
3290 <primary>OLD version of binary file</primary>
3292 <secondary>used by bos uninstall command</secondary>
3296 <sect2 id="HDRWQ113">
3297 <title>Reverting to the Previous Version of Binaries</title>
3299 <para>In rare cases, installing a new binary can cause problems serious enough to require reverting to the previous version.
3300 Just as with installing binaries, consistent system performance requires reverting every server machine back to the same
3301 version. Issue the <emphasis role="bold">bos uninstall</emphasis> command described here on each binary distribution
3304 <para>When you use the <emphasis role="bold">bos uninstall</emphasis> command, the BOS Server discards the current version of
3305 a binary file and promotes the <emphasis role="bold">.BAK</emphasis> version of the file by removing the extension. It renames
3306 the current <emphasis role="bold">.OLD</emphasis> version, if any, to <emphasis role="bold">.BAK</emphasis>.</para>
3308 <para>If there is no current <emphasis role="bold">.BAK</emphasis> version, the <emphasis role="bold">bos uninstall</emphasis>
3309 command operation fails and generates an error message. If a <emphasis role="bold">.OLD</emphasis> version still exists, issue
3310 the <emphasis role="bold">mv</emphasis> command to rename it to <emphasis role="bold">.BAK</emphasis> before reissuing the
3311 <emphasis role="bold">bos uninstall</emphasis> command.</para>
3313 <para>Just as when you install new binaries, the server processes do not start using a reverted version immediately.
3314 Presumably you are reverting because the current binaries do not work, so the following instructions have you restart the
3315 relevant processes.</para>
3318 <primary>bos commands</primary>
3320 <secondary>uninstall</secondary>
3324 <primary>commands</primary>
3326 <secondary>bos install</secondary>
3330 <sect2 id="Header_132">
3331 <title>To revert to the previous version of binaries</title>
3335 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
3336 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
3337 display the users in the UserList file</link>. <programlisting>
3338 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
3339 </programlisting></para>
3343 <para>Verify that the <emphasis role="bold">.BAK</emphasis> version of each relevant binary is available in the <emphasis
3344 role="bold">/usr/afs/bin</emphasis> directory on each binary distribution machine. If necessary, you can use the <emphasis
3345 role="bold">bos getdate</emphasis> command as described in <link linkend="HDRWQ115">Displaying Binary Version
3346 Dates</link>. If necessary, rename the <emphasis role="bold">.OLD</emphasis> version to <emphasis
3347 role="bold">.BAK</emphasis></para>
3350 <listitem id="LIWQ114">
3351 <para>Issue the <emphasis role="bold">bos uninstall</emphasis> command for a binary distribution
3352 machine. (If you have forgotten which machine is performing that role, see <link linkend="HDRWQ97">To locate the binary
3353 distribution machine for a system type</link>.) <programlisting>
3354 % <emphasis role="bold">bos uninstall</emphasis> <<replaceable>machine name</replaceable>> <<replaceable>files to uninstall</replaceable>>+
3355 </programlisting></para>
3357 <para>where <variablelist>
3359 <term><emphasis role="bold">u</emphasis></term>
3362 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">uninstall</emphasis>.</para>
3367 <term><emphasis role="bold">machine name</emphasis></term>
3370 <para>Names the binary distribution machine.</para>
3375 <term><emphasis role="bold">files to uninstall</emphasis></term>
3378 <para>Names each binary file in the <emphasis role="bold">/usr/afs/bin</emphasis> directory to replace with its
3379 <emphasis role="bold">.BAK</emphasis> version. The file name alone is sufficient, because the <emphasis
3380 role="bold">/usr/afs/bin</emphasis> directory is assumed.</para>
3383 </variablelist></para>
3387 <para>Repeat Step <link linkend="LIWQ114">3</link> for each binary distribution machine.</para>
3391 <para>Wait until the <emphasis role="bold">upclientbin</emphasis> process on each server machine retrieves the reverted
3392 from the binary distribution machine. You can verify the timestamps on binary files by using the <emphasis role="bold">bos
3393 getdate</emphasis> command as described in <link linkend="HDRWQ115">Displaying Binary Version Dates</link>. When the
3394 binary files are available on each server machine, issue the <emphasis role="bold">bos restart</emphasis> command, for
3395 which complete instructions appear in <link linkend="HDRWQ170">Stopping and Immediately Restarting
3396 Processes</link>.</para>
3398 <para>If you are working on an AFS client machine, it is a wise precaution to have a copy of the <emphasis
3399 role="bold">bos</emphasis> command suite binaries on the local disk before restarting server processes. In the
3400 conventional configuration, the <emphasis role="bold">/usr/afsws/bin</emphasis> directory that houses the <emphasis
3401 role="bold">bos</emphasis> command binary on client machines is a symbolic link into AFS, which conserves local disk
3402 space. However, restarting certain processes (particularly the database server processes) can make the AFS filespace
3403 inaccessible, particularly if a problem arises during the restart. Having a local copy of the <emphasis
3404 role="bold">bos</emphasis> binary enables you to uninstall or reinstall process binaries or restart processes even in this
3405 case. Use the <emphasis role="bold">cp</emphasis> command to copy the <emphasis role="bold">bos</emphasis> command binary
3406 from the <emphasis role="bold">/usr/afsws/bin</emphasis> directory to a local directory such as <emphasis
3407 role="bold">/tmp</emphasis>.</para>
3410 % <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <<replaceable>instances</replaceable>>+
3416 <primary>server process binaries</primary>
3418 <secondary>displaying time stamp</secondary>
3422 <primary>command suite</primary>
3424 <secondary>binaries</secondary>
3426 <tertiary>displaying time stamp</tertiary>
3430 <primary>time stamp</primary>
3432 <secondary>on binary file, listing</secondary>
3436 <primary>date</primary>
3438 <secondary>on binary file, listing</secondary>
3442 <primary>compilation</primary>
3444 <secondary>date of, listing on binary file</secondary>
3448 <primary>displaying</primary>
3450 <secondary>time stamp on binary file</secondary>
3454 <sect2 id="HDRWQ115">
3455 <title>Displaying Binary Version Dates</title>
3457 <para>You can check the compilation dates for all three versions of a binary file in the <emphasis
3458 role="bold">/usr/afs/bin</emphasis> directory--the current, <emphasis role="bold">.BAK</emphasis> and .<emphasis
3459 role="bold">OLD</emphasis> versions. This is useful for verifying that new binaries have been copied to a file server machine
3460 from its binary distribution machine before restarting a server process to use the new binaries.</para>
3462 <para>To check dates on binaries in a directory other than <emphasis role="bold">/usr/afs/bin</emphasis>, add the <emphasis
3463 role="bold">-dir</emphasis> argument. See the <emphasis>OpenAFS Administration Reference</emphasis>.</para>
3466 <primary>bos commands</primary>
3468 <secondary>getdate</secondary>
3472 <primary>commands</primary>
3474 <secondary>bos getdate</secondary>
3478 <sect2 id="Header_134">
3479 <title>To display binary version dates</title>
3483 <para>Issue the <emphasis role="bold">bos getdate</emphasis> command. <programlisting>
3484 % <emphasis role="bold">bos getdate</emphasis> <<replaceable>machine name</replaceable>> <<replaceable>files to check</replaceable>>+
3485 </programlisting></para>
3487 <para>where <variablelist>
3489 <term><emphasis role="bold">getd</emphasis></term>
3492 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">getdate</emphasis>.</para>
3497 <term><emphasis role="bold">machine name</emphasis></term>
3500 <para>Name the file server machine for which to display binary dates.</para>
3505 <term><emphasis role="bold">files to check</emphasis></term>
3508 <para>Names each binary file to display.</para>
3511 </variablelist></para>
3516 <primary>BAK version of binary file</primary>
3518 <secondary>removing obsolete</secondary>
3522 <primary>OLD version of binary file</primary>
3524 <secondary>removing obsolete</secondary>
3528 <primary>removing</primary>
3530 <secondary>obsolete .BAK and .OLD version of binaries</secondary>
3534 <primary>server process binaries</primary>
3536 <secondary>removing obsolete</secondary>
3540 <primary>command suite</primary>
3542 <secondary>binaries</secondary>
3544 <tertiary>removing obsolete</tertiary>
3548 <primary>removing</primary>
3550 <secondary>core files from /usr/afs/logs</secondary>
3554 <primary>core files</primary>
3556 <secondary>removing from /usr/afs/logs directory</secondary>
3560 <primary>usr/afs/bin directory</primary>
3562 <secondary>removing obsolete .BAK and .OLD files</secondary>
3566 <sect2 id="HDRWQ116">
3567 <title>Removing Obsolete Binary Files</title>
3569 <para>When processes with new binaries have been running without problems for a number of days, it is generally safe to remove
3570 the <emphasis role="bold">.BAK</emphasis> and <emphasis role="bold">.OLD</emphasis> versions from the <emphasis
3571 role="bold">/usr/afs/bin</emphasis> directory, both to reduce clutter and to free space on the file server machine's local
3574 <para>You can use the <emphasis role="bold">bos prune</emphasis> command's flags to remove the following types of files:
3577 <para>To remove files in the <emphasis role="bold">/usr/afs/bin</emphasis> directory with a <emphasis
3578 role="bold">.BAK</emphasis> extension, use the <emphasis role="bold">-bak</emphasis> flag.</para>
3582 <para>To remove files in the <emphasis role="bold">/usr/afs/bin</emphasis> directory with a <emphasis
3583 role="bold">.OLD</emphasis> extension, use the <emphasis role="bold">-old</emphasis> flag.</para>
3587 <para>To remove files in the <emphasis role="bold">/usr/afs/logs</emphasis> directory called <emphasis
3588 role="bold">core</emphasis>, with any extension, use the <emphasis role="bold">-core</emphasis> flag.</para>
3592 <para>To remove all three types of files, use the <emphasis role="bold">-all</emphasis> flag.</para>
3594 </itemizedlist></para>
3597 <primary>commands</primary>
3599 <secondary>bos prune</secondary>
3603 <primary>bos commands</primary>
3605 <secondary>prune</secondary>
3609 <sect2 id="Header_136">
3610 <title>To remove obsolete binaries</title>
3614 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
3615 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
3616 display the users in the UserList file</link>. <programlisting>
3617 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
3618 </programlisting></para>
3622 <para>Issue the <emphasis role="bold">bos prune</emphasis> command with one or more of its flags. <programlisting>
3623 % <emphasis role="bold">bos prune</emphasis> <<replaceable>machine name</replaceable>> [<emphasis role="bold">-bak</emphasis>] [<emphasis
3624 role="bold">-old</emphasis>] [<emphasis role="bold">-core</emphasis>] [<emphasis role="bold">-all</emphasis>]
3625 </programlisting></para>
3627 <para>where <variablelist>
3629 <term><emphasis role="bold">p</emphasis></term>
3632 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">prune</emphasis>.</para>
3637 <term><emphasis role="bold">machine name</emphasis></term>
3640 <para>Names the file server machine on which to remove obsolete files.</para>
3645 <term><emphasis role="bold">-bak</emphasis></term>
3648 <para>Removes all the files with a <emphasis role="bold">.BAK</emphasis> extension from the <emphasis
3649 role="bold">/usr/afs/bin</emphasis> directory. Do not combine this flag with the <emphasis
3650 role="bold">-all</emphasis> flag.</para>
3655 <term><emphasis role="bold">-old</emphasis></term>
3658 <para>Removes all the files a .<emphasis role="bold">OLD</emphasis> extension from the <emphasis
3659 role="bold">/usr/afs/bin</emphasis> directory. Do not combine this flag with the <emphasis
3660 role="bold">-all</emphasis> flag.</para>
3665 <term><emphasis role="bold">-core</emphasis></term>
3668 <para>Removes all core files from the <emphasis role="bold">/usr/afs/logs</emphasis> directory. Do not combine
3669 this flag with the <emphasis role="bold">-all</emphasis> flag</para>
3674 <term><emphasis role="bold">-all</emphasis></term>
3677 <para>Combines the effect of the other three flags. Do not combine it with the other three flags.</para>
3680 </variablelist></para>
3685 <sect2 id="HDRWQ117">
3686 <title>Displaying A Binary File's Build Level</title>
3688 <para>For the most consistent performance on a server machine, and cell-wide, it is best for all server processes to come from
3689 the same AFS distribution. Every AFS binary includes an ASCII string that specifies its version, or <emphasis>build
3690 level</emphasis>. To display it, use the <emphasis role="bold">strings</emphasis> and <emphasis role="bold">grep</emphasis>
3691 commands, which are included in most UNIX distributions.</para>
3694 <primary>commands</primary>
3696 <secondary>which</secondary>
3700 <primary>commands</primary>
3702 <secondary>strings</secondary>
3706 <primary>strings command</primary>
3710 <primary>which command</primary>
3714 <sect2 id="Header_138">
3715 <title>To display an AFS binary's build level</title>
3719 <para>Change to the directory that houses the binary file . If you are not sure where the binary resides, issue the
3720 <emphasis role="bold">which</emphasis> command. <programlisting>
3721 % <emphasis role="bold">which</emphasis> binary_file
3722 /bin_dir_path/binary_file
3723 % <emphasis role="bold">cd</emphasis> bin_dir_path
3724 </programlisting></para>
3728 <para>Issue the <emphasis role="bold">strings</emphasis> command to extract all ASCII strings from the binary file. Pipe
3729 the output to the <emphasis role="bold">grep</emphasis> command to locate the relevant line. <programlisting>
3730 % <emphasis role="bold">strings ./</emphasis>binary_file <emphasis role="bold">| grep Base</emphasis>
3731 </programlisting></para>
3733 <para>The output reports the AFS build level in a format like the following:</para>
3736 @(#)Base configuration afsversion build_level
3739 <para>For example, the following string indicates the binary is from AFS M.m build 3.0:</para>
3742 @(#)Base configuration afsM.m 3.0
3748 <primary>CellServDB file (server)</primary>
3750 <secondary>maintaining</secondary>
3754 <primary>files</primary>
3756 <secondary>CellServDB (server)</secondary>
3760 <primary>database server process</primary>
3762 <secondary>use of CellServDB file</secondary>
3766 <primary>Ubik</primary>
3768 <secondary>use of CellServDB file</secondary>
3772 <primary>server process</primary>
3774 <secondary>use of CellServDB file</secondary>
3779 <sect1 id="HDRWQ118">
3780 <title>Maintaining the Server CellServDB File</title>
3782 <para>Every file server machine maintains a list of its home cell's database server machines in the local disk file <emphasis
3783 role="bold">/usr/afs/etc/CellServDB</emphasis> on its local disk. Both database server processes and non-database server
3784 processes consult the file: <itemizedlist>
3786 <para>The database server processes (the Authentication, Backup, Protection, and Volume Location Servers) maintain
3787 constant contact with their peers in order to keep their copies of the replicated administrative databases
3788 synchronized.</para>
3790 <para>As detailed in <link linkend="HDRWQ102">Replicating the OpenAFS Administrative Databases</link>, the database server
3791 processes use the Ubik utility to synchronize the information in the databases they maintain. The Ubik coordinator at the
3792 synchronization site for each database maintains the single read/write copy of the database and distributes changes to the
3793 secondary sites as necessary. It must maintain contact with a majority of the secondary sites to remain the coordinator,
3794 and consults the <emphasis role="bold">CellServDB</emphasis> file to learn how many peers it has and on which machines
3795 they are running.</para>
3797 <para>If the coordinator loses contact with the majority of its peers, they all cooperate to elect a new coordinator by
3798 majority vote. During the election, all of the Ubik processes consult the <emphasis role="bold">CellServDB</emphasis> file
3799 to learn where to send their votes, and what number constitutes a majority.</para>
3803 <para>The non-database server processes must know which machines are running the database server processes in order to
3804 retrieve information from the databases. For example, the first time that a user accesses an AFS file, the File Server
3805 that houses it contacts the Protection Server to obtain a list of the user's group memberships (the list is called a
3806 current protection subgroup, or CPS). The File Server uses the CPS as it determines if the access control list (ACL)
3807 protecting the file grants the required permissions to the user (for more details, see <link linkend="HDRWQ534">About the
3808 Protection Database</link>).</para>
3810 </itemizedlist></para>
3813 <primary>CellServDB file (server)</primary>
3815 <secondary>effect of wrong information in</secondary>
3818 <para>The consequences of missing or incorrect information in the <emphasis role="bold">CellServDB</emphasis> file are as
3819 follows: <itemizedlist>
3821 <para>If the file does not list a machine, then it is effectively not a database server machine even if the database
3822 server processes are running. The Ubik coordinator does not send it database updates or include it in the count that
3823 establishes a majority. It does not participate in Ubik elections, and so refuses to distribute database information to
3824 any client machines that happen to contact it (which they can do if their <emphasis
3825 role="bold">/usr/vice/etc/CellServDB</emphasis> file lists it). Users of the client machine must wait for a timeout before
3826 they can contact a correctly functioning database server machine.</para>
3830 <para>If the file lists a machine that is not running the database server processes, the consequences can be serious. The
3831 Ubik coordinator cannot send it database updates, but includes it in the count that establishes a majority. If valid
3832 secondary sites go down and stop sending their votes to the coordinator, it can wrongly appear that the coordinator no
3833 longer has the majority it needs. The resulting election of a new coordinator causes a service outage during which
3834 information from the database becomes unavailable. Furthermore, the lack of a vote from the incorrectly listed site can
3835 disturb the election, if it makes the other sites believe that a majority of sites are not voting for the new
3838 <para>A more minor consequence is that non-database server processes attempt to contact the database server processes on
3839 the machine. They experience a timeout delay because the processes are not running.</para>
3841 </itemizedlist></para>
3843 <para>Note that the <emphasis role="bold">/usr/afs/etc/CellServDB</emphasis> file on a server machine is not the same as the
3844 <emphasis role="bold">/usr/vice/etc/CellServDB</emphasis> file on client machine. The client version includes entries for
3845 foreign cells as well as the local cell. However, it is important to update both versions of the file whenever you change your
3846 cell's database server machines. A server machine that is also a client needs to have both files, and you need to update them
3847 both. For more information on maintaining the client version of the <emphasis role="bold">CellServDB</emphasis> file, see <link
3848 linkend="HDRWQ406">Maintaining Knowledge of Database Server Machines</link>.</para>
3851 <primary>system control machine</primary>
3853 <secondary>CellServDB file, distributing to server machines</secondary>
3857 <primary>distribution</primary>
3859 <secondary>of CellServDB file (server)</secondary>
3863 <primary>Update Server</primary>
3865 <secondary>CellServDB file (server), distributing</secondary>
3868 <sect2 id="HDRWQ119">
3869 <title>Distributing the Server CellServDB File</title>
3871 <para>To avoid the negative consequences of incorrect information in the <emphasis
3872 role="bold">/usr/afs/etc/CellServDB</emphasis> file, you must update it on all of your cell's server machines every time you
3873 add or remove a database server machine. The <emphasis>OpenAFS Quick Beginnings</emphasis> provides complete instructions for
3874 installing or removing a database server machine and for updating the <emphasis role="bold">CellServDB</emphasis> file in that
3875 context. This section explains how to distribute the file to your server machines and how to make other cells aware of the
3876 changes if you participate in the AFS global name space.</para>
3878 <para>If you use the Update Server to distribute the central copy of the server
3879 <emphasis role="bold">CellServDB</emphasis> file stored on the cell's system control machine.
3880 For instructions on configuring the Update Server, see the <emphasis>OpenAFS Quick Beginnings</emphasis>.</para>
3882 <para>To avoid formatting errors that can cause errors, always use the <emphasis role="bold">bos addhost</emphasis> and
3883 <emphasis role="bold">bos removehost</emphasis> commands, rather than editing the file directly. You must also restart the
3884 database server processes running on the machine, to initiate a coordinator election among the new set of database server
3885 machines. This step is included in the instructions that appear in <link linkend="HDRWQ121">To add a database server machine
3886 to the CellServDB file</link> and <link linkend="HDRWQ122">To remove a database server machine from the CellServDB
3887 file</link>. For instructions on displaying the contents of the file, see <link linkend="HDRWQ120">To display a cell's
3888 database server machines</link>.</para>
3890 <para>If you make your cell accessible to foreign users as part of the AFS global name space, you also need to inform other
3891 cells when you change your cell's database server machines. The AFS Support group maintains a <emphasis
3892 role="bold">CellServDB</emphasis> file that lists all cells that participate in the AFS global name space, and can change your
3893 cell's entry at your request. For further details, see <link linkend="HDRWQ38">Making Your Cell Visible to
3894 Others</link>.</para>
3896 <para>Another way to advertise your cell's database server machines is to maintain a copy of the file at the conventional
3897 location in your AFS filespace, <emphasis role="bold">/afs/</emphasis><emphasis>cellname</emphasis><emphasis
3898 role="bold">/service/etc/CellServDB.local</emphasis>. For further discussion, see <link linkend="HDRWQ43">The Third
3899 Level</link>.</para>
3902 <primary>bos commands</primary>
3904 <secondary>listhosts</secondary>
3908 <primary>commands</primary>
3910 <secondary>bos listhosts</secondary>
3914 <primary>CellServDB file (server)</primary>
3916 <secondary>displaying</secondary>
3920 <primary>displaying</primary>
3922 <secondary>CellServDB file (server)</secondary>
3926 <primary>database server machine</primary>
3928 <secondary>displaying list in server CellServDB file</secondary>
3932 <primary>displaying</primary>
3934 <secondary>database server machines in server CellServDB file</secondary>
3938 <sect2 id="HDRWQ120">
3939 <title>To display a cell's database server machines</title>
3943 <para>Issue the <emphasis role="bold">bos listhosts</emphasis> command. If you have maintained the file properly, the
3944 output is the same on every server machine, but the <emphasis>machine name</emphasis> argument enables you to check
3945 various machines if you wish. <programlisting>
3946 % <emphasis role="bold">bos listhosts</emphasis> <<replaceable>machine name</replaceable>> [<<replaceable>cell name</replaceable>>]
3947 </programlisting></para>
3949 <para>where <variablelist>
3951 <term><emphasis role="bold">listh</emphasis></term>
3954 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">listhosts</emphasis>.</para>
3959 <term><emphasis role="bold">machine name</emphasis></term>
3962 <para>Specifies the server machine from which to display the <emphasis
3963 role="bold">/usr/afs/etc/CellServDB</emphasis> file.</para>
3968 <term><emphasis role="bold">cell name</emphasis></term>
3971 <para>Specifies the complete Internet domain name of a foreign cell. You must already know the name of at least
3972 one server machine in the cell, to provide as the <emphasis role="bold">machine name</emphasis> argument.</para>
3975 </variablelist></para>
3979 <para>The output lists the machines in the order they appear in the <emphasis role="bold">CellServDB</emphasis> file on the
3980 specified server machine. It assigns each one a <computeroutput>Host</computeroutput> index number, as in the following
3981 example. There is no implied relationship between the index and a machine's IP address, name, or role as Ubik coordinator or
3982 secondary site.</para>
3985 % <emphasis role="bold">bos listhosts fs1.example.com</emphasis>
3986 Cell name is example.com
3987 Host 1 is fs1.example.com
3988 Host 2 is fs7.example.com
3989 Host 3 is fs4.example.com
3992 <para>The output lists machines by name rather than IP address as long as the naming service (such as the Domain Name Service
3993 or local host table) is functioning properly. To display IP addresses, login to a server machine as the local superuser
3994 <emphasis role="bold">root</emphasis> and use a text editor or display command, such as the <emphasis
3995 role="bold">cat</emphasis> command, to view the <emphasis role="bold">/usr/afs/etc/CellServDB</emphasis> file.</para>
3998 <primary>adding</primary>
4000 <secondary>database server machine</secondary>
4002 <tertiary>to server CellServDB file</tertiary>
4006 <primary>database server machine</primary>
4008 <secondary>adding</secondary>
4010 <tertiary>to server CellServDB file</tertiary>
4014 <primary>CellServDB file (server)</primary>
4016 <secondary>adding database server machine</secondary>
4020 <primary>adding</primary>
4022 <secondary>CellServDB file (server) entry for database server machine</secondary>
4026 <primary>database server machine</primary>
4028 <secondary>CellServDB file (server) entry</secondary>
4030 <tertiary>adding</tertiary>
4034 <primary>database server process</primary>
4036 <secondary>restarting after adding entry to server CellServDB file</secondary>
4040 <primary>Protection Server</primary>
4042 <secondary>restarting after adding entry to server CellServDB file</secondary>
4046 <primary>Authentication Server</primary>
4048 <secondary>restarting after adding entry to server CellServDB file</secondary>
4052 <primary>VL Server</primary>
4054 <secondary>restarting after adding entry to server CellServDB file</secondary>
4058 <primary>Backup Server</primary>
4060 <secondary>restarting after adding entry to server CellServDB file</secondary>
4064 <primary>bos commands</primary>
4066 <secondary>addhost</secondary>
4070 <primary>commands</primary>
4072 <secondary>bos addhost</secondary>
4076 <sect2 id="HDRWQ121">
4077 <title>To add a database server machine to the CellServDB file</title>
4081 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
4082 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
4083 display the users in the UserList file</link>. <programlisting>
4084 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
4085 </programlisting></para>
4089 <para>Issue the <emphasis role="bold">bos addhost</emphasis> command to add each new database server machine to the
4090 <emphasis role="bold">CellServDB</emphasis> file. Specify the system control
4091 machine as <emphasis>machine name</emphasis>. (If you have forgotten which machine is the system control machine, see
4092 <link linkend="HDRWQ99">The Output on the System Control Machine</link>.)
4094 % <emphasis role="bold">bos addhost</emphasis> <<replaceable>machine name</replaceable>> <<replaceable>host name</replaceable>>+
4095 </programlisting></para>
4097 <para>where <variablelist>
4099 <term><emphasis role="bold">addh</emphasis></term>
4102 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">addhost</emphasis>.</para>
4107 <term><emphasis role="bold">machine name</emphasis></term>
4110 <para>Names the system control machine</para>
4115 <term><emphasis role="bold">host name</emphasis></term>
4118 <para>Specifies the fully qualified hostname of each database server machine to add to the <emphasis
4119 role="bold">CellServDB</emphasis> file (for example: <emphasis role="bold">fs4.example.com</emphasis>). The BOS Server
4120 uses the <emphasis role="bold">gethostbyname()</emphasis> routine to obtain each machine's IP address and records
4121 both the name and address automatically.</para>
4124 </variablelist></para>
4128 <para>Restart the Authentication Server, Backup Server, Protection Server, and VL Server on every database server machine,
4129 so that the new set of machines participate in the election of a new Ubik coordinator. The instruction uses the
4130 conventional names for the processes; make the appropriate substitution if you use different process names. For complete
4131 syntax, see <link linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>.</para>
4133 <para><emphasis role="bold">Important:</emphasis> Repeat the following command in quick succession on all of the database
4134 server machines.</para>
4137 % <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">buserver kaserver ptserver vlserver</emphasis>
4142 <para>Edit the <emphasis role="bold">/usr/vice/etc/CellServDB</emphasis> file on each of your cell's client machines. For
4143 instructions, see <link linkend="HDRWQ406">Maintaining Knowledge of Database Server Machines</link>.</para>
4147 <para>If you participate in the AFS global name space, please have one of your cell's designated site contacts register
4148 the changes you have made with the AFS Product Support group.</para>
4150 <para>If you maintain a central copy of your cell's server <emphasis role="bold">CellServDB</emphasis> file in the
4151 conventional location (<emphasis role="bold">/afs/</emphasis><emphasis>cellname</emphasis><emphasis
4152 role="bold">/service/etc/CellServDB.local</emphasis>), edit the file to reflect the change.</para>
4157 <primary>removing</primary>
4159 <secondary>database server machine</secondary>
4161 <tertiary>from server CellServDB file</tertiary>
4165 <primary>database server machine</primary>
4167 <secondary>removing</secondary>
4169 <tertiary>from server CellServDB file</tertiary>
4173 <primary>CellServDB file (server)</primary>
4175 <secondary>removing database server machine</secondary>
4179 <primary>database server machine</primary>
4181 <secondary>CellServDB file (server) entry</secondary>
4183 <tertiary>removing</tertiary>
4187 <primary>database server process</primary>
4189 <secondary>restarting after removing entry from server CellServDB file</secondary>
4193 <primary>Protection Server</primary>
4195 <secondary>restarting after removing entry from server CellServDB file</secondary>
4199 <primary>Authentication Server</primary>
4201 <secondary>restarting after removing entry from server CellServDB file</secondary>
4205 <primary>VL Server</primary>
4207 <secondary>restarting after removing entry from server CellServDB file</secondary>
4211 <primary>Backup Server</primary>
4213 <secondary>restarting after removing entry from server CellServDB file</secondary>
4217 <primary>bos commands</primary>
4219 <secondary>removehost</secondary>
4223 <primary>commands</primary>
4225 <secondary>bos removehost</secondary>
4229 <sect2 id="HDRWQ122">
4230 <title>To remove a database server machine from the CellServDB file</title>
4234 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
4235 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
4236 display the users in the UserList file</link>. <programlisting>
4237 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
4238 </programlisting></para>
4242 <para>Issue the <emphasis role="bold">bos removehost</emphasis> command to remove each database server machine from the
4243 <emphasis role="bold">CellServDB</emphasis> file. Specify the system control
4244 machine as <emphasis>machine name</emphasis>. (If you have forgotten which machine is the system control machine, see
4245 <link linkend="HDRWQ99">The Output on the System Control Machine</link>.)
4247 % <emphasis role="bold">bos removehost</emphasis> <<replaceable>machine name</replaceable>> <<replaceable>host name</replaceable>>+
4248 </programlisting></para>
4250 <para>where <variablelist>
4252 <term><emphasis role="bold">removeh</emphasis></term>
4255 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">removehost</emphasis>.</para>
4260 <term><emphasis role="bold">machine name</emphasis></term>
4263 <para>Names the system control machine.</para>
4268 <term><emphasis role="bold">host name</emphasis></term>
4271 <para>Specifies the fully qualified hostname of each database server machine to remove from the <emphasis
4272 role="bold">CellServDB</emphasis> file (for example: <emphasis role="bold">fs4.example.com</emphasis>).</para>
4275 </variablelist></para>
4279 <para>Restart the Authentication Server, Backup Server, Protection Server, and VL Server on every database server machine,
4280 so that the new set of machines participate in the election of a new Ubik coordinator. The instruction uses the
4281 conventional names for the processes; make the appropriate substitution if you use different process names. For complete
4282 syntax, see <link linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>.</para>
4284 <para><emphasis role="bold">Important:</emphasis> Repeat the following command in quick succession on all of the database
4285 server machines.</para>
4288 % <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">buserver kaserver ptserver vlserver</emphasis>
4293 <para>Edit the <emphasis role="bold">/usr/vice/etc/CellServDB</emphasis> file on each of your cell's client machines. For
4294 instructions, see <link linkend="HDRWQ406">Maintaining Knowledge of Database Server Machines</link>.</para>
4298 <para>If you participate in the AFS global name space, please have one of your cell's designated site contacts register
4299 the changes you have made with the AFS Product Support group.</para>
4301 <para>If you maintain a central copy of your cell's server <emphasis role="bold">CellServDB</emphasis> file in the
4302 conventional location (<emphasis role="bold">/afs/</emphasis><emphasis>cellname</emphasis><emphasis
4303 role="bold">/service/etc/CellServDB.local</emphasis>), edit the file to reflect the change.</para>
4309 <sect1 id="HDRWQ123">
4310 <title>Managing Authentication and Authorization Requirements</title>
4312 <para>This section describes how the AFS server processes guarantee that only properly authorized users perform privileged
4313 commands, by checking authorization checking and mutually authenticating with their clients. It explains how you can control
4314 authorization checking requirements on a per-machine or per-cell basis, and how to bypass mutual authentication when issuing
4318 <primary>authorization checking</primary>
4320 <secondary>compared to authentication</secondary>
4324 <primary>authentication</primary>
4326 <secondary>compared to authorization checking</secondary>
4330 <primary>privileged commands</primary>
4334 <primary>commands</primary>
4336 <secondary>privileged, defined</secondary>
4340 <primary>anonymous user</primary>
4342 <secondary>identity assigned to unauthenticated user</secondary>
4346 <primary>authorization checking</primary>
4348 <secondary>defined</secondary>
4351 <sect2 id="HDRWQ124">
4352 <title>Authentication versus Authorization</title>
4354 <para>Many AFS commands are <emphasis>privileged</emphasis> in that the AFS server process invoked by the command performs it
4355 only for a properly authorized user. The server process performs the following two tests to determine if someone is properly
4356 authorized: <itemizedlist>
4358 <para>In the <emphasis>authentication</emphasis> test, the server process mutually authenticates with the command
4359 interpreter, Cache Manager, or other client process that is acting on behalf of a user or application. The goal of this
4360 test is to determine who is issuing the command. The server process verifies that the issuer really is who he or she
4361 claims to be, by examining the server ticket and other components of the issuer's token. (Secondarily, it allows the
4362 client process to verify that the server process is genuine.) If the issuer has no token or otherwise fails the test,
4363 the server process assigns him or her the identity <emphasis role="bold">anonymous</emphasis>, a completely unprivileged
4364 user. For a more complete description of mutual authentication, see <link linkend="HDRWQ75">A More Detailed Look at
4365 Mutual Authentication</link>.</para>
4367 <para>Many individual commands enable you to bypass the authentication test by assuming the <emphasis
4368 role="bold">anonymous</emphasis> identity without even attempting to mutually authenticate. Note, however, that this is
4369 futile if the command is privileged and the server process is still performing the <emphasis>authorization</emphasis>
4370 test, because in that case the process refuses to execute privileged commands for the <emphasis
4371 role="bold">anonymous</emphasis> user.</para>
4375 <para>In the authorization test, the server process determines if the issuer is authorized to use the command by
4376 consulting a list of privileged users. The goal of this test is to determine what the issuer is allowed to do. Different
4377 server processes consult different lists of users, as described in <link linkend="HDRWQ581">Managing Administrative
4378 Privilege</link>. The server process refuses to execute any privileged command for an unauthorized issuer. If a command
4379 has no privilege requirements, the server process skips this step and executes and immediately.</para>
4382 <para>Never place the <emphasis role="bold">anonymous</emphasis> user or the <emphasis
4383 role="bold">system:anyuser</emphasis> group on a privilege list; it makes authorization checking meaningless.</para>
4385 <para>You can use the <emphasis role="bold">bos setauth</emphasis> command to control whether the server processes on
4386 a server machine check for authorization. Other server machines are not affected. Keep in mind that turning off
4387 authorization checking is a grave security risk, because the server processes on that machine perform any action for
4391 </itemizedlist></para>
4394 <primary>controlling</primary>
4396 <secondary>authorization checking for entire cell</secondary>
4400 <primary>authorization checking</primary>
4402 <secondary>controlling cell-wide</secondary>
4406 <primary>restarting</primary>
4408 <secondary>server process</secondary>
4410 <tertiary>when changing authorization checking</tertiary>
4414 <primary>authorization checking</primary>
4416 <secondary>and restarting processes</secondary>
4420 <sect2 id="HDRWQ125">
4421 <title>Controlling Authorization Checking on a Server Machine</title>
4423 <para>Disabling authorization checking is a serious breach of security because it means that the AFS server processes on a
4424 file server machine performs any action for any user, even the <emphasis role="bold">anonymous</emphasis> user.</para>
4426 <para>The only time it is common to disable authorization checking is when installing a new file server machine (see the IBM
4427 AFS Quick Beginnings). It is necessary then because it is not possible to configure all of the necessary security mechanisms
4428 before performing other actions that normally make use of them. For greatest security, work at the console of the machine you
4429 are installing and enable authorization checking as soon as possible.</para>
4431 <para>During normal operation, the only reason to disable authorization checking is if an error occurs with the server
4432 encryption keys, leaving the servers unable to authenticate users properly. For instructions on handling key-related
4433 emergencies, see <link linkend="HDRWQ370">Handling Server Encryption Key Emergencies</link>.</para>
4435 <para>You control authorization checking on each file server machine separately; turning it on or off on one machine does not
4436 affect the others. Because client machines generally choose a server process at random, it is hard to predict what
4437 authorization checking conditions prevail for a given command unless you make the requirement the same on all machines. To
4438 turn authorization checking on or off for the entire cell, you must repeat the appropriate command on every file server
4441 <para>The server processes constantly monitor the directory <emphasis role="bold">/usr/afs/local</emphasis> on their local
4442 disks to determine if they need to check for authorization. If the file called <emphasis role="bold">NoAuth</emphasis> appears
4443 in that directory, then the servers do not check for authorization. When it is not present (the usual case), they perform
4444 authorization checking.</para>
4446 <para>Control the presence of the <emphasis role="bold">NoAuth</emphasis> file through the BOS Server. When you disable
4447 authorization checking with the <emphasis role="bold">bos setauth</emphasis> command (or, during installation, by putting the
4448 <emphasis role="bold">-noauth</emphasis> flag on the command that starts up the BOS Server), the BOS Server creates the
4449 zero-length <emphasis role="bold">NoAuth</emphasis> file. When you reenable authorization checking, the BOS Server removes the
4453 <primary>bos commands</primary>
4455 <secondary>setauth</secondary>
4459 <primary>commands</primary>
4461 <secondary>bos setauth</secondary>
4465 <primary>authorization checking</primary>
4467 <secondary>disabling</secondary>
4471 <primary>turning off authorization checking</primary>
4475 <primary>disabling</primary>
4477 <secondary>authorization checking</secondary>
4481 <sect2 id="HDRWQ126">
4482 <title>To disable authorization checking on a server machine</title>
4486 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
4487 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
4488 display the users in the UserList file</link>. <programlisting>
4489 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
4490 </programlisting></para>
4494 <para>Issue the <emphasis role="bold">bos setauth</emphasis> command to disable authorization checking. <programlisting>
4495 % <emphasis role="bold">bos setauth</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">off</emphasis>
4496 </programlisting></para>
4498 <para>where <variablelist>
4500 <term><emphasis role="bold">seta</emphasis></term>
4503 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">setauth</emphasis>.</para>
4508 <term><emphasis role="bold">machine name</emphasis></term>
4511 <para>Specifies the file server machine on which server processes do not check for authorization.</para>
4514 </variablelist></para>
4519 <primary>authorization checking</primary>
4521 <secondary>enabling</secondary>
4525 <primary>enabling authorization checking</primary>
4529 <primary>turning on authorization checking</primary>
4533 <sect2 id="HDRWQ127">
4534 <title>To enable authorization checking on a server machine</title>
4538 <para>Reenable authorization checking. (No privilege is required because the machine is not currently checking for
4539 authorization.) For detailed syntax information, see the preceding section. <programlisting>
4540 % <emphasis role="bold">bos setauth</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">on</emphasis>
4541 </programlisting></para>
4546 <primary>mutual authentication</primary>
4548 <secondary>preventing</secondary>
4552 <primary>preventing</primary>
4554 <secondary>mutual authentication</secondary>
4558 <sect2 id="HDRWQ128">
4559 <title>Bypassing Mutual Authentication for an Individual Command</title>
4561 <para>Several of the server processes allow any user (not just system administrators) to disable mutual authentication when
4562 issuing a command. The server process treats the issuer as the unauthenticated user <emphasis
4563 role="bold">anonymous</emphasis>.</para>
4565 <para>The facilities for preventing mutual authentication are provided for use in emergencies (such as the key emergency
4566 discussed in <link linkend="HDRWQ370">Handling Server Encryption Key Emergencies</link>). During normal circumstances,
4567 authorization checking is turned on, making it useless to prevent authentication: the server processes refuse to perform
4568 privileged commands for the user <emphasis role="bold">anonymous</emphasis>.</para>
4570 <para>It can be useful to prevent authentication when authorization checking is turned off. The very act of trying to
4571 authenticate can cause problems if the server cannot understand a particular encryption key, as is likely to happen in a key
4575 <primary>bos commands</primary>
4577 <secondary>mutual authentication, bypassing</secondary>
4581 <primary>kas commands</primary>
4583 <secondary>mutual authentication, bypassing</secondary>
4587 <primary>pts commands</primary>
4589 <secondary>mutual authentication, bypassing</secondary>
4593 <primary>vos commands</primary>
4595 <secondary>mutual authentication, bypassing</secondary>
4599 <primary>kas commands</primary>
4601 <secondary>interactive</secondary>
4605 <primary>commands</primary>
4607 <secondary>kas interactive</secondary>
4611 <primary>entering</primary>
4613 <secondary>kas interactive mode</secondary>
4617 <primary>interactive mode (kas commands)</primary>
4621 <sect2 id="HDRWQ129">
4622 <title>To bypass mutual authentication for bos, kas, pts, and vos commands</title>
4624 <para>Provide the <emphasis role="bold">-noauth</emphasis> flag which is available on many of the commands in the suites. To
4625 verify that a command accepts the flag, issue the <emphasis role="bold">help</emphasis> command in its suite, or consult the
4626 command's reference page in the <emphasis>OpenAFS Administration Reference</emphasis> (the reference page also specifies the
4627 shortest acceptable abbreviation for the flag on each command). The suites' <emphasis role="bold">apropos</emphasis> and
4628 <emphasis role="bold">help</emphasis> commands do not themselves accept the flag.</para>
4630 <para>You can bypass mutual authentication for all <emphasis role="bold">kas</emphasis> commands issued during an interactive
4631 session by including the <emphasis role="bold">-noauth</emphasis> flag on the <emphasis role="bold">kas interactive</emphasis>
4632 command. If you have already entered interactive mode with an authenticated identity, issue the <emphasis role="bold">(kas)
4633 noauthentication</emphasis> command to assume the <emphasis role="bold">anonymous</emphasis> identity.</para>
4636 <primary>fs commands</primary>
4638 <secondary>mutual authentication, bypassing</secondary>
4642 <sect2 id="Header_151">
4643 <title>To bypass mutual authentication for fs commands</title>
4645 <para>This is not possible, except by issuing the <emphasis role="bold">unlog</emphasis> command to discard your tokens before
4646 issuing the <emphasis role="bold">fs</emphasis> command.</para>
4650 <sect1 id="HDRWQ130">
4651 <title>Adding or Removing Disks and Partitions</title>
4653 <para>AFS makes it very easy to add storage space to your cell, just by adding disks to existing file server machines. This
4654 section explains how to install or remove a disk used to store AFS volumes. (Another way to add storage space is to install
4655 additional server machines, as instructed in the <emphasis>OpenAFS Quick Beginnings</emphasis>.)</para>
4657 <para>Both adding and removing a disk cause at least a brief file system outage, because you must restart the <emphasis
4658 role="bold">fs</emphasis> process to have it recognize the new set of server partitions. Some operating systems require that you
4659 shut the machine off before adding or removing a disk, in which case you must shut down all of the AFS server processes first.
4660 Otherwise, the AFS-related aspects of adding or removing a disk are not complicated, so the duration of the outage depends
4661 mostly on how long it takes to install or remove the disk itself.</para>
4663 <para>The following instructions for installing a new disk completely prepare it to house AFS volumes. You can then use the
4664 <emphasis role="bold">vos create</emphasis> command to create new volumes, or the <emphasis role="bold">vos move</emphasis>
4665 command to move existing ones from other partitions. For instructions, see <link linkend="HDRWQ185">Creating Read/write
4666 Volumes</link> and <link linkend="HDRWQ226">Moving Volumes</link>. The instructions for removing a disk are basically the
4667 reverse of the installation instructions, but include extra steps that protect against data loss.</para>
4669 <para>A server machines can house 256 AFS server partitions, each one mounted at a directory with a name of the form <emphasis
4670 role="bold">/vicep</emphasis><emphasis>index</emphasis>, where <emphasis>index</emphasis> is one or two lowercase letters. By
4671 convention, the first partition on a machine is mounted at <emphasis role="bold">/vicepa</emphasis>, the second at <emphasis
4672 role="bold">/vicepb</emphasis>, and so on to the twenty-sixth at <emphasis role="bold">/vicepz</emphasis>. Additional partitions
4673 are mounted at <emphasis role="bold">/vicepaa</emphasis> through <emphasis role="bold">/vicepaz</emphasis> and so on up to
4674 <emphasis role="bold">/vicepiv</emphasis>. Using the letters consecutively is not required, but is simpler.</para>
4676 <para>Mount each <emphasis role="bold">/vicep</emphasis> directory directly under the local file system's root directory (
4677 <emphasis role="bold">/</emphasis> ), not as a subdirectory of any other directory; for example, <emphasis
4678 role="bold">/usr/vicepa</emphasis> is not an acceptable location. You must also map the directory to the partition's device name
4679 in the file server machine's file systems registry file (<emphasis role="bold">/etc/fstab</emphasis> or equivalent).</para>
4681 <para>These instructions assume that the machine's AFS initialization file includes the following command to restart the BOS
4682 Server after each reboot. The BOS Server starts the other AFS server processes listed in the local <emphasis
4683 role="bold">/usr/afs/local/BosConfig</emphasis> file. For information on the <emphasis role="bold">bosserver</emphasis>
4684 command's optional arguments, see its reference page in the <emphasis>OpenAFS Administration Reference</emphasis>.</para>
4687 /usr/afs/bin/bosserver
4691 <primary>adding</primary>
4693 <secondary>disk to file server machine</secondary>
4697 <primary>installing</primary>
4699 <secondary>disk on file server machine</secondary>
4703 <primary>disk</primary>
4705 <secondary>file server machine</secondary>
4707 <tertiary>adding/installing</tertiary>
4711 <primary>file server machine</primary>
4713 <secondary>disk</secondary>
4715 <tertiary>adding/installing</tertiary>
4719 <primary>mounting</primary>
4721 <secondary>disk on file server machine</secondary>
4725 <primary>commands</primary>
4727 <secondary>vos listpart</secondary>
4731 <primary>vos commands</primary>
4733 <secondary>listpart</secondary>
4736 <sect2 id="HDRWQ131">
4737 <title>To add and mount a new disk to house AFS volumes</title>
4741 <para>Become the local superuser <emphasis role="bold">root</emphasis> on the machine, if you are not already, by issuing
4742 the <emphasis role="bold">su</emphasis> command. <programlisting>
4743 % <emphasis role="bold">su root</emphasis>
4744 Password: <replaceable>root_password</replaceable>
4745 </programlisting></para>
4749 <para>Decide how many AFS partitions to divide the new disk into and the names of the directories at which to mount them
4750 (the introduction to this section describes the naming conventions). To display the names of the existing server
4751 partitions on the machine, issue the <emphasis role="bold">vos listpart</emphasis> command. Include the <emphasis
4752 role="bold">-localauth</emphasis> flag because you are logged in as the local superuser <emphasis
4753 role="bold">root</emphasis> but do not necessarily have administrative tokens. <programlisting>
4754 # <emphasis role="bold">vos listpart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-localauth</emphasis>
4755 </programlisting></para>
4757 <para>where <variablelist>
4759 <term><emphasis role="bold">listp</emphasis></term>
4762 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">listpart</emphasis>.</para>
4767 <term><emphasis role="bold">machine name</emphasis></term>
4770 <para>Names the local file server machine.</para>
4775 <term><emphasis role="bold">-localauth</emphasis></term>
4778 <para>Constructs a server ticket using a key from the local <emphasis role="bold">/usr/afs/etc/KeyFile</emphasis>
4779 file. The <emphasis role="bold">bos</emphasis> command interpreter presents it to the BOS Server during mutual
4780 authentication.</para>
4783 </variablelist></para>
4787 <para>Create each directory at which to mount a partition. <programlisting>
4788 # <emphasis role="bold">mkdir /vicep</emphasis><replaceable>x</replaceable>[<replaceable>x</replaceable>]
4789 </programlisting></para>
4792 <primary>files</primary>
4794 <secondary>file systems registry (fstab)</secondary>
4798 <primary>file systems registry file</primary>
4800 <secondary>adding new disk to file server machine</secondary>
4804 <primary>etc/fstab file</primary>
4806 <secondary></secondary>
4808 <see>file systems registry file</see>
4812 <primary>fstab file</primary>
4814 <secondary></secondary>
4816 <see>file systems registry file</see>
4820 <listitem id="LIWQ132">
4821 <para>Using a text editor, create an entry in the machine's file systems registry file (<emphasis
4822 role="bold">/etc/fstab</emphasis> or equivalent) for each new disk partition, mapping its device name to the directory you
4823 created in the previous step. Refer to existing entries in the file to learn the proper format, which varies for different
4824 operating systems.</para>
4827 <listitem id="LIWQ133">
4828 <para>If the operating system requires that you shut off the machine to install a new disk, issue
4829 the <emphasis role="bold">bos shutdown</emphasis> command to shut down all AFS server processes other than the BOS Server
4830 (it terminates safely when you shut off the machine). Include the <emphasis role="bold">-localauth</emphasis> flag because
4831 you are logged in as the local superuser <emphasis role="bold">root</emphasis> but do not necessarily have administrative
4832 tokens. For a complete description of the command, see <link linkend="HDRWQ168">To stop processes temporarily</link>.
4834 # <emphasis role="bold">bos shutdown</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-localauth</emphasis> [<emphasis
4835 role="bold">-wait</emphasis>]
4836 </programlisting></para>
4839 <listitem id="LIWQ134">
4840 <para>If necessary, shut off the machine. Install and format the new disk according to the
4841 instructions provided by the disk and operating system vendors. If necessary, edit the disk's partition table to reflect
4842 the changes you made to the files system registry file in step <link linkend="LIWQ132">4</link>; consult the operating
4843 system documentation for instructions.</para>
4847 <para>If you shut off the machine down in step <link linkend="LIWQ134">6</link>, turn it on. Otherwise, issue the
4848 <emphasis role="bold">bos restart</emphasis> command to restart the <emphasis role="bold">fs</emphasis> process, forcing
4849 it to recognize the new set of server partitions. Include the <emphasis role="bold">-localauth</emphasis> flag because you
4850 are logged in as the local superuser <emphasis role="bold">root</emphasis> but do not necessarily have administrative
4851 tokens. For complete instructions for the <emphasis role="bold">bos restart</emphasis> command, see <link
4852 linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>. <programlisting>
4853 # <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">fs -localauth</emphasis>
4854 </programlisting></para>
4858 <para>Issue the <emphasis role="bold">bos status</emphasis> command to verify that all server processes are running
4859 correctly. For more detailed instructions, see <link linkend="HDRWQ158">Displaying Process Status and Information from the
4860 BosConfig File</link>. <programlisting>
4861 # <emphasis role="bold">bos status</emphasis> <<replaceable>machine name</replaceable>>
4862 </programlisting></para>
4867 <primary>removing</primary>
4869 <secondary>disk from file server machine</secondary>
4873 <primary>disk</primary>
4875 <secondary>file server machine</secondary>
4877 <tertiary>removing</tertiary>
4881 <primary>file server machine</primary>
4883 <secondary>disk</secondary>
4885 <tertiary>removing</tertiary>
4889 <primary>unmounting</primary>
4891 <secondary>file server machine disk</secondary>
4895 <primary>vos commands</primary>
4897 <secondary>move</secondary>
4899 <tertiary>when removing file server machine disk</tertiary>
4903 <sect2 id="HDRWQ135">
4904 <title>To unmount and remove a disk housing AFS volumes</title>
4908 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
4909 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
4910 display the users in the UserList file</link>. <programlisting>
4911 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
4912 </programlisting></para>
4916 <para>Issue the <emphasis role="bold">vos listvol</emphasis> command to list the volumes housed on each partition of each
4917 disk you are about to remove, in preparation for removing them or moving them to other partitions. For detailed
4918 instructions, see <link linkend="HDRWQ219">Displaying Volume Headers</link>. <programlisting>
4919 % <emphasis role="bold">vos listvol</emphasis> <<replaceable>machine name</replaceable>> [<<replaceable>partition name</replaceable>>]
4920 </programlisting></para>
4924 <para>Move any volume you wish to retain in the file system to another partition. You can move only read/write volumes.
4925 For more detailed instructions, and for instructions on moving read-only and backup volumes, see <link
4926 linkend="HDRWQ226">Moving Volumes</link>. <programlisting>
4927 % <emphasis role="bold">vos move</emphasis> <<replaceable>volume name or ID</replaceable>> \
4928 <<replaceable>machine name on source</replaceable>> <<replaceable>partition name on source</replaceable>> \
4929 <<replaceable>machine name on destination</replaceable>> <<replaceable>partition name on destination</replaceable>>
4930 </programlisting></para>
4934 <para><emphasis role="bold">(Optional)</emphasis> If there are any volumes you do not wish to retain, back them up using
4935 the <emphasis role="bold">vos dump</emphasis> command or the AFS Backup System. See <link linkend="HDRWQ240">Dumping and
4936 Restoring Volumes</link> or <link linkend="HDRWQ296">Backing Up Data</link>, respectively.</para>
4940 <para>Become the local superuser <emphasis role="bold">root</emphasis> on the machine, if you are not already, by issuing
4941 the <emphasis role="bold">su</emphasis> command. <programlisting>
4942 % <emphasis role="bold">su root</emphasis>
4943 Password: <replaceable>root_password</replaceable>
4944 </programlisting></para>
4947 <primary>umount command</primary>
4951 <primary>commands</primary>
4953 <secondary>umount</secondary>
4958 <para>Issue the <emphasis role="bold">umount</emphasis> command, repeating it for each partition on the disk to be
4959 removed. <programlisting>
4960 # <emphasis role="bold">cd /</emphasis>
4961 # <emphasis role="bold">umount /dev/</emphasis><<replaceable>partition_block_device_name</replaceable>>
4962 </programlisting></para>
4965 <primary>file systems registry file</primary>
4967 <secondary>removing disk from file server machine</secondary>
4971 <listitem id="LIWQ136">
4972 <para>Using a text editor, remove or comment out each partition's entry from the machine's file
4973 systems registry file (<emphasis role="bold">/etc/fstab</emphasis> or equivalent).</para>
4977 <para>Remove the <emphasis role="bold">/vicep</emphasis> directory associated with each partition. <programlisting>
4978 # <emphasis role="bold">rmdir /vicep</emphasis>xx
4979 </programlisting></para>
4983 <para>If the operating system requires that you shut off the machine to remove a disk, issue the <emphasis role="bold">bos
4984 shutdown</emphasis> command to shut down all AFS server processes other than the BOS Server (it terminates safely when you
4985 shut off the machine). Include the <emphasis role="bold">-localauth</emphasis> flag because you are logged in as the local
4986 superuser <emphasis role="bold">root</emphasis> but do not necessarily have administrative tokens. For a complete
4987 description of the command, see <link linkend="HDRWQ168">To stop processes temporarily</link>. <programlisting>
4988 # <emphasis role="bold">bos shutdown</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-localauth</emphasis> [<emphasis
4989 role="bold">-wait</emphasis>]
4990 </programlisting></para>
4993 <listitem id="LIWQ137">
4994 <para>If necessary, shut off the machine. Remove the disk according to the instructions provided by
4995 the disk and operating system vendors. If necessary, edit the disk's partition table to reflect the changes you made to
4996 the files system registry file in step <link linkend="LIWQ136">7</link>; consult the operating system documentation for
4997 instructions.</para>
5001 <para>If you shut off the machine down in step <link linkend="LIWQ137">10</link>, turn it on. Otherwise, issue the
5002 <emphasis role="bold">bos restart</emphasis> command to restart the <emphasis role="bold">fs</emphasis> process, forcing
5003 it to recognize the new set of server partitions. Include the <emphasis role="bold">-localauth</emphasis> flag because you
5004 are logged in as the local superuser <emphasis role="bold">root</emphasis> but do not necessarily have administrative
5005 tokens. For complete instructions for the <emphasis role="bold">bos restart</emphasis> command, see <link
5006 linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>. <programlisting>
5007 # <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">fs -localauth</emphasis>
5008 </programlisting></para>
5012 <para>Issue the <emphasis role="bold">bos status</emphasis> command to verify that all server processes are running
5013 correctly. For more detailed instructions, see <link linkend="HDRWQ158">Displaying Process Status and Information from the
5014 BosConfig File</link>. <programlisting>
5015 # <emphasis role="bold">bos status</emphasis> <<replaceable>machine name</replaceable>>
5016 </programlisting></para>
5021 <primary>File Server</primary>
5023 <secondary>use of NetInfo file</secondary>
5027 <primary>File Server</primary>
5029 <secondary>use of NetRestrict file</secondary>
5033 <primary>File Server</primary>
5035 <secondary>use of sysid file</secondary>
5039 <primary>Ubik</primary>
5041 <secondary>use of NetInfo and NetRestrict files</secondary>
5045 <primary>database server machine</primary>
5047 <secondary>use of NetInfo and NetRestrict files</secondary>
5051 <primary>File Server</primary>
5053 <secondary>interfaces registered in VLDB server entry</secondary>
5057 <primary>setting</primary>
5059 <secondary>server machine interfaces registered in VLDB</secondary>
5063 <primary>controlling</primary>
5065 <secondary>server machine interfaces registered in VLDB</secondary>
5069 <primary>displaying</primary>
5071 <secondary>server entries from VLDB</secondary>
5075 <primary>displaying</primary>
5077 <secondary>VLDB server entries</secondary>
5081 <primary>server entry in VLDB</primary>
5086 <sect1 id="HDRWQ138">
5087 <title>Managing Server IP Addresses and VLDB Server Entries</title>
5089 <para>The AFS support for multihomed file server machines is largely automatic. The File Server process records the IP addresses
5090 of its file server machine's network interfaces in the local <emphasis role="bold">/usr/afs/local/sysid</emphasis> file and also
5091 registers them in a <emphasis>server entry</emphasis> in the Volume Location Database (VLDB). The <emphasis
5092 role="bold">sysid</emphasis> file and server entry are identified by the same unique number, which creates an association
5093 between them.</para>
5095 <para>When the Cache Manager requests volume location information, the Volume Location (VL) Server provides all of the
5096 interfaces registered for each server machine that houses the volume. This enables the Cache Manager to make use of multiple
5097 addresses when accessing AFS data stored on a multihomed file server machine.</para>
5099 <para>If you wish, you can control which interfaces the File Server registers in its VLDB server entry by creating two files in
5100 the local <emphasis role="bold">/usr/afs/local</emphasis> directory: <emphasis role="bold">NetInfo</emphasis> and <emphasis
5101 role="bold">NetRestrict</emphasis>. Each time the File Server restarts, it builds a list of the local machine's interfaces by
5102 reading the <emphasis role="bold">NetInfo</emphasis> file, if it exists. If you do not create the file, the File Server uses the
5103 list of network interfaces configured with the operating system. It then removes from the list any addresses that appear in the
5104 <emphasis role="bold">NetRestrict</emphasis> file, if it exists. The File Server records the resulting list in the <emphasis
5105 role="bold">sysid</emphasis> file and registers the interfaces in the VLDB server entry that has the same unique
5108 <para>On database server machines, the <emphasis role="bold">NetInfo</emphasis> and <emphasis role="bold">NetRestrict</emphasis>
5109 files also determine which interfaces the Ubik database synchronization library uses when communicating with the database server
5110 processes running on other database server machines.</para>
5112 <para>There is a maximum number of IP addresses in each server entry, as documented in the <emphasis>OpenAFS Release
5113 Notes</emphasis>. If a multihomed file server machine has more interfaces than the maximum, AFS simply ignores the excess ones.
5114 It is probably appropriate for such machines to use the <emphasis role="bold">NetInfo</emphasis> and <emphasis
5115 role="bold">NetRestrict</emphasis> files to control which interfaces are registered.</para>
5117 <para>If for some reason the <emphasis role="bold">sysid</emphasis> file no longer exists, the File Server creates a new one
5118 with a new unique identifier. When the File Server registers the contents of the new file, the Volume Location (VL) Server
5119 normally recognizes automatically that the new file corresponds to an existing server entry, and overwrites the existing server
5120 entry with the new file contents and identifier. However, it is best not to remove the <emphasis role="bold">sysid</emphasis>
5121 file if that can be avoided.</para>
5123 <para>Similarly, it is important not to copy the <emphasis role="bold">sysid</emphasis> file from one file server machine to
5124 another. If you commonly copy the contents of the <emphasis role="bold">/usr/afs</emphasis> directory from an existing machine
5125 as part of installing a new file server machine, be sure to remove the <emphasis role="bold">sysid</emphasis> file from the
5126 <emphasis role="bold">/usr/afs/local</emphasis> directory on the new machine before starting the File Server.</para>
5128 <para>There are certain cases where the VL Server cannot determine whether it is appropriate to overwrite an existing server
5129 entry with a new <emphasis role="bold">sysid</emphasis> file's contents and identifier. It then refuses to allow the File Server
5130 to register the interfaces, which prevents the File Server from starting. This can happen if, for example, a new <emphasis
5131 role="bold">sysid</emphasis> file includes two interfaces that currently are registered by themselves in separate server
5132 entries. In such cases, error messages in the <emphasis role="bold">/usr/afs/log/VLLog</emphasis> file on the VL Server machine
5133 and in the <emphasis role="bold">/usr/afs/log/FileLog</emphasis> file on the file server machine indicate that you need to use
5134 the <emphasis role="bold">vos changeaddr</emphasis> command to resolve the problem. Contact the AFS Product Support group for
5135 instructions and assistance.</para>
5137 <para>Except in this type of rare error case, the only appropriate use of the <emphasis role="bold">vos changeaddr</emphasis>
5138 command is to remove a VLDB server entry completely when you remove a file server machine from service. The VLDB can accommodate
5139 a maximum number of server entries, as specified in the <emphasis>OpenAFS Release Notes</emphasis>. Removing obsolete entries
5140 makes it possible to allocate server entries for new file server machines as required. See the instructions that follow.</para>
5142 <para>Do not use the <emphasis role="bold">vos changeaddr</emphasis> command to change the list of interfaces registered in a
5143 VLDB server entry. To change a file server machine's IP addresses and server entry, see the instructions that follow.</para>
5146 <primary>NetInfo file (server version)</primary>
5148 <secondary>creating/editing</secondary>
5152 <primary>creating</primary>
5154 <secondary>NetInfo file (server version)</secondary>
5158 <primary>editing</primary>
5160 <secondary>NetInfo file (server version)</secondary>
5163 <sect2 id="Header_156">
5164 <title>To create or edit the server NetInfo file</title>
5168 <para>Become the local superuser <emphasis role="bold">root</emphasis> on the machine, if you are not already, by issuing
5169 the <emphasis role="bold">su</emphasis> command. <programlisting>
5170 % <emphasis role="bold">su root</emphasis>
5171 Password: <replaceable>root_password</replaceable>
5172 </programlisting></para>
5176 <para>Using a text editor, open the <emphasis role="bold">/usr/afs/local/NetInfo</emphasis> file. Place one IP address in
5177 dotted decimal format (for example, <computeroutput>192.12.107.33</computeroutput>) on each line. The order of entries is
5178 not significant.</para>
5182 <para>If you want the File Server to start using the revised list immediately, use the <emphasis role="bold">bos
5183 restart</emphasis> command to restart the <emphasis role="bold">fs</emphasis> process. For instructions, see <link
5184 linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>.</para>
5189 <primary>NetRestrict file (server version)</primary>
5191 <secondary>creating/editing</secondary>
5195 <primary>creating</primary>
5197 <secondary>NetRestrict file (server version)</secondary>
5201 <primary>editing</primary>
5203 <secondary>NetRestrict file (server version)</secondary>
5207 <sect2 id="Header_157">
5208 <title>To create or edit the server NetRestrict file</title>
5212 <para>Become the local superuser <emphasis role="bold">root</emphasis> on the machine, if you are not already, by issuing
5213 the <emphasis role="bold">su</emphasis> command. <programlisting>
5214 % <emphasis role="bold">su root</emphasis>
5215 Password: <replaceable>root_password</replaceable>
5216 </programlisting></para>
5220 <para>Using a text editor, open the <emphasis role="bold">/usr/afs/local/NetRestrict</emphasis> file. Place one IP address
5221 in dotted decimal format on each line. The order of the addresses is not significant. Use a slash (<emphasis
5222 role="bold">/</emphasis>) followed by a subnet length to represent all possible addresses in a range. For example, the entry
5223 <computeroutput>192.12.105.0/24</computeroutput> indicates that the Cache Manager does not register any of the addresses in
5224 the 192.12.105 subnet.</para>
5228 <para>If you want the File Server to start using the revised list immediately, use the <emphasis role="bold">bos
5229 restart</emphasis> command to restart the <emphasis role="bold">fs</emphasis> process. For instructions, see <link
5230 linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>.</para>
5235 <primary>vos commands</primary>
5237 <secondary>listaddrs</secondary>
5241 <primary>commands</primary>
5243 <secondary>vos listaddrs</secondary>
5247 <sect2 id="Header_158">
5248 <title>To display all server entries from the VLDB</title>
5252 <para>Issue the <emphasis role="bold">vos listaddrs</emphasis> command to display all server entries from the VLDB.
5254 % <emphasis role="bold">vos listaddrs</emphasis>
5255 </programlisting></para>
5257 <para>where <emphasis role="bold">lista</emphasis> is the shortest acceptable abbreviation of <emphasis
5258 role="bold">listaddrs</emphasis>.</para>
5260 <para>The output displays all server entries from the VLDB, each on its own line. If a file server machine is multihomed,
5261 all of its registered addresses appear on the line. The first one is the one reported as a volume's site in the output
5262 from the <emphasis role="bold">vos examine</emphasis> and <emphasis role="bold">vos listvldb</emphasis> commands.</para>
5264 <para>VLDB server entries record IP addresses, and the command interpreter has the local name service (either a process
5265 like the Domain Name Service or a local host table) translate them to hostnames before displaying them. If an IP address
5266 appears in the output, it is not possible to translate it.</para>
5268 <para>The existence of an entry does not necessarily indicate that the machine that is still an active file server
5269 machine. To remove obsolete server entries, see the following instructions.</para>
5274 <primary>vos commands</primary>
5276 <secondary>changeaddr</secondary>
5280 <primary>commands</primary>
5282 <secondary>vos changeaddr</secondary>
5286 <sect2 id="Header_159">
5287 <title>To remove obsolete server entries from the VLDB</title>
5291 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
5292 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
5293 display the users in the UserList file</link>. <programlisting>
5294 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
5295 </programlisting></para>
5299 <para>Issue the <emphasis role="bold">vos changeaddr</emphasis> command to remove a server entry from the VLDB.
5301 % <emphasis role="bold">vos changeaddr</emphasis> <<replaceable>original IP address</replaceable>> <emphasis role="bold">-remove</emphasis>
5302 </programlisting></para>
5304 <para>where <variablelist>
5306 <term><emphasis role="bold">ch</emphasis></term>
5309 <para>Is the shortest acceptable abbreviation of <emphasis role="bold">changeaddr</emphasis>.</para>
5314 <term><emphasis role="bold">original IP address</emphasis></term>
5317 <para>Specifies one of the IP addresses currently registered for the file server machine in the VLDB. Any of a
5318 multihomed file server machine's addresses are acceptable to identify it.</para>
5323 <term><emphasis role="bold">-remove</emphasis></term>
5326 <para>Removes the server entry.</para>
5329 </variablelist></para>
5334 <sect2 id="Header_160">
5335 <title>To change a server machine's IP addresses</title>
5339 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file. If necessary, issue
5340 the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in <link linkend="HDRWQ593">To
5341 display the users in the UserList file</link>. <programlisting>
5342 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
5343 </programlisting></para>
5347 <para>If the machine is the system control machine or a binary distribution machine, and you are also changing its
5348 hostname, redefine all relevant <emphasis role="bold">upclient</emphasis> processes on other server machines to refer to
5349 the new hostname. Use the <emphasis role="bold">bos delete</emphasis> and <emphasis role="bold">bos create</emphasis>
5350 commands as instructed in <link linkend="HDRWQ161">Creating and Removing Processes</link>.</para>
5354 <para>If the machine is a database server machine, edit its entry in the <emphasis
5355 role="bold">/usr/afs/etc/CellServDB</emphasis> file on every server machine in the cell to list one of the new IP
5356 addresses. You can edit the file on the system control machine and wait the
5357 required time (by default, five minutes) for the Update Server to distribute the changed file to all server
5362 <para>If the machine is a database server machine, issue the <emphasis role="bold">bos shutdown</emphasis> command to stop
5363 all server processes. If the machine is also a file server, the volumes on it are inaccessible during this time. For a
5364 complete description of the command, see <link linkend="HDRWQ168">To stop processes temporarily</link>. <programlisting>
5365 % <emphasis role="bold">bos shutdown</emphasis> <<replaceable>machine name</replaceable>>
5366 </programlisting></para>
5370 <para>Use the utilities provided with the operating system to change one or more of the machine's IP addresses.</para>
5374 <para>If appropriate, edit the <emphasis role="bold">/usr/afs/local/NetInfo</emphasis> file, the <emphasis
5375 role="bold">/usr/afs/local/NetRestrict</emphasis> file, or both, to reflect the changed addresses. Instructions appear
5376 earlier in this section.</para>
5380 <para>If the machine is a database server machine, issue the <emphasis role="bold">bos restart</emphasis> command to
5381 restart all server processes on the machine. For complete instructions for the <emphasis role="bold">bos
5382 restart</emphasis> command, see <link linkend="HDRWQ170">Stopping and Immediately Restarting Processes</link>.
5384 % <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-all</emphasis>
5385 </programlisting></para>
5387 <para>At the same time, issue the <emphasis role="bold">bos restart</emphasis> command on all other database server
5388 machines in the cell to restart the database server processes only (the Authentication, Backup, Protection, and Volume
5389 Location Servers). Issue the commands in quick succession so that all of the database server processes vote in the quorum
5393 % <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">kaserver buserver ptserver vlserver</emphasis>
5396 <para>If you are changing IP addresses on every database server machine in the cell, you must also issue the <emphasis
5397 role="bold">bos restart</emphasis> command on every file server machine in the cell to restart the <emphasis
5398 role="bold">fs</emphasis> process.</para>
5402 <para>If the machine is not a database server machine, issue the <emphasis role="bold">bos restart</emphasis> command to
5403 restart the <emphasis role="bold">fs</emphasis> process (if the machine is a database server, you already restarted the
5404 process in the previous step). The File Server automatically compiles a new list of interfaces, records them in the
5405 <emphasis role="bold">/usr/afs/local/sysid</emphasis> file, and registers them in its VLDB server entry. <programlisting>
5406 % <emphasis role="bold">bos restart</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">fs</emphasis>
5407 </programlisting></para>
5411 <para>If the machine is a database server machine, edit its entry in the <emphasis
5412 role="bold">/usr/vice/etc/CellServDB</emphasis> file on every client machine in the cell to list one of the new IP
5413 addresses. Instructions appear in <link linkend="HDRWQ406">Maintaining Knowledge of Database Server
5414 Machines</link>.</para>
5418 <para>If there are machine entries in the Protection Database for the machine's previous IP addresses, use the <emphasis
5419 role="bold">pts rename</emphasis> command to change them to the new addresses. For instructions, see <link
5420 linkend="HDRWQ556">Changing a Protection Database Entry's Name</link>.</para>
5425 <primary>rebooting</primary>
5427 <secondary>server machine, instructions</secondary>
5431 <primary>server machine</primary>
5433 <secondary>rebooting</secondary>
5437 <primary>BOS Server</primary>
5439 <secondary>role in reboot of server machine</secondary>
5444 <sect1 id="HDRWQ139">
5445 <title>Rebooting a Server Machine</title>
5447 <para>You can reboot a server machine either by typing the appropriate commands at its console or by issuing the <emphasis
5448 role="bold">bos exec</emphasis> command on a remote machine. Remote rebooting can be more convenient, because you do not need to
5449 leave your present location, but you cannot track the progress of the reboot as you can at the console. Remote rebooting is
5450 possible because the server machine's operating system recognizes the BOS Server, which executes the <emphasis role="bold">bos
5451 exec</emphasis> command, as the local superuser <emphasis role="bold">root</emphasis>.</para>
5453 <para>Rebooting server machines is part of routine maintenance in some cells, and some instructions in the AFS documentation
5454 include it as a step. It is certainly not intended to be the standard method for recovering from AFS-related problems, however,
5455 but only a last resort when the machine is unresponsive and you have tried all other reasonable options.</para>
5457 <para>Rebooting causes a service outage. If the machine stores volumes, they are all inaccessible until the reboot completes and
5458 the File Server reattaches them. If the machine is a database server machine, information from the databases can become
5459 unavailable during the reelection of the synchronization site for each database server process; the VL Server outage generally
5460 has the greatest impact, because the Cache Manager must be able to access the VLDB to fetch AFS data.</para>
5462 <para>By convention, a server machine's AFS initialization file includes the following command to restart the BOS Server after
5463 each reboot. It starts the other AFS server processes listed in the local <emphasis
5464 role="bold">/usr/afs/local/BosConfig</emphasis> file. These instructions assume that the initialization file includes the
5468 /usr/afs/bin/bosserver
5471 <sect2 id="HDRWQ140">
5472 <title>To reboot a file server machine from its console</title>
5476 <para>Become the local superuser <emphasis role="bold">root</emphasis> on the machine, if you are not already, by issuing
5477 the <emphasis role="bold">su</emphasis> command. <programlisting>
5478 % <emphasis role="bold">su root</emphasis>
5479 Password: <replaceable>root_password</replaceable>
5480 </programlisting></para>
5484 <para>Issue the <emphasis role="bold">bos shutdown</emphasis> command to shut down all AFS server processes other than the
5485 BOS Server, which terminates safely when you reboot the machine. Include the <emphasis role="bold">-localauth</emphasis>
5486 flag because you are logged in as the local superuser <emphasis role="bold">root</emphasis> but do not necessarily have
5487 administrative tokens. For a complete description of the command, see <link linkend="HDRWQ168">To stop processes
5488 temporarily</link>. <programlisting>
5489 # <emphasis role="bold">bos shutdown</emphasis> <<replaceable>machine name</replaceable>> <emphasis role="bold">-localauth</emphasis> [<emphasis
5490 role="bold">-wait</emphasis>]
5491 </programlisting></para>
5495 <para>Reboot the machine. On many system types, the appropriate command is <emphasis role="bold">shutdown</emphasis>, but
5496 the appropriate options vary; consult your UNIX administrator's guide. <programlisting>
5497 # <emphasis role="bold">shutdown</emphasis>
5498 </programlisting></para>
5503 <primary>commands</primary>
5505 <secondary>bos exec</secondary>
5509 <primary>bos commands</primary>
5511 <secondary>exec</secondary>
5515 <sect2 id="HDRWQ141">
5516 <title>To reboot a file server machine remotely</title>
5520 <para>Verify that you are listed in the <emphasis role="bold">/usr/afs/etc/UserList</emphasis> file on the machine you are
5521 rebooting. If necessary, issue the <emphasis role="bold">bos listusers</emphasis> command, which is fully described in
5522 <link linkend="HDRWQ593">To display the users in the UserList file</link>. <programlisting>
5523 % <emphasis role="bold">bos listusers</emphasis> <<replaceable>machine name</replaceable>>
5524 </programlisting></para>
5528 <para>Issue the <emphasis role="bold">bos shutdown</emphasis> to halt AFS server processes other than the BOS Server,
5529 which terminates safely when you turn off the machine. For a complete description of the command, see <link
5530 linkend="HDRWQ168">To stop processes temporarily</link>. <programlisting>
5531 % <emphasis role="bold">bos shutdown</emphasis> <<replaceable>machine name</replaceable>> [<emphasis role="bold">-wait</emphasis>]
5532 </programlisting></para>
5536 <para>Issue the <emphasis role="bold">bos exec</emphasis> command to reboot the machine remotely. <programlisting>
5537 % <emphasis role="bold">bos exec</emphasis> <<replaceable>machine name</replaceable>> reboot_command
5538 </programlisting></para>
5540 <para>where <variablelist>
5542 <term><emphasis role="bold">machine name</emphasis></term>
5545 <para>Names the file server machine to reboot.</para>
5550 <term><emphasis role="bold">reboot_command</emphasis></term>
5553 <para>Is the rebooting command for the machine's operating system. The <emphasis role="bold">shutdown</emphasis>
5554 command is appropriate on many system types, but consult your operating system documentation.</para>
5557 </variablelist></para>