1 <?xml version="1.0" encoding="UTF-8"?>
4 <title>An Overview of OpenAFS Administration</title>
6 <para>This chapter provides a broad overview of the concepts and
7 organization of AFS. It is strongly recommended that anyone involved in
8 administering an AFS cell read this chapter before beginning to issue
12 <title>A Broad Overview of AFS</title>
14 <para>This section introduces most of the key terms and concepts
15 necessary for a basic understanding of AFS. For a more detailed
16 discussion, see <link linkend="HDRWQ7">More Detailed Discussions of
17 Some Basic Concepts</link>.</para>
19 <sect2 renderas="sect3">
20 <title>AFS: A Distributed File System</title>
22 <para>AFS is a distributed file system that enables users to share
23 and access all of the files stored in a network of computers as
24 easily as they access the files stored on their local machines. The
25 file system is called distributed for this exact reason: files can
26 reside on many different machines (be distributed across them), but
27 are available to users on every machine.</para>
30 <sect2 renderas="sect3">
31 <title>Servers and Clients</title>
33 <para>AFS stores files on file server machines. File server machines
34 provide file storage and delivery service, along with other
35 specialized services, to the other subset of machines in the
36 network, the client machines. These machines are called clients
37 because they make use of the servers' services while doing their own
38 work. In a standard AFS configuration, clients provide computational
39 power, access to the files in AFS and other "general purpose" tools
40 to the users seated at their consoles. There are generally many more
41 client workstations than file server machines.</para>
43 <para>AFS file server machines run a number of server processes, so
44 called because each provides a distinct specialized service: one
45 handles file requests, another tracks file location, a third manages
46 security, and so on. To avoid confusion, AFS documentation always
47 refers to server machines and server processes, not simply to
48 servers. For a more detailed description of the server processes,
49 see <link linkend="HDRWQ17">AFS Server Processes and the Cache
50 Manager</link>.</para>
53 <sect2 renderas="sect3">
56 <para>A cell is an administratively independent site running AFS. As
57 a cell's system administrator, you make many decisions about
58 configuring and maintaining your cell in the way that best serves
59 its users, without having to consult the administrators in other
60 cells. For example, you determine how many clients and servers to
61 have, where to put files, and how to allocate client machines to
65 <sect2 renderas="sect3">
66 <title>Transparent Access and the Uniform Namespace</title>
68 <para>Although your AFS cell is administratively independent, you
69 probably want to organize the local collection of files (your
70 filespace or tree) so that users from other cells can also access
71 the information in it. AFS enables cells to combine their local
72 filespaces into a global filespace, and does so in such a way that
73 file access is transparent--users do not need to know anything about
74 a file's location in order to access it. All they need to know is
75 the pathname of the file, which looks the same in every cell. Thus
76 every user at every machine sees the collection of files in the same
77 way, meaning that AFS provides a uniform namespace to its
81 <sect2 renderas="sect3">
82 <title>Volumes</title>
84 <para>AFS groups files into volumes, making it possible to
85 distribute files across many machines and yet maintain a uniform
86 namespace. A volume is a unit of disk space that functions like a
87 container for a set of related files, keeping them all together on
88 one partition. Volumes can vary in size, but are (by definition)
89 smaller than a partition.</para>
91 <para>Volumes are important to system administrators and users for
92 several reasons. Their small size makes them easy to move from one
93 partition to another, or even between machines. The system
94 administrator can maintain maximum efficiency by moving volumes to
95 keep the load balanced evenly. In addition, volumes correspond to
96 directories in the filespace--most cells store the contents of each
97 user home directory in a separate volume. Thus the complete contents
98 of the directory move together when the volume moves, making it easy
99 for AFS to keep track of where a file is at a certain time.</para>
101 <para>Volume moves are recorded automatically, so users do not have
102 to keep track of file locations. Volumes can be moved from server to
103 server by a cell administrator without notifying clients, even while
104 the volume is in active use by a client machine. Volume moves are
105 transparent to client machines apart from a brief interruption in
106 file service for files in that volume.</para>
109 <sect2 renderas="sect3">
110 <title>Efficiency Boosters: Replication and Caching</title>
112 <para>AFS incorporates special features on server machines and
113 client machines that help make it efficient and reliable.</para>
115 <para>On server machines, AFS enables administrators to replicate
116 commonly-used volumes, such as those containing binaries for popular
117 programs. Replication means putting an identical read-only copy
118 (sometimes called a clone) of a volume on more than one file server
119 machine. The failure of one file server machine housing the volume
120 does not interrupt users' work, because the volume's contents are
121 still available from other machines. Replication also means that one
122 machine does not become overburdened with requests for files from a
123 popular volume.</para>
125 <para>On client machines, AFS uses caching to improve efficiency.
126 When a user on a client machine requests a file, the Cache Manager
127 on the client sends a request for the data to the File Server
128 process running on the proper file server machine. The user does not
129 need to know which machine this is; the Cache Manager determines
130 file location automatically. The Cache Manager receives the file
131 from the File Server process and puts it into the cache, an area of
132 the client machine's local disk or memory dedicated to temporary
133 file storage. Caching improves efficiency because the client does
134 not need to send a request across the network every time the user
135 wants the same file. Network traffic is minimized, and subsequent
136 access to the file is especially fast because the file is stored
137 locally. AFS has a way of ensuring that the cached file stays
138 up-to-date, called a callback.</para>
141 <sect2 renderas="sect3">
142 <title>Security: Mutual Authentication and Access Control
145 <para>Even in a cell where file sharing is especially frequent and
146 widespread, it is not desirable that every user have equal access to
147 every file. One way AFS provides adequate security is by requiring
148 that servers and clients prove their identities to one another
149 before they exchange information. This procedure, called mutual
150 authentication, requires that both server and client demonstrate
151 knowledge of a "shared secret" (like a password) known only to the
152 two of them. Mutual authentication guarantees that servers provide
153 information only to authorized clients and that clients receive
154 information only from legitimate servers.</para>
156 <para>Users themselves control another aspect of AFS security, by
157 determining who has access to the directories they own. For any
158 directory a user owns, he or she can build an access control list
159 (ACL) that grants or denies access to the contents of the
160 directory. An access control list pairs specific users with specific
161 types of access privileges. There are seven separate permissions and
162 up to twenty different people or groups of people can appear on an
163 access control list.</para>
165 <para>For a more detailed description of AFS's mutual authentication
166 procedure, see <link linkend="HDRWQ75">A More Detailed Look at
167 Mutual Authentication</link>. For further discussion of ACLs, see
168 <link linkend="HDRWQ562">Managing Access Control
174 <title>More Detailed Discussions of Some Basic Concepts</title>
176 <para>The previous section offered a brief overview of the many
177 concepts that an AFS system administrator needs to understand. The
178 following sections examine some important concepts in more
179 detail. Although not all concepts are new to an experienced
180 administrator, reading this section helps ensure a common
181 understanding of term and concepts.</para>
184 <title>Networks</title>
187 <primary>network</primary>
189 <secondary>defined</secondary>
192 <para>A <emphasis>network</emphasis> is a collection of
193 interconnected computers able to communicate with each other and
194 transfer information back and forth.</para>
196 <para>A network can connect computers of any kind, but the typical
197 network running AFS connects servers or high-function personal
198 workstations with AFS file server machines. For more about the
199 classes of machines used in an AFS environment, see <link
200 linkend="HDRWQ10">Servers and Clients</link>.</para>
204 <title>Distributed File Systems</title>
207 <primary>file system</primary>
209 <secondary>defined</secondary>
213 <primary>distributed file system</primary>
216 <para>A <emphasis>file system</emphasis> is a collection of files
217 and the facilities (programs and commands) that enable users to
218 access the information in the files. All computing environments have
221 <para>Networked computing environments often use
222 <emphasis>distributed file systems</emphasis> like AFS. A
223 distributed file system takes advantage of the interconnected nature
224 of the network by storing files on more than one computer in the
225 network and making them accessible to all of them. In other words,
226 the responsibility for file storage and delivery is "distributed"
227 among multiple machines instead of relying on only one. Despite the
228 distribution of responsibility, a distributed file system like AFS
229 creates the illusion that there is a single filespace.</para>
233 <title>Servers and Clients</title>
236 <primary>server/client model</primary>
240 <primary>server</primary>
242 <secondary>definition</secondary>
246 <primary>client</primary>
248 <secondary>definition</secondary>
251 <para>AFS uses a server/client model. In general, a server is a
252 machine, or a process running on a machine, that provides
253 specialized services to other machines. A client is a machine or
254 process that makes use of a server's specialized service during the
255 course of its own work, which is often of a more general nature than
256 the server's. The functional distinction between clients and server
257 is not always strict, however--a server can be considered the client
258 of another server whose service it is using.</para>
260 <para>AFS divides the machines on a network into two basic classes,
261 <emphasis>file server machines</emphasis> and <emphasis>client
262 machines</emphasis>, and assigns different tasks and
263 responsibilities to each.</para>
266 <title>File Server Machines</title>
269 <primary>file server machine</primary>
273 <primary>server</primary>
275 <secondary>process</secondary>
277 <tertiary>definition</tertiary>
280 <para><emphasis>File server machines</emphasis> store the files in
281 the distributed file system, and a <emphasis>server
282 process</emphasis> running on the file server machine delivers and
283 receives files. AFS file server machines run a number of
284 <emphasis>server processes</emphasis>. Each process has a special
285 function, such as maintaining databases important to AFS
286 administration, managing security or handling volumes. This
287 modular design enables each server process to specialize in one
288 area, and thus perform more efficiently. For a description of the
289 function of each AFS server process, see <link
290 linkend="HDRWQ17">AFS Server Processes and the Cache
291 Manager</link>.</para>
294 <para>Not all AFS server machines must run all of the server
295 processes. Some processes run on only a few machines because the
296 demand for their services is low. Other processes run on only one
297 machine in order to act as a synchronization site. See <link
298 linkend="HDRWQ90">The Four Roles for File Server
299 Machines</link>.</para>
302 <title>Client Machines</title>
305 <primary>client</primary>
307 <secondary>machine</secondary>
309 <tertiary>definition</tertiary>
312 <para>The other class of machines are the <emphasis>client
313 machines</emphasis>, which generally work directly for users,
314 providing computational power and other general purpose tools but
315 may also be other servers that use data stored in AFS to provide
316 other services. Clients also provide users with access to the
317 files stored on the file server machines. Clients run a Cache
318 Manager, which is normally a combination of a kernel module and a
319 running process that enables them to communicate with the AFS
320 server processes running on the file server machines and to cache
321 files. See <link linkend="HDRWQ28">The Cache Manager</link> for
322 more information. There are usually many more client machines in a
323 cell than file server machines.</para>
331 <primary>cell</primary>
334 <para>A <emphasis>cell</emphasis> is an independently administered
335 site running AFS. In terms of hardware, it consists of a collection
336 of file server machines defined as belonging to the cell. To say
337 that a cell is administratively independent means that its
338 administrators determine many details of its configuration without
339 having to consult administrators in other cells or a central
340 authority. For example, a cell administrator determines how many
341 machines of different types to run, where to put files in the local
342 tree, how to associate volumes and directories, and how much space
343 to allocate to each user.</para>
345 <para>The terms <emphasis>local cell</emphasis> and <emphasis>home
346 cell</emphasis> are equivalent, and refer to the cell in which a
347 user has initially authenticated during a session, by logging onto a
348 machine that belongs to that cell. All other cells are referred to
349 as <emphasis>foreign</emphasis> from the user's perspective. In
350 other words, throughout a login session, a user is accessing the
351 filespace through a single Cache Manager--the one on the machine to
352 which he or she initially logged in--and that Cache Manager is
353 normally configured to have a default local cell. All other cells
354 are considered foreign during that login session, even if the user
355 authenticates in additional cells or uses the <emphasis
356 role="bold">cd</emphasis> command to change directories into their
357 file trees. This distinction is mostly invisible and irrelavant to
358 users. For most purposes, users will see no difference between local
359 and foreign cells.</para>
362 <primary>local cell</primary>
366 <primary>cell</primary>
368 <secondary>local</secondary>
372 <primary>foreign cell</primary>
376 <primary>cell</primary>
378 <secondary>foreign</secondary>
381 <para>It is possible to maintain more than one cell at a single
382 geographical location. For instance, separate departments on a
383 university campus or in a corporation can choose to administer their
384 own cells. It is also possible to have machines at geographically
385 distant sites belong to the same cell; only limits on the speed of
386 network communication determine how practical this is.</para>
388 <para>Despite their independence, AFS cells generally agree to make
389 their local filespace visible to other AFS cells, so that users in
390 different cells can share files if they choose. If your cell is to
391 participate in the "global" AFS namespace, it must comply with a few
392 basic conventions governing how the local filespace is configured
393 and how the addresses of certain file server machines are advertised
394 to the outside world.</para>
398 <title>The Uniform Namespace and Transparent Access</title>
401 <primary>transparent access as AFS feature</primary>
405 <primary>access</primary>
407 <secondary>transparent (AFS feature)</secondary>
410 <para>One of the features that makes AFS easy to use is that it
411 provides transparent access to the files in a cell's
412 filespace. Users do not have to know which file server machine
413 stores a file in order to access it; they simply provide the file's
414 pathname, which AFS automatically translates into a machine
417 <para>In addition to transparent access, AFS also creates a
418 <emphasis>uniform namespace</emphasis>--a file's pathname is
419 identical regardless of which client machine the user is working
420 on. The cell's file tree looks the same when viewed from any client
421 because the cell's file server machines store all the files
422 centrally and present them in an identical manner to all
425 <para>To enable the transparent access and the uniform namespace
426 features, the system administrator must follow a few simple
427 conventions in configuring client machines and file trees. For
428 details, see <link linkend="HDRWQ39">Making Other Cells Visible in
429 Your Cell</link>.</para>
433 <title>Volumes</title>
436 <primary>volume</primary>
438 <secondary>definition</secondary>
441 <para>A <emphasis>volume</emphasis> is a conceptual container for a
442 set of related files that keeps them all together on one file server
443 machine partition. Volumes can vary in size, but are (by definition)
444 smaller than a partition. Volumes are the main administrative unit
445 in AFS, and have several characteristics that make administrative
446 tasks easier and help improve overall system
447 performance. <itemizedlist>
449 <para>The relatively small size of volumes makes them easy to
450 move from one partition to another, or even between
455 <para>You can maintain maximum system efficiency by moving
456 volumes to keep the load balanced evenly among the different
457 machines. If a partition becomes full, the small size of
458 individual volumes makes it easy to find enough room on other
459 machines for them.</para>
462 <primary>volume</primary>
464 <secondary>in load balancing</secondary>
469 <para>Each volume corresponds logically to a directory in the
470 file tree and keeps together, on a single partition, all the
471 data that makes up the files in the directory (including
472 possible subdirectories). By maintaining (for example) a
473 separate volume for each user's home directory, you keep all
474 of the user's files together, but separate from those of other
475 users. This is an administrative convenience that is
476 impossible if the partition is the smallest unit of
480 <primary>volume</primary>
482 <secondary>correspondence with directory</secondary>
486 <primary>directory</primary>
488 <secondary>correspondence with volume</secondary>
492 <primary>correspondence</primary>
494 <secondary>of volumes and directories</secondary>
499 <para>The directory/volume correspondence also makes
500 transparent file access possible, because it simplifies the
501 process of file location. All files in a directory reside
502 together in one volume and in order to find a file, a file
503 server process need only know the name of the file's parent
504 directory, information which is included in the file's
505 pathname. AFS knows how to translate the directory name into
506 a volume name, and automatically tracks every volume's
507 location, even when a volume is moved from machine to
508 machine. For more about the directory/volume correspondence,
509 see <link linkend="HDRWQ14">Mount Points</link>.</para>
513 <para>Volumes increase file availability through replication
517 <primary>volume</primary>
519 <secondary>as unit of</secondary>
521 <tertiary>replication</tertiary>
525 <primary>volume</primary>
527 <secondary>as unit of</secondary>
529 <tertiary>backup</tertiary>
534 <para>Replication (placing copies of a volume on more than one
535 file server machine) makes the contents more reliably
536 available; for details, see <link
537 linkend="HDRWQ15">Replication</link>. Entire sets of volumes
538 can be backed up as dump files (possibly to tape) and restored
539 to the file system; see <link linkend="HDRWQ248">Configuring
540 the AFS Backup System</link> and <link
541 linkend="HDRWQ283">Backing Up and Restoring AFS
542 Data</link>. In AFS, backup also refers to recording the state
543 of a volume at a certain time and then storing it (either on
544 tape or elsewhere in the file system) for recovery in the
545 event files in it are accidentally deleted or changed. See
546 <link linkend="HDRWQ201">Creating Backup
547 Volumes</link>.</para>
551 <para>Volumes are the unit of resource management. A space
552 quota associated with each volume sets a limit on the maximum
553 volume size. See <link linkend="HDRWQ234">Setting and
554 Displaying Volume Quota and Current Size</link>.</para>
557 <primary>volume</primary>
559 <secondary>as unit of</secondary>
561 <tertiary>resource management</tertiary>
569 <title>Mount Points</title>
572 <primary>mount point</primary>
574 <secondary>definition</secondary>
577 <para>The previous section discussed how each volume corresponds
578 logically to a directory in the file system: the volume keeps
579 together on one partition all the data in the files residing in the
580 directory. The directory that corresponds to a volume is called its
581 <emphasis>root directory</emphasis>, and the mechanism that
582 associates the directory and volume is called a <emphasis>mount
583 point</emphasis>. A mount point is similar to a symbolic link in the
584 file tree that specifies which volume contains the files kept in a
585 directory. A mount point is not an actual symbolic link; its
586 internal structure is different.</para>
589 <para>You must not create, in AFS, a symbolic link to a file whose
590 name begins with the number sign (#) or the percent sign (%),
591 because the Cache Manager interprets such a link as a mount point
592 to a regular or read/write volume, respectively.</para>
596 <primary>root directory</primary>
600 <primary>directory</primary>
602 <secondary>root</secondary>
606 <primary>volume</primary>
608 <secondary>root directory of</secondary>
612 <primary>volume</primary>
614 <secondary>mounting</secondary>
617 <para>The use of mount points means that many of the elements in an
618 AFS file tree that look and function just like standard UNIX file
619 system directories are actually mount points. In form, a mount point
620 is a symbolic link in a special format that names the volume
621 containing the data for files in the directory. When the Cache
622 Manager (see <link linkend="HDRWQ28">The Cache Manager</link>)
623 encounters a mount point--for example, in the course of interpreting
624 a pathname--it looks in the volume named in the mount point. In the
625 volume the Cache Manager finds an actual UNIX-style directory
626 element--the volume's root directory--that lists the files contained
627 in the directory/volume. The next element in the pathname appears in
630 <para>A volume is said to be <emphasis>mounted</emphasis> at the
631 point in the file tree where there is a mount point pointing to the
632 volume. A volume's contents are not visible or accessible unless it
633 is mounted. Unlike some other file systems, AFS volumes can be
634 mounted at multiple locations in the file system at the same
639 <title>Replication</title>
642 <primary>replication</primary>
644 <secondary>definition</secondary>
648 <primary>clone</primary>
651 <para><emphasis>Replication</emphasis> refers to making a copy, or
652 <emphasis>clone</emphasis>, of a source read/write volume and then
653 placing the copy on one or more additional file server machines in a
654 cell. One benefit of replicating a volume is that it increases the
655 availability of the contents. If one file server machine housing the
656 volume fails, users can still access the volume on a different
657 machine. No one machine need become overburdened with requests for a
658 popular file, either, because the file is available from several
661 <para>Replication is not necessarily appropriate for cells with
662 limited disk space, nor are all types of volumes equally suitable
663 for replication (replication is most appropriate for volumes that
664 contain popular files that do not change very often). For more
665 details, see <link linkend="HDRWQ50">When to Replicate
666 Volumes</link>.</para>
670 <title>Caching and Callbacks</title>
673 <primary>caching</primary>
676 <para>Just as replication increases system availability,
677 <emphasis>caching</emphasis> increases the speed and efficiency of
678 file access in AFS. Each AFS client machine dedicates a portion of
679 its local disk or memory to a cache where it stores data
680 temporarily. Whenever an application program (such as a text editor)
681 running on a client machine requests data from an AFS file, the
682 request passes through the Cache Manager. The Cache Manager is a
683 portion of the client machine's kernel that translates file requests
684 from local application programs into cross-network requests to the
685 <emphasis>File Server process</emphasis> running on the file server
686 machine storing the file. When the Cache Manager receives the
687 requested data from the File Server, it stores it in the cache and
688 then passes it on to the application program.</para>
690 <para>Caching improves the speed of data delivery to application
691 programs in the following ways:</para>
695 <para>When the application program repeatedly asks for data from
696 the same file, it is already on the local disk. The application
697 does not have to wait for the Cache Manager to request and
698 receive the data from the File Server.</para>
702 <para>Caching data eliminates the need for repeated request and
703 transfer of the same data, so network traffic is reduced. Thus,
704 initial requests and other traffic can get through more
708 <primary>AFS</primary>
710 <secondary>reducing traffic in</secondary>
714 <primary>network</primary>
716 <secondary>reducing traffic through caching</secondary>
720 <primary>slowed performance</primary>
722 <secondary>preventing in AFS</secondary>
728 <primary>callback</primary>
732 <primary>consistency guarantees</primary>
734 <secondary>cached data</secondary>
737 <para>While caching provides many advantages, it also creates the
738 problem of maintaining consistency among the many cached copies of a
739 file and the source version of a file. This problem is solved using
740 a mechanism referred to as a <emphasis>callback</emphasis>.</para>
742 <para>A callback is a promise by a File Server to a Cache Manager to
743 inform the latter when a change is made to any of the data delivered
744 by the File Server. Callbacks are used differently based on the type
745 of file delivered by the File Server: <itemizedlist>
747 <para>When a File Server delivers a writable copy of a file
748 (from a read/write volume) to the Cache Manager, the File
749 Server sends along a callback with that file. If the source
750 version of the file is changed by another user, the File
751 Server breaks the callback associated with the cached version
752 of that file--indicating to the Cache Manager that it needs to
753 update the cached copy.</para>
757 <para>When a File Server delivers a file from a read-only
758 volume to the Cache Manager, the File Server sends along a
759 callback associated with the entire volume (so it does not
760 need to send any more callbacks when it delivers additional
761 files from the volume). Only a single callback is required per
762 accessed read-only volume because files in a read-only volume
763 can change only when a new version of the complete volume is
764 released. All callbacks associated with the old version of the
765 volume are broken at release time.</para>
770 <para>The callback mechanism ensures that the Cache Manager always
771 requests the most up-to-date version of a file. However, it does not
772 ensure that the user necessarily notices the most current version as
773 soon as the Cache Manager has it. That depends on how often the
774 application program requests additional data from the File System or
775 how often it checks with the Cache Manager.</para>
780 <title>AFS Server Processes and the Cache Manager</title>
783 <primary>AFS</primary>
785 <secondary>server processes used in</secondary>
789 <primary>server</primary>
791 <secondary>process</secondary>
793 <tertiary>list of AFS</tertiary>
796 <para>As mentioned in <link linkend="HDRWQ10">Servers and
797 Clients</link>, AFS file server machines run a number of processes,
798 each with a specialized function. One of the main responsibilities of
799 a system administrator is to make sure that processes are running
800 correctly as much of the time as possible, using the administrative
801 services that the server processes provide.</para>
803 <para>The following list briefly describes the function of each server
804 process and the Cache Manager; the following sections then discuss the
805 important features in more detail.</para>
807 <para>The <emphasis>File Server</emphasis>, the most fundamental of
808 the servers, delivers data files from the file server machine to local
809 workstations as requested, and stores the files again when the user
810 saves any changes to the files.</para>
812 <para>The <emphasis>Basic OverSeer Server (BOS Server)</emphasis>
813 ensures that the other server processes on its server machine are
814 running correctly as much of the time as possible, since a server is
815 useful only if it is available. The BOS Server relieves system
816 administrators of much of the responsibility for overseeing system
819 <para>The Protection Server helps users control who has access to
820 their files and directories. It is responsible for mapping Kerberos
821 principals to AFS identities. Users can also grant access to several
822 other users at once by putting them all in a group entry in the
823 Protection Database maintained by the Protection Server.</para>
825 <para>The <emphasis>Volume Server</emphasis> performs all types of
826 volume manipulation. It helps the administrator move volumes from one
827 server machine to another to balance the workload among the various
830 <para>The <emphasis>Volume Location Server (VL Server)</emphasis>
831 maintains the Volume Location Database (VLDB), in which it records the
832 location of volumes as they move from file server machine to file
833 server machine. This service is the key to transparent file access for
836 <para>The <emphasis>Salvager</emphasis> is not a server in the sense
837 that others are. It runs only after the File Server or Volume Server
838 fails; it repairs any inconsistencies caused by the failure. The
839 system administrator can invoke it directly if necessary.</para>
841 <para>The <emphasis>Update Server</emphasis> distributes new versions
842 of AFS server process software and configuration information to all
843 file server machines. It is crucial to stable system performance that
844 all server machines run the same software.</para>
846 <para>The <emphasis>Backup Server</emphasis> maintains the Backup
847 Database, in which it stores information related to the Backup
848 System. It enables the administrator to back up data from volumes to
849 tape. The data can then be restored from tape in the event that it is
850 lost from the file system. The Backup Server is optional and is only
851 one of several ways that the data in an AFS cell can be backed
854 <para>The <emphasis>Cache Manager</emphasis> is the one component in
855 this list that resides on AFS client rather than file server
856 machines. It not a process per se, but rather a part of the kernel on
857 AFS client machines that communicates with AFS server processes. Its
858 main responsibilities are to retrieve files for application programs
859 running on the client and to maintain the files in the cache.</para>
861 <para>AFS also relies on two other services that are not part of AFS
862 and need to be instaled separately:</para>
864 <para>AFS requires a <emphasis>Kerberos KDC</emphasis> to use for user
865 authentication. It verifies user identities at login and provides the
866 facilities through which participants in transactions prove their
867 identities to one another (mutually authenticate). AFS uses Kerberos
868 for all of its authentication. The Kerberos KDC replaces the old
869 <emphasis>Authentication Server</emphasis> included in OpenAFS. The
870 Authentication Server is still available for sites that need it, but
871 is now deprecated and should not be used for any new
872 installations.</para>
874 <para>The <emphasis>Network Time Protocol Daemon (NTPD)</emphasis> is
875 not an AFS server process, but plays a vital role nonetheless. It
876 synchronizes the internal clock on a file server machine with those on
877 other machines. Synchronized clocks are particularly important for
878 correct functioning of the AFS distributed database technology (known
879 as Ubik); see <link linkend="HDRWQ103">Configuring the Cell for Proper
880 Ubik Operation</link>. The NTPD is usually provided with the operating
884 <title>The File Server</title>
887 <primary>File Server</primary>
889 <secondary>description</secondary>
892 <para>The <emphasis>File Server</emphasis> is the most fundamental
893 of the AFS server processes and runs on each file server machine. It
894 provides the same services across the network that the UNIX file
895 system provides on the local disk: <itemizedlist>
897 <para>Delivering programs and data files to client
898 workstations as requested and storing them again when the
899 client workstation finishes with them.</para>
903 <para>Maintaining the hierarchical directory structure that
904 users create to organize their files.</para>
908 <para>Handling requests for copying, moving, creating, and
909 deleting files and directories.</para>
913 <para>Keeping track of status information about each file and
914 directory (including its size and latest modification
919 <para>Making sure that users are authorized to perform the
920 actions they request on particular files or
925 <para>Creating symbolic and hard links between files.</para>
929 <para>Granting advisory locks (corresponding to UNIX locks) on
937 <title>The Basic OverSeer Server</title>
940 <primary>BOS Server</primary>
942 <secondary>description</secondary>
945 <para>The <emphasis>Basic OverSeer Server (BOS Server)</emphasis>
946 reduces the demands on system administrators by constantly
947 monitoring the processes running on its file server machine. It can
948 restart failed processes automatically and provides a convenient
949 interface for administrative tasks.</para>
951 <para>The BOS Server runs on every file server machine. Its primary
952 function is to minimize system outages. It also</para>
956 <para>Constantly monitors the other server processes (on the
957 local machine) to make sure they are running correctly.</para>
961 <para>Automatically restarts failed processes, without
962 contacting a human operator. When restarting multiple server
963 processes simultaneously, the BOS server takes interdependencies
964 into account and initiates restarts in the correct order.</para>
967 <primary>system outages</primary>
969 <secondary>reducing</secondary>
973 <primary>outages</primary>
975 <secondary>BOS Server role in,</secondary>
980 <para>Accepts requests from the system administrator. Common
981 reasons to contact BOS are to verify the status of server
982 processes on file server machines, install and start new
983 processes, stop processes either temporarily or permanently, and
984 restart dead processes manually.</para>
988 <para>Helps system administrators to manage system configuration
989 information. The BOS Server provides a simple interface for
990 modifying two files that contain information about privileged
991 users and certain special file server machines. It also
992 automates the process of adding and changing <emphasis>server
993 encryption keys</emphasis>, which are important in mutual
994 authentication, if the Authentication Server is still in use,
995 but this function of the BOS Server is deprecated. For more
996 details about these configuration files, see <link
997 linkend="HDRWQ85">Common Configuration Files in the /usr/afs/etc
998 Directory</link>.</para>
1003 <sect2 id="HDRWQ21">
1004 <title>The Protection Server</title>
1007 <primary>protection</primary>
1009 <secondary>in AFS</secondary>
1013 <primary>Protection Server</primary>
1015 <secondary>description</secondary>
1019 <primary>protection</primary>
1021 <secondary>in UNIX</secondary>
1024 <para>The <emphasis>Protection Server</emphasis> is the key to AFS's
1025 refinement of the normal UNIX methods for protecting files and
1026 directories from unauthorized use. The refinements include the
1027 following: <itemizedlist>
1029 <para>Defining associations between Kerberos principals and
1030 AFS identities. Normally, this is a simple mapping between
1031 principal names in the Kerberos realm associated with an AFS
1032 cell to AFS identities in that cell, but the Protection Server
1033 also manages mappings for users using cross-realm
1034 authentication from a different Kerberos realm.</para>
1036 <para>Defining seven access permissions rather than the
1037 standard UNIX file system's three. In conjunction with the
1038 UNIX mode bits associated with each file and directory
1039 element, AFS associates an <emphasis>access control list
1040 (ACL)</emphasis> with each directory. The ACL specifies which
1041 users have which of the seven specific permissions for the
1042 directory and all the files it contains. For a definition of
1043 AFS's seven access permissions and how users can set them on
1044 access control lists, see <link linkend="HDRWQ562">Managing
1045 Access Control Lists</link>.</para>
1048 <primary>access</primary>
1050 <secondary></secondary>
1057 <para>Enabling users to grant permissions to numerous
1058 individual users--a different combination to each individual
1059 if desired. UNIX protection distinguishes only between three
1060 user or groups: the owner of the file, members of a single
1061 specified group, and everyone who can access the local file
1066 <para>Enabling users to define their own groups of users,
1067 recorded in the <emphasis>Protection Database</emphasis>
1068 maintained by the Protection Server. The groups then appear on
1069 directories' access control lists as though they were
1070 individuals, which enables the granting of permissions to many
1071 users simultaneously.</para>
1075 <para>Enabling system administrators to create groups
1076 containing client machine IP addresses to permit access when
1077 it originates from the specified client machines. These types
1078 of groups are useful when it is necessary to adhere to
1079 machine-based licensing restrictions or where it is difficult
1080 for some reason to obtain Kerberos credentials for processes
1081 running on those systems that need access to AFS.</para>
1087 <primary>group</primary>
1089 <secondary>definition</secondary>
1093 <primary>Protection Database</primary>
1096 <para>The Protection Server's main duty is to help the File Server
1097 determine if a user is authorized to access a file in the requested
1098 manner. The Protection Server creates a list of all the groups to
1099 which the user belongs. The File Server then compares this list to
1100 the ACL associated with the file's parent directory. A user thus
1101 acquires access both as an individual and as a member of any
1104 <para>The Protection Server also maps Kerberos principals to
1105 <emphasis>AFS user ID</emphasis> numbers (<emphasis>AFS
1106 UIDs</emphasis>). These UIDs are functionally equivalent to UNIX
1107 UIDs, but operate in the domain of AFS rather than in the UNIX file
1108 system on a machine's local disk. This conversion service is
1109 essential because the tickets that the Kerberos KDC gives to
1110 authenticated users are stamped with principal names (to comply with
1111 Kerberos standards). The AFS server processes identify users by AFS
1112 UID, not by username. Before they can understand whom the token
1113 represents, they need the Protection Server to translate the
1114 username into an AFS UID. For further discussion of the
1115 authentication process, see <link linkend="HDRWQ75">A More Detailed
1116 Look at Mutual Authentication</link>.</para>
1119 <sect2 id="HDRWQ22">
1120 <title>The Volume Server</title>
1123 <primary>Volume Server</primary>
1125 <secondary>description</secondary>
1128 <para>The <emphasis>Volume Server</emphasis> provides the interface
1129 through which you create, delete, move, and replicate volumes, as
1130 well as prepare them for archiving to disk, tape, or other media
1131 (backing up). <link linkend="HDRWQ13">Volumes</link> explained the
1132 advantages gained by storing files in volumes. Creating and deleting
1133 volumes are necessary when adding and removing users from the
1134 system; volume moves are done for load balancing; and replication
1135 enables volume placement on multiple file server machines (for more
1136 on replication, see <link
1137 linkend="HDRWQ15">Replication</link>).</para>
1140 <sect2 id="HDRWQ23">
1141 <title>The Volume Location (VL) Server</title>
1144 <primary>VL Server</primary>
1146 <secondary>description</secondary>
1150 <primary>VLDB</primary>
1153 <para>The <emphasis>VL Server</emphasis> maintains a complete list
1154 of volume locations in the <emphasis>Volume Location Database
1155 (VLDB)</emphasis>. When the Cache Manager (see <link
1156 linkend="HDRWQ28">The Cache Manager</link>) begins to fill a file
1157 request from an application program, it first contacts the VL Server
1158 in order to learn which file server machine currently houses the
1159 volume containing the file. The Cache Manager then requests the file
1160 from the File Server process running on that file server
1163 <para>The VLDB and VL Server make it possible for AFS to take
1164 advantage of the increased system availability gained by using
1165 multiple file server machines, because the Cache Manager knows where
1166 to find a particular file. Indeed, in a certain sense the VL Server
1167 is the keystone of the entire file system--when the information in
1168 the VLDB is inaccessible, the Cache Manager cannot retrieve files,
1169 even if the File Server processes are working properly. A list of
1170 the information stored in the VLDB about each volume is provided in
1171 <link linkend="HDRWQ180">Volume Information in the
1175 <primary>VL Server</primary>
1177 <secondary>importance to transparent access</secondary>
1181 <sect2 id="HDRWQ26">
1182 <title>The Salvager</title>
1185 <primary>Salvager</primary>
1187 <secondary>description</secondary>
1190 <para>The <emphasis>Salvager</emphasis> differs from other AFS
1191 Servers in that it runs only at selected times. The BOS Server
1192 invokes the Salvager when the File Server, Volume Server, or both
1193 fail. The Salvager attempts to repair disk corruption that can
1194 result from a failure.</para>
1196 <para>As a system administrator, you can also invoke the Salvager as
1197 necessary, even if the File Server or Volume Server has not
1198 failed. See <link linkend="HDRWQ232">Salvaging
1199 Volumes</link>.</para>
1202 <sect2 id="HDRWQ24">
1203 <title>The Update Server</title>
1206 <primary>Update Server</primary>
1208 <secondary>description</secondary>
1211 <para>The <emphasis>Update Server</emphasis> is an optional process
1212 that helps guarantee that all file server machines are running the
1213 same version of a server process. System performance can be
1214 inconsistent if some machines are running one version of the File
1215 Server (for example) and other machines were running another
1218 <para>To ensure that all machines run the same version of a process,
1219 install new software on a single file server machine of each system
1220 type, called the <emphasis>binary distribution machine</emphasis>
1221 for that type. The binary distribution machine runs the server
1222 portion of the Update Server, whereas all the other machines of that
1223 type run the client portion of the Update Server. The client
1224 portions check frequently with the <emphasis>server
1225 portion</emphasis> to see if they are running the right version of
1226 every process; if not, the <emphasis>client portion</emphasis>
1227 retrieves the right version from the binary distribution machine and
1228 installs it locally. The system administrator does not need to
1229 remember to install new software individually on all the file server
1230 machines: the Update Server does it automatically. For more on
1231 binary distribution machines, see <link linkend="HDRWQ93">Binary
1232 Distribution Machines</link>.</para>
1235 <primary>Update Server</primary>
1237 <secondary>server portion</secondary>
1241 <primary>Update Server</primary>
1243 <secondary>client portion</secondary>
1246 <para>The Update Server also distributes configuration files that
1247 all file server machines need to store on their local disks (for a
1248 description of the contents and purpose of these files, see <link
1249 linkend="HDRWQ85">Common Configuration Files in the /usr/afs/etc
1250 Directory</link>). As with server process software, the need for
1251 consistent system performance demands that all the machines have the
1252 same version of these files. The system administrator needs to make
1253 changes to these files on one machine only, the cell's
1254 <emphasis>system control machine</emphasis>, which runs a server
1255 portion of the Update Server. All other machines in the cell run a
1256 client portion that accesses the correct versions of these
1257 configuration files from the system control machine. For more
1258 information, see <link linkend="HDRWQ94">The System Control
1259 Machine</link>.</para>
1262 <sect2 id="HDRWQ25">
1263 <title>The Backup Server</title>
1266 <primary>Backup System</primary>
1268 <secondary>Backup Server described</secondary>
1272 <primary>Backup Server</primary>
1274 <secondary>description</secondary>
1277 <para>The <emphasis>Backup Server</emphasis> is an optional process
1278 that maintains the information in the <emphasis>Backup
1279 Database</emphasis>. The Backup Server and the Backup Database
1280 enable administrators to back up data from AFS volumes to tape and
1281 restore it from tape to the file system if necessary. The server and
1282 database together are referred to as the Backup System. This Backup
1283 System is only one way to back up AFS, and many AFS cells use
1284 different methods.</para>
1286 <para>Administrators who wish to use the Backup System initially
1287 configure it by defining sets of volumes to be dumped together and
1288 the schedule by which the sets are to be dumped. They also install
1289 the system's tape drives and define the drives' <emphasis>Tape
1290 Coordinators</emphasis>, which are the processes that control the
1293 <para>Once the Backup System is configured, user and system data can
1294 be dumped from volumes to tape or disk. In the event that data is
1295 ever lost from the system (for example, if a system or disk failure
1296 causes data to be lost), administrators can restore the data from
1297 tape. If tapes are periodically archived, or saved, data can also be
1298 restored to its state at a specific time. Additionally, because
1299 Backup System data is difficult to reproduce, the Backup Database
1300 itself can be backed up to tape and restored if it ever becomes
1301 corrupted. For more information on configuring and using the Backup
1302 System, and on other AFS backup options, see <link
1303 linkend="HDRWQ248">Configuring the AFS Backup System</link> and
1304 <link linkend="HDRWQ283">Backing Up and Restoring AFS
1308 <sect2 id="HDRWQ28">
1309 <title>The Cache Manager</title>
1312 <primary>Cache Manager</primary>
1314 <secondary>functions of</secondary>
1317 <para>As already mentioned in <link linkend="HDRWQ16">Caching and
1318 Callbacks</link>, the <emphasis>Cache Manager</emphasis> is the one
1319 component in this section that resides on client machines rather
1320 than on file server machines. It is a combination of a daemon
1321 process and a set of extensions or modifications in the client
1322 machine's kernel, usually implemented as a loadable kernel module,
1323 that enable communication with the server processes running on
1324 server machines. Its main duty is to translate file requests (made
1325 by application programs on client machines) into <emphasis>remote
1326 procedure calls (RPCs)</emphasis> to the File Server. (The Cache
1327 Manager first contacts the VL Server to find out which File Server
1328 currently houses the volume that contains a requested file, as
1329 mentioned in <link linkend="HDRWQ23">The Volume Location (VL)
1330 Server</link>). When the Cache Manager receives the requested file,
1331 it caches it before passing data on to the application
1334 <para>The Cache Manager also tracks the state of files in its cache
1335 compared to the version at the File Server by storing the callbacks
1336 sent by the File Server. When the File Server breaks a callback,
1337 indicating that a file or volume changed, the Cache Manager requests
1338 a copy of the new version before providing more data to application
1342 <sect2 id="HDRWQ20">
1343 <title>The Kerberos KDC</title>
1346 <primary>Kerberos KDC</primary>
1347 <secondary>description</secondary>
1350 <primary>Authentication Server</primary>
1351 <secondary>description</secondary>
1352 <seealso>Kerberos KDC</seealso>
1355 <primary>Active Directory</primary>
1356 <secondary>Kerberos KDC</secondary>
1359 <primary>MIT Kerberos</primary>
1360 <secondary>Kerberos KDC</secondary>
1363 <primary>Heimdal</primary>
1364 <secondary>Kerberos KDC</secondary>
1367 <para>The <emphasis>Kerberos KDC</emphasis> (Key Distribution
1368 Center) performs two main functions related to network security:
1371 <para>Verifying the identity of users as they log into the
1372 system by requiring that they provide a password or some other
1373 form of authentication credentials. The Kerberos KDC grants
1374 the user a ticket, which is converted into a token to prove to
1375 AFS server processes that the user has authenticated. For more
1376 on tokens, see <link linkend="HDRWQ76">Complex Mutual
1377 Authentication</link>.</para>
1381 <para>Providing the means through which server and client
1382 processes prove their identities to each other (mutually
1383 authenticate). This helps to create a secure environment in
1384 which to send cross-network messages.</para>
1389 <para>The Kerberos KDC is a required service, but does not come with
1390 OpenAFS. One Kerberos KDC may provide authentication services for
1391 multiple AFS cells. Each AFS cell must be associated with a Kerberos
1392 realm with one or more Kerberos KDCs supporting version 4 or 5 of
1393 the Kerberos protocol. Kerberos version 4 is not secure and is
1394 supported only for backwards compatibility; Kerberos 5 should be
1395 used for any new installation.</para>
1397 <para>A Kerberos KDC maintains a database in which it stores
1398 encryption keys for users and for services, including the AFS server
1399 encryption key. For users, these encryption keys are normally formed
1400 by converting a user password to a key, but Kerberos KDCs also
1401 support other authentication mechanisms. To learn more about the
1402 procedures AFS uses to verify user identity and during mutual
1403 authentication, see <link linkend="HDRWQ75">A More Detailed Look at
1404 Mutual Authentication</link>.</para>
1406 <para>Kerberos KDC software is included with some operating systems
1407 or may be acquired separately. MIT Kerberos, Heimdal, and Microsoft
1408 Active Directory are known to work with OpenAFS as a Kerberos
1409 Server.This technology was originally developed by the Massachusetts
1410 Institute of Technology's Project Athena.</para>
1413 <para>The <emphasis>Authentication Server</emphasis>, or kaserver,
1414 was a Kerberos version 4 KDC. It is obsolete and should no longer
1415 be used. A third-party Kerberos version 5 KDC should be used
1416 instead. The Authentication Server is still provided with OpenAFS,
1417 but only for backward compatibility and legacy support for sites
1418 that have not yet migrated to a Kerberos version 5 KDC. the
1419 Kerberos Server. All references to the <emphasis>Kerberos
1420 KDC</emphasis> in this guide refer to a Kerberos 5 server.</para>
1424 <primary>AFS</primary>
1426 <secondary></secondary>
1432 <primary>username</primary>
1434 <secondary>use by Kerberos</secondary>
1438 <primary>UNIX</primary>
1440 <secondary>UID</secondary>
1442 <tertiary>functional difference from AFS UID</tertiary>
1446 <primary>Kerberos</primary>
1448 <secondary>use of usernames</secondary>
1452 <sect2 id="HDRWQ27">
1453 <title>The Network Time Protocol Daemon</title>
1456 <primary>ntpd</primary>
1458 <secondary>description</secondary>
1461 <para>The <emphasis>Network Time Protocol Daemon (NTPD)</emphasis>
1462 is not an AFS server process, but plays an important role. It helps
1463 guarantee that all of the file server machines and client machines
1464 agree on the time. The NTPD on all file server machines learns the
1465 correct time from a parent NTPD source, which may be located inside
1466 or outside the cell.</para>
1468 <para>Keeping clocks synchronized is particularly important to the
1469 correct operation of AFS's distributed database technology, which
1470 coordinates the copies of the Backup, Protection, and Volume
1471 Location Databases; see <link linkend="HDRWQ52">Replicating the
1472 OpenAFS Administrative Databases</link>. Client machines may also
1473 refer to these clocks for the correct time; therefore, it is less
1474 confusing if all file server machines have the same time. For more
1475 technical detail about the NTPD, see <ulink
1476 url="http://www.ntp.org/">The NTP web site</ulink> or the
1477 documentation for your operating system.</para>
1479 <important><title>Clock Skew Impact</title> <para>Client machines
1480 that are authenticating to an OpenAFS cell with valid credentials
1481 may still fail when the clocks of the client machine, Kerberos KDC,
1482 and the File Server machines are not in sync.</para></important>