3 \page title AFS-3 Programmer's Reference: Volume Server/Volume Location
6 \author Edward R. Zayas
9 \date 29 August 1991 14:48 Copyright 1991 Transarc Corporation All Rights
13 \page chap1 Chapter 1: Overview
15 \section sec1-1 Section 1.1: Introduction
18 This document describes the architecture and interfaces for two of the
19 important agents of the AFS distributed file system, the Volume Server and the
20 Volume Location Server. The Volume Server allows operations affecting entire
21 AFS volumes to be executed, while the Volume Location Server provides a lookup
22 service for volumes, identifying the server or set of servers on which volume
25 \section sec1-2 Section 1.2: Volumes
27 \subsection sec1-2-1 Section 1.2.1: Definition
30 The underlying concept manipulated by the two AFS servers examined by this
31 document is the volume. Volumes are the basic mechanism for organizing the data
32 stored within the file system. They provide the foundation for addressing,
33 storing, and accessing file data, along with serving as the administrative
34 units for replication, backup, quotas, and data motion between File Servers.
36 Specifically, a volume is a container for a hierarchy of files, a connected
37 file system subtree. In this respect, a volume is much like a traditional unix
38 file system partition. Like a partition, a volume can be mounted in the sense
39 that the root directory of the volume can be named within another volume at an
40 AFS mount point. The entire file system hierarchy is built up in this manner,
41 using mount points to glue together the individual subtrees resident within
42 each volume. The root of this hierarchy is then mounted by each AFS client
43 machine using a conventional unix mount point within the workstation's local
44 file system. By convention, this entryway into the AFS domain is mounted on the
45 /afs local directory. From a user's point of view, there is only a single mount
46 point to the system; the internal mount points are generally transparent.
48 \subsection sec1-2-2 Section 1.2.2: Volume Naming
51 There are two methods by which volumes may be named. The first is via a
52 human-readable string name, and the second is via a 32-bit numerical
53 identifier. Volume identifiers, whether string or numerical, must be unique
54 within any given cell. AFS mount points may use either representation to
55 specify the volume whose root directory is to be accessed at the given
56 position. Internally, however, AFS agents use the numerical form of
57 identification exclusively, having to translate names to the corresponding
60 \subsection sec1-2-3 Section 1.2.3: Volume Types
63 There are three basic volume types: read-write, read-only, and backup volumes.
64 \li Read-write: The data in this volume may be both read and written by those
65 clients authorized to do so.
66 \li Read-only: It is possible to create one or more read-only snapshots of
67 read-write volumes. The read-write volume serving as the source image is
68 referred to as the parent volume. Each read-only clone, or child, instance must
69 reside on a different unix disk partition than the other clones. Every clone
70 instance generated from the same parent read-write volume has the identical
71 volume name and numerical volume ID. This is the reason why no two clones may
72 appear on the same disk partition, as there would be no way to differentiate
73 the two. AFS clients are allowed to read files and directories from read-only
74 volumes, but cannot overwrite them individually. However, it is possible to
75 make changes to the read-write parent and then release the contents of the
76 entire volume to all the read-only replicas. The release operation fails if it
77 does not reach the appropriate replication sites.
78 \li Backup: A backup volume is a special instance of a read-only volume. While
79 it is also a read-only snapshot of a given read-write volume, only one instance
80 is allowed to exist at any one time. Also, the backup volume must reside on the
81 same partition as the parent read-write volume from which it was created. It is
82 from a backup volume that the AFS backup system writes file system data to
83 tape. In addition, backup volumes may be mounted into the file tree just like
84 the other volume types. In fact, by convention, the backup volume for each
85 user's home directory subtree is typically mounted as OldFiles in that
86 directory. If a user accidentally deletes a file that resides in the backup
87 snapshot, the user may simply copy it out of the backup directly without the
88 assistance of a system administrator, or any kind of tape restore operation.
89 Backup volume are implemented in a copy-on-write fashion. Thus, backup volumes
90 may be envisioned as consisting of a set of pointers to the true data objects
91 in the base read-write volume when they are first created. When a file is
92 overwritten in the read-write version for the first time after the backup
93 volume was created, the original data is physically written to the backup
94 volume, breaking the copyon-write link. With this mechanism, backup volumes
95 maintain the image of the read-write volume at the time the snapshot was taken
96 using the minimum amount of additional disk space.
98 \section sec1-3 Section 1.3: Scope
101 This paper is a member of a documentation suite providing specifications of the
102 operation and interfaces offered by the various AFS servers and agents. The
103 scope of this work is to provide readers with a sufficiently detailed
104 description of the Volume Location Server and the Volume Server so that they
105 may construct client applications which call their RPC interface routines.
107 \section sec1-4 Section 1.4: Document Layout
110 After this introductory portion of the document, Chapters 2 and 3 examine the
111 architecture and RPC interface of the Volume Location Server and its replicated
112 database. Similarly, Chapters 4 and 5 describe the architecture and RPC
113 interface of the Volume Server.
115 \page chap2 Chapter 2: Volume Location Server Architecture
117 \section sec2-1 Section 2.1: Introduction
120 The Volume Location Server allows AFS agents to query the location and basic
121 status of volumes resident within the given cell. Volume Location Server
122 functions may be invoked directly from authorized users via the vos utility.
124 This chapter briefly discusses various aspects of the Volume Location Server's
125 architecture. First, the need for volume location is examined, and the specific
126 parties that call the Volume Location Server interface routines are identified.
127 Then, the database maintained to provide volume location service, the Volume
128 Location Database (VLDB), is examined. Finally, the vlserver process which
129 implements the Volume Location Server is considered.
131 As with all AFS servers, the Volume Location Server uses the Rx remote
132 procedure call package for communication with its clients.
134 \section sec2-2 Section 2.2: The Need For Volume Location
137 The Cache Manager agent is the primary consumer of AFS volume location service,
138 on which it is critically dependent for its own operation. The Cache Manager
139 needs to map volume names or numerical identifiers to the set of File Servers
140 on which its instances reside in order to satisfy the file system requests it
141 is processing on behalf of it clients. Each time a Cache Manager encounters a
142 mount point for which it does not have location information cached, it must
143 acquire this information before the pathname resolution may be successfully
144 completed. Once the File Server set is known for a particular volume, the Cache
145 Manager may then select the proper site among them (e.g. choosing the single
146 home for a read-write volume, or randomly selecting a site from a read-only
147 volume's replication set) and begin addressing its file manipulation operations
148 to that specific server.
150 While the Cache Manager consults the volume location service, it is not capable
151 of changing the location of volumes and hence modifying the information
152 contained therein. This capability to perform acts which change volume location
153 is concentrated within the Volume Server. The Volume Server process running on
154 each server machine manages all volume operations affecting that platform,
155 including creations, deletions, and movements between servers. It must update
156 the volume location database every time it performs one of these actions.
158 None of the other AFS system agents has a need to access the volume location
159 database for its site. Surprisingly, this also applies to the File Server
160 process. It is only aware of the specific set of volumes that reside on the set
161 of physical disks directly attached to the machine on which they execute. It
162 has no knowlege of the universe of volumes resident on other servers, either
163 within its own cell or in foreign cells.
165 \section sec2-3 Section 2.3: The VLDB
168 The Volume Location Database (VLDB) is used to allow AFS application programs
169 to discover the location of any volume within its cell, along with select
170 information about the nature and state of that volume. It is organized in a
171 very straightforward fashion, and uses the ubik [4] [5] facility to to provide
172 replication across multiple server sites.
174 \subsection sec2-3-1 Section 2.3.1: Layout
177 The VLDB itself is a very simple structure, and synchronized copies may be
178 maintained at two or more sites. Basically, each copy consists of header
179 information, followed by a linear (yet unbounded) array of entries. There are
180 several associated hash tables used to perform lookups into the VLDB. The first
181 hash table looks up volume location information based on the volume's name.
182 There are three other hash tables used for lookup, based on volume ID/type
183 pairs, one for each possible volume type.
185 The VLDB for a large site may grow to contain tens of thousands of entries, so
186 some attempts were made to make each entry as small as possible. For example,
187 server addresses within VLDB entries are represented as single-byte indicies
188 into a table containing the full longword IP addresses.
190 A free list is kept for deleted VLDB entries. The VLDB will not grow unless all
191 the entries on the free list have been exhausted, keeping it as compact as
194 \subsection sec2-3-2 Section 2.3.2: Database Replication
197 The VLDB, along with other important AFS databases, may be replicated to
198 multiple sites to improve its availability. The ubik replication package is
199 used to implement this functionality for the VLDB. A full description of ubik
200 and of the quorum completion algorithm it implements may be found in [4] and
201 [5]. The basic abstraction provided by ubik is that of a disk file replicated
202 to multiple server locations. One machine is considered to be the
203 synchronization site, handling all write operations on the database file. Read
204 operations may be directed to any of the active members of the quorum, namely a
205 subset of the replication sites large enough to insure integrity across such
206 failures as individual server crashes and network partitions. All of the quorum
207 members participate in regular elections to determine the current
208 synchronization site. The ubik algorithms allow server machines to enter and
209 exit the quorum in an orderly and consistent fashion. All operations to one of
210 these replicated "abstract files" are performed as part of a transaction. If
211 all the related operations performed under a transaction are successful, then
212 the transaction is committed, and the changes are made permanent. Otherwise,
213 the transaction is aborted, and all of the operations for that transaction are
216 \section sec2-4 Section 2.4: The vlserver Process
219 The user-space vlserver process is in charge of providing volume location
220 service for AFS clients. This program maintains the VLDB replica at its
221 particular server, and cooperates with all other vlserver processes running in
222 the given cell to propagate updates to the database. It implements the RPC
223 interface defined in the vldbint.xg definition file for the rxgen RPC stub
224 generator program. As part of its startup sequence, it must discover the VLDB
225 version it has on its local disk, move to join the quorum of replication sites
226 for the VLDB, and get the latest version if the one it came up with was out of
227 date. Eventually, it will synchronize with the other VLDB replication sites,
228 and it will begin accepting calls.
230 The vlserver program uses at most three Rx worker threads to listen for
231 incoming Volume Location Server calls. It has a single, optional command line
232 argument. If the string "-noauth" appears when the program is invoked, then
233 vlserver will run in an unauthenticated mode where any individual is considered
234 authorized to perform any VLDB operation. This mode is necessary when first
235 bootstrapping an AFS installation.
237 \page chap3 Chapter 3: Volume Location Server Interface
239 \section sec3-1 Section 3.1: Introduction
242 This chapter documents the API for the Volume Location Server facility, as
243 defined by the vldbint.xg Rxgen interface file and the vldbint.h include file.
244 Descriptions of all the constants, structures, macros, and interface functions
245 available to the application programmer appear here.
247 It is expected that Volume Location Server client programs run in user space,
248 as does the associated vos volume utility. However, the kernel-resident Cache
249 Manager agent also needs to call a subset of the Volume Location Server's RPC
250 interface routines. Thus, a second Volume Location Server interface is
251 available, built exclusively to satisfy the Cache Manager's limited needs. This
252 subset interface is defined by the afsvlint.xg Rxgen interface file, and is
253 examined in the final section of this chapter.
255 \section sec3-2 3.2: Constants
258 This section covers the basic constant definitions of interest to the Volume
259 Location Server application programmer. These definitions appear in the
260 vldbint.h file, automatically generated from the vldbint.xg Rxgen interface
261 file, and in vlserver.h.
263 Each subsection is devoted to describing the constants falling into the
264 following categories:
265 \li Configuration and boundary quantities
266 \li Update entry bits
267 \li List-by-attribute bits
268 \li Volume type indices
269 \li States for struct vlentry
270 \li States for struct vldbentry
271 \li ReleaseType argument values
272 \li Miscellaneous items
274 \subsection sec3-2-1 Section 3.2.1: Configuration and Boundary
278 These constants define some basic system values, including configuration
286 Maximum size of various character strings, including volume name fields in
287 structures and host names.
294 Maximum number of replications sites for a volume.
301 Maximum number of volume types.
308 VLDB database version number
315 Size of internal Volume Location Server volume name and volume ID hash tables.
316 This must always be a prime number.
323 Specifies a null pointer value.
330 Value used when allocating memory internally for VLDB entry records.
337 Illegal Volume Location Server host ID.
344 Maximum number of servers appearing in the VLDB.
351 First unused flag value in such fields as serverFlags in struct vldbentry and
352 RepsitesNewFlags in struct VldbUpdateEntry.
359 Maximum number of AFS disk partitions for any one server.
366 Maximum interval that the current high-watermark value for a volume ID can be
367 increased in one operation.
374 Maximum number of seconds that any VLDB entry can remain locked.
381 Maximum size of the name field within a struct.
383 \subsection sec3-2-2 Section 3.2.2: Update Entry Bits
386 These constants define bit values for the Mask field in the struct
387 VldbUpdateEntry. Specifically, setting these bits is equivalent to declaring
388 that the corresponding field within an object of type struct VldbUpdateEntry
389 has been set. For example, setting the VLUPDATE VOLUMENAME flag in Mask
390 indicates that the name field contains a valid value.
397 If set, indicates that the name field is valid.
404 If set, indicates that the volumeType field is valid.
411 If set, indicates that the flags field is valid.
418 If set, indicates that the ReadOnlyId field is valid.
425 If set, indicates that the BackupId field is valid.
432 If set, indicates that the nModifiedRepsites field is valid.
439 If set, indicates that the cloneId field is valid.
446 Is the replica being deleted?
453 Is the replica being added?
456 VLUPDATE REPS MODSERV
460 Is the server part of the replica location correct?
463 VLUPDATE REPS MODPART
467 Is the partition part of the replica location correct?
470 VLUPDATE REPS MODFLAG
474 Various modification flag values.
476 \subsection sec3-2-3 Section 3.2.3: List-By-Attribute Bits
479 These constants define bit values for the Mask field in the struct
480 VldbListByAttributes is to be used in a match. Specifically, setting these bits
481 is equivalent to declaring that the corresponding field within an object of
482 type struct VldbListByAttributes is set. For example, setting the VLLIST SERVER
483 flag in Mask indicates that the server field contains a valid value.
490 If set, indicates that the server field is valid.
497 If set, indicates that the partition field is valid.
504 If set, indicates that the volumetype field is valid.
511 If set, indicates that the volumeid field is valid.
518 If set, indicates that that flag field is valid.
520 \subsection sec3-2-4 Section 3.2.4: Volume Type Indices
523 These constants specify the order of entries in the volumeid array in an object
524 of type struct vldbentry. They also identify the three different types of
548 \subsection sec3-2-5 Section 3.2.5: States for struct vlentry
551 The following constants appear in the flags field in objects of type struct
552 vlentry. The first three values listed specify the state of the entry, while
553 all the rest stamp the entry with the type of an ongoing volume operation, such
554 as a move, clone, backup, deletion, and dump. These volume operations are the
555 legal values to provide to the voloper parameter of the VL SetLock() interface
558 For convenience, the constant VLOP ALLOPERS is defined as the inclusive OR of
559 the above values from VLOP MOVE through VLOP DUMP.
566 Entry is in the free list.
573 Entry is soft-deleted.
580 Advisory lock held on the entry.
587 The associated volume is being moved between servers.
594 The associated volume is being cloned to its replication sites.
601 A backup volume is being created for the associated volume.
608 The associated volume is being deleted.
615 A dump is being taken of the associated volume.
617 \subsection sec3-2-6 Section 3.2.6: States for struct vldbentry
620 Of the following constants, the first three appear in the flags field within an
621 object of type struct vldbentry, advising of the existence of the basic volume
622 types for the given volume, and hence the validity of the entries in the
623 volumeId array field. The rest of the values provided in this table appear in
624 the serverFlags array field, and apply to the instances of the volume appearing
625 in the various replication sites.
627 This structure appears in numerous Volume Location Server interface calls,
628 namely VL CreateEntry(), VL GetEntryByID(), VL GetEntryByName(), VL
629 ReplaceEntry() and VL ListEntry().
636 The read-write volume ID is valid.
643 The read-only volume ID is valid.
650 The backup volume ID is valid.
657 Not used; originally intended to mark an entry as belonging to a
658 partially-created volume instance.
665 A read-only version of the volume appears at this server.
672 A read-write version of the volume appears at this server.
679 A backup version of the volume appears at this server.
681 \subsection sec3-2-7 Section 3.2.7: ReleaseType Argument Values
684 The following values are used in the ReleaseType argument to various Volume
685 Location Server interface routines, namely VL ReplaceEntry(), VL UpdateEntry()
686 and VL ReleaseLock().
693 Is the LockTimestamp field valid?
700 Are any of the bits valid in the flags field?
707 Is the LockAfsId field valid?
709 \subsection sec3-2-8 Section 3.2.8: Miscellaneous
712 Miscellaneous values.
718 Has a replication site gotten a new release of a volume?
720 A synonym for this constant is VLSF NEWREPSITE.
722 \section sec3-3 Section 3.3: Structures and Typedefs
725 This section describes the major exported Volume Location Server data
726 structures of interest to application programmers, along with the typedefs
727 based upon those structures.
729 \subsection sec3-3-1 Section 3.3.1: struct vldbentry
732 This structure represents an entry in the VLDB as made visible to Volume
733 Location Server clients. It appears in numerous Volume Location Server
734 interface calls, namely VL CreateEntry(), VL GetEntryByID(), VL
735 GetEntryByName(), VL ReplaceEntry() and VL ListEntry().
737 \li char name[] - The string name for the volume, with a maximum length of
738 MAXNAMELEN (65) characters, including the trailing null.
739 \li long volumeType - The volume type, one of RWVOL, ROVOL, or BACKVOL.
740 \li long nServers - The number of servers that have an instance of this volume.
741 \li long serverNumber[] - An array of indices into the table of servers,
742 identifying the sites holding an instance of this volume. There are at most
743 MAXNSERVERS (8) of these server sites allowed by the Volume Location Server.
744 \li long serverPartition[] - An array of partition identifiers, corresponding
745 directly to the serverNumber array, specifying the partition on which each of
746 those volume instances is located. As with the serverNumber array,
747 serverPartition has up to MAXNSERVERS (8) entries.
748 \li long serverFlags[] - This array holds one flag value for each of the
749 servers in the previous arrays. Again, there are MAXNSERVERS (8) slots in this
751 \li u long volumeId[] - An array of volume IDs, one for each volume type. There
752 are MAXTYPES slots in this array.
753 \li long cloneId - This field is used during a cloning operation.
754 \li long flags - Flags concerning the status of the fields within this
755 structure; see Section 3.2.6 for the bit values that apply.
757 \subsection sec3-3-2 Section 3.3.2: struct vlentry
760 This structure is used internally by the Volume Location Server to fully
761 represent a VLDB entry. The client-visible struct vldbentry represents merely a
762 subset of the information contained herein.
764 \li u long volumeId[] - An array of volume IDs, one for each of the MAXTYPES of
766 \li long flags - Flags concerning the status of the fields within this
767 structure; see Section 3.2.6 for the bit values that apply.
768 \li long LockAfsId - The individual who locked the entry. This feature has not
769 yet been implemented.
770 \li long LockTimestamp - Time stamp on the entry lock.
771 \li long cloneId - This field is used during a cloning operation.
772 \li long AssociatedChain - Pointer to the linked list of associated VLDB
774 \li long nextIdHash[] - Array of MAXTYPES next pointers for the ID hash table
775 pointer, one for each related volume ID.
776 \li long nextNameHash - Next pointer for the volume name hash table.
777 \li long spares1[] - Two longword spare fields.
778 \li char name[] - The volume's string name, with a maximum of MAXNAMELEN (65)
779 characters, including the trailing null.
780 \li u char volumeType - The volume's type, one of RWVOL, ROVOL, or BACKVOL.
781 \li u char serverNumber[] - An array of indices into the table of servers,
782 identifying the sites holding an instance of this volume. There are at most
783 MAXNSERVERS (8) of these server sites allowed by the Volume Location Server.
784 \li u char serverPartition[] - An array of partition identifiers, corresponding
785 directly to the serverNumber array, specifying the partition on which each of
786 those volume instances is located. As with the serverNumber array,
787 serverPartition has up to MAXNSERVERS (8) entries.
788 \li u char serverFlags[] - This array holds one flag value for each of the
789 servers in the previous arrays. Again, there are MAXNSERVERS (8) slots in this
791 \li u char RefCount - Only valid for read-write volumes, this field serves as a
792 reference count, basically the number of dependent children volumes.
793 \li char spares2[] - This field is used for 32-bit alignment.
795 \subsection sec3-3-3 Section 3.3.3: struct vital vlheader
798 This structure defines the leading section of the VLDB header, of type struct
799 vlheader. It contains frequently-used global variables and general statistics
802 \li long vldbversion - The VLDB version number. This field must appear first in
804 \li long headersize - The total number of bytes in the header.
805 \li long freePtr - Pointer to the first free enry in the free list, if any.
806 \li long eofPtr - Pointer to the first free byte in the header file.
807 \li long allocs - The total number of calls to the internal AllocBlock()
808 function directed at this file.
809 \li long frees - The total number of calls to the internal FreeBlock() function
810 directed at this file.
811 \li long MaxVolumeId - The largest volume ID ever granted for this cell.
812 \li long totalEntries[] - The total number of VLDB entries by volume type in
813 the VLDB. This array has MAXTYPES slots, one for each volume type.
815 \subsection sec3-3-4 Section 3.3.4: struct vlheader
818 This is the layout of the information stored in the VLDB header. Notice it
819 includes an object of type struct vital vlheader described above (see Section
820 3.3.3) as the first field.
822 \li struct vital vlheader vital header - Holds critical VLDB header
824 \li u long IpMappedAddr[] - Keeps MAXSERVERID+1 mappings of IP addresses to
826 \li long VolnameHash[] - The volume name hash table, with HASHSIZE slots.
827 \li long VolidHash[][] - The volume ID hash table. The first dimension in this
828 array selects which of the MAXTYPES volume types is desired, and the second
829 dimension actually implements the HASHSIZE hash table buckets for the given
832 \subsection sec3-3-5 Section 3.3.5: struct VldbUpdateEntry
835 This structure is used as an argument to the VL UpdateEntry() routine (see
836 Section 3.6.7). Please note that multiple entries can be updated at once by
837 setting the appropriate Mask bits. The bit values for this purpose are defined
840 \li u long Mask - Bit values determining which fields are to be affected by the
842 \li char name[] - The volume name, up to MAXNAMELEN (65) characters including
844 \li long volumeType - The volume type.
845 \li long flags - This field is used in conjuction with Mask (in fact, one of
846 the Mask bits determines if this field is valid) to choose the valid fields in
848 \li u long ReadOnlyId - The read-only ID.
849 \li u long BackupId - The backup ID.
850 \li long cloneId - The clone ID.
851 \li long nModifiedRepsites - Number of replication sites whose entry is to be
853 \li u long RepsitesMask[] - Array of bit masks applying to the up to
854 MAXNSERVERS (8) replication sites involved.
855 \li long RepsitesTargetServer[] - Array of target servers for the operation, at
856 most MAXNSERVERS (8) of them.
857 \li long RepsitesTargetPart[] - Array of target server partitions for the
858 operation, at most MAXNSERVERS (8) of them.
859 \li long RepsitesNewServer[] - Array of new server sites, at most MAXNSERVERS
861 \li long RepsitesNewPart[] - Array of new server partitions for the operation,
862 at most MAXNSERVERS (8) of them.
863 \li long RepsitesNewFlags[] - Flags applying to each of the new sites, at most
864 MAXNSERVERS (8) of them.
866 \subsection sec3-3-6 Section 3.3.6: struct VldbListByAttributes
869 This structure is used by the VL ListAttributes() routine (see Section 3.6.11).
871 \li u long Mask - Bit mask used to select the following attribute fields on
873 \li long server - The server address to match.
874 \li long partition - The partition ID to match.
875 \li long volumetype - The volume type to match.
876 \li long volumeid - The volume ID to match.
877 \li long flag - Flags concerning these values.
879 \subsection sec3-3-7 Section 3.3.7: struct single vldbentry
882 This structure is used to construct the vldblist object (See Section 3.3.12),
883 which basically generates a queueable (singly-linked) version of struct
886 \li vldbentry VldbEntry - The VLDB entry to be queued.
887 \li vldblist next vldb - The next pointer in the list.
889 \subsection sec3-3-8 Section 3.3.8: struct vldb list
892 This structure defines the item returned in linked list form from the VL
893 LinkedList() function (see Section 3.6.12). This same object is also returned
894 in bulk form in calls to the VL ListAttributes() routine (see Section 3.6.11).
896 \li vldblist node - The body of the first object in the linked list.
898 \subsection sec3-3-9 Section 3.3.9: struct vldstats
901 This structure defines fields to record statistics on opcode hit frequency. The
902 MAX NUMBER OPCODES constant has been defined as the maximum number of opcodes
903 supported by this structure, and is set to 30.
905 \li unsigned long start time - Clock time when opcode statistics were last
907 \li long requests[] - Number of requests received for each of the MAX NUMBER
908 OPCODES opcode types.
909 \li long aborts[] - Number of aborts experienced for each of the MAX NUMBER
910 OPCODES opcode types.
911 \li long reserved[] - These five longword fields are reserved for future use.
913 \subsection sec3-3-10 Section 3.3.10: bulk
916 typedef opaque bulk<DEFAULTBULK>;
919 This typedef may be used to transfer an uninterpreted set of bytes across the
920 Volume Location Server interface. It may carry up to DEFAULTBULK (10,000)
923 \li bulk len - The number of bytes contained within the data pointed to by the
925 \li bulk val - A pointer to a sequence of bulk len bytes.
927 \subsection sec3-3-11 Section 3.3.11: bulkentries
930 typedef vldbentry bulkentries<>;
933 This typedef is used to transfer an unbounded number of struct vldbentry
934 objects. It appears in the parameter list for the VL ListAttributes() interface
937 \li bulkentries len - The number of vldbentry structures contained within the
938 data pointed to by the next field.
939 \li bulkentries val - A pointer to a sequence of bulkentries len vldbentry
942 \subsection sec3-3-12 Section 3.3.12: vldblist
945 typedef struct single_vldbentry *vldblist;
948 This typedef defines a queueable struct vldbentry object, referenced by the
949 single vldbentry typedef as well as struct vldb list.
951 \subsection sec3-3-13 Section 3.3.13: vlheader
954 typedef struct vlheader vlheader;
957 This typedef provides a short name for objects of type struct vlheader (see
960 \subsection sec3-3-14 Section 3.3.14: vlentry
963 typedef struct vlentry vlentry;
966 This typedef provides a short name for objects of type struct vlentry (see
969 \section sec3-4 Section 3.4: Error Codes
972 This section covers the set of error codes exported by the Volume Location
973 Server, displaying the printable phrases with which they are associated.
980 Volume Id entry exists in vl database.
994 Volume name entry exists in vl database.
1001 Internal creation failure.
1015 Vl database is empty.
1022 Entry is deleted (soft delete).
1029 Volume name is illegal.
1036 Index is out of range.
1050 Illegal server number (out of range).
1057 Bad partition number.
1064 Run out of space for Replication sites.
1071 No such Replication server site exists.
1078 Replication site already exists.
1085 Parent R/W entry not found.
1092 Illegal Reference Count number.
1099 Vl size for attributes exceeded.
1106 Bad incoming vl entry.
1113 Illegal max volid increment.
1120 RO/BACK id already hashed.
1127 Vl entry is already locked.
1134 Bad volume operation code.
1141 Bad release lock type.
1148 Status report: last release was aborted.
1155 Invalid replication site server flag.
1162 No permission access.
1169 malloc(realloc) failed to alloc enough memory.
1171 \section sec3-5 Section 3.5: Macros
1174 The Volume Location Server defines a small number of macros, as described in
1175 this section. They are used to update the internal statistics variables and to
1176 compute offsets into character strings. All of these macros really refer to
1177 internal operations, and strictly speaking should not be exposed in this
1180 \subsection sec3-5-1 Section 3.5.1: COUNT REQ()
1183 #define COUNT_REQ(op)
1184 static int this_op = op-VL_LOWEST_OPCODE;
1185 dynamic_statistics.requests[this_op]++
1188 Bump the appropriate entry in the variable maintaining opcode usage statistics
1189 for the Volume Location Server. Note that a static variable is set up to record
1190 this op, namely the index into the opcode monitoring array. This static
1191 variable is used by the related COUNT ABO() macro defined below.
1193 \subsection sec3-5-2 Section 3.5.2: COUNT ABO()
1196 #define COUNT_ABO dynamic_statistics.aborts[this_op]++
1199 Bump the appropriate entry in the variable maintaining opcode abort statistics
1200 for the Volume Location Server. Note that this macro does not take any
1201 arguemnts. It expects to find a this op variable in its environment, and thus
1202 depends on its related macro, COUNT REQ() to define that variable.
1204 \subsection sec3-5-3 Section 3.5.3: DOFFSET()
1207 #define DOFFSET(abase, astr, aitem) ((abase)+(((char *)(aitem)) -((char
1211 Compute the byte offset of charcter object aitem within the enclosing object
1212 astr, also expressed as a character-based object, then offset the resulting
1213 address by abase. This macro is used ot compute locations within the VLDB when
1214 actually writing out information.
1216 \section sec3-6 Section 3.6: Functions
1219 This section covers the Volume Location Server RPC interface routines. The
1220 majority of them are generated from the vldbint.xg Rxgen file, and are meant to
1221 be used by user-space agents. There is also a subset interface definition
1222 provided in the afsvlint.xg Rxgen file. These routines, described in Section
1223 3.7, are meant to be used by a kernel-space agent when dealing with the Volume
1224 Location Server; in particular, they are called by the Cache Manager.
1226 \subsection sec3-6-1 Section 3.6.1: VL CreateEntry - Create a VLDB
1230 int VL CreateEntry(IN struct rx connection *z conn,
1231 IN vldbentry *newentry)
1234 This function creates a new entry in the VLDB, as specified in the newentry
1235 argument. Both the name and numerical ID of the new volume must be unique
1236 (e.g., it must not already appear in the VLDB). For non-read-write entries, the
1237 read-write parent volume is accessed so that its reference count can be
1238 updated, and the new entry is added to the parent's chain of associated
1240 The VLDB is write-locked for the duration of this operation.
1242 VL PERM The caller is not authorized to execute this function. VL NAMEEXIST The
1243 volume name already appears in the VLDB. VL CREATEFAIL Space for the new entry
1244 cannot be allocated within the VLDB. VL BADNAME The volume name is invalid. VL
1245 BADVOLTYPE The volume type is invalid. VL BADSERVER The indicated server
1246 information is invalid. VL BADPARTITION The indicated partition information is
1247 invalid. VL BADSERVERFLAG The server flag field is invalid. VL IO An error
1248 occurred while writing to the VLDB.
1250 \subsection sec3-6-2 Section 3.6.2: VL DeleteEntry - Delete a VLDB
1254 int VL DeleteEntry(IN struct rx connection *z conn,
1259 Delete the entry matching the given volume identifier and volume type as
1260 specified in the Volid and voltype arguments. For a read-write entry whose
1261 reference count is greater than 1, the entry is not actually deleted, since at
1262 least one child (read-only or backup) volume still depends on it. For cases of
1263 non-read-write volumes, the parent's reference count and associated chains are
1266 If the associated VLDB entry is already marked as deleted (i.e., its flags
1267 field has the VLDELETED bit set), then no further action is taken, and VL
1268 ENTDELETED is returned. The VLDB is write-locked for the duration of this
1271 VL PERM The caller is not authorized to execute this function. VL BADVOLTYPE An
1272 illegal volume type has been specified by the voltype argument. VL NOENT This
1273 volume instance does not appear in the VLDB. VL ENTDELETED The given VLDB entry
1274 has already been marked as deleted. VL IO An error occurred while writing to
1277 \subsection sec3-6-3 Section 3.6.3: VL GetEntryByID - Get VLDB entry by
1281 int VL GetEntryByID(IN struct rx connection *z conn, IN long Volid, IN long
1282 voltype, OUT vldbentry *entry)
1285 Given a volume's numerical identifier (Volid) and type (voltype), return a
1286 pointer to the entry in the VLDB describing the given volume instance.
1288 The VLDB is read-locked for the duration of this operation.
1290 VL BADVOLTYPE An illegal volume type has been specified by the voltype
1292 \n VL NOENT This volume instance does not appear in the VLDB.
1293 \n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1295 \subsection sec3-6-4 Section 3.6.4: VL GetEntryByName - Get VLDB entry
1299 int VL GetEntryByName(IN struct rx connection *z conn,
1300 IN char *volumename,
1301 OUT vldbentry *entry)
1304 Given the volume name in the volumename parameter, return a pointer to the
1305 entry in the VLDB describing the given volume. The name in volumename may be no
1306 longer than MAXNAMELEN (65) characters, including the trailing null. Note that
1307 it is legal to use the volume's numerical identifier (in string form) as the
1310 The VLDB is read-locked for the duration of this operation.
1312 This function is closely related to the VL GetEntryByID() routine, as might be
1313 expected. In fact, the by-ID routine is called if the volume name provided in
1314 volumename is the string version of the volume's numerical identifier.
1316 VL BADVOLTYPE An illegal volume type has been specified by the voltype
1318 \n VL NOENT This volume instance does not appear in the VLDB.
1319 \n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1320 \n VL BADNAME The volume name is invalid.
1322 \subsection sec3-6-5 Section 3.6.5: VL GetNewVolumeId - Generate a new
1326 int VL GetNewVolumeId(IN struct rx connection *z conn,
1328 OUT long *newvolumid)
1331 Acquire bumpcount unused, consecutively-numbered volume identifiers from the
1332 Volume Location Server. The lowest-numbered of the newly-acquired set is placed
1333 in the newvolumid argument. The largest number of volume IDs that may be
1334 generated with any one call is bounded by the MAXBUMPCOUNT constant defined in
1335 Section 3.2.1. Currently, there is (effectively) no restriction on the number
1336 of volume identifiers that may thus be reserved in a single call.
1338 The VLDB is write-locked for the duration of this operation.
1340 VL PERM The caller is not authorized to execute this function.
1341 \n VL BADVOLIDBUMP The value of the bumpcount parameter exceeds the system
1342 limit of MAXBUMPCOUNT.
1343 \n VL IO An error occurred while writing to the VLDB.
1345 \subsection sec3-6-6 Section 3.6.6: VL ReplaceEntry - Replace entire
1346 contents of VLDB entry
1349 int VL ReplaceEntry(IN struct rx connection *z conn,
1352 IN vldbentry *newentry,
1353 IN long ReleaseType)
1356 Perform a wholesale replacement of the VLDB entry corresponding to the volume
1357 instance whose identifier is Volid and type voltype with the information
1358 contained in the newentry argument. Individual VLDB entry fields cannot be
1359 selectively changed while the others are preserved; VL UpdateEntry() should be
1360 used for this objective. The permissible values for the ReleaseType parameter
1361 are defined in Section 3.2.7.
1363 The VLDB is write-locked for the duration of this operation. All of the hash
1364 tables impacted are brought up to date to incorporate the new information.
1366 VL PERM The caller is not authorized to execute this function.
1367 \n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1369 \n VL BADRELLOCKTYPE An illegal release lock has been specified by the
1370 ReleaseType argument.
1371 \n VL NOENT This volume instance does not appear in the VLDB.
1372 \n VL BADENTRY An attempt was made to change a read-write volume ID.
1373 \n VL IO An error occurred while writing to the VLDB.
1375 \subsection sec3-6-7 Section 3.6.7: VL UpdateEntry - Update contents of
1379 int VL UpdateEntry(IN struct rx connection *z conn,
1382 IN VldbUpdateEntry *UpdateEntry,
1383 IN long ReleaseType)
1386 Update the VLDB entry corresponding to the volume instance whose identifier is
1387 Volid and type voltype with the information contained in the UpdateEntry
1388 argument. Most of the entry's fields can be modified in a single call to VL
1389 UpdateEntry(). The Mask field within the UpdateEntry parameter selects the
1390 fields to update with the values stored within the other UpdateEntry fields.
1391 Permissible values for the ReleaseType parameter are defined in Section 3.2.7.
1393 The VLDB is write-locked for the duration of this operation.
1395 VL PERM The caller is not authorized to execute this function.
1396 \n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1398 \n VL BADRELLOCKTYPE An illegal release lock has been specified by the
1399 ReleaseType argument.
1400 \n VL NOENT This volume instance does not appear in the VLDB.
1401 \n VL IO An error occurred while writing to the VLDB.
1403 \subsection sec3-6-8 Section 3.6.8: VL SetLock - Lock VLDB entry
1406 int VL SetLock(IN struct rx connection *z conn,
1412 Lock the VLDB entry matching the given volume ID (Volid) and type (voltype) for
1413 volume operation voloper (e.g., VLOP MOVE and VLOP RELEASE). If the entry is
1414 currently unlocked, then its LockTimestamp will be zero. If the lock is
1415 obtained, the given voloper is stamped into the flags field, and the
1416 LockTimestamp is set to the time of the call.
1418 When the caller attempts to lock the entry for a release operation, special
1419 care is taken to abort the operation if the entry has already been locked for
1420 this operation, and the existing lock has timed out. In this case, VL SetLock()
1421 returns VL RERELEASE.
1423 The VLDB is write-locked for the duration of this operation.
1425 VL PERM The caller is not authorized to execute this function.
1426 \n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1428 \n VL BADVOLOPER An illegal volume operation was specified in the voloper
1429 argument. Legal values are defined in the latter part of the table in Section
1431 \n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1432 \n VL ENTRYLOCKED The given VLDB entry has already been locked (which has not
1434 \n VL RERELEASE A VLDB entry locked for release has timed out, and the caller
1435 also wanted to perform a release operation on it.
1436 \n VL IO An error was experienced while attempting to write to the VLDB.
1438 \subsection sec3-6-9 Section 3.6.9: VL ReleaseLock - Unlock VLDB entry
1441 int VL ReleaseLock(IN struct rx connection *z conn,
1444 IN long ReleaseType)
1447 Unlock the VLDB entry matching the given volume ID (Volid) and type (voltype).
1448 The ReleaseType argument determines which VLDB entry fields from flags and
1449 LockAfsId will be cleared along with the lock timestamp in LockTimestamp.
1450 Permissible values for the ReleaseType parameter are defined in Section 3.2.7.
1452 The VLDB is write-locked for the duration of this operation.
1454 VL PERM The caller is not authorized to execute this function.
1455 \n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1457 \n VL BADRELLOCKTYPE An illegal release lock has been specified by the
1458 ReleaseType argument.
1459 \n VL NOENT This volume instance does not appear in the VLDB.
1460 \n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1461 \n VL IO An error was experienced while attempting to write to the VLDB.
1463 \subsection sec3-6-10 Section 3.6.10: VL ListEntry - Get contents of
1467 int VL ListEntry(IN struct rx connection *z conn,
1468 IN long previous index,
1470 OUT long *next index,
1471 OUT vldbentry *entry)
1474 This function assists in the task of enumerating the contents of the VLDB.
1475 Given an index into the database, previous index, this call return the single
1476 VLDB entry at that offset, placing it in the entry argument. The number of VLDB
1477 entries left to list is placed in count, and the index of the next entry to
1478 request is returned in next index. If an illegal index is provided, count is
1481 The VLDB is read-locked for the duration of this operation.
1485 \subsection sec3-6-11 Section 3.6.11: VL ListAttributes - List all VLDB
1486 entry matching given attributes, single return object
1489 int VL ListAttributes(IN struct rx connection *z conn,
1490 IN VldbListByAttributes *attributes,
1492 OUT bulkentries *blkentries)
1495 Retrieve all the VLDB entries that match the attributes listed in the
1496 attributes parameter, placing them in the blkentries object. The number of
1497 matching entries is placed in nentries. Matching can be done by server number,
1498 partition, volume type, flag, or volume ID. The legal values to use in the
1499 attributes argument are listed in Section 3.2.3. Note that if the VLLIST
1500 VOLUMEID bit is set in attributes, all other bit values are ignored and the
1501 volume ID provided is the sole search criterion.
1503 The VLDB is read-locked for the duration of this operation.
1505 Note that VL ListAttributes() is a potentially expensive function, as
1506 sequential search through all of the VLDB entries is performed in most cases.
1508 VL NOMEM Memory for the blkentries object could not be allocated.
1509 \n VL NOENT This specified volume instance does not appear in the VLDB.
1510 \n VL SIZEEXCEEDED Ran out of room in the blkentries object.
1511 \n VL IO Error while reading from the VLDB.
1513 \subsection sec3-6-12 Section 3.6.12: VL LinkedList - List all VLDB
1514 entry matching given attributes, linked list return object
1517 int VL LinkedList(IN struct rx connection *z conn,
1518 IN VldbListByAttributes *attributes,
1520 OUT vldb list *linkedentries)
1523 Retrieve all the VLDB entries that match the attributes listed in the
1524 attributes parameter, creating a linked list of entries based in the
1525 linkedentries object. The number of matching entries is placed in nentries.
1526 Matching can be done by server number, partition, volume type, flag, or volume
1527 ID. The legal values to use in the attributes argument are listed in Section
1528 3.2.3. Note that if the VLLIST VOLUMEID bit is set in attributes, all other bit
1529 values are ignored and the volume ID provided is the sole search criterion.
1531 The VL LinkedList() function is identical to the VL ListAttributes(), except
1532 for the method of delivering the VLDB entries to the caller.
1534 The VLDB is read-locked for the duration of this operation.
1536 VL NOMEM Memory for an entry in the list based at linkedentries object could
1538 \n VL NOENT This specified volume instance does not appear in the VLDB.
1539 \n VL SIZEEXCEEDED Ran out of room in the current list object.
1540 \n VL IO Error while reading from the VLDB.
1542 \subsection sec3-6-13 Section 3.6.13: VL GetStats - Get Volume Location
1546 int VL GetStats(IN struct rx connection *z conn,
1547 OUT vldstats *stats,
1548 OUT vital vlheader *vital header)
1551 Collect the different types of VLDB statistics. Part of the VLDB header is
1552 returned in vital header, which includes such information as the number of
1553 allocations and frees performed, and the next volume ID to be allocated. The
1554 dynamic per-operation stats are returned in the stats argument, reporting the
1555 number and types of operations and aborts.
1557 The VLDB is read-locked for the duration of this operation.
1559 VL PERM The caller is not authorized to execute this function.
1561 \subsection sec3-6-14 Section 3.6.14: VL Probe - Verify Volume Location
1562 Server connectivity/status
1565 int VL Probe(IN struct rx connection *z conn)
1568 This routine serves a 'pinging' function to determine whether the Volume
1569 Location Server is still running. If this call succeeds, then the Volume
1570 Location Server is shown to be capable of responding to RPCs, thus confirming
1571 connectivity and basic operation.
1573 The VLDB is not locked for this operation.
1577 \section sec3-7 Section 3.7: Kernel Interface Subset
1580 The interface described by this document so far applies to user-level clients,
1581 such as the vos utility. However, some volume location operations must be
1582 performed from within the kernel. Specifically, the Cache Manager must find out
1583 where volumes reside and otherwise gather information about them in order to
1584 conduct its business with the File Servers holding them. In order to support
1585 Volume Location Server interconnection for agents operating within the kernel,
1586 the afsvlint.xg Rxgen interface was built. It is a minimal subset of the
1587 user-level vldbint.xg definition. Within afsvlint.xg, there are duplicate
1588 definitions for such constants as MAXNAMELEN, MAXNSERVERS, MAXTYPES, VLF
1589 RWEXISTS, VLF ROEXISTS, VLF BACKEXISTS, VLSF NEWREPSITE, VLSF ROVOL, VLSF
1590 RWVOL, and VLSF BACKVOL. Since the only operations the Cache Manager must
1591 perform are volume location given a specific volume ID or name, and to find out
1592 about unresponsive Volume Location Servers, the following interface routines
1593 are duplicated in afsvlint.xg, along with the struct vldbentry declaration:
1594 \li VL GetEntryByID()
1595 \li VL GetEntryByName()
1598 \page chap4 Chapter 4: Volume Server Architecture
1600 \section sec4-1 Section 4.1: Introduction
1603 The Volume Server allows administrative tasks and probes to be performed on the
1604 set of AFS volumes residing on the machine on which it is running. As described
1605 in Chapter 2, a distributed database holding volume location info, the VLDB, is
1606 used by client applications to locate these volumes. Volume Server functions
1607 are typically invoked either directly from authorized users via the vos utility
1608 or by the AFS backup system.
1610 This chapter briefly discusses various aspects of the Volume Server's
1611 architecture. First, the high-level on-disk representation of volumes is
1612 covered. Then, the transactions used in conjuction with volume operations are
1613 examined. Then, the program implementing the Volume Server, volserver, is
1614 considered. The nature and format of the log file kept by the Volume Server
1615 rounds out the description.
1616 As with all AFS servers, the Volume Server uses the Rx remote procedure call
1617 package for communication with its clients.
1619 \section sec4-2 Section 4.2: Disk Representation
1622 For each volume on an AFS partition, there exists a file visible in the unix
1623 name space which describes the contents of that volume. By convention, each of
1624 these files is named by concatenating a prefix string, "V", the numerical
1625 volume ID, and the postfix string ".vol". Thus, file V0536870918.vol describes
1626 the volume whose numerical ID is 0536870918. Internally, each per-volume
1627 descriptor file has such fields as a version number, the numerical volume ID,
1628 and the numerical parent ID (useful for read-only or backup volumes). It also
1629 has a list of related inodes, namely files which are not visible from the unix
1630 name space (i.e., they do not appear as entries in any unix directory object).
1631 The set of important related inodes are:
1632 \li Volume info inode: This field identifies the inode which hosts the on-disk
1633 representation of the volume's header. It is very similar to the information
1634 pointed to by the volume field of the struct volser trans defined in Section
1635 5.4.1, recording important status information for the volume.
1636 \li Large vnode index inode: This field identifies the inode which holds the
1637 list of vnode identifiers for all directory objects residing within the volume.
1638 These are "large" since they must also hold the Access Control List (ACL)
1639 information for the given AFS directory.
1640 \li Small vnode index inode: This field identifies the inode which holds the
1641 list of vnode identifiers for all non-directory objects hosted by the volume.
1643 All of the actual files and directories residing within an AFS volume, as
1644 identified by the contents of the large and small vnode index inodes, are also
1645 free-floating inodes, not appearing in the conventional unix name space. This
1646 is the reason the vendor-supplied fsck program should not be run on partitions
1647 containing AFS volumes. Since the inodes making up AFS files and directories,
1648 as well as the inodes serving as volume indices for them, are not mapped to any
1649 directory, the standard fsck program would throw away all of these
1650 "unreferenced" inodes. Thus, a special version of fsck is provided that
1651 recognizes partitions containing AFS volumes as well as standard unix
1654 \section sec4-3 Section 4.3: Transactions
1657 Each individual volume operation is carried out by the Volume Server as a
1658 transaction, but not in the atomic sense of the word. Logically, creating a
1659 Volume Server transaction can be equated with performing an "exclusive open" on
1660 the given volume before beginning the actual work of the desired volume
1661 operation. No other Volume Server (or File Server) operation is allowed on the
1662 opened volume until the transaction is terminated. Thus, transactions in the
1663 context of the Volume Server serve to provide mutual exclusion without any of
1664 the normal atomicity guarantees. Volumes maintain enough internal state to
1665 enable recovery from interrupted or failed operations via use of the salvager
1666 program. Whenever volume inconsistencies are detected, this salvager program is
1667 run, which then attempts to correct the problem.
1669 Volume transactions have timeouts associated with them. This guarantees that
1670 the death of the agent performing a given volume operation cannot result in the
1671 volume being permanently removed from circulation. There are actually two
1672 timeout periods defined for a volume transaction. The first is the warning
1673 time, defined to be 5 minutes. If a transaction lasts for more than this time
1674 period without making progress, the Volume Server prints a warning message to
1675 its log file (see Section 4.5). The second time value associated with a volume
1676 transaction is the hard timeout, defined to occur 10 minutes after any progress
1677 has been made on the given operation. After this period, the transaction will
1678 be unconditionally deleted, and the volume freed for any other operations.
1679 Transactions are reference-counted. Progress will be deemed to have occurred
1680 for a transaction, and its internal timeclock field will be updated, when:
1681 \li 1 The transaction is first created.
1682 \li 2 A reference is made to the transaction, causing the Volume Server to look
1683 it up in its internal tables.
1684 \li 3 The transaction's reference count is decremented.
1686 \section sec4-4 Section 4.4: The volserver Process
1689 The volserver user-level program is run on every AFS server machine, and
1690 implements the Volume Server agent. It is responsible for providing the Volume
1691 Server interface as defined by the volint.xg Rxgen file.
1693 The volserver process defines and launches five threads to perform the bulk of
1694 its duties. One thread implements a background daemon whose job it is to
1695 garbage-collect timed-out transaction structures. The other four threads are
1696 RPC interface listeners, primed to accept remote procedure calls and thus
1697 perform the defined set of volume operations.
1699 Certain non-standard configuration settings are made for the RPC subsystem by
1700 the volserver program. For example, it chooses to extend the length of time
1701 that an Rx connection may remain idle from the default 12 seconds to 120
1702 seconds. The reasoning here is that certain volume operations may take longer
1703 than 12 seconds of processing time on the server, and thus the default setting
1704 for the connection timeout value would incorrectly terminate an RPC when in
1705 fact it was proceeding normally and correctly.
1707 The volserver program takes a single, optional command line argument. If a
1708 positive integer value is provided on the command line, then it shall be used
1709 to set the debugging level within the Volume Server. By default, a value of
1710 zero is used, specifying that no special debugging output will be generated and
1711 fed to the Volume Server log file described below.
1713 \section sec4-5 Section 4.5: Log File
1716 The Volume Server keeps a log file, recording the set of events of special
1717 interest it has encountered. The file is named VolserLog, and is stored in the
1718 /usr/afs/logs directory on the local disk of the server machine on which the
1719 Volume Server runs. This is a human-readable file, with every entry
1722 Whenever the volserver program restarts, it renames the current VolserLog file
1723 to VolserLog.old, and starts up a fresh log. A properly-authorized individual
1724 can easily inspect the log file residing on any given server machine. This is
1725 made possible by the BOS Server AFS agent running on the machine, which allows
1726 the contents of this file to be fetched and displayed on the caller's machine
1727 via the bos getlog command.
1729 An excerpt from a Volume Server log file follows below. The numbers appearing
1730 in square brackets at the beginning of each line have been inserted so that we
1731 may reference the individual lines of the log excerpt in the following
1734 [1] Wed May 8 06:03:00 1991 AttachVolume: Error attaching volume
1735 /vicepd/V1969547815.vol; volume needs salvage
1736 [2] Wed May 8 06:03:01 1991 Volser: ListVolumes: Could not attach volume
1738 [3] Wed May 8 07:36:13 1991 Volser: Clone: Cloning volume 1969541499 to new
1740 [4] Wed May 8 11:25:05 1991 AttachVolume: Cannot read volume header
1741 /vicepd/V1969547415.vol
1742 [5] Wed May 8 11:25:06 1991 Volser: CreateVolume: volume 1969547415
1743 (bld.dce.s3.dv.pmax_ul3) created
1746 Line [1] indicates that the volume whose numerical ID is 1969547815 could not
1747 be attached on partition /vicepd. This error is probably the result of an
1748 aborted transaction which left the volume in an inconsistent state, or by
1749 actual damage to the volume structure or data. In this case, the Volume Server
1750 recommends that the salvager program be run on this volume to restore its
1751 integrity. Line [2] records the operation which revealed this situation, namely
1752 the invocation of an AFSVolListVolumes() RPC.
1754 Line [4] reveals that the volume header file for a specific volume could not be
1755 read. Line [5], as with line [2] in the above paragraph, indicates why this is
1756 true. Someone had called the AFSVolCreateVolume() interface function, and as a
1757 precaution, the Volume Server first checked to see if such a volume was already
1758 present by attempting to read its header.
1760 Thus verifying that the volume did not previously exist, the Volume Server
1761 allowed the AFSVolCreateVolume() call to continue its processing, creating and
1762 initializing the proper volume file, V1969547415.vol, and the associated header
1765 \page chap5 Chapter 5: Volume Server Interface
1767 \section sec5-1 Section 5.1 Introduction
1770 This chapter documents the API for the Volume Server facility, as defined by
1771 the volint.xg Rxgen interface file and the volser.h include file. Descriptions
1772 of all the constants, structures, macros, and interface functions available to
1773 the application programmer appear here.
1775 \section sec5-2 Section 5.2: Constants
1778 This section covers the basic constant definitions of interest to the Volume
1779 Server application programmer. These definitions appear in the volint.h file,
1780 automatically generated from the volint.xg Rxgen interface file, and in
1783 Each subsection is devoted to describing the constants falling into the
1784 following categories:
1785 \li Configuration and boundary values
1786 \li Interface routine opcodes
1787 \li Transaction Flags
1790 \li States for struct vldbentry
1794 \subsection sec5-2-1 Section 5.2.1: Configuration and Boundary Values
1797 These constants define some basic system configuration values, along with such
1798 things as maximum sizes of important arrays.
1800 MyPort 5,003 The Rx UDP port on which the Volume Server service may be
1807 Used by the vos utility to define maximum lengths for internal filename
1815 Maximum number of server agents implementing the AFS Volume Location Database
1816 (VLDB) for the cell.
1823 The Rx service number on the given UDP port (MyPort) above.
1830 Used as an invalid read-only or backup volume ID.
1837 The number of characters in the longest possible volume name, including the
1838 trailing null. Note: this is only used by the vos utility; the Volume Server
1839 uses the "old" value below.
1842 VOLSER OLDMAXVOLNAME
1846 The "old" maximum number of characters in an AFS volume name, including the
1847 trailing null. In reality, it is also the current maximum.
1854 The maximum number of replication sites for a volume.
1861 Size in bytes of the name field in struct volintInfo (see Section 5.4.6).
1864 \subsection sec5-2-2 Section 5.2.2: Interface Routine Opcodes
1867 These constants, appearing in the volint.xg Rxgen interface file for the Volume
1868 Server, define the opcodes for the RPC routines. Every Rx call on this
1869 interface contains this opcode, and the dispatcher uses it to select the proper
1870 code at the server site to carry out the call.
1877 Opcode for AFSVolCreateVolume()
1884 Opcode for AFSVolDeleteVolume()
1891 Opcode for AFSVolRestoreVolume()
1898 Opcode for AFSVolForward()
1905 Opcode for AFSVolEndTrans()
1912 Opcode for AFSVolClone() .
1919 Opcode for AFSVolSetFlags()
1926 Opcode for AFSVolGetFlags()
1933 Opcode for AFSVolTransCreate()
1940 Opcode for AFSVolDump()
1947 Opcode for AFSVolGetNthVolume()
1954 Opcode for AFSVolSetForwarding()
1961 Opcode for AFSVolGetName()
1968 Opcode for AFSVolGetStatus()
1975 Opcode for AFSVolSignalRestore()
1982 Opcode for AFSVolListPartitions()
1989 Opcode for AFSVolListVolumes()
1996 Opcode for AFSVolSetIdsTypes()
2003 Opcode for AFSVolMonitor()
2010 Opcode for AFSVolPartitionInfo()
2017 Opcode for AFSVolReClone()
2024 Opcode for AFSVolListOneVolume()
2031 Opcode for AFSVolNukeVolume()
2038 Opcode for AFSVolSetDate()
2040 \subsection sec5-2-3 Section 5.2.3: Transaction Flags
2043 These constants define the various flags the Volume Server uses in assocation
2044 with volume transactions, keeping track of volumes upon which operations are
2045 currently proceeding. There are three sets of flag values, stored in three
2046 different fields within a struct volser trans: general volume state, attachment
2047 modes, and specific transaction states.
2049 \subsubsection sec5-2-3-1: Section 5.2.3.1 vflags
2052 These values are used to represent the general state of the associated volume.
2053 They appear in the vflags field within a struct volser trans.
2060 The volume should be deleted on next salvage.
2067 This volume should never be put online.
2074 This volume has been deleted (via AFSVolDeleteVol¬ume() ), and thus should not
2077 \subsubsection sec5-2-3-2 Section 5.2.3.2: iflags
2080 These constants represent the desired attachment mode for a volume at the start
2081 of a transaction. Once attached, the volume header is marked to reflect this
2082 mode. Attachment modes are useful in salvaging partitions, as they indicate
2083 whether the operations being performed on individual volumes at the time the
2084 crash occured could have introduced inconsistencies in their metadata
2085 descriptors. If a volume was attached in a read-only fashion, then the salvager
2086 may decide (taking other factors into consideration) that the volume doesn't
2087 need attention as a result of the crash.
2089 These values appear in the iflags field within a struct volser trans.
2096 Volume offline on server (returns VOFFLINE).
2103 Volume busy on server (returns VBUSY).
2110 Volume is read-only on client, read-write on server -DO NOT USE.
2117 Volume does not exist correctly yet.
2126 \subsubsection sec5-2-3-3 Section 5.2.3.3: tflags
2129 This value is used to represent the transaction state of the associated volume,
2130 and appears in the tflags field within a struct volser trans.
2137 Delete transaction not yet freed due to high reference count.
2139 \subsection sec5-2-4 Section 5.2.4: Volume Types
2142 The following constants may be supplied as values for the type argument to the
2143 AFSVol-CreateVolume() interface call. They are just synonyms for the three
2144 values RWVOL, ROVOL,
2151 Specifies a read-write volume type.
2158 Specifies a read-only volume type.
2165 Specifies a backup volume type.
2167 \subsection sec5-2-5 Section 5.2.5: LWP State
2170 This set of exported definitions refers to objects internal to the Volume
2171 Server, and strictly speaking should not be visible to other agents.
2172 Specifically, a busyFlags array keeps a set of flags referenced by the set of
2173 lightweight threads running within the Volume Server. These flags reflect and
2174 drive the state of each of these worker LWPs.
2181 Volume Server LWP is idle, waiting for new work.
2188 A work item has been queued.
2190 \subsection sec5-2-6 Section 5.2.6: States for struct vldbentry
2193 The Volume Server defines a collection of synonyms for certain values defined
2194 by the Volume Location Server. These particular constants are used within the
2195 flags field in objects of type struct vldbentry. The equivalent Volume Location
2196 Server values are described in Section 3.2.6.
2203 Synonym for VLF RWEXISTS.
2210 Synonym for VLF ROEXISTS.
2217 Synonym for VLF BACKEXISTS.
2224 Synonym for VLSF NEWREPSITE.
2231 Synonym for VLFS ROVOL.
2238 Synonym for VLSF RWVOL.
2245 Synonym for VLSF BACKVOL.
2247 \subsection sec5-2-7 Section 5.2.7: Validity Checks
2250 These values are used for performing validity checks. The first one appears
2251 only within the partFlags field within objects of type partList (see Section
2252 5.4.3). The rest (except VOK and VBUSY) appear in the volFlags field within an
2253 object of type struct volDescription. These latter defintions are used within
2254 the volFlags field to mark whether the rest of the fields within the struct
2255 volDescription are valid. Note that while several constants are defined, only
2256 some are actually used internally by the Volume Server code.
2263 The indicated partition is valid.
2270 The indicated clone (field volCloneId) is a valid one.
2277 The indicated clone volume (field volCloneId) has been deleted.
2284 The indicated volume ID (field volId) is valid.
2291 The indicted volume name (field volName) is valid. Not used internally by the
2299 The indicated volume size (field volSize) is valid. Not used internally by the
2307 The struct volDescription refers to a valid volume.
2314 The indicated clone ID (field volCloneId) should be reused.
2321 Used in the status field of struct volintInfo to show that everything is OK.
2328 Used in the status field of struct volintInfo to show that the volume is
2331 \subsection sec5-2-8 Section 5.2.8: Miscellaneous
2334 This section covers the set of exported Volume Server definitions that don't
2335 easily fall into the above categories.
2342 Not used internally by the Volume Server; used as a maxi¬mum size for internal
2350 Size of an internal Volume Server character array (busyFlags), it marks the
2351 maximum number of threads within the server.
2358 Synonym for the unix standard input file descriptor.
2365 Synonym for the unix standard output file descriptor.
2367 \section sec5-3 Section 5.3: Exported Variables
2370 This section describes the single variable that the Volume Server exports to
2373 The QI GlobalWriteTrans exported variable represents a pointer to the head of
2374 the global queue of transaction structures for operations being handled by a
2375 Volume Server. Each object in this list is of type struct volser trans (see
2376 Section 5.4.1 below).
2378 \section sec5-4 Section 5.4: Structures and Typedefs
2381 This section describes the major exported Volume Server data structures of
2382 interest to application programmers, along with some of the the typedefs based
2383 on those structures. Please note that typedefs in shose definitions angle
2384 brackets appear are those fed through the Rxgen RPC stub generator. Rxgen uses
2385 these angle brackets to specify an array of indefinite size.
2387 \subsection sec5-4-1 Section 5.4.1: struct volser trans
2390 This structure defines the transaction record for all volumes upon which an
2391 active operation is proceeding.
2393 \li struct volser trans *next - Pointer to the next transaction structure in
2395 \li long tid - Transaction ID.
2396 \li long time - The time this transaction was last active, for timeout
2398 \li This is the standard unix time format.
2399 \li long creationTime - The time a which this transaction started.
2400 \li long returnCode - The overall transaction error code.
2401 \li struct Volume *volume - Pointer to the low-level object describing the
2402 associated volume. This is included here for the use of lower-level support
2404 \li long volid - The associated volume's numerical ID.
2405 \li long partition - The partition on which the given volume resides.
2406 \li long dumpTransId - Not used.
2407 \li long dumpSeq - Not used.
2408 \li short refCount - Reference count on this structure.
2409 \li short iflags - Initial attach mode flags.
2410 \li char vflags - Current volume status flags.
2411 \li char tflags - Transaction flags.
2412 \li char incremental - If non-zero, indicates that an incremental restore
2413 operation should be performed.
2414 \li char lastProcName[] - Name of the last internal Volume Server procedure
2415 that used this transaction. This field may be up to 30 characters long,
2416 including the trailing null, and is intended for debugging purposes only.
2417 \li struct rx call *rxCallPtr - Pointer to latest associated rx call. This
2418 field is intended for debugging purposes only.
2420 \subsection sec5-4-2 Section 5.4.2: struct volDescription
2423 This structure is used by the AFS backup system to group certain key fields of
2426 \li char volName[] -The name of the given volume; maximum length of this string
2427 is VOLSER MAXVOLNAME characters, including the trailing null.
2428 \li long volId -The volume's numerical ID.
2429 \li int volSize -The size of the volume, in bytes.
2430 \li long volFlags -Keeps validity information on the given volume and its
2431 clones. This field takes on values from the set defined in Section 5.2.7
2432 \li long volCloneId -The volume's current clone ID.
2434 \subsection sec5-4-3 Section 5.4.3: struct partList
2437 This structure is used by the backup system and the vos tool to keep track of
2438 the state of the AFS disk partitions on a given server.
2440 \li long partId[] -Set of 26 partition IDs.
2441 \li long partFlags[] -Set to PARTVALID if the associated partition slot
2442 corresponds to a valid partition. There are 26 entries in this array.
2444 \subsection sec5-4-4 Section 5.4.4: struct volser status
2447 This structure holds the status of a volume as it is known to the Volume
2448 Server, and is passed to clients through the AFSVolGetStatus() interface call.
2450 Two fields appearing in this structure, accessDate and updateDate, deserve a
2451 special note. In particular, it is important to observe that these fields are
2452 not kept in full synchrony with reality. When a File Server provides one of its
2453 client Cache Managers with a chunk of a file on which to operate, it is
2454 incapable of determining exactly when the data in that chunk is accessed, or
2455 exactly when it is updated. This is because the manipulations occur on the
2456 client machine, without any information on these accesses or updates passed
2457 back to the server. The only time these fields can be modified is when the
2458 chunk of a file resident within the given volume is delivered to a client (in
2459 the case of accessDate), or when a client writes back a dirty chunk to the File
2460 Server (in the case of updateDate).
2462 \li long volID - The volume's numerical ID, unique within the cell.
2463 \li long nextUnique - Next value to use for a vnode uniquifier within this
2465 \li int type - Basic volume class, one of RWVOL, ROVOL, or BACKVOL.
2466 \li long parentID - Volume ID of the parent, if this volume is of type ROVOL or
2468 \li long cloneID - ID of the latest read-only clone, valid iff the type field
2470 \li long backupID - Volume ID of the latest backup of this read-write volume.
2471 \li long restoredFromID - The volume ID contained in the dump from which this
2472 volume was restored. This field is used to simply make sure that an incremental
2473 dump is not restored on top of something inappropriate. Note that this field
2474 itself is not dumped.
2475 \li long maxQuota - The volume's maximum quota, in 1Kbyte blocks.
2476 \li long minQuota - The volume's minimum quota, in 1Kbyte blocks.
2477 \li long owner - The user ID of the person responsible for this volume.
2478 \li long creationDate - For a volume of type RWVOL, this field marks its
2479 creation date. For the original copy of a clone, this field represents the
2481 \li long accessDate - Last access time by a user for this volume. This value is
2482 expressed as a standard unix longword date quantity.
2483 \li long updateDate - Last modification time by a user for this volume. This
2484 value is expressed as a standard unix longword date quantity.
2485 \li long expirationDate - Expiration date for this volume. If the volume never
2486 expires, then this field is set to zero.
2487 \li long backupDate - The last time a backup clone was created for this volume.
2488 \li long copyDate - The time that this copy of this volume was created.
2490 \subsection sec5-4-5 Section 5.4.5: struct destServer
2493 Used to specify the destination server in an AFSVolForward() invocation (see
2496 \li long destHost - The IP address of the destination server.
2497 \li long destPort - The UDP port for the Volume Server Rx service there.
2498 \li long destSSID - Currently, this field is always set to 1.
2500 \subsection sec5-4-6 Section 5.4.6: struct volintInfo
2503 This structure is used to communicate volume information to the Volume Server's
2504 RPC clients. It is used to build the volEntries object, which appears as a
2505 parameter to the AFSVolListVolumes() call.
2507 The comments in Section 5.4.4 concerning the accessDate and updateDate fields
2508 are equally valid for the analogue fields in this structure.
2510 \li char name[] - The null-terminated name for the volume, which can be no
2511 longer than VNAMESIZE (32) characters, including the trailing null.
2512 \li long volid - The volume's numerical ID.
2513 \li long type - The volume's basic class, one of RWVOL, ROVOL, or BACKVOL.
2514 \li long backupID - The latest backup volume's ID.
2515 \li long parentID - The parent volume's ID.
2516 \li long cloneID - The latest clone volume's ID.
2517 \li long status - Status of the volume; may be one of VOK or VBUSY.
2518 \li long copyDate - The time that this copy of this volume was created.
2519 \li unsigned char inUse - If non-zero, an indication that this volume is
2521 \li unsigned char needsSalvaged - If non-zero, an indication that this volume
2522 needs to be salvaged.
2523 \li unsigned char destroyMe - If non-zero, an indication that this volume
2524 should be destroyed.
2525 \li long creationDate - Creation date for a read/write volume; cloning date for
2526 the original copy of a read-only volume.
2527 \li long accessDate - Last access time by a user for this volume.
2528 \li long updateDate - Last modification time by a user for this volume.
2529 \li long backupDate - Last time a backup copy was made of this volume.
2530 \li int dayUse - Number of times this volume was accessed since midnight of the
2532 \li int filecount - the number of file system objects contained within the
2534 \li int maxquota - The upper limit on the number of 1-Kbyte disk blocks of
2535 storage that this volume may obtain.
2536 \li int size - Not known.
2537 \li long flags - Values used by the backup system are stored here.
2538 \li long spare1 -spare3 -Spare fields, reserved for future use.
2540 \subsection sec5-4-7 Section 5.4.7: struct transDebugInfo
2543 This structure is provided for monitoring and debugging purposes. It is used to
2544 compose the transDebugEntries variable-sized object, which in turn appears as a
2545 parameter to the AFSVolMonitor() interface call.
2547 \li long tid - The transaction ID.
2548 \li long time - The time when the transaction was last active, for timeout
2550 \li long creationTime - The time the transaction started.
2551 \li long returnCode - The overall transaction error code.
2552 \li long volid - The open volume's ID.
2553 \li long partition - The open volume's partition.
2554 \li short iflags - Initial attach mode flags (IT*).
2555 \li char vflags - Current volume status flags (VT*).
2556 \li char tflags - Transaction flags (TT*).
2557 \li char lastProcName[] - The string name of the last procedure which used
2558 transaction. This field may be up to 30 characters long, including the trailing
2559 null, and is intended for debugging purposes only.
2560 \li int callValid - Flag which determines if the following fields are valid.
2561 \li long readNext - Sequence number of the next Rx packet to be read.
2562 \li long transmitNext - Sequence number of the next Rx packet to be
2564 \li int lastSendTime - The last time anything was sent over the wire for this
2566 \li int lastReceiveTime - The last time anything was received over the wire for
2569 \subsection sec5-4-8 Section 5.4.8: struct pIDs
2572 Used by the AFSVolListPartitions() interface call, this structure is used to
2573 store information on all of the partitions on a given Volume Server.
2575 \li long partIds[] - One per letter of the alphabet (/vicepa through /vicepz).
2576 Filled with 0 for "/vicepa", 25 for "/vicepz". Invalid partition slots are
2577 filled in with a -1.
2579 \subsection sec5-4-9 Section 5.4.9: struct diskPartition
2582 This structure contains information regarding an individual AFS disk partition.
2583 It is returned as a parameter to the AFSVolPartitionInfo() call.
2585 \li char name[] -Mounted partition name, up to 32 characters long including the
2587 \li char devName[] -Device name on which the partition lives, up to 32
2588 characters long including the trailing null.
2589 \li int lock fd -A lock used for mutual exclusion to the named partition. A
2590 value of -1 indicates the lock is not currently being held. Otherwise, it has
2591 the file descriptor resulting from the unix open() call on the file specified
2592 in the name field used to "acquire" the lock.
2593 \li int totalUsable - The number of blocks within the partition which are
2595 \li int free - The number of free blocks in the partition.
2596 \li int minFree - The minimum number of blocks that must remain free regardless
2597 of allocation requests.
2599 \section sec5-4-10 Section 5.4.10: struct restoreCookie
2602 Used as a parameter to both AFSVolRestore() and AFSVolForward(),a restoreCookie
2603 keeps information that must be preserved between various Volume Server
2606 \li char name[] - The volume name, up to 32 characters long including the
2608 \li long type - The volume type, one of RWVOL, ROVOL, and BACKVOL.
2609 \li long clone - The current read-only clone ID for this volume.
2610 \li long parent - The parent ID for this volume.
2612 \section sec5-4-11 Section 5.4.11: transDebugEntries
2615 typedef transDebugInfo transDebugEntries<>;
2618 This typedef is used to generate a variable-length object which is passed as a
2619 parameter to the AFSVolMonitor() interface function. Thus, it may carry any
2620 number of descriptors for active transactions on the given Volume Server.
2621 Specifi, it causes a C structure of the same name to be defined with the
2624 \li u int transDebugEntries len - The number of struct transDebugInfo (see
2625 Section 5.4.7) objects appearing at the memory location pointed to by the
2626 transDebugEntries val field.
2627 \li transDebugInfo *transDebugEntries val - A pointer to a region of memory
2628 containing an array of transDebugEntries len objects of type struct
2631 \subsection sec5-4-12 Section 5.4.12: volEntries
2634 typedef volintInfo volEntries<>;
2637 This typedef is used to generate a variable-length object which is passed as a
2638 parameter to AFSVolListVolumes(). Thus, it may carry any number of descriptors
2639 for volumes on the given Volume Server. Specifically, it causes a C structure
2640 of the same name to be defined with the following fields:
2642 \li u int volEntries len - The number of struct volintInfo (see Section 5.4.6)
2643 objects appearing at the memory location pointed to by the volEntries val
2645 \li volintInfo *volEntries val -A pointer to a region of memory containing an
2646 array of volEntries len objects of type struct volintInfo.
2648 \section sec5-5 Section 5.5: Error Codes
2651 The Volume Server advertises two groups of error codes. The first set consists
2652 of the standard error codes defined by the package. The second is a collection
2653 of lower-level return values which are exported here for convenience.
2660 internal error releasing transaction.
2667 unknown internal error.
2670 VOLSERREAD DUMPERROR
2674 badly formatted dump.
2681 badly formatted dump(2).
2688 could not attach volume.
2691 VOLSERILLEGAL PARTITION
2702 could not detach volume.
2709 insufficient privilege for volume operation.
2716 error from volume location database.
2737 illegal volume operation.
2744 volume release failed.
2751 volume still in use by volserver.
2758 out of virtual memory in volserver.
2772 more than one read/write volume.
2779 failed volume server operation.
2781 \subsection sec5-5-1 Section 5.5.1: Standard
2784 The error codes described in this section were defined by the Volume Server to
2785 describe exceptional conditions arising in the course of RPC call handling.
2787 \subsection sec5-5-2 Section 5.5.2: Low-Level
2790 These error codes are duplicates of those defined from a package which is
2791 internal to the Volume Server. They are re-defined here to make them visible to
2792 Volume Server clients.
2799 Volume needs to be salvaged.
2806 Bad vnode number encountered.
2813 The given volume is either not attached, doesn't exist, or is not online.
2820 The given volume already exists.
2827 The volume is currently not in service.
2834 The specified volume is offline, for the reason given in the offline message
2835 field (a subfield within the volume field in struct volser trans).
2842 Volume is already online.
2849 The disk partition is full.
2856 The given volume's maximum quota, as expressed in the maxQuota field of the
2857 struct volintInfo, has been exceeded.
2864 The named volume is temporarily unavailable, and the client is encouraged to
2865 retry the operation shortly.
2872 The given volume has moved to a new server.
2875 The VICE SPECIAL ERRORS constant is defined to be the lowest of these error
2878 \section sec5-6 Section 5.6: Macros
2881 The Volume Server defines a small number of macros, as described in this
2883 \subsection sec5-6-1 Section 5.6.1: THOLD()
2886 #define THOLD(tt) ((tt)->refCount++)
2889 This macro is used to increment the reference count field, refCount, in an
2890 object of type struct volser trans. Thus, the associated transaction is
2891 effectively "held" insuring it won't be garbage-collected. The counterpart to
2892 this operation, TRELE(), is implemented by the Volume Server as a function.
2894 \subsection sec5-6-2 Section 5.6.2: ISNAMEVALID()
2897 #define ISNAMEVALID(name) (strlen(name) < (VOLSER_OLDMAXVOLNAME -9))
2900 This macro checks to see if the given name argument is of legal length. It must
2901 be no more than the size of the container, which is at most VOLSER
2902 OLDMAXVOLNAME characters, minus the length of the longest standardized volume
2903 name postfix known to the system. That postfix is the 9-character .restored
2904 string, which is tacked on to the name of a volume that has been restored from
2907 \section sec5-7 Section 5.7: Functions
2910 This section covers the Volume Server RPC interface routines, defined by and
2911 generated from the volint.xg Rxgen file. The following is a summary of the
2912 interface functions and their purpose:
2927 Obliterate a volume completely.
2932 Dump (i.e., save) the contents of a volume.
2937 Show intention to call AFSVolRestore().
2942 Recreate a volume from a dump.
2947 Dump a volume, then restore to a given server and volume.
2952 Clone (and optionally purge) a volume.
2962 Set forwarding info for a moved volume.
2967 Create transaction for a [volume, partition].
2977 Get volume flags for a transaction.
2982 Set volume flags for a transaction.
2987 Get the volume name associated with a transaction.
2992 Get status of a transaction/volume.
2997 Set header info for a volume.
3002 Set creation date in a volume.
3005 AFSVolListPartitions
3007 Return a list of AFS partitions on a server.
3012 Get partition information.
3017 Return a list of volumes on the server.
3022 Return header info for a single volume.
3027 Get volume header given its index.
3032 Collect server transaction state.
3036 There are two general comments that apply to most of the Volume Server
3038 \li 1. AFS partitions are identified by integers ranging from 0 to 25,
3039 corresponding to the letters "a" through "z". By convention, AFS partitions are
3040 named /vicepx, where x is any lower-case letter.
3041 \li 2. Legal volume types to pass as parameters are RWVOL, ROVOL, and BACKVOL,
3042 as defined in Section 3.2.4.
3044 \subsection sec5-7-1 Section 5.7.1: AFSVolCreateVolume - Create a
3048 int AFSVolCreateVolume(IN struct rx connection *z conn,
3057 Create a volume named name, with numerical identifier volid, and of type type.
3058 The new volume is to be placed in the specified partition for the server
3059 machine as identified by the Rx connection information pointed to by z conn. If
3060 a value of 0 is provided for the parent argument, it will be set by the Volume
3061 Server to the value of volid itself. The trans parameter is set to the Volume
3062 Location Server transaction ID corresponding to the volume created by this
3063 call, if successful.
3064 The numerical volume identifier supplied in the volid parameter must be
3065 generated beforehand by calling VL GetNewVolumeID() (see Section 3.6.5). After
3066 AFSVolCreateVolume() completes correctly, the new volume is marked as offline.
3067 It must be explicitly brought online through a call to AFSVolSetFlags() (see
3068 Section 5.7.14) while passing the trans transaction ID generated by
3069 AFSVolCreateVolume(). The "hold" on the new volume guaranteed by the trans
3070 transaction may be "released" by calling AFSVolEnd-Trans(). Until then, no
3071 other process may operate on the volume.
3072 Upon creation, a volume's maximum quota (as specified in the maxquota field of
3073 a struct volintInfo) is set to 5,000 1-Kbyte blocks.
3074 Note that the AFSVolCreateVolume() routine is the only Volume Server function
3075 that manufactures its own transaction. All others must have already acquired a
3076 transaction ID via either a previous call to AFSVolCreateVolume() or
3077 AFSVolTransCreate().
3079 VOLSERBADNAME The volume name parameter was longer than 31 characters plus the
3081 \n VOLSERBAD ACCESS The caller is not authorized to create a volume.
3082 \n EINVAL The type parameter was illegal. E2BIG A value of 0 was provided in
3083 the volid parameter. VOLSERVOLBUSY A transaction could not be created, thus the
3084 given volume was busy.
3085 \n EIO The new volume entry could not be created.
3086 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3087 dropped to the proper level.
3088 \n <misc> If the partition parameter is unintelligible, this routine will
3089 return a low-level unix error.
3091 \subsection sec5-7-2 Section 5.7.2: AFSVolDeleteVolume - Delete a
3095 int AFSVolDeleteVolume(IN struct rx connection *z conn, IN long trans)
3098 Delete the volume associated with the open transaction ID specified within
3099 trans. All of the file system objects contained within the given volume are
3100 destroyed, and the on-disk volume metadata structures are reclaimed. In
3101 addition, the in-memory volume descriptor's vflags field is set to VTDeleted,
3102 indicating that it has been deleted.
3104 Under some circumstances, a volume should be deleted by calling
3105 AFSVolNukeVolume() instead of this routine. See Section 5.7.3 for more details.
3107 VOLSERBAD ACCESS The caller is not authorized to delete a volume.
3108 \n ENOENT The trans transaction was not found.
3109 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3110 dropped to the proper level.
3112 \subsection sec5-7-3 Section 5.7.3: AFSVolNukeVolume - Obliterate a
3116 int AFSVolNukeVolume(IN struct rx connection *z conn,
3121 Completely obliterate the volume living on partition partID whose ID is volID.
3122 This involves scanning all inodes on the given partition and removing those
3123 marked with the specified volID. If the volume is a read-only clone, only the
3124 header inodes are removed, since they are the only ones stamped with the
3125 read-only ID. To reclaim the space taken up by the actual data referenced
3126 through a read-only clone, this routine should be called on the read-write
3127 master. Note that calling AFSVolNukeVolume() on a read-write volume effectively
3128 destroys all the read-only volumes cloned from it, since everything except for
3129 their indicies to the (now-deleted) data will be gone.
3131 Under normal circumstances, it is preferable to use AFSVolDeleteVolume()
3132 instead of AFSVolNukeVolume() to delete a volume. The former is much more
3133 efficient, as it only touches those objects in the partition that belong to the
3134 named volume, walking the on-disk volume metadata structures. However,
3135 AFSVolNukeVolume() must be used in situations where the volume metadata
3136 structures are known to be damaged. Since a complete scan of all inodes in the
3137 partition is performed, all disconnected or unreferenced portions of the given
3138 volume will be reclaimed.
3140 VOLSERBAD ACCESS The caller is not authorized to call this routine.
3141 \n VOLSERNOVOL The partition specified by the partID argument is illegal.
3143 \subsection sec5-7-4 Section 5.7.4: AFSVolDump - Dump (i.e., save) the
3144 contents of a volume
3147 int AFSVolDump(IN struct rx connection *z conn,
3152 Generate a canonical dump of the contents of the volume associated with
3153 transaction fromTrans as of calendar time fromDate. If the given fromDate is
3154 zero, then a full dump will be carried out. Otherwise, the resulting dump will
3155 be an incremental one.
3157 This is specified as a split function within the volint.xg Rxgen interface
3158 file. This specifies that two routines are generated, namely StartAFSVolDump()
3159 and EndAFSVolDump(). The former is used to marshall the IN arguments, and the
3160 latter is used to unmarshall the return value of the overall operation. The
3161 actual dump data appears in the Rx stream for the call (see the section
3162 entitled Example Server and Client in the companion AFS-3 Programmer's
3163 Reference: Specification for the Rx Remote Procedure Call Facility document).
3165 VOLSERBAD ACCESS The caller is not authorized to dump a volume.
3166 \n ENOENT The fromTrans transaction was not found.
3167 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3168 dropped to the proper level.
3170 \subsection sec5-7-5 Section 5.7.5: AFSVolSignalRestore - Show
3171 intention to call AFSVolRestore()
3174 int AFSVolSignalRestore(IN struct rx connection *z conn,
3181 Show an intention to the Volume Server that the client will soon call
3182 AFSVolRestore(). The parameters, namely the volume name, type, parent ID pid
3183 and clone ID cloneid are stored in a well-known set of global variables. These
3184 values are used to set the restored volume's header, overriding those values
3185 present in the dump from which the volume will be resurrected.
3187 VOLSERBAD ACCESS The caller is not authorized to call this routine.
3188 \n VOLSERBADNAME The volume name contained in name was longer than 31
3189 characters plus the trailing null.
3191 \subsection sec5-7-6 Section 5.7.6: AFSVolRestore - Recreate a volume
3195 int AFSVolRestore(IN struct rx connection *z conn,
3198 IN struct restoreCookie *cookie)
3201 Interpret a canonical volume dump (generated as the result of calling
3202 AFSVolDumpVolume()), passing it to the volume specified by the toTrans
3203 transaction. Only the low bit in the flags argument is inspected. If this low
3204 bit is turned on, the dump will be restored as incremental; otherwise, a full
3205 restore will be carried out.
3207 All callbacks to the restored volume are broken.
3209 This is specified as a split function within the volint.xg Rxgen interface
3210 file. This specifies that two routines are generated, namely
3211 StartAFSVolRestore() and EndAFSVolRestore() . The former is used to marshall
3212 the IN arguments, and the latter is used to unmarshall the return value of the
3213 overall operation. The actual dump data flows over the Rx stream for the call
3214 (see the section entitled Example Server and Client in the companion AFS-3
3215 Programmer's Reference: Specification for the Rx Remote Procedure Call Facility
3218 The AFSVolSignalRestore() routine (see Section 5.7.5) should be called before
3219 invoking this function in order to signal the intention to restore a particular
3222 VOLSERREAD DUMPERROR Dump data being restored is corrupt.
3223 \n VOLSERBAD ACCESS The caller is not authorized to restore a volume.
3224 \n ENOENT The fromTrans transaction was not found.
3225 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3226 dropped to the proper level.
3228 \subsection sec5-7-7 Section 5.7.7: AFSVolForward - Dump a volume, then
3229 restore to given server and volume
3232 int AFSVolForward(IN struct rx connection *z conn,
3235 IN struct destServer *destination,
3237 IN struct restoreCookie *cookie)
3240 Dumps the volume associated with transaction fromTrans from the given fromDate.
3241 The dump itself is sent to the server described by destination, where it is
3242 restored as the volume associated with transaction destTrans. In reality, an Rx
3243 connection is set up to the destServer, StartAFSVolRestore() directs writing to
3244 the Rx call's stream, and then EndAFSVolRestore() is used to deliver the dump
3245 for the volume corresponding to fromTrans. If a non-zero fromDate is provided,
3246 then the dump will be incremental from that date. Otherwise, a full dump will
3249 The Rx connection set up for this task is always destroyed before the function
3250 returns. The destination volume should exist before carrying out this
3251 operation, and the invoking process should have started transactions on both
3252 participating volumes.
3254 VOLSERBAD ACCESS The caller is not authorized to forward a volume.
3255 \n ENOENT The fromTrans transaction was not found.
3256 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3257 dropped to the proper level.
3258 \n ENOTCONN An Rx connection to the destination server could not be
3261 \subsection sec5-7-8 Section 5.7.8: AFSVolClone - Clone (and optionally
3265 int AFSVolClone(IN struct rx connection *z conn,
3273 Make a clone of the read-write volume associated with transaction trans, giving
3274 the cloned volume a name of newName. The newType parameter specifies the type
3275 for the new clone, and may be either ROVOL or BACKVOL. If purgeVol is set to a
3276 non-zero value, then that volume will be purged during the clone operation.
3277 This may be more efficient that separate clone and purge calls when making
3278 backup volumes. The newVol parameter sets the new clone's ID. It is illegal to
3279 pass a zero in newVol.
3281 VOLSERBADNAME The volume name contained in newName was longer than 31
3282 characters plus the trailing null.
3283 \n VOLSERBAD ACCESS The caller is not authorized to clone a volume.
3284 \n ENOENT The fromTrans transaction was not found.
3285 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3286 dropped to the proper level.
3287 \n VBUSY The given transaction was already in use; indicating that someone else
3288 is currently manipulating the specified clone.
3289 \n EROFS The volume associated with the given trans is read-only (either ROVOL
3291 \n EXDEV The volume associated with the trans transaction and the one specified
3292 by purgeVol must be on the same disk device, and they must be cloned from the
3294 \n EINVAL The purgeVol must be read-only, i.e. either type ROVOL or BACKVOL.
3296 \subsection sec5-7-9 Section 5.7.9: AFSVolReClone - Re-clone a volume
3299 int AFSVolReClone(IN struct rx connection *z conn,
3304 Recreate an existing clone, with identifier cloneID, from the volume associated
3305 with transaction tid.
3307 VOLSERBAD ACCESS The caller is not authorized to clone a volume.
3308 \n ENOENT The tid transaction was not found.
3309 \n VOLSERTRELE ERROR The tid transaction's reference count could not be dropped
3310 to the proper level.
3311 \n VBUSY The given transaction was already in use; indicating that someone else
3312 is currently manipulating the specified clone.
3313 \n EROFS The volume to be cloned must be read-write (of type RWVOL).
3314 \n EXDEV The volume to be cloned and the named clone itself must be on the same
3315 device. Also, cloneID must have been cloned from the volume associated with
3317 \n EINVAL The target clone must be a read-only volume (i.e., of type ROVOL or
3320 \subsection sec5-7-10 Section 5.7.10: AFSVolSetForwarding - Set
3321 forwarding info for a moved volume
3324 int AFSVolSetForwarding(IN struct rx connection *z conn,
3329 Record the IP address specified within newsite as the location of the host
3330 which now hosts the volume associated with transaction tid, formerly resident
3331 on the current host. This is intended to gently guide Cache Managers who have
3332 stale volume location cached to the volume's new site, ensuring the move is
3333 transparent to clients using that volume.
3335 VOLSERBAD ACCESS The caller is not authorized to create a forwarding address.
3336 \n ENOENT The trans transaction was not found.
3338 \subsection sec5-7-11 Section 5.7.11: AFSVolTransCreate - Create
3339 transaction for a [volume, partition]
3342 int AFSVolTransCreate(IN struct rx connection *z conn,
3349 Create a new Volume Server transaction associated with volume ID volume on
3350 partition partition. The type of volume transaction is specified by the flags
3351 parameter. The values in flags specify whether the volume should be treated as
3352 busy (ITBusy), offline (ITOffline), or in shared read-only mode (ITReadOnly).
3353 The identifier for the new transaction built by this function is returned in
3356 Creating a transaction serves as a signal to other agents that may be
3357 interested in accessing a volume that it is unavailable while the Volume Server
3358 is manipulating it. This prevents the corruption that could result from
3359 multiple simultaneous operations on a volume.
3361 EINVAL Illegal value encountered in flags.
3362 \n VOLSERVOLBUSY A transaction could not be created, thus the given [volume,
3363 partition] pair was busy.
3364 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3365 dropped to the proper level after creation.
3367 \subsection sec5-7-12 Section 5.7.12: AFSVolEndTrans - End a
3371 int AFSVolEndTrans(IN struct rx connection *z conn,
3376 End the transaction identified by trans, returning its final error code into
3377 rcode. This makes the associated [volume, partition] pair eligible for further
3378 Volume Server operations.
3380 VOLSERBAD ACCESS The caller is not authorized to create a transaction.
3381 \n ENOENT The trans transaction was not found.
3383 \subsection sec5-7-13 Section 5.7.13: AFSVolGetFlags - Get volume flags
3387 int AFSVolGetFlags(IN struct rx connection *z conn,
3392 Return the value of the vflags field of the struct volser trans object
3393 describing the transaction identified as trans. The set of values placed in the
3394 flags parameter is described in Section 5.2.3.1. Briefly, they indicate whether
3395 the volume has been deleted (VTDeleted), out of service (VTOutOfService), or
3396 marked delete-on-salvage (VTDeleteOnSalvage).
3398 ENOENT The trans transaction was not found.
3399 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3400 dropped to the proper level.
3402 \subsection sec5-7-14 Section 5.7.14: AFSVolSetFlags - Set volume flags
3406 int AFSVolSetFlags(IN struct rx connection *z conn,
3411 Set the value of the vflags field of the struct volser trans object describing
3412 the transaction identified as trans to the contents of flags. The set of legal
3413 values for the flags parameter is described in Section 5.2.3.1. Briefly, they
3414 indicate whether the volume has been deleted (VTDeleted), out of service
3415 (VTOutOfService), or marked delete-onsalvage (VTDeleteOnSalvage).
3417 ENOENT The trans transaction was not found.
3418 \n EROFS Updates to this volume are not allowed.
3419 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3420 dropped to the proper level.
3422 \subsection sec5-7-15 Section 5.7.15: AFSVolGetName - Get the volume
3423 name associated with a transaction
3426 int AFSVolGetName(IN struct rx connection *z conn,
3431 Given a transaction identifier tid, return the name of the volume associated
3432 with the given transaction. The tname parameter is set to point to the address
3433 of a string buffer of at most 256 chars containing the desired information,
3434 which is created for this purpose. Note: the caller is responsible for freeing
3435 the buffer pointed to by tname when its information is no longer needed.
3437 ENOENT The tid transaction was not found, or a volume was not associated with
3438 it (VSrv internal error).
3439 \n E2BIG The volume name was too big (greater than or equal to SIZE (1,024)
3441 \n VOLSERTRELE ERROR The trans transaction's reference count could not be
3442 dropped to the proper level.
3444 \subsection sec5-7-16 Section 5.7.16: AFSVolGetStatus - Get status of a
3448 int AFSVolGetStatus(IN struct rx connection *z conn,
3450 OUT struct volser status *status)
3453 This routine fills the status structure passed as a parameter with status
3454 information for the volume identified by the transaction identified by tid, if
3455 it exists. Included in this status information are the volume's ID, its type,
3456 disk quotas, the IDs of its clones and backup volumes, and several other
3457 administrative details.
3459 ENOENT The tid transaction was not found.
3460 \n VOLSERTRELE ERROR The tid transaction's reference count could not be dropped
3461 to the proper level.
3463 \subsection sec5-7-17 Section 5.7.17: AFSVolSetIdsTypes - Set header
3467 int AFSVolSetIdsTypes(IN struct rx connection *z conn,
3476 The transaction identifed by tId is located, and the values supplied for the
3477 volume name, volume type, parent ID pId, clone ID cloneId and backup ID
3478 backupId are recorded into the given transaction.
3480 ENOENT The tId transaction was not found.
3481 \n VOLSERBAD ACCESS The caller is not authorized to call this routine.
3482 \n VOLSERBADNAME The volume name contained in name was longer than 31
3483 characters plus the trailing null.
3484 \n VOLSERTRELE ERROR The tId transaction's reference count could not be dropped
3485 to the proper level.
3487 \subsection sec5-7-18 Section 5.7.18: AFSVolSetDate - Set creation date
3491 int AFSVolSetDate(IN struct rx connection *z conn,
3496 Set the creationDate of the struct volintInfo describing the volume associated
3497 with transaction tid to newDate.
3499 VOLSERBAD ACCESS The caller is not authorized to call this routine.
3500 \n ENOENT The tId transaction was not found.
3501 \n VOLSERTRELE ERROR The tid transaction's reference count could not be dropped
3502 to the proper level.
3504 \subsection sec5-7-19 Section 5.7.19: AFSVolListPartitions - Return a
3505 list of AFS partitions on a server
3508 int AFSVolListPartitions(IN struct rx connection *z conn,
3509 OUT struct pIDs *partIDs)
3512 Return a list of AFS partitions in use by the server processing this call. The
3513 output parameter is the fixed-length partIDs array, with one slot for each of
3514 26 possible partitions. By convention, AFS partitions are named /vicepx, where
3515 x is any letter. The /vicepa partition is represented by a zero in this array,
3516 /vicepa bya1, andsoon. Unused partitions are represented by slots filled with a
3521 \subsection sec5-7-20 Section 5.7.20: AFSVolPartitionInfo - Get
3522 partition information
3525 int AFSVolPartitionInfo(IN struct rx connection *z conn,
3527 OUT struct diskPartition *partition)
3530 Collect information regarding the partition with the given character string
3531 name, and place it into the partition object provided.
3533 VOLSERBAD ACCESS The caller is not authorized to call this routine.
3534 \n VOLSERILLEGAL PARTITION An illegal partition was specified by name
3536 \subsection sec5-7-21 Section 5.7.21: AFSVolListVolumes - Return a list
3537 of volumes on the server
3540 int AFSVolListVolumes(IN struct rx connection *z conn,
3543 OUT volEntries *resultEntries)
3546 Sweep through all the volumes on the partition identified by partid, filling in
3547 consecutive records in the resultEntries object. If the flags parameter is set
3548 to a non-zero value, then full status information is gathered. Otherwise, just
3549 the volume ID field is written for each record. The fields for a volEntries
3550 object like the one pointed to by resultEntries are described in Section 5.4.6,
3551 which covers the struct volintInfo definition.
3553 VOLSERILLEGAL PARTITION An illegal partition was specified by partID
3554 \n VOLSERNO MEMORY Not enough memory was available to hold all the required
3555 entries within resultEntries.
3557 \subsection sec5-7-22 Section 5.7.22: AFSVolListOneVolume - Return
3558 header info for a single volume
3561 int AFSVolListOneVolume(IN struct rx connection *z conn,
3564 OUT volEntries *resultEntries)
3567 Find the information for the volume living on partition partID whose ID is
3568 volid, and place a single struct volintInfo entry within the variable-size
3569 resultEntries object.
3571 This is similar to the AFSVolListVolumes() call, which returns information on
3572 all volumes on the specified partition. The full volume information is always
3573 written into the returned entry (equivalent to setting the flags argument to
3574 AFSVolListVolumes() to a non-zero value).
3576 VOLSERILLEGAL PARTITION An illegal partition was specified by partID
3577 \n ENODEV The given volume was not found on the given partition.
3579 \subsection sec5-7.23 Section 5.7.23: AFSVolGetNthVolume - Get volume
3580 header given its index
3583 int AFSVolGetNthVolume(IN struct rx connection *z conn,
3586 OUT long *partition)
3589 Using index as a zero-based index into the set of volumes hosted by the server
3590 chosen by the z conn argument, return the volume ID and partition of residence
3591 for the given index.
3592 \Note This functionality has not yet been implemented.
3594 VOLSERNO OP Not implemented.
3596 \subsection sec5-7.24 Section 5.7.24: AFSVolMonitor - Collect server
3600 int AFSVolMonitor(IN struct rx connection *z conn,
3601 OUT transDebugEntries *result)
3604 This call allows the transaction state of a Volume Server to be monitored for
3605 debugging purposes. Anyone wishing to supervise this Volume Server state may
3606 call this routine, causing all active transactions to be recorded in the given
3611 \page biblio Bibliography
3613 \li [1] Transarc Corporation. AFS 3.0 System Administrator's Guide,
3614 F-30-0-D102, Pittsburgh, PA, April 1990.
3615 \li [2] Transarc Corporation. AFS 3.0 Command Reference Manual, F-30-0-D103,
3616 Pittsburgh, PA, April 1990.
3617 \li [3] CMU Information Technology Center. Synchronization and Caching
3618 Issues in the Andrew File System, USENIX Proceedings, Dallas, TX, Winter 1988.
3619 \li [4] Information Technology Center, Carnegie Mellon University. Ubik -A
3620 Library For Managing Ubiquitous Data, ITCID, Pittsburgh, PA, Month, 1988.
3621 \li [5] Information Technology Center, Carnegie Mellon University. Quorum
3622 Completion, ITCID, Pittsburgh, PA, Month, 1988.