## 3 AFS administration The Administration Section of the [[AFSFrequentlyAskedQuestions]]. [[PreambleFAQ]] [[GeneralFAQ]] [[UsageFAQ]]

3 AFS administration

[[ResourcesFAQ]] [[AboutTheFAQ]] [[FurtherReading]] ### 3.01 Is there a version of _program_ available with AFS authentication? In general, not specifically; modern systems use authentication frameworks, so that an e.g. AFS plugin can be added to the framework and all programs will thereby be able to use it without modification. On many systems, the authentication framework is PAM (Pluggable Authentication Modules). Acquiring AFS tokens via PAM can be done by several different PAM modules, including Russ Allbery's [pam-afs-session](http://www.eyrie.org/~eagle/software/pam-afs-session/) and Red Hat's `pam_krb5afs`. ### 3.02 What is `/afs/@cell`? It is a commonly created symbolic link pointing at `/afs/$your_cell_name`. `@cell` is not something that is provided by AFS. You may decide it is useful in your cell and wish to create it yourself. `/afs/@cell` is useful because: - If you look after more than one AFS cell, you could create the link in each cell then set your `$PATH` as: PATH=$PATH:/afs/@cell/@sys/local/bin - For most cells, it shortens the path names to be typed in, thus reducing typos and saving time. A disadvantage of using this convention is that when you `cd` into `/afs/@cell` then type `pwd` you see `/afs/@cell` instead of the full name of your cell. This may appear confusing if a user wants to tell a user in another cell the pathname to a file. You could create your own `/afs/@cell` with the following script (usable in `ksh` or any POSIX shell): #/bin/ksh - # author: mpb [ -L /afs/@cell ] && echo We already have @cell! && exit cell=$(cat /usr/vice/etc/ThisCell) cd /afs/.${cell} && fs mkm temp root.afs cd temp ln -s /afs/${cell} @cell ln -s /afs/.${cell} .@cell # .@cell for RW path cd /afs/.${cell} && fs rmm temp vos release root.afs; fs checkv ### 3.03 Given that AFS data is location independent, how does an AFS client determine which server houses the data its user is attempting to access? The Volume Location Database (VLDB) is stored on AFS Database Servers and is ideally replicated across 3 or more Database Server machines. Replication of the Database ensures high availability and load balances the requests for the data. The VLDB maintains information regarding the current physical location of all volume data (files and directories) in the cell, including the IP address of the [[FileServer]], and the name of the disk partition the data is stored on. A list of a cell's Database Servers is stored on the local disk of each AFS Client machine as `/usr/vice/etc/CellServDB` The Database Servers also house the Protection Database (user UID and protection group information) and the Backup Database (used by System Administrators to backup AFS file data to tape), and in older sites the Kerberos Authentication Database (encrypted user and server passwords). ### 3.04 How does AFS maintain consistency on read-write files? AFS uses a mechanism called "callbacks". A callback is a promise from the fileserver that the cache version of a file/directory is up-to-date. It is established by the fileserver with the caching of a file. When a file is modified, the fileserver breaks the callback. When the user accesses the file again the Cache Manager fetches a new copy if the callback has been broken or it has expired (after 2 hours by default). The following paragraphs describe the AFS callback mechanism in more detail: If I `open()` `fileA` and start reading, and you then `open()` `fileA`, `write()` a change **\*\*and `close()` or `fsync()`\*\*** the file to get your changes back to the server - at the time the server accepts and writes your changes to the appropriate location on the server disk, the server also breaks callbacks to all clients to which it issued a copy of `fileA`. So my client receives a message to break the callback on `fileA`, which it dutifully does. But my application (editor, spreadsheet, whatever I'm using to read fileA) is still running, and doesn't really care that the callback has been broken. When something causes the application to `read()` more of the file, the `read()` system call executes AFS cache manager code via the VFS switch, which does check the callback and therefore gets new copies of the data. Of course, the application may not re-read data that it has already read, but that would also be the case if you were both using the same host. So, for both AFS and local files, I may not see your changes. Now if I exit the application and start it again, or if the application does another `open()` on the file, then I will see the changes you've made. This information tends to cause tremendous heartache and discontent - but unnecessarily so. People imagine rampant synchronization problems. In practice this rarely happens and in those rare instances, the data in question is typically not critical enough to cause real problems or crashing and burning of applications. Since 1985, we've found that the synchronization algorithm has been more than adequate in practice - but people still like to worry! The source of worry is that, if I make changes to a file from my workstation, your workstation is not guaranteed to be notified until I `close` or `fsync` the file, at which point AFS guarantees that your workstation will be notified. This is a significant departure from NFS, in which no guarantees are provided. ### 3.05 Which protocols does AFS use? AFS may be thought of as a collection of protocols and software processes, nested one on top of the other. The constant interaction between and within these levels makes AFS a very sophisticated software system. At the lowest level is the UDP protocol. UDP is the connection to the actual network wire. The next protocol level is the remote procedure call (RPC). In general, RPCs allow the developer to build applications using the client/server model, hiding the underlying networking mechanisms. AFS uses Rx, an RPC protocol developed specifically for AFS during its development phase at Carnegie Mellon University. Above the RPC is a series of server processes and interfaces that all use Rx for communication between machines. Fileserver, volserver, upserver, upclient, and bosserver are server processes that export RPC interfaces to allow their user interface commands to request actions and get information. For example, a bos status command will examine the bos server process on the indicated file server machine. Database servers use ubik, a replicated database mechanism which is implemented using RPC. Ubik guarantees that the copies of AFS databases of multiple server machines remain consistent. It provides an application programming interface (API) for database reads and writes, and uses RPCs to keep the database synchronized. The database server processes, `vlserver` and `ptserver`, reside above ubik. These processes export an RPC interface which allows user commands to control their operation. For instance, the `pts` command is used to communicate with the `ptserver`, while the command `vos` uses the `vlserver`'s RPC interface. Some application programs are quite complex, and draw on RPC interfaces for communication with an assortment of processes. Scout utilizes the RPC interface to file server processes to display and monitor the status of file servers. The `uss` command interfaces with `ptserver`, `volserver`, and `vlserver` to create new user accounts. The Cache Manager also exports an RPC interface. This interface is used principally by file server machines to break callbacks. It can also be used to obtain Cache Manager status information. The program `cmdebug` shows the status of a Cache Manager using this interface. For additional information, Section 1.5 of the AFS System Administrator's Guide and the April 1990 Cache Update contain more information on ubik. Udebug information and short descriptions of all debugging tools were included in the January 1991 Cache Update. Future issues will discuss other debugging tools in more detail. [source: ] [Copyright 1991 Transarc Corporation] ### 3.06 Which TCP/IP ports and protocols do I need to enable in order to operate AFS through my Internet firewall? Outbound destination ports for a client: kerberos 88/udp 88/tcp ntp 123/udp afs3-fileserver 7000/udp afs3-ptserver 7002/udp afs3-vlserver 7003/udp afs3-volserver 7005/udp If you also plan to control AFS servers from a client, you will also need afs3-bosserver 7007/udp You will also need to allow an inbound port afs3-callback 7001/udp or if you are using Arla cachemanager 4711/udp (Note: if you are using NAT, you should try to to arrange for the UDP NAT timeout on port 7001 to be at least two hours. Recent [[OpenAFS]] server and client versions will try to send keepalives to keep the callback NAT entry open, but some consumer router/WiFi/NAT devices may have a timeout that is too short even for this keepalive. If the NAT entry expires, your cache manager will not be [[notified of file changes on the server|AdminFAQ#callbacks]] and you will only find out about file changes approximately after two hours, when the callback expires.) You will also need to allow various ephemeral UDP source ports for outbound connections, but you will need to do this for DNS and [[NTP|AdminFAQ#ntp]] anyway. ### 3.07 Are setuid programs executable across AFS cell boundaries? By default, the setuid bit is ignored but the program may be run (without setuid privilege). It would be bad to allow arbitrary setuid programs in remote cells to run; consider that someone could put a setuid copy of `bash` in a personal cell, arrange for that to be visible via DNS `SRV` records, and then `fs mkmount` a reference to it in their AFS space on e.g. a school machine. It is possible to configure an AFS client to honor the setuid bit. This is achieved by `root` (only) running: root@toontown # fs setcell -cell $cellname -suid where `$cellname` is the name of the foreign cell. Use with care! Note that making a program setuid (or setgid) in AFS does **not** mean that the program will get AFS permissions of a user or group. To become AFS authenticated, you have to `aklog`. If you are not authenticated to AFS, AFS treats you as `system:anyuser`. setuid only affects local Unix permissions (and is meaningless on Windows clients). ### 3.08 How can I run daemons with tokens that do not expire? It is not a good idea to run with tokens that do not expire because this would weaken one of the security features of Kerberos. A (slightly) better approach is to re-authenticate just before the token expires. (Even more preferable would be to get a token for a particular operation, preferably from the user performing some operation, but this is not always possible, especially with services that are not aware of Kerberos.) The most common way to achieve this these days is to generate a keytab containing the credentials you want the daemon to have, and use a program like [k5start](http://www.eyrie.org/~eagle/software/kstart/) to run the daemon with those credentials. ### 3.09 Can I check my users' passwords for security purposes? The major Kerberos implementations (MIT Kerberos and Heimdal) all include ways to do password strength checking when a user chooses a password. There are not currently any (public!) utilities to check keys in a KDC (which are generated from passwords) against dictionaries; and you cannot (generally) generate an unencrypted KDC dump to check them (the KDC keys are double-encrypted: not only are they stored as encrypted keys instead of the original plaintext passwords, but the entire record is encrypted with the KDC's own master key). ### 3.10 Is there a way to automatically balance disk usage across fileservers? Yes. There is a tool, balance, which does exactly this. It can be retrieved via anonymous ftp from . (It does not appear to have been updated since late 2003). Actually, it is possible to write arbitrary balancing algorithms for this tool. The default set of "agents" provided for the current version of balance balance by usage, # of volumes, and activity per week, the latter currently requiring a source patch to the AFS volserver. Balance is highly configurable. Author: Dan Lovinger Contact: Derrick Brashear <shadow+@andrew.cmu.edu> ### 3.11 Can I shutdown an AFS fileserver without affecting users? Yes, this is an example of the flexibility you have in managing AFS. Before attempting to shutdown an AFS fileserver you have to make some arrangements that any services that were being provided are moved to another AFS fileserver: 1. Move all AFS volumes to another fileserver. (Check you have the space!) This can be done "live" while users are actively using files in those volumes with no detrimental effects. 1. Make sure that critical services have been replicated on one (or more) other fileserver(s). Such services include: - `vlserver` - Volume Location server - `ptserver` - Protection server - `buserver` - Backup server - `kaserver` - Old Kerberos Authentication server - `fileserver` - Kerberos KDCs, or `kaserver` on older installations It is simple to test this before the real shutdown by issuing: bos shutdown $server $service where `$server` is the name of the server to be shutdown and `$service` is `-all` or the specific service to be shut down. Note that a service instance may *not* be the same as the service name; use `bos status $server` to check. (One common configuration uses short names like `pts` or even `pt` for the `ptserver` service, for example.) Kerberos services are usually not managed via `bos`; check the OS's services manager for `krb5kdc`, `kadmind`, `kpasswdd`, and similar. (Different Kerberos implementations will have different service daemons.) Other points to bear in mind: - `vos remove` any RO volumes on the server to be shutdown. Create corresponding RO volumes on the 2nd fileserver after moving the RW. There are two reasons for this: 1. An RO on the same partition ("cheap replica") requires less space than a full-copy RO. 2. Because AFS always accesses RO volumes in preference to RW, traffic will be directed to the RO and therefore quiesce the load on the fileserver to be shutdown. - If you are still using `kaserver` and the system to be shutdown has the lowest IP address, there may be a brief delay in authenticating because of timeout experienced before contacting a second `kaserver`. ### 3.12 How can I set up mail delivery to users with `$HOME`s in AFS? Preferably, don't. This has been found to scale poorly because of high load on read-write servers; mail clients check for new mail every few minutes (or even seconds) and this will cause problems for any file server. Additionally, as the mail server cannot authenticate to AFS as the receiving user, you need to carefully manage permissions on the receiving directory tree to avoid mail being lost or the directory being used as a general dropbox with potential security implications. See [this message](http://www.openafs.org/pipermail/openafs-info/2007-June/026621.html) for more information about the scalability of mail delivery onto a shared fileserver. (Something to think about: very similar considerations are why the recommendation for Exchange mail servers is to only have a small number of users on each mailbox server.) If you absolutely must do this for some reason, here's one way to do it. First, you must have your mail delivery daemon AFS authenticated (probably as "`postman`" or similar). [`kstart`](http://www.eyrie.org/~eagle/software/kstart/) can be used to do this. (Note that the mail delivery agent cannot authenticate as the actual user! To do so, it would need access to keytabs for each possible recipient; and it is almost certainly a bad idea to give it access to such keytabs.) Second, you need to set up the ACLs so that "`postman`" has lookup rights down to the user's `$HOME` and "`lik`" on the destination directory (for this example, we'll use `$HOME/Mail`). ### 3.13 Should I replicate a [[ReadOnly]] volume on the same partition and server as the [[ReadWrite]] volume? Yes, Absolutely! It improves the robustness of your served volumes. If [[ReadOnly]] volumes _exist_ (_not_ just "are available"), Cache Managers will not utilize the [[ReadWrite]] version of the volume except via an explicit [[ReadWrite]] mountpoint. This means if **all** RO copies are on dead servers, are offline, are behind a network partition, etc, then clients will not be able to get the data, even if the RW version of the volume is healthy, on a healthy server and in a healthy network. However, you are **very** strongly encouraged to keep one RO copy of a volume on the _same server and partition_ as the RW. There are two reasons for this: 1. The RO that is on the same server and partition as the RW is a clone (just a copy of the header, not a full copy of each file). It therefore is very small, but provides access to the same set of files that all other (full copy) [[ReadOnly]] volumes do. Transarc trainers referred to this as the "cheap replica"; some admins call it a "shadow", but this is not the same as a [[shadow volume|AdminFAQ#shadow volume]]. 2. To prevent the frustration that occurs when all your ROs are unavailable but a perfectly healthy RW was accessible but not used. If you keep a "cheap replica", then by definition, if the RW is available, one of the ROs is also available, and clients will utilize that site. ### 3.14 Will AFS run on a multi-homed fileserver? (multi-homed = host has more than one network interface.) Yes, it will. Older AFS assumed that there is one address per host, but modern [[OpenAFS]] identifies servers ad clients by UUIDs (universally unique identifiers) so that a fileserver will be recognized by any of its registered addresses. See the documentation for the [`NetInfo`](http://docs.openafs.org/Reference/5/NetInfo.html) and [`NetRestrict`](http://docs.openafs.org/Reference/5/NetRestrict.html) files. The UUID for a fileserver is generated when the [`sysid`](http://docs.openafs.org/Reference/5/sysid.html) file is created. If you have multiple addresses and must use only one of them (say, multiple addresses on the same subnet), you may need to use the `-rxbind` option to the network server processes `bosserver`, `kaserver`, `ptserver`, `vlserver`, `volserver`, `fileserver` as appropriate. (Note that some of these do not currently document `-rxbind`, notably `kaserver` because it is not being maintained. Again, the preferred solution here is to migrate off of `kaserver`, but the `rxbind` option _will_ work if needed.) Database servers can *not* safely operate multihomed; the Ubik replication protocol assumes a 1-to-1 mapping between addresses and servers. Use the [`NetInfo`](http://docs.openafs.org/Reference/5/NetInfo.html) and [`NetRestrict`](http://docs.openafs.org/Reference/5/NetRestrict.html) files to associate database servers with a single address. ### 3.15 Can I replicate my user's home directory AFS volumes? No. Users with `$HOME`s in `/afs` normally have an AFS [[ReadWrite]] volume mounted in their home directory. You can replicate a RW volume, but only as a [[ReadOnly]] volume; there can only be one instance of a [[ReadWrite]] volume. In theory you could have RO copies of a user's RW volume on a second server, but in practice this won't work for the following reasons: a) AFS has a bias to always access the RO copy of a RW volume if one exists. So the user would have a [[ReadOnly]] `$HOME`, which is not very useful. (You could use an RW mountpoint to avoid this.) b) [[ReadOnly]] volumes are not automatically updated; you would need to manually update each user volume (e.g. `vos release user.fred; fs checkv`). The bottom line is: you cannot usefully replicate `$HOME`s across servers. (That said, there is one potentially useful case: if there is an extended fileserver outage, you can use `vos convertROtoRW` to promote a [[ReadOnly]] volume to [[ReadWrite]]. You should only do this if the alternative is restoring the entire contents of the downed fileserver from a backup; should the fileserver return to service, the attempt to re-register additional [[ReadWrite]] volume instances will fail. As such, *if* you make sure to use a [[ReadWrite]] mountpoint for user volumes, replicating a user's `$HOME` may prove useful as an online backup.) ### 3.16 How can I list which clients have cached files from a server? By using the following script, which should work in a POSIX-compliant shell, `ksh` or `bash` (but check the path to `rxdebug`, and you need `nslookup` to be installed): #!/bin/ksh - # # NAME afsclients # AUTHOR Rainer Toebbicke # DATE June 1994 # PURPOSE Display AFS clients which have grabbed files from a server if [ $# = 0 ]; then echo "Usage: $0 ... " exit 1 fi for n; do /usr/afsws/etc/rxdebug -servers $n -allconn done | grep '^Connection' | while read x y z ipaddr rest; do echo $ipaddr done | sort -u | while read ipaddr; do ipaddr=${ipaddr%%,} n="`nslookup $ipaddr`" n="${n##*Name: }" n="${n%%Address:*}" n="${n##*([ ])}" n="${n%?}" echo "$n ($ipaddr)" done An alternative in Perl (still requires `rxdebug` but not `nslookup`): #! /usr/bin/perl -w use strict; use warnings; use Socket; my %client; for my $fs (@ARGV) { open my $rx, '-|', "rxdebug -server \Q$fs\E -allconn" or die "rxdebug: $!"; while (<$rx>) { /^Connection from host (\S+),/ and $client{$1} = 1; } } for my $ip (keys %client) { my ($ia, $host); $ia = inet_aton($ip); if (defined ($host = gethostbyaddr($ia, AF_INET))) { $client{$ip} = "$host ($ip)"; } else { $client{$ip} = $ip; } } for my $host (sort values %client) { print $host, "\n"; } ### 3.17 Do Backup volumes require as much space as [[ReadWrite]] volumes? Occasionally, but usually not. A backup volume consists of copy-on-write clones of the files in the original volume; if the file in the original is then modified, it will be copied first, leaving the backup volume pointing at the original version. The BK volume is re-synchronised with the RW next time a `vos backup` or `vos backupsys` is run. The space needed for the BK volume is directly related to the size of all files changed in the RW between runs of `vos backupsys`. ### 3.18 Should I run `ntpd` on my AFS client? Yes. You should not rely on older time services such as `timed` or programs such as `ntpdate`, and should not use the legacy `settime` functionality of the AFS client. You should also avoid using automatic time synchronization provided by virtual machine hypervisors (indeed, VMware specifically recommends disabling its time synchronization on Linux and using `ntpd`). The AFS Servers make use of NTP [[[NTP|FurtherReading#NTP]]] to synchronise time each other and typically with one or more external NTP servers. By default, clients synchronize their time with one of the servers in the local cell. Thus all the machines participating in the AFS cell have an accurate view of the time. For further details on NTP see . The latest version is 4.2.6, dated December 2011, which is **much** more recent that the version packaged with Transarc AFS. OpenAFS no longer ships with `timed`, since it is assumed that all sites use NTP. A list of NTP servers is available from . Note that you should prefer to have one or more master local servers sync to one of the "pool" servers for your continent, and other local clients sync to the master local server(s). The default time setting behavior of the AFS client can be disabled by specifying the `-nosettime` argument to [afsd](http://www.transarc.ibm.com/Library/documentation/afs/3.5/unix/cmd/cmd53.htm). It is **strongly** recommended that you run NTP and use `-nosettime` on all machines (clients *and* servers). ### 3.19 Why and how should I keep `/usr/vice/etc/CellServDB` current? On AFS clients, `/usr/vice/etc/CellServDB` defines the cells (and their db servers) that can be accessed via `/afs`. Over time, site details change: servers are added/removed or moved onto new network addresses; new cells appear. While some of this can be handled by means of DNS `AFSDB` or `SRV` records, you must know about a cell to even be able to ask about it; the [[CellServDB]] acts as a central directory of cell. (Of course, it is sometimes a good idea to not advertise some internal cells; but that also means not putting them in public-facing DNS, so you will likely want a local [[CellServDB]].) In order to keep up-to-date with such changes, the [[CellServDB]] file on each AFS client should be kept consistent with some master copy (at your site). As well as updating [[CellServDB]], your AFS administrator should ensure that new cells are mounted in your cell's `root.afs` volume. If a cell is added to [[CellServDB]], either the **client** must be restarted or you must use [`fs newcell`](http://docs.openafs.org/Reference/1/fs_newcell.html) to register the new cell information with the running client. The official public master [[CellServDB]] is maintained at `grand.central.org`, from or . You can send updates for this to . The client [[CellServDB]] file must not reside under `/afs` (since it needs to exist before the client starts!) and is best located in local filespace. After obtaining an updated [[CellServDB]] and distributing to clients, you will want to run a script similar to this Perl script. (It could be written in shell, but not comprehensibly. Feel free to reimplement in your preferred language.) #! /usr/bin/perl use strict; use warnings; # # Given a CellServDB file (may be local, may be master), issue "fs newcell" for each listed # cell. We don't bother checking for changes, as that's much more expensive than making an # unnecessary change. # # Expected usage via puppet: rather than making it the restart/reconfigure action for the # client (or server) we depend on an exec stanza which does so. No point in running it if # what changed is something else that requires a full restart. # my $cell; my @srv = (); while (<>) { chomp; if (/^>(\S+)(?:\s|\Z)/) { if (defined $cell and @srv) { system 'fs', 'newcell', $cell, @srv and die "fs newcell failed"; } $cell = $1; @srv = (); } # same rules as afsd: if the name doesn't resolve, use the IP elsif (defined $cell and /(^\d+\.\d+\.\d+\.\d+)\s*#(\S+)\s*$/) { if (defined gethostbyname($2)) { push @srv, $2; } else { push @srv, $1; } } else { warn "line $ {.}: can't parse \"$_\"\n"; } } # last entry if (defined $cell and @srv) { system 'fs', 'newcell', $cell, @srv and die "fs newcell failed"; } else { warn "line $ {.}: no valid cells found\n"; } ### 3.20 How can I compile a list of AFS fileservers? Here is a Bourne shell command to do it (it will work in GNU bash and the Korn shell, too, and even `csh`): stimpy@nick $ vos listvldb -cell `cat /usr/vice/etc/ThisCell` | awk '/server/ {print $2}' | sort -u ### 3.21 How can I set up anonymous FTP login to access `/afs`? The easiest way on a primarily "normal" machine (where you don't want to have everything in AFS) is to actually mount `root.cell` under `~ftp`, and then symlink `/afs` to `~ftp/afs` or whatever. It's as simple as changing the mountpoint in `/usr/vice/etc/cacheinfo` and restarting `afsd`. Note that when you do this, anon ftp users can go anywhere `system:anyuser` can (or worse, if you're using IP-based ACLs and the ftp host is listed in any PTS groups). The only "polite" solution I've arrived at is to have the ftp host machine run a minimal [[CellServDB]] and police my ACLs tightly. Alternatively, you can make `~ftp` an AFS volume and just mount whatever you need under that - this works well if you can keep everything in AFS, and you don't have the same problems with anonymous "escapes" into `/afs`. (Note that you can often use host `tmpfs` mounts onto AFS directories to hide things or provide host-specific paths.) Note that similar considerations apply to web access; it used to be not uncommon for accidental misconfigurations of MIT's web hosts to result in people's home directories showing up in Google searches. This **will** annoy people who do not think of their [[OpenAFS]] home directory as being world-visible! (even though they should realize it and set their ACL appropriately) ### 3.22 Is the data sent over the network encrypted in AFS? [[OpenAFS]] has an `fs` subcommand to turn on encryption of regular file data sent and received by a client. This is a per client setting that persists until reboot. No server actions are needed to support this change. The syntax is: fs setcrypt on fs setcrypt off fs getcrypt Note that this only encrypts network traffic between the client and server. The data on the server's disk is not encrypted, nor is the data in the client's disk cache. The encryption algorithm used is [fcrypt](http://surfvi.com/~ota/fcrypt-paper.txt), which is a DES variant. Additionally, data read/written without a token is not encrypted over the wire. This (and the use of DES variants, both here and in general) is a shortcoming of AFS's security protocols and is being addressed by the development of a new `rxgk` protocol. Enabling encryption by default: - [[RedHat]] Linux: ([src](https://lists.openafs.org/pipermail/openafs-info/2002-July/005085.html)) change the last line of `/etc/sysconfig/afs` to `AFS_POST_INIT="/usr/bin/fs setcrypt on"` - Windows ([src](https://lists.openafs.org/pipermail/openafs-info/2003-June/009416.html)) set the following registry value named `SecurityLevel` under `HKLM\SYSTEM\CurrentControlSet\Services\TransarcAFSDaemon\Parameters` to 2. ### 3.23 What underlying filesystems can I use for AFS? See also [[SupportedConfigurations]]. What filesystems can be used for fileserver partitions depends on what `configure` switches were used during compilation from sources. To be always on the safe side, use the `--enable-namei-fileserver` configure flag; that will give you a `fileserver` binary which can act on any `/vicep*` partition regardless of its filesystem type. With the namei file server, you can basically use any filesystem you want. The namei file server does not do any fancy stuff behind the scenes but only accesses normal files (their names are a bit strange though). Older versions of AFS also provided an inode fileserver. On older Solaris it once gave a 10% speedup over the namei fileserver; but with modern operating systems and disks, the performance difference is negligible. The inode fileserver cannot run on every filesystem, as it abuses the filesystem internals to store AFS metadata and opens files directly by inode number instead of going through the normal filesystem access methanisms. The `fsck` distributed with the operating system will consider these inode-accessed files to be "dangling" and either link them into `lost+found` or delete them entirely; it will also often corrupt the AFS metadata, which it doesn't know about. As of [[OpenAFS]] 1.6, inode fileservers are no longer supported; you can still build from source with inode support, but it has bugs and should only be used in a read-only configuration to copy volumes to a namei fileserver host. On the client side, the cache partition requires a filesystem supporting the inode abstraction for the cache (usually `/var/vice/cache`) since the cache manager references files by their inode. Fortunately, it does not store metadata in "unused" parts of the filesystem, and cache creation always provides proper names for the cache files so they won't be damaged by `fsck`. The following file systems have been reported _not_ to work for the AFS client cache: - [[ReiserFS]] - vxfs (HP-UX) - advfs (Tru64), it initially works but eventually corrupts the cache - efs (SGI) - Transarc AFS supported efs, but [[OpenAFS]] doesn't have a license to use the efs code - zfs (Solaris, FreeBSD, other ports) - you can however use a zvolume with a ufs or other supported filesystem The OpenAFS cache manager will detect an unsupported filesystem and refuse to start. The following file systems have been reported to work for the AFS client cache: - ext2 - ext3 - hfs (HP-UX) - xfs (at least on IRIX 6.5) - ufs (Solaris, [[Tru64Unix]]) ### 3.24 Compiling [[OpenAFS]] from source (Modern [[OpenAFS]] supports proper packaging for various systems; these notes are still somewhat applicable but mostly relevant for 1.2.x.) The kernel component of [[OpenAFS]] must be compiled by the same kernel used to compile the kernel, e.g. Solaris must use the cc from SUNWspro and not gcc. [[Tru64Unix]] doesn't support modules, so you have to edit kernel config files and link statically into kernel. Dynamically loaded Kernel modules work on Linux, Solaris, Irix ... ./configure --enable-transarc-paths=/usr/etc --with-afs-sysname=i386_linux24 make dest cd dest/i386_linux24 ... and continue the install process described in IBM AFS documentation. If you do "make install", you will end up with some stuff installed into /usr/local but something not, regardless the --enable-transarc-paths option ... "make install" it's messy. ### 3.25 Upgrading [[OpenAFS]] (Modern [[OpenAFS]] supports proper packaging for various systems; these notes are still somewhat applicable but mostly relevant for 1.2.x.) These instructions assume a "dest" tree (the output of `make dest`, and the contents of the official binary distribution tarballs). It is generally preferable to use native packages when they exist; the packaging will handle most of the details of upgrading for you. #### Upgrade of AFS on Linux /etc/rc.d/init.d/afs stop cd root.client/usr/vice/etc tar cvf - . | (cd /usr/vice/etc; tar xfp -) cp -p afs.rc /etc/rc.d/init.d/afs cp ../../../../lib/pam_afs.krb.so.1 /lib/security cd ../../../../root.server/usr/afs tar cvf - . | (cd /usr/afs; tar xfp -) # echo "auth sufficient /lib/security/pam_afs.so try_first_pass \ ignore_root" >> /etc/pam.d/login cd /lib/security vim /etc/sysconfig/afs ln -s pam_afs.krb.so.1 pam_afs.so cd /etc/rc3.d ln -s ../init.d/afs S99afs cd ../rc0.d ln -s ../init.d/afs K01afs cp /usr/vice/etc/afs.conf /etc/sysconfig/afs /etc/rc.d/init.d/afs start #### Upgrade of AFS on Solaris 2.6 cd /etc/rc3.d/ mv S20afs aS20afs init 6 cd root.server/usr/afs tar cvf - ./bin | (cd /usr/afs; tar xfp -) cd ../../.. cp root.client/usr/vice/etc/modload/libafs.nonfs.o /kernel/fs/afs cp root.server/etc/vfsck /usr/lib/fs/afs/fsck cd root.client/usr/vice tar cvf - ./etc | (cd /usr/vice; tar xf -) cd ../../.. cp lib/pam_afs.krb.so.1 /usr/lib/security cp lib/pam_afs.so.1 /usr/lib/security cd /etc/rc3.d mv aS20afs S20afs init 6 #### Upgrade of AFS on Irix 6.5 /etc/chkconfig -f afsserver off /etc/chkconfig -f afsclient off /etc/chkconfig -f afsml off /etc/chkconfig -f afsxnfs off /etc/reboot cd root.server/usr/afs tar cvf - ./bin | (cd /usr/afs; tar xfp -) cd ../../.. cp root.client/usr/vice/etc/sgiload/libafs.IP22.nonfs.o /usr/vice/etc/sgiload echo "AFS will be compiled statically into kernel" echo "otherwise skip following lines and use chkconfig afsml on" cp root.client/bin/afs.sm /var/sysgen/system cp root.client/bin/afs /var/sysgen/master.d echo "The next file comes from openafs-*/src/libafs/STATIC.*" cp root.client/bin/libafs.IP22.nonfs.a /var/sysgen/boot/afs.a cp /unix /unix_orig /etc/autoconfig echo "end of static kernel modifications" cd root.client/usr/vice/etc echo "Delete any of the modload/ files which don't fit your platform if you need space" echo "These files originate from openafs-*/src/libafs/MODLOAD.*" tar cvf - . | (cd /usr/vice/etc; tar xf -) /etc/chkconfig -f afsserver on /etc/chkconfig -f afsclient on # /etc/chkconfig -f afsml on - afs is compiled statically into kernel, so leave afsml off /etc/chkconfig -f afsml off /etc/chkconfig -f afsxnfs off /etc/reboot #### Upgrade of AFS on [[Tru64Unix]] cd root.server/usr/afs/ tar cvf - ./bin | (cd /usr/afs; tar xfp -) cd ../../../root.client/bin cp ./libafs.nonfs.o /usr/sys/BINARY/afs.mod ls -la /usr/sys/BINARY/afs.mod doconfig -c FOO cd ../../root.client/usr/vice cp etc/afsd /usr/vice/etc/afsd cp etc/C/afszcm.cat /usr/vice/etc/C/afszcm.cat ### 3.26 Notes on debugging [[OpenAFS]] In case of troubles when you need only `fileserver` process to run (to be able to debug), run the `lwp` fileserver instead of the `pthreads` fileserver (`src/viced/fileserver` instead of `src/tviced/fileserver` if you have a buildtree handy): cp src/viced/fileserver /usr/afs/bin (or wherever) bos restart localhost fs -local then attach with `gdb`. (This may be less necessary with recent `gdb`; its `pthreads` support used to be quite abysmal. [*ed.*]) To debug if client running `afsd` kernel process talks to the servers from [[CellServDB]], do: tcpdump -vv -s 1500 port 7001 Other ports are: afs3-fileserver 7000/udp # file server itself afs3-callback 7001/udp # callbacks to cache managers afs3-prserver 7002/udp # users & groups database afs3-vlserver 7003/udp # volume location database afs3-kaserver 7004/udp # AFS/Kerberos authentication service afs3-volser 7005/udp # volume managment server afs3-errors 7006/udp # error interpretation service afs3-bos 7007/udp # basic overseer process afs3-update 7008/udp # server-to-server updater afs3-rmtsys 7009/udp # remote cache manager service When `tcpdump` doesn't help, try: fstrace setset cm -active # make your error happen fstrace dump cm ### 3.27 Tuning client cache for huge data Use on afsd command line -chunk 17 or greater. Be carefull, with certain cache sizes afsd crashes on startup (Linux, [[Tru64Unix]] at least). It is possibly when dcache is too small. Go for: /usr/vice/etc/afsd -nosettime -stat 12384 -chunk 19 > So I ran the full suite of iozone tests (13), but at a single file > size (128M) and one record size (64K). I set the AFS cache size to > 80000K for both memcache and diskcache. Note that memcache size and diskcache size are different things. In the case of memcache, a fixed number of chunks are allocated in memory, such that numChunks * chunkSize = memCacheSize. In the case of disk cache, there are a lot more chunks, because the disk cache assumes not every chunk will be filled (the underlying filesystem handles disk block allocation for us). Thus, when you have small file segments, they use up an entire chunk worth of cache in the memcache case, but only their size worth of cache in the diskcache cache. -- kolya ### 3.28 Settting up PAM with AFS Solaris auth sufficient /lib/security/pam_afs.so debug try_first_pass ignore_root debug auth required /lib/security/pam_env.so auth sufficient /lib/security/pam_unix.so likeauth nullok auth required /lib/security/pam_deny.so account required /lib/security/pam_unix.so password required /lib/security/pam_cracklib.so retry=3 type= password sufficient /lib/security/pam_unix.so nullok use_authtok md5 shadow password required /lib/security/pam_deny.so session sufficient /lib/security/pam_afs.so set_token session required /lib/security/pam_limits.so session required /lib/security/pam_unix.so # reafslog is to unlock dtlogin's screensaver other auth sufficient /usr/athena/lib/pam_krb4.so reafslog ### 3.29 How can I have a Kerberos realm different from the AFS cell name? How can I use an AFS cell across multiple Kerberos realms? OpenAFS defaults to using a Kerberos realm generated from the cell name by uppercasing. You can instead tell it the Kerberos realm to use with a truncated `krb.conf` file: /usr/afs/etc/krb.conf # Transarc paths /etc/openafs/server/krb.conf # FHS paths You do not list any KDCs in this file, just space-separated realms on a single line. See also [[below|AdminFAQ#multirealm]]. You can list a maximum of 2 realms in this file in older AFS, but [[OpenAFS]] 1.6 and later allow any number of realms. ### 3.30 What are the `bos` instance types? How do I use them? There are, as of this writing, 4 types of [`bos`](http://docs.openafs.org/Reference/8/bos_create.html) server: * `simple` - a single program which will be kept running as needed. * `cron` - a single program, plus a time at which it will be automatically run; typically used for cell backups. The time looks like `04:00` to run every day at a given time, or `sun 04:00` to run once a week. Times may be specified in 24-hour or 12-hour (with am/pm suffix); weekdays may be full or abbreviated to 3 characters. Case is ignored. (A legacy usage allows the string `now` to be used; use [`bos exec`](http://docs.openafs.org/Reference/8/bos_exec.html) instead.) * `fs` - a standard fileserver which has three programs that must be run together in a particular way. The `fs` server type will take care of starting, stopping, and restarting these programs in order to keep them working together. * `dafs` - demand attach fileservers are similar to standard fileservers, but have an additional component program to be synchronized. The `dafs` server type will ensure these are started, stopped, and restarted correctly while maintaining synchronization. ### 3.31 afsd gives me "`Error -1 in basic initialization.`" on startup When starting afsd, I get the following: # /usr/vice/etc/afsd -nosettime -debug afsd: My home cell is 'foo.bar.baz' ParseCacheInfoFile: Opening cache info file '/usr/vice/etc/cacheinfo'... ParseCacheInfoFile: Cache info file successfully parsed: cacheMountDir: '/afs' cacheBaseDir: '/usr/vice/cache' cacheBlocks: 50000 afsd: 5000 inode_for_V entries at 0x8075078, 20000 bytes SScall(137, 28, 17)=-1 afsd: Forking rx listener daemon. afsd: Forking rx callback listener. afsd: Forking rxevent daemon. SScall(137, 28, 48)=-1 SScall(137, 28, 0)=-1 SScall(137, 28, 36)=-1 afsd: Error -1 in basic initialization. Make sure the kernel module has been loaded. Modern [[OpenAFS]] startup scripts should ensure this and report an error if it cannot be loaded, but startup scripts from older versions or on systems which can't use loadable kernel modules (requiring the kernel to be relinked) will not catch this and you will get these errors from `SScall`. ### 3.32 Error "`afs: Tokens for user of AFS id 0 for cell foo.bar.baz are discarded (rxkad error=19270407)`" elmer@toontown ~$ translate_et 19270407 19270407 (rxk).7 = security object was passed a bad ticket or alternately elmer@toontown ~$ grep 19270407 /usr/afsws/include/rx/* /usr/afsws/include/rx/rxkad.h:#define RXKADBADTICKET (19270407L) A common cause of this problem (error 19270407) is the use of periods ("`.`") in Kerberos V principals. If you have a Kerberos principal such as `my.name@REALM.COM` and create the corresponding `pts` userid `my.name`, you will get the cryptic error above. If you want to use such principal names and have OpenAFS 1.4.7 or later, you can pass the option `-allow-dotted-principals` to all server daemons to allow their use. See the `-allow-dotted-principals` option in the fileserver (or any server daemon) documentation for more information. (The problem here is that for compatibility reasons, [[OpenAFS]] uses Kerberos 4 name rules internally; while "`.`" was the name component separator in Kerberos 4, in [[Kerberos5]] it is "`/`" so [[OpenAFS]] translates `.` to `/` when passing names to Kerberos for verification. This means that a Kerberos 5 name with an embedded period cannot be used directly without disabling the translation; but with the translation disabled, you cannot easily use Kerberos 5 names with components. There is ongoing work in this area because `rxgk` requires proper support for these names.) In general, the `translate_et` utility can be used to find out what an AFS error number means. This only works for AFS errors; some utilities may also report Kerberos errors in this way, and `translate_et` will not work for these. Some sites have alternative utilities that understand Kerberos as well as AFS errors (see for example (here)[file:///afs/sinenomine.net/user/ballbery/public/translate_err]). ### 3.33 I have tickets and tokens, but still get `Permission denied` for some operations. This can be caused by the above, or by not being in a server `UserList` (`/usr/afs/etc/UserList` or `/etc/openafs/server/UserList`). Also beware that, as described [[above|AdminFAQ#translate_et]], `UserList` accepts only Kerberos 4 name syntax: use `joe.admin` instead of `joe/admin`. See `https://lists.openafs.org/pipermail/openafs-devel/2002-December/008673.html` and the rest of the thread. ### 3.34 Recovering broken AFS cache on clients >> Does anyone have a trick to force AFS to refresh its cache (for a >> particular directory or even for all files?) The only way I know >> how to accomplish this is to reboot, stop in single user mode, >> rm -rf the cache files and let AFS rebuild everything. > > fs flush and fs flushv have cured corruption problems in the past > on some of our clients. Thanks for the tip - I was not aware of the flush* subcommands. Here's a little of what I saw today: ls -la /bin/ls: asso.S14Q00246.all.log: Bad address /bin/ls: asso.S14Q00246.all.lst: Bad address /bin/ls: chr14markers.txt: Bad address /bin/ls: geno.summary.txt: Bad address /bin/ls: global.ind.S14Q00246.all.txt: No such device /bin/ls: global.S14Q00246.all.txt: No such device total 103 [ other ls results as usual ] Flushing a particular file had no effect (the same error as shown above appears). Flushvolume took a long time, but when it eventually completed, the ls -la behaved exactly as one would expect. Recent [[OpenAFS]] (1.6.4 and newer) has an [`fs flushall`](http://docs.openafs.org/Reference/1/fs_flushall.html) command in addition to the `flush` and `flushvol` commands. Older AFS versions sometimes corrupted their cache filesystems in ways that `fs flushvol` cannot fix. Sometimes this can be corrected with root@toontown ~# fs setca 1; fs setca 0 (set the cache to minimum size, and then back to normal; beware that in most versions of Transarc AFS, you will have to specify the actual cache size instead of `0`!). If this does not work, you can force a cache rebuild by shutting down [[OpenAFS]] and removing `/var/vice/cache/CacheItems` (it is not necessary to remove all cache files), although you may want to remake the filesystem (`mkfs` or `newfs`) instead in case there is actual filesystem corruption. If this happens regularly, please file an OpenAFS bug. ### 3.35 What does it mean for a volume to not be in the VLDB? If a volume is not in the VLDB, you will be unable to perform operations on it using its name; all "vos" operations will need to be done using its numerical id, server, and partition. Furthermore, if a volume is not in the VLDB, it cannot be reached via mountpoints. ### 3.36 What is a Volume Group? You can think of a Volume Group as an RW volume, and all of the clones of that RW (its RO clone, BK clone, and any other clones). All of the volumes in a Volume Group on the same fileserver can share storage for data that is the same between all of them. This is why, for example, an RO clone usually takes up very little disk space; since an RW and its RO clone are in the same Volume Group, they can share storage for unchanged data. All of the volumes in a group usually have very similar volume ID numbers. For example, if an RW volume has ID 536870915, its RO clone will typically be 536870916. However, this is not required, as volume ID numbers can be almost anything. You can even manually specify what volume ID number you want for a volume when you create the volume with "`vos create`". Currently, you can only have about 8 volumes in a Volume Group. However, this limitation is due to technical details of the fileserver "namei" disk backend. If that backend is improved in the future, or if different backends are developed, the number of volumes in a volume group could be much greater. Some low-level documentation may refer to a "volume group ID". This is always the same as the RW volume ID. ### 3.37 What is a Clone? A clone of a volume is a read-only copy of that volume which shares on-disk storage with the original volume. Backup volumes are a particular kind of clone volume. Read-only replicas which reside on the same partition as their read-write volume are another particular kind of clone volume. In some other storage systems this kind of volume is called a "snapshot". Clone volumes must belong to the same volume group (see [[previous question|AdminFAQ#volumegroup]]) as the volume which they are a clone of. In addition to backup and readonly clones, you may create up to three additional clones of a volume. To do this, use "`vos clone`". When you "`vos remove`" a volume, its "backup" clone will also be removed automatically. However, clones created with "`vos clone`" are **not** removed automatically. Unfortunately, these "dangling clones" will no longer be in the VLDB (see [[above|AdminFAQ#notinvldb]]). They belong to a volume group whose leader (RW volume) no longer exists, which is a somewhat undefined state for AFS. Such volumes should be manually deleted as soon as possible. ### 3.38 What is a Shadow? A shadow of a volume is a duplicate of that volume which resides on a different partition. Shadow volumes do not share storage with their original volumes (unlike clones). A readonly volume on a **different** partition than its readwrite volume could be considered one particular example of a shadow volume; however, in practice the term "shadow volume" is used to refer to volumes created with "vos shadow" and not to refer to readonly volumes. A shadow of any readwrite volume may be created using the "vos shadow" command. This will create a new volume which is a shadow of the original volume, and will copy the contents of the old volume to the new volume. This will also set a bit in the header of the new volume that identifies it as a shadow volume. Shadow volumes do not appear in the VLDB (see [[above|AdminFAQ#notinvldb]]) -- "`vos shadow`" does not create a VLDB entry and "`vos syncvldb`" ignores shadow volumes. You can "refresh" a shadow volume from its original with "`vos shadow -incremental`". This operation will first check to make sure that the target of the operation is a shadow volume, to prevent the administrator from accidentally corrupting a non-shadow volume. However, if you shadow from a readwrite volume to some shadow of **another** volume, the shadow will be corrupted and will have to be deleted. `vos shadow` will only copy data which has changed, so it is very efficient. You can remove the shadow bit from a volume's header with "`vos syncvldb -force`". This will remove the shadow bit and create a VLDB entry for the volume, deleting any previous entry for the RW volume. However, the RW volume itself will not be deleted; it will simply exist without a VLDB entry. Attempting to create shadows of two different RW volumes on the same partition with the same name is prohibited by the `volserver`. Technically it is possible to create two shadow volumes with the same name on different partitions; however, this is not advisable and may lead to undefined behavior. (Some AFS administrators may refer to an RO clone of an RW volume on the same server/partition as a "shadow"; this terminology predates the existence of shadow volumes and should be avoided.) ### 3.39 Can I authenticate to my AFS cell using multiple Kerberos realms? Yes. This can be useful if your organization has multiple Kerberos realms with identical user entries: For example you might have an MIT Kerberos realm for Unix-like systems, and an Active Directory domain for Windows with synchronized accounts. In order to make this work, you need to do 4 things. 1. Add a key for the `afs` service to the additional realm and store it in a keytab: $ kadmin -q ank -e des-cbc-crc:v4 -kvno afs/your.cell.name@YOUR.SECOND.REALM.NAME $ kadmin -q ktadd -e des-cbc-crc:v4 -k /path/to/afs.keytab afs/your.cell.name@YOUR.SECOND.REALM.NAME Note that a kvno must be specified for the key that is different than the kvno for your existing key(s) in the original realm. You can check on the kvno of the existing keys by running "`asetkey list`" on one of your servers. Since keys must be in ascending order in the AFS [[KeyFile]], it will be easiest if you make the new kvno higher than any existing key's kvno. It's also worth noting that the process of adding the key to a keytab (at least with MIT krb5) actually creates a new key first, so your kvno will end up being higher than what you specified when you added the principal. You can check on the current kvno by using the command "`kadmin -q getprinc afs/your.cell.name@YOUR.SECOND.REALM.NAME`". 2. Add the new key to the [[KeyFile]] on your AFS servers: $ asetkey add /path/to/afs.keytab afs/your.cell.name@YOUR.SECOND.REALM.NAME Note that the kvno here needs to be the same one as is reported by the `kadmin getprinc` command. 3. Create an AFS `krb.conf` with your additional realm's name in it, and place it on all of your AFS servers; see [[above|AdminFAQ#afskrbconf]]. 4. Restart your AFS servers. At this point you should be able to run: kinit you@YOUR.SECOND.REALM.NAME aklog and receive the same privileges as if you had run: kinit you@YOUR.CELL.NAME aklog ### 3.40 How can I ensure that the userids on client machines match the users' `pts` ids? You can use [libnss-afs](http://www.megacz.com/software/libnss-afs.html) for this. ### 3.41 What is Fast Restart? When compiled with `--enable-fast-restart`, the file server will start up immediately, without first salvaging any volumes which cannot be attached. Disadvantages to Fast Restart, [as noted here](http://lists.openafs.org/pipermail/openafs-info/2008-May/029386.html) include: 1. Volumes in need of salvage remain offline until an administrator intervenes manually 2. On an inode-based fileserver, salvaging a single volume crawls every inode; therefore, salvaging volumes individually (rather than partition-at-a-time) duplicates work. In [[OpenAFS]] 1.6 there is a [[demand attach fileserver|DemandAttach]] which provides even faster restart while reducing the drawbacks; you should use it instead. ### 3.42 Why does AFS reboot itself spontaneously at 4:00am every Sunday? This was made to be the default behavior back in the days when OpenAFS servers had problems with leaking memory and other resources. These days, it is generally seen as not necessary. You can disable this behavior with: bos setrestart $SERVER_NAME never bos setrestart $SERVER_NAME -newbinary never Newer versions of [[OpenAFS]] do not enable this by default. ### 3.43 Why do I get an error -1765328370 when authenticating? tweety@toontown ~$ translate_err -1765328370 krb5 error -1765328370 = KRB5KDC_ERR_ETYPE_NOSUPP (See [[here|AdminFAQ#translate_et]] for `translate_err`.) Usually this means that your KDC has support for `des-cbc-crc` and other weaker encryption types turned off. Re-enable support for DES encryption types and you will get further. Check `/etc/krb5.conf` and make sure it has something like the following in it: [libdefaults] allow_weak_crypto = true Also check `kdc.conf` (possibly located in `/var/kerberos/krb5kdc`; check the documentation for your Kerberos packages) and make sure that `des-cbc-crc:normal` is in the `supported_enctypes` list. There is ongoing work to remove the need for DES enctypes.