## 1 General The General Section of the [[AFSFrequentlyAskedQuestions]]. - [[PreambleFAQ]]
- [[UsageFAQ]] - [[AdminFAQ]] - [[ResourcesFAQ]] - [[AboutTheFAQ]] - [[FurtherReading]] ### 1.01 What is AFS? AFS is a distributed filesystem that enables co-operating hosts (clients and servers) to efficiently share filesystem resources across both local area and wide area networks. AFS is based on a distributed file system originally developed at the Information Technology Center at Carnegie-Mellon University that was called the "Andrew File System". "Andrew" was the name of the research project at CMU - honouring the founders of the University. A spin-off company, Transarc Corporation (now part of IBM), started producing and marketing a commercial version of AFS in 1989. In November, 2000, IBM Open-Sourced AFS, creating [[OpenAFS]]. Development continues to this day. ### 1.02 Who supplies AFS? [[OpenAFS]] is available from the [[OpenAFS]] website. Additionally, an independant open source project, Arla, supports some clients that [[OpenAFS]] does not. Arla clients are completely compatible with [[OpenAFS]]. IBM no longer markets AFS and has declared an end-of-life for support.
  [[Main/OpenAFS]] WWW: http://www.openafs.org/
  Arla WWW: http://www.stacken.kth.se/projekt/arla/
### 1.03 What is /afs? The root of the AFS filetree is /afs. If you execute "ls /afs" you will see directories that correspond to AFS cells (see below). These cells may be local (on same LAN) or remote (eg halfway around the world). With AFS you can access all the filesystem space under /afs with commands you already use (eg: cd, cp, rm, and so on) provided you have been granted permission (see AFS ACL below). ### 1.04 What is an AFS cell? An AFS cell is a collection of servers grouped together administratively and presenting a single, cohesive filesystem. Typically, an AFS cell is a set of hosts that use the same Internet domain name. Normally, a variation of the domain name is used as the AFS cell name. Users log into AFS client workstations which request information and files from the cell's servers on behalf of the users. ### 1.05 What are the benefits of using AFS? The main strengths of AFS are its: - caching facility - security features - simplicity of addressing - scalability - communications protocol Here are some of the advantages of using AFS in more detail: #### 1.05.a Cache Manager AFS client machines run a Cache Manager process. The Cache Manager maintains information about the identities of the users logged into the machine, finds and requests data on their behalf, and keeps chunks of retrieved files on local disk. The effect of this is that as soon as a remote file is accessed a chunk of that file gets copied to local disk and so subsequent accesses (warm reads) are almost as fast as to local disk and considerably faster than a cold read (across the network). Local caching also significantly reduces the amount of network traffic, improving performance when a cold read is necessary. #### 1.05.b Location independence Unlike NFS, which makes use of /etc/filesystems (on a client) to map (mount) between a local directory name and a remote filesystem, AFS does its mapping (filename to location) at the server. This has the tremendous advantage of making the served filespace location independent. Location independence means that a user does not need to know which fileserver holds the file, the user only needs to know the pathname of a file. Of course, the user does need to know the name of the AFS cell to which the file belongs. Use of the AFS cellname as the second part of the pathname (eg: /afs/$AFSCELL/somefile) is helpful to distinguish between file namespaces of the local and non-local AFS cells. To understand why such location independence is useful, consider having 20 clients and two servers. Let's say you had to move a filesystem "/home" from server a to server b. Using NFS, you would have to change the /etc/filesystems file on 20 clients and take "/home" off-line while you moved it between servers. With AFS, you simply move the AFS volume(s) which constitute "/home" between the servers. You do this "on-line" while users are actively using files in "/home" with no disruption to their work. (Actually, the AFS equivalent of "/home" would be /afs/$AFSCELL/home where $AFSCELL is the AFS cellname.) #### 1.05.c Scalability With location independence comes scalability. An architectural goal of the AFS designers was client/server ratios of 200:1 which has been successfully exceeded at some sites. Some sites are exceeding this ratio. Exactly what ratio your cell can use depends on many factors including: number of AFS files, size of AFS files, rate at which changes are made, rate at which file are being accessed, speed of servers processor, I/O rates, and network bandwidth. AFS cells can range from the small (1 server/client) to the massive (with tens of servers and thousands of clients). Cells can be dynamic: it is simple to add new fileservers or clients and grow the computing resources to meet new user requirements. #### 1.05.d Improved security Firstly, AFS makes use of Kerberos to authenticate users. This improves security for several reasons: - passwords do not pass across the network in plaintext - encrypted passwords no longer need to be visible - You don't have to use NIS, aka yellow pages, to distribute /etc/passwd - thus "ypcat passwd" can be eliminated. - If you do choose to use NIS, you can replace the password field with "X" so the encrypted password is not visible. (These issues are discussed in detail in [[[AdminGuide|Main/FurtherReading#AdminGuide]]]). - AFS uses mutual authentication - both the service provider and service requester prove their identities Secondly, AFS uses access control lists (ACLs) to enable users to restrict access to their own directories. #### 1.05.e Single systems image (SSI) Establishing the same view of filestore from each client and server in a network of systems (that comprise an AFS cell) is an order of magnitude simpler with AFS than it is with, say, NFS. This is useful to do because it enables users to move from workstation to workstation and still have the same view of filestore. It also simplifies part of the systems management workload. In addition, because AFS works well over wide area networks the SSI is also accessible remotely. As an example, consider a company with two widespread divisions (and two AFS cells): ny.acme.com and sf.acme.com. Mr Fudd, based in the New York office, is visiting the San Francisco office. Mr. Fudd can then use any AFS client workstation in the San Francisco office that he can log into (a unprivileged guest account would suffice). He could authenticate himself to the ny.acme.com cell and securely access his New York filespace. For example: The following shows a guest in the sf.acme.com AFS cell: 1. add AFS executables directory to PATH 2. obtaining a PAG with pagsh command (see 2.06) 3. use the klog command to authenticate into the ny.acme.com AFS cell 4. making a HOME away from home 5. invoking a homely .profile guest@toontown.sf.acme.com $ PATH=/usr/afsws/bin:$PATH # {1} guest@toontown.sf.acme.com $ pagsh # {2} $ klog -cell ny.acme.com -principal elmer # {3} Password: $ HOME=/afs/ny.acme.com/user/elmer; export HOME # {4} $ cd $ . .profile # {5} you have new mail guest@toontown $ It is not necessary for the San Francisco sys admin to give Mr. Fudd an AFS account in the sf.acme.com cell. Mr. Fudd only needs to be able to log into an AFS client that is: 1. on the same network as his cell and 2. his ny.acme.com cell is mounted in the sf.acme.com cell (as would certainly be the case in a company with two cells). #### 1.05.f Replicated AFS volumes AFS files are stored in structures called Volumes. These volumes reside on the disks of the AFS file server machines. Volumes containing frequently accessed data can be read-only replicated on several servers. Cache managers (on users client workstations) will make use of replicate volumes to load balance. If accessing data from one replicate copy, and that copy becomes unavailable due to server or network problems, AFS will automatically start accessing the same data from a different replicate copy. An AFS client workstation will access the closest volume copy. By placing replicate volumes on servers closer to clients (eg on same physical LAN) access to those resources is improved and network traffic reduced. #### 1.05.g Improved robustness to server crash The Cache Manager maintains local copies of remotely accessed files. This is accomplished in the cache by breaking files into chunks of up to 64k (default chunk size). So, for a large file, there may be several chunks in the cache but a small file will occupy a single chunk (which will be only as big as is needed). A "working set" of files that have been accessed on the client is established locally in the client's cache (copied from fileserver(s)). If a fileserver crashes, the client's locally cached file copies remain readable but updates to cached files fail while the server is down. Also, if the AFS configuration has included replicated read-only volumes then alternate fileservers can satisfy requests for files from those volumes. #### 1.05.h "Easy to use" networking Accessing remote file resources via the network becomes much simpler when using AFS. Users have much less to worry about: want to move a file from a remote site? Just copy it to a different part of /afs. Once you have wide-area AFS in place, you don't have to keep local copies of files. Let AFS fetch and cache those files when you need them. #### 1.05.i Communications protocol AFS communications protocol is optimized for Wide Area Networks. Retransmitting only the single bad packet in a batch of packets and allowing the number of unacknowledged packets to be higher (than in other protocols, see [[[Johnson90|Main/FurtherReading#Johnson90]]]). #### 1.05.j Improved system management capability Systems administrators are able to make configuration changes from any client in the AFS cell (it is not necessary to login to a fileserver). With AFS it is simple to effect changes without having to take systems off-line. Example: A department (with its own AFS cell) was relocated to another office. The cell had several fileservers and many clients. How could they move their systems without causing disruption? First, the network infrastructure was established to the new location. The AFS volumes on one fileserver were migrated to the other fileservers. The "freed up" fileserver was moved to the new office and connected to the network. A second fileserver was "freed up" by moving its AFS volumes across the network to the first fileserver at the new office. The second fileserver was then moved. This process was repeated until all the fileservers were moved. All this happened with users on client workstations continuing to use the cell's filespace. Unless a user saw a fileserver being physically moved (s)he would have no way to tell the change had taken place. Finally, the AFS clients were moved - this was noticed! ### 1.06 Which systems is AFS available for? [[OpenAFS]] (as of the 1.4.2 release) is currently available in binary releases for: IBM AIX 5.1, 5.2, 5.3 Fedora Core 3, 4, 5 (Intel) [[FreeBSD]] 6.1 (Intel) HP/UX 11i (PA-RISC) SGI Irix 6.5 [[MacOS]] X 10.4 Tiger (Universal) [[OpenBSD]] 3.9 (Intel) RHEL 3, 4 (Intel) RHEL 4 (AMD64) Solaris 7, 8, 9, 10 (Sparc) Solaris 8, 9, 10 (Intel) Windows 2000/XP/2003 The sources are also known to create working binaries for [[NetBSD]] 2.\* and 3.0 (server only). Additional platforms may be available in the beta (1.5) tree. Arla (as of the 0.43 release) is currently supporting: [[FreeBSD]] 5.1-5.5 [[NetBSD]] 2.\*, 3.0 Linux 2.4.x and 2.6.x [[MacOSX]] 10.4 Please see the Arla web pages for more details, and remember that Arla is client only. ### 1.07 What does "ls /afs" display in the Internet AFS filetree? Essentially this displays the AFS cells that co-operate in the Internet AFS filetree. Note that the output of this will depend on the cell you do it from; a given cell may not have all the publicly advertised cells available, and it may have some cells that aren't advertised outside of the given site. The definitive source for this information is: - Note that it is also possible to use AFS "behind the firewall" within the confines of your organization's network - you don't have to participate in the Internet AFS filetree. Indeed, there are lots of benefits of using AFS on a local area network without using the WAN capabilities. ### 1.08 Why does AFS use Kerberos authentication? It improves security. Kerberos uses the idea of a trusted third party to prove identification. This is a bit like using a letter of introduction or quoting a referee who will vouch for you. When a user authenticates using the klog command (s)he is prompted for a password. If the password is accepted the Kerberos server provides the user with an encrypted token (containing a "ticket granting ticket"). From that point on, it is the encrypted token that is used to prove the user's identity. These tokens have a limited lifetime (typically a day) and are useless when expired. In AFS, it is possible to authenticate into multiple AFS cells. A summary of the current set of tokens held can be displayed by using the "tokens" command. For example: elmer@toontown $ tokens Tokens held by the Cache Manager: User's (AFS ID 9997) tokens for afs@ny.acme.com [Expires Sep 15 06:50] User's (AFS ID 5391) tokens for afs@sf.acme.com [Expires Sep 15 06:48] --End of list-- Kerberos improves security because a users's password need only be entered once (at klog time). AFS uses Kerberos to do complex mutual authentication which means that both the service requester and the service provider have to prove their identities before a service is granted. Originally AFS shipped with it's own version of a Kerberos, called "KAS." KAS still ships at this time (1.4.2 release) but is depricated in favor of using a true Kerberos 5 implimentation. [[OpenAFS]] does not currently ship with a K5 install; it is up to the administrator(s) to choose a version (either MIT's or KTH's "Heimdal") and install it. [[OpenAFS]] will happily work with either. For more detail on this and other Kerberos issues see the faq for Kerberos (posted to news.answers and comp.protocols.kerberos) [[[Jaspan|Main/FurtherReading#Jaspan]]]. (Also, see [[[Miller87|Main/FurtherReading#Miller87]]], [[[Bryant88|Main/FurtherReading#Bryant88]]], [[[Bellovin90|Main/FurtherReading#Bellovin90]]], [[[Steiner88|Main/FurtherReading#Steiner88]]]) ### 1.09 Does AFS work over protocols other than TCP/IP? No. AFS was designed to work over TCP/IP. ### 1.10 How can I access AFS from my PC? You can use [[OpenAFS]] for Windows client. In the past year it has become very stable and robust. [[OAfW]] works with Kerberos for Windows in much the same way the Unix clients work with Kerberos. There is also SAMBA (a SMB server for UNIX). - There are several ways to integrate AFS with SAMBA. See [[SMBtoAFS]]. ### 1.11 How does AFS compare with NFS?
  AFS NFS
File Access Common name space from all workstations Different file names from different workstations
File Location Tracking Automatic tracking by file system processes and databases Mountpoints to files set by administrators and users
Performance Client caching to reduce network load; callbacks to maintain cache consistency No local disk caching; limited cache consistency
Andrew Benchmark (5 phases, 8 clients) Average time of 210 seconds/client Average time of 280 seconds/client
Scaling capabilities Maintains performance in small and very large installations Best in small to mid-size installations
  Excellent performance on wide-area configuration Best in local-area configurations
Security Kerberos mutual authentication Security based on unencrypted user ID's
  Access control lists on directories for user and group access No access control lists
Availability Replicates read-mostly data and AFS system information No replication
Backup Operation No system downtime with specially developed AFS Backup System Standard UNIX backup system
Reconfiguration By volumes (groups of files) Per-file movement
  No user impact; files remain accessible during moves, and file names do not change Users lose access to files and filenames change (mountpoints need to be reset)
System Management Most tasks performed from any workstation Frequently involves telnet to other workstations
Autonomous Architecture Autonomous administrative units called cells, in addition to file servers and clients File servers and clients
  No trust required between cells No security distinctions between sites
[ source: ftp://ftp.transarc.com/pub/afsps/doc/afs-nfs.comparison ]
Other points: - Some vendors offer more secure versions of NFS but implementations vary. Many NFS ports have no extra security features (such as Kerberos). - The AFS Cache Manager can be configured to work with a RAM (memory) based cache. This offers signifigant performance benefits over a disk based cache. NFS has no such feature. Imagine how much faster it is to access files cached into RAM! - The Andrew benchmark demonstrates that AFS has better performance than NFS as the number of clients increases. A graph of this (taken from Andrew benchmark report) is available in: - ![20050131\_graph\_afs\_nfs.jpg](http://www.angelfire.com/hi/plutonic/images/20050131_graph_afs_nfs.jpg)