openafs.git
9 years agoWindows: AFSPerformObjectInvalidate hold ExtentsResource shared
Jeffrey Altman [Tue, 21 Feb 2012 01:50:53 +0000]
Windows: AFSPerformObjectInvalidate hold ExtentsResource shared

The AFSPerformObjectInvalidate() was obtaining exclusive
access to the Fcb ExtentsResource even though it was not
tearing down the extents list.  The ExtentsResource could
be held shared instead.  Doing so will avoid the following
deadlock:

Thread 1:
 nt!MmPurgeSection+0x403
 nt!CcPurgeCacheSection+0x100
 AFSRedirLib!AFSPerformObjectInvalidate+0xd4
 AFSRedirLib!AFSWorkerThread+0xa4
 nt!PspSystemThreadStartup+0x2e

Thread 2:
 AFSRedirLib!AFSAcquireShared+0x18
 AFSRedirLib!AFSMarkDirty+0x68
 AFSRedirLib!AFSNonCachedWrite+0x603
 AFSRedirLib!AFSCommonWrite+0x5fa
 AFSRedirLib!AFSWrite+0x20
 nt!IofCallDriver+0x45
 AFSRedir!AFSWrite+0x57
 nt!IofCallDriver+0x45
 fltMgr!FltpDispatch+0x6f
 nt!IofCallDriver+0x45
 AMFilter+0x2c6e
 nt!IofCallDriver+0x45
 PMDriver+0x112a
 nt!IofCallDriver+0x45
 OpLoader+0x1cd2
 nt!IofCallDriver+0x45
 savonaccesscontrol+0x6f15
 savonaccessfilter+0x2fa0
 nt!IofCallDriver+0x45
 nt!IoAsynchronousPageWrite+0xd0
 nt!MiMappedPageWriter+0x127
 nt!PspSystemThreadStartup+0x2e

Thread 1 is attempting to perform a cache purge which cannot complete
until Thread 2 is finished but Thread 2 requires the ExtentsResource
which is held by Thread 1.

Change-Id: I4582093cf973f61cf6aff0df5e23b6711ec708b3
Reviewed-on: http://gerrit.openafs.org/6744
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: fsLockCount not accurate
Jeffrey Altman [Mon, 20 Feb 2012 06:48:20 +0000]
Windows: fsLockCount not accurate

Prior to 1.6.2 the file server does not report an accurate value
for the lock state.  In addition, callbacks are not broken when
locks are freed due to lease expiration.

Change-Id: I5b79d1d59c2ace9834cf23dfbef33e343ce6dda0
Reviewed-on: http://gerrit.openafs.org/6741
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoviced: lockcount only valid if not expired
Jeffrey Altman [Mon, 20 Feb 2012 06:40:03 +0000]
viced: lockcount only valid if not expired

locks are issued on a lease.  If the lock is expired, the lock
count is zero.

Change-Id: I628dd5b8b0d38694d653d9e8e82ff60ec2e1505c
Reviewed-on: http://gerrit.openafs.org/6740
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agovolser: Remove unused variable
Marc Dionne [Mon, 20 Feb 2012 22:56:29 +0000]
volser: Remove unused variable

tid is now unused - remove it to avoid a warning.

Change-Id: If2d4fdf16415bbf19de3cd8a3e621d04d4d9b018
Reviewed-on: http://gerrit.openafs.org/6743
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoviced: Relax "h_TossStuff_r failed" warnings
Andrew Deason [Fri, 17 Feb 2012 23:12:46 +0000]
viced: Relax "h_TossStuff_r failed" warnings

Currently, h_TossStuff_r bails out and logs a message if we detect
that somebody grabbed a reference or locked the host while we tried to
h_NBLock_r. The reasoning for this is that it is not legal for anyone
to h_Hold_r a host that has HOSTDELETED set (but the error is
detectable and recoverable); callers are supposed to check for
HOSTDELETED and not hold a host in that case.

However, HOSTDELETED may not be set when h_TossStuff_r is called,
since we call it if either HOSTDELETED _or_ CLIENTDELETED are set. If
CLIENTDELETED is set and HOSTDELETED is not, it's perfectly fine (and
necessary) for callers to grab a reference to the host. So, if that's
what is going on, don't log a message, since that's normal behavior.

Check for HOSTDELETED before we h_NBLock_r, since it is technically
possible (and legal) for someone to grab a reference to the host and
somehow set HOSTDELETED while we wait for h_NBLock_r to return. Also
log the flags when we see this message.

Change-Id: Ie50a0617de094bb1c721da28f100ed4b31aa849f
Reviewed-on: http://gerrit.openafs.org/6733
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoviced: Remove extraneous h_AHTAHT_r in h_GetHost_r
Andrew Deason [Fri, 17 Feb 2012 22:24:16 +0000]
viced: Remove extraneous h_AHTAHT_r in h_GetHost_r

We added this address to the host with an addInterfaceAddr_r call just
a few lines before, which adds the host to the address hash table.
Another call to h_AddHostToAddrHashTable_r is pure overhead and
confusing.

Change-Id: Ib08817274e632f67776956ede8b56eaf0dce879e
Reviewed-on: http://gerrit.openafs.org/6729
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoviced: Set h_GetHost_r probefail if MPAA_r fails
Andrew Deason [Fri, 17 Feb 2012 21:46:50 +0000]
viced: Set h_GetHost_r probefail if MPAA_r fails

Currently, in h_GetHost_r, if we get a connection whose address does
not match an extant host, but the reported uuid does, we ProbeUuid the
old host. If it fails, we call MultiProbeAlternateAddress_r and set
'probefail'. Later on, if 'probefail' is set, we always add the
connection address to the host, and remove the host->host,host->port
address from the host.

However, this is not always correct. Consider the following situation.

We have an existing host that has primary address 1.1.1.1, and also
has addresses 1.1.1.2 and 1.1.1.3 on the interface list but not on the
hash table. Say that host A stops responding on 1.1.1.1, and a
connection comes in from 1.1.1.2. We ProbeUuid 1.1.1.1 and get a
failure, so we call MultiProbeAlternateAddress_r.
MultiProbeAlternateAddress_r probes via rx_Multi the addresses 1.1.1.2
and 1.1.1.3. Say that 1.1.1.3 responds first, and responds
successfully, so MultiProbeAlternateAddress_r sets 1.1.1.3 to be the
primary address for the host.

After MultiProbeAlternateAddress_r returns, 'probefail' is set. A few
lines down, we see that oldHost->host does not match haddr, and
'probefail' is set, so we add 1.1.1.2 to the interface list, and
remove 1.1.1.3, and set 1.1.1.2 to be the primary address, even though
1.1.1.3 is the address we most recently 'know' is correct.

To fix this, only set 'probefail' if MultiProbeAlternateAddress_r also
fails after the failed ProbeUuid call. Conceptually this makes sense,
since if MultiProbeAlternateAddress_r succeeds, it found an address
that responds successfully to ProbeUuid, and it sets that address to
be the primary address. Therefore, after MultiProbeAlternateAddress_r
returns success, the situation is the same as if the 'good' address
was already the primary address, and the ProbeUuid call succeeded, so
'probefail' should be cleared.

Change-Id: Id32817916a8a42db567ad099aae00745b79598c5
Reviewed-on: http://gerrit.openafs.org/6728
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoviced: Correctly update addrs on alt addr probe
Andrew Deason [Fri, 17 Feb 2012 19:14:31 +0000]
viced: Correctly update addrs on alt addr probe

The functions MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r try to find a valid address in a host's
interface list of addrs. If they find one, they update host->host and
host->port. However, they do so just by changing those fields directly
and by calling h_DeleteHostFromAddrHashTable_r and
h_AddHostToAddrHashTable_r. This leaves the old host->host, host->port
on the interface list, and leaves it marked as 'valid'. Similarly, the
new host and port may still be marked as not 'valid'.

This can result in the host being on the addr hash table via an
address that is not on the host's interface list. After the above
situation occurs, we may call

  removeInterfaceAddr_r(host, host->host, host->port);

and then update host->host and host->port, which happens in a variety
of places. Since host->host, host->port is not marked as valid in the
interface list, it is not removed from the addr hash table, but it is
removed from the interface list. Eventually, this can cause the host
to be referenced from the addr hash table even after it has been
freed.

Since this can result in hash table entries pointing to the 'wrong'
host, this can result in FileLog messages such as:

Sun Feb  5 03:16:35 2012 Removing address that does not belong to host 0xdeadbeefdead (1.2.3.4:7001).

And bogus instances of the message:

Sun Feb  5 03:16:36 2012 CB: new identity for host 0xdeadbeefdead (1.2.3.4:7001), deleting(1 baadcafe 12345678-9abc-def0-12-34-456789abcdef fedcba98-76543210f-ed-cb-a9876543210f)

To fix this, make MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r update the address list the same way as
all of the code in host.c does; by adding the new address with
addInterfaceAddr_r, removing it with removeInterfaceAddr_r, and
updating host->host and host->port.

Change-Id: I0a95e0186c03c1831c4df86daae901bf2462da0e
Reviewed-on: http://gerrit.openafs.org/6727
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoviced: Delete dup host before probing old host
Andrew Deason [Thu, 16 Feb 2012 22:20:16 +0000]
viced: Delete dup host before probing old host

Currently, when the fileserver gets a new connection from an address
not on the addr hash table, we allocate a new host structure and add
that host to the addr hash table. If we then find that that host's
uuid matches the uuid of an extant host, we do the following:

 - probe the old host with the uuid, and MultiProbeAlternateAddress_r
   if the probe fails

 - mark the duplicate host as HOSTDELETED

 - manipulate the interface lists

Consider, for example, that we have an extant host ('oldHost') with
address 1.2.3.4:7001, but with 5.6.7.8:7001 on its alternate interface
list. At some point, the 1.2.3.4:7001 interface goes away or becomes
unreachable. A new connection comes in from that same host on
5.6.7.8:7001.

What will happen is we create a new host for address 5.6.7.8:7001, and
then detect the uuid collision. When we try to probe the old address
of 1.2.3.4:7001, it will fail, and we will try to
MultiProbeAlternateAddress_r. MultiProbeAlternateAddress_r will
determine that the alternate address 5.6.7.8:7001 responds
successfully to the probe, and it tries to set 5.6.7.8:7001 to be the
primary address of 'oldHost', and add 'oldHost' to the addr hash table
under 5.6.7.8:7001.

But the "new" host from the incoming connection is already hashed on
the address hash table under 5.6.7.8:7001, so the
h_AddHostToAddrHashTable_r call in MultiProbeAlternateAddress_r fails.
Since we later delete the new duplicate host, this results in
5.6.7.8:7001 being the primary address for the host, but that address
is not anywhere in the address hash table.

This behavior can be seen by the following pair of FileLog messages:

Wed Feb  1 11:02:38 2012 CB: ProbeUuid for 0xdeadbeefdead (1.2.3.4:7001) failed -01
Wed Feb  1 11:02:38 2012 h_AddHostToAddrHashTable_r: refusing to hash host beefdeadbaadcafe (5.6.7.8:7001) already hashed

While those message do not necessarily indicate this problem, this
problem will result in those messages.

To fix this, mark the duplicate host as HOSTDELETED before we do any
probing on 'oldHost'. This way, if MultiProbeAlternateAddress_r tries
to add 'oldHost' to the addr hash table under 5.6.7.8:7001, it will be
able to do so successfully, since the old duplicate host is deleted.

Change-Id: Id3aaab0718425492dca1deba892725160677b85f
Reviewed-on: http://gerrit.openafs.org/6726
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agovos: allow releases without offline time
Derrick Brashear [Tue, 13 Dec 2011 17:46:36 +0000]
vos: allow releases without offline time

allow releases using dumps to clones to avoid offline time

Change-Id: I06ed71f12494e362aa10a851081c9dcaf8c9a1af
Reviewed-on: http://gerrit.openafs.org/6254
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agovos: refactor code
Derrick Brashear [Tue, 13 Dec 2011 17:29:30 +0000]
vos: refactor code

change vos to remove lots of duplicated code for volume deletes and clones

Change-Id: I1f39e857de6eefa0d8897e4eb8ece49e4a72f518
Reviewed-on: http://gerrit.openafs.org/6253
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoRx: Avoid lastBusy/PEER_BUSY discrepancy
Andrew Deason [Mon, 13 Feb 2012 20:11:36 +0000]
Rx: Avoid lastBusy/PEER_BUSY discrepancy

If an rx call has the RX_CALL_PEER_BUSY flag set, but the call's
conn->lastBusy is not set, we can easily cause an rx caller to loop
infinitely. rx_NewCall will see that lastBusy for a call channel is
not set, and will use that call channel, but rxi_CheckBusy will note
that the call appears busy and that there are non-busy call channels
on the same conn, and so will return RX_CALL_BUSY.

This can currently happen in rxi_ResetCall, since we set
RX_CALL_PEER_BUSY on the call again if the call had that flag set when
rxi_ResetCall was called. If we are calling rxi_ResetCall with
'newcall' set, the passed in call is unrelated to the new call, since
it was obtained from the free list. Thus, the busy-ness of the call
should be ignored. Fix this by only paying attention to the incoming
RX_CALL_PEER_BUSY flag if 'newcall' is not set.

Also prevent this from happening by clearing RX_CALL_PEER_BUSY in
rx_NewCall when we select a call and clear lastBusy for that call.

Change-Id: Ic5a4709854b62d962ed91ee0103c6cbdd735d175
Reviewed-on: http://gerrit.openafs.org/6707
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agovol: allow clones of readonly volumes
Derrick Brashear [Tue, 13 Dec 2011 17:00:52 +0000]
vol: allow clones of readonly volumes

allow writing of data where it's not user data we're changing
(e.g. allow a vnode to be marked cloned in the vnode index)

Change-Id: If3338ab0474ddbfe895b705217d61c054c4cb696
Reviewed-on: http://gerrit.openafs.org/6251
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agovolser: allow clonevol purge id to be new id
Derrick Brashear [Tue, 13 Dec 2011 16:24:16 +0000]
volser: allow clonevol purge id to be new id

effectively the same functionality that reclone already uses, but
for some reason we artificially limit it out of clone despite
the interface being there for it. it used to be there. put it back.

Change-Id: I22868c41f8d3b920ba61d01e5334ff2320b38376
Reviewed-on: http://gerrit.openafs.org/6250
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agovolser: allow cloning non-rw volumes
Derrick Brashear [Tue, 13 Dec 2011 16:22:38 +0000]
volser: allow cloning non-rw volumes

remove EROFS error which is the only thing preventing a working clone
on a non-RW.

Change-Id: Ic3d4d07519188712e9a38267fc74ebd1eaef7d8a
Reviewed-on: http://gerrit.openafs.org/6249
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoWindows: Dereg Lanman and Lsa reg values for afsredir
Jeffrey Altman [Sun, 19 Feb 2012 00:57:25 +0000]
Windows: Dereg Lanman and Lsa reg values for afsredir

If the machine has been upgraded from an AFS SMB Server to the
AFS Redirector, the registry will have leftover configuration
for the "AFS" netbios name in the Lsa BackConnectionHostNames
value and the LanmanWorkstation ReconnectableServers and
ServersWithExtendedSessTimeout values.   These values are not
useful with the AFS Redirector since \\AFS is owned by afsredir.sys
and not the SMB redirector.  Remove the "AFS" netbios name from
these values when afsd_service.exe has started in redirector mode.

Change-Id: If8c100d3569595645c041ac58fedb1c835f9129f
Reviewed-on: http://gerrit.openafs.org/6737
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agodoc: replace hostnames with IETF example hostnames
Ken Dreyer [Sat, 11 Feb 2012 16:43:30 +0000]
doc: replace hostnames with IETF example hostnames

There were several different real and made-up hostnames and company names used
throughout our documentation examples.

The IETF has reserved "example.com" and other "example" TLDs for use in
examples (RFC 2606). Replace almost all references to ABC Corporation, DEF
Corporation, and State University, as well as "abc.com", "bigcell.com",
"def.com", "def.gov", "ghi.com", "ghi.gov", "jkl.com", "mit.edu",
"stanford.edu", "state.edu", "stateu.edu", "uncc.edu", and "xyz.com".
Standardize on "Example Corporation", "Example Network", "Example
Organization" (example.com, example.net, and example.org).

The Scout documentation in the Admin Guide contains PNG images that contain
the old cell names, so I left those references until the images can be
replaced.

Change-Id: I4e44815b2d2ffe204810b7fd850842248f67c367
Reviewed-on: http://gerrit.openafs.org/6697
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Explorer Shell Set Unix Mode bits
Jeffrey Altman [Sat, 18 Feb 2012 03:21:00 +0000]
Windows: Explorer Shell Set Unix Mode bits

The Unix Mode bits were not being saved.  This patch permits
them to be saved.

FIXES 130572

Change-Id: I6bf96c04115ee0f01e84b44b9efaacb578d95cbc
(cherry picked from commit 534d95ef90ac5e5ebf5deb227008e0b023e7ef8b)
Reviewed-on: http://gerrit.openafs.org/6734
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: remove unnecessary DirectoryEnumEvent
Jeffrey Altman [Fri, 17 Feb 2012 15:37:34 +0000]
Windows: remove unnecessary DirectoryEnumEvent

The DirectoryEnumEvent is not required to implement:

  AFSSetEnumerationEvent
  AFSClearEnumerationEvent
  AFSIsEnumerationInProgress

The DirectoryEnumCount is modified by interlocked operations
and can be used as a marker for when an enumeration is in progress.

Change-Id: I414ce2bc753b0fd60a3fac51c2cf3d264a32ab05
Reviewed-on: http://gerrit.openafs.org/6725
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: VolumeCB->ObjectInfoTree.TreeLock Deadlock
Jeffrey Altman [Fri, 17 Feb 2012 04:50:18 +0000]
Windows: VolumeCB->ObjectInfoTree.TreeLock Deadlock

AFSPrimaryVolumeWorkerThread held the VolumeCB->ObjectInfoTree.TreeLock
exclusively across calls to AFSCleanupFcb() which in turn triggers
a file extent release to the service which can in turn result in
an object invalidation.  Processing the invalidation requires shared
access to VolumeCB->ObjectInfoTree.TreeLock which results in a deadlock.

This patch alters the processing of AFSPrimaryVolumeWorkerThread
so that the VolumeCB->ObjectInfoTree.TreeLock is not held across
the AFSCleanupFcb() calls.

FIXES 130431

Change-Id: I3726df02ab47d2dcc83a32c75957a5dafcfbf20e
Reviewed-on: http://gerrit.openafs.org/6724
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agovolinfo: initialize vnode details
Michael Meffie [Thu, 16 Feb 2012 15:58:50 +0000]
volinfo: initialize vnode details

Clear the vnode details object. Fixes the path lookup in volscan.

Change-Id: I5176cf50bdb54529230fc72e4d1a65a20b4c14ba
Reviewed-on: http://gerrit.openafs.org/6722
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agolibafs: kill rxevent daemon even in upcall mode
Derrick Brashear [Mon, 13 Feb 2012 21:11:19 +0000]
libafs: kill rxevent daemon even in upcall mode

the switch from rxk listener env to upcall env could leave the event
daemon running. fix that.

Change-Id: Ibe36e7473536c36a739c0ad1e18fcf6880c98021
Reviewed-on: http://gerrit.openafs.org/6713
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agodoc: refer to aklog instead of klog
Ken Dreyer [Thu, 16 Feb 2012 03:12:56 +0000]
doc: refer to aklog instead of klog

klog (and kaserver) is deprecated. In generic examples, refer to the Kerberos
5 equivalents.

Change-Id: I95806a384686033fe2f03573017fc619c2a376c7
Reviewed-on: http://gerrit.openafs.org/6721
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

9 years agoWindows: disable afsdhook.dll reload by daemon
Jeffrey Altman [Wed, 15 Feb 2012 05:06:47 +0000]
Windows: disable afsdhook.dll reload by daemon

The daemon thread's loading and unloading of afsdhook.dll every
second prevents the disk drive from sleeping and forces a search
of the PATH.   Make the periodic reloading configurable and
disable it by default.

Change-Id: I7e1a5b2bc7e1c4d4ea39fc30cf34c1195a326ed2
Reviewed-on: http://gerrit.openafs.org/6715
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: remove install9x rules
Jeffrey Altman [Wed, 15 Feb 2012 02:52:28 +0000]
Windows: remove install9x rules

Change-Id: I293f982d0f1466fd9bf213db055eedafc3c79977
Reviewed-on: http://gerrit.openafs.org/6712
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: remove AFS_WIN95_ENV
Jeffrey Altman [Tue, 14 Feb 2012 21:02:02 +0000]
Windows: remove AFS_WIN95_ENV

No longer build for Win9x.  Remove AFS_WIN95_ENV conditionals.

Change-Id: I7082017a3aaa9a30723549974c4d8af50025b923
Reviewed-on: http://gerrit.openafs.org/6711
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: add KTC_TOKEN_MUTEX_FAIL error code
Jeffrey Altman [Tue, 14 Feb 2012 20:35:07 +0000]
Windows: add KTC_TOKEN_MUTEX_FAIL error code

If acquisition of the Global\AFS_KTC_Mutex fails, return a
different error code from a pioctl failure since the pioctl
was never issued.

Change-Id: I001227f87e97a06bf419c68d6579843e4f93f032
Reviewed-on: http://gerrit.openafs.org/6710
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: avoid GetComputerNameW call for all ioctl
Jeffrey Altman [Tue, 14 Feb 2012 17:01:38 +0000]
Windows: avoid GetComputerNameW call for all ioctl

Cache the value of GetComputerNameW() to avoid repeated calls
for each and every redirector ioctl request.

Change-Id: I4476db982897a631510eba7d859385268b16ce34
Reviewed-on: http://gerrit.openafs.org/6708
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoRedHat: Fail openafs-client 'stop' on rmmod error
Andrew Deason [Wed, 8 Feb 2012 22:03:29 +0000]
RedHat: Fail openafs-client 'stop' on rmmod error

Currently, the openafs-client RPM init script ignores any error
reported by rmmod. If 'umount /afs' succeeds but rmmod does not, the
client may panic the machine if the client is started again (from e.g.
running the 'restart' init script method), since afsd will try to
initialize AFS with a libafs that has been shut down.

So, do not ignore errors from 'rmmod', and instead fail the 'stop'
method from the init script if we get an error.

Change-Id: Id4a07703fb4df69ad3a6a3569c91e48f73a0d309
Reviewed-on: http://gerrit.openafs.org/6709
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agodoc: fix AdminGuide
Jeffrey Altman [Sun, 12 Feb 2012 03:14:23 +0000]
doc: fix AdminGuide

The AdminGuide was broken by e99224f2fe049bc339e87c8b6c195de67dca2f08.

Change-Id: I4fc67d36857d62b562092b9892636f3e4c6d6623
Reviewed-on: http://gerrit.openafs.org/6703
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: default cell grand.central.org
Jeffrey Altman [Sat, 11 Feb 2012 22:31:00 +0000]
Windows: default cell grand.central.org

Change the default cell from openafs.org to grand.central.org
since there is no openafs.org cell.  All openafs software is
distributed from the grand.central.org cell.

Change-Id: I21ea2c5a9b55fbe3bb4ea19ae34ecf0e5a38084f
Reviewed-on: http://gerrit.openafs.org/6699
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: reset version to 0.0.0 on master
Jeffrey Altman [Sat, 11 Feb 2012 22:29:51 +0000]
Windows: reset version to 0.0.0 on master

Master does not track a particular version number.
For Windows builds on master, reset the version to
0.0.0 so that the builds are not confused with the actual
1.5.7600.

Change-Id: I3c84bb117418284de0d65e2a4069b88908b91659
Reviewed-on: http://gerrit.openafs.org/6698
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: AFSRemoveFcb() cannot race
Jeffrey Altman [Sat, 11 Feb 2012 17:49:33 +0000]
Windows: AFSRemoveFcb() cannot race

Modify AFSRemoveFcb to use InterlockedComparePointerExchange
to ensure that only one thread can remove and deallocate an
AFSFcb structure.

Change-Id: I27d27b6a99806bee2fc2cfc04c2ac04d975a553d
Reviewed-on: http://gerrit.openafs.org/6696
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agodoc: add section on direct volume access
Ken Dreyer [Fri, 10 Feb 2012 00:37:01 +0000]
doc: add section on direct volume access

Provide examples of the direct volume access syntax, using the
fictitious example.com cell.

Change-Id: Ia2ea592531e29f6b744d0bd6993d598d78a799c4
Reviewed-on: http://gerrit.openafs.org/6691
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Perform rename to self check earlier
Jeffrey Altman [Fri, 10 Feb 2012 13:56:12 +0000]
Windows: Perform rename to self check earlier

In AFSSetRenameInfo(), the rename to itself check was performed
after the name collision check.  Move the check earlier in the
routine to ensure that we catch the no-op before any real work
is done.

Change-Id: I580dd9958a259d4a1819c6bd882dae8067d2853d
Reviewed-on: http://gerrit.openafs.org/6692
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoviced: Keep H_LOCK while locking host in h_Alloc_r
Andrew Deason [Tue, 20 Dec 2011 22:44:42 +0000]
viced: Keep H_LOCK while locking host in h_Alloc_r

Currently in h_Alloc_r, we h_Lock_r the host, so we have it locked on
return. However, h_Lock_r drops the host glock, which is bad in this
situation since we have already added the host to the global hash
table, so other threads may see it. This can mean that by the time
h_Alloc_r returns, the returned host may have HOSTDELETED set, and/or
the addresses associated with the host may be completely different.

h_Alloc_r's caller, h_GetHost_r, seems to assume that the host is
still associated with the address of the passed-in connection. When
this is not true, this can result in the host structure getting into a
strange state, such as the primary addr/port may not be hashed. The
host may also have HOSTDELETED set, in which case we're not supposed
to be dealing with it at all.

To avoid these problems, lock host->lock directly in h_Alloc_r,
without going through h_Lock_r and dropping H_LOCK. Also do it as one
of the first things we do to initialize the host, just to make sure
that if anybody else happens to see the host, it is locked by us when
they do.

Change-Id: Ia99cb84ad94f3e143ed0bae33485a88d60ff5b27
Reviewed-on: http://gerrit.openafs.org/6389
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoviced: Allow null host for BreakCallBack
Marc Dionne [Sun, 22 Jan 2012 14:45:22 +0000]
viced: Allow null host for BreakCallBack

For replication writes at the remote site, we will want to call
this without a host structure.

Change-Id: I9cdef18f35229c9ab162cc07f6d60fe443204654
Reviewed-on: http://gerrit.openafs.org/6674
Reviewed-by: Simon Wilkinson <simonxwilkinson@gmail.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agolibafsauthent, tvolser: fix objdir build
Jonathan A. Kollasch [Tue, 7 Feb 2012 21:23:23 +0000]
libafsauthent, tvolser: fix objdir build

Change-Id: I50c3424d61fc440f870207229a9540ebdb9a9632
Reviewed-on: http://gerrit.openafs.org/6689
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoWindows: Release Notes corrections
Jeffrey Altman [Tue, 7 Feb 2012 20:56:12 +0000]
Windows: Release Notes corrections

Add missing BlockSize registry value

Correct AFSRedirector\NetworkProvider registry key description

Add note that LanAdapter value is ignored if SMB mode is not in use.

Change-Id: I449988f1f6841c1b254d73b08a6ee53ca2dbaeda
Reviewed-on: http://gerrit.openafs.org/6685
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: OpenAFS reparse points are surrogates
Jeffrey Altman [Mon, 6 Feb 2012 17:00:58 +0000]
Windows: OpenAFS reparse points are surrogates

OpenAFS reparse points represent mount points, symlinks, and dfs
referrals.  All of which are file system objects that represent
another named entity in the system.  As a result the reparse tag
field must include the Reparse Tag Surrogate bit (0x20000000) set.

This permits the IsReparseTagNameSurrogate() macro provided in
winnt.h to be used to determine if the reparse point is a surrogate
or not.

See
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365197%28v=vs.85%29.aspx

Change-Id: I2561823e23371c2fdf01941da99fe848ca1fa11d
Reviewed-on: http://gerrit.openafs.org/6668
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoRW Replication: Add basic definitions
Marc Dionne [Wed, 18 Jan 2012 19:04:28 +0000]
RW Replication: Add basic definitions

Add some basic definitions that will be needed to handle RW
replicas.

A new volume type RWREPL is added.  Replicas will share the same
volume ID as the RW volume, so the array of volume IDs by volume
type is unchanged, as is the VLDB entry format.

A new flag bit ITSRWREPL/VLSF_RWREPLICA for serverFlags identifies
RW replica sites in VLDB entries.

Change-Id: I882b238f34e682ebea782e11dc418ae1340d4546
Reviewed-on: http://gerrit.openafs.org/6676
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Simon Wilkinson <simonxwilkinson@gmail.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agovol: remove OPENAFS_VOL_STATS
Marc Dionne [Tue, 4 Oct 2011 21:47:48 +0000]
vol: remove OPENAFS_VOL_STATS

OPENAFS_VOL_STATS has been unconditionally defined since the IBM days.
Adjust the code to assume it is set.

Change-Id: I3b5ff99a469e6865ff1e10405a7f77d8c3890f59
Reviewed-on: http://gerrit.openafs.org/5551
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoDisable kernel opt by default on Solaris 10 and 11
Andrew Deason [Mon, 6 Feb 2012 19:23:41 +0000]
Disable kernel opt by default on Solaris 10 and 11

With newer Solaris Studio (sometime in the 12.* series), cc started
adding SSE instructions to optimized x86 code, which is invalid for
kernel code and can generate panics. There appears to be no way to
turn this off currently (-xvector=%none is non-functional), so default
to not optimizing kernel code.

Change-Id: I5fdedb11219df68e0146b8e0cee9010c2eb4067e
Reviewed-on: http://gerrit.openafs.org/6671
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoRx: Add missing rx_packet.h includes
Andrew Deason [Fri, 3 Feb 2012 22:06:16 +0000]
Rx: Add missing rx_packet.h includes

We no longer include rx_packet.h from rx.h, so rx_kcommon.h was not
picking up some packet-related definitions. Some files
(SOLARIS/rx_knet.c, IRIX/rx_knet.c) were using packet-related defines
(e.g. RX_HEADER_SIZE) while just including rx_kcommon.h. Include
rx_packet.h in those files to get the relevant definitions.

Change-Id: Ib012f295d8e324dd8b38eb0b89933eac392a9583
Reviewed-on: http://gerrit.openafs.org/6670
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoSOLARIS: Use kcred instead of afs_osi_cred
Andrew Deason [Thu, 2 Feb 2012 23:35:52 +0000]
SOLARIS: Use kcred instead of afs_osi_cred

For many vfs ops to the cache, we currently pass &afs_osi_cred for our
credentials, which is a mostly zeroed-out credential structure. In
some modern versions of Solaris (Solaris 11), at least some parts of
this structure need to not be NULL (cr_zone), or we will panic.

The Solaris kernel provides a 'kcred' credentials structure for the
purpose of using "kernel" credentials for i/o. So just use that
instead, since kcred has existed at least since Solaris 8.

Change-Id: Ia5252580d2de6dd7adfa1a1929148362d1da6360
Reviewed-on: http://gerrit.openafs.org/6669
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoWindows: Avoid race during PIOCtl DirNode allocation
Jeffrey Altman [Sat, 4 Feb 2012 22:26:02 +0000]
Windows: Avoid race during PIOCtl DirNode allocation

Use InterlockedCompareExchangePointer to assign the DirNode to
ObjectInfo->Specific.Directory.PIOCtlDirectoryCB.  Otherwise,
one thread could race with another thread when allocating the
pioctl object.

Change-Id: Ic5b1a0ff2e44f2c4520cc7f5e536bd876bc83a65
Reviewed-on: http://gerrit.openafs.org/6661
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Hold Fcb references prior to service call
Jeffrey Altman [Sat, 4 Feb 2012 17:48:24 +0000]
Windows: Hold Fcb references prior to service call

If the Fcb reference count hits 0 while the service is called
it is possible that the Fcb can be garbage collected prior to
the completion of the call.

Change-Id: I32c3c5e3debb246fe63ac6f6cc5625b493ee47a9
Reviewed-on: http://gerrit.openafs.org/6660
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Do not build NSIS by default
Jeffrey Altman [Sun, 5 Feb 2012 17:58:22 +0000]
Windows: Do not build NSIS by default

NSIS installers are no longer up to date and do not support 64-bit
builds.  OpenAFS no longer distributes them for 1.7 and beyond.
Stop building them by default.

Change-Id: I6b8c2b46ccc30654cfb4661c9bde50483bc99785
Reviewed-on: http://gerrit.openafs.org/6664
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: add buf_InvalidateBuffers
Jeffrey Altman [Fri, 3 Feb 2012 16:35:33 +0000]
Windows: add buf_InvalidateBuffers

Add a utility function that invalidates all buffers for a
cm_scache_t object.

Change-Id: Ib10139fb2aefa03d597d5afd494652fade40432e
Reviewed-on: http://gerrit.openafs.org/6651
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: fix cm_DirOpDelBuffer assert
Jeffrey Altman [Fri, 3 Feb 2012 16:21:45 +0000]
Windows: fix cm_DirOpDelBuffer assert

In cm_DirOpDelBuffer() the data version field for a buffer
in cm_dirOp_t.buffers[] can be CM_BUF_VERSION_BAD if the buffer
was added to the buffer list but was never fetched from the file
server.  If the buffer was recycled by buf_Get() an attempt to
remove an entry from the directory will be failed as opposed to
fetching the buffer from the file server and performing the local
removal.

Change-Id: Id9af5180f2176c2a90ef9907ae84139e66ffe5d6
Reviewed-on: http://gerrit.openafs.org/6650
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: buffer DV ranges do not work for directories
Jeffrey Altman [Fri, 3 Feb 2012 16:17:40 +0000]
Windows: buffer DV ranges do not work for directories

In cm_MergeStatus, always set cm_scache_t.bufDataVersionLow
to the new data version because the cm_dir package does not
support version ranges.   All modified dir buffers have their
dataVersion field set to the current data version value.

Failure to update the bufDataVersionLow field can result in
B+ Trees being constructed from out of date directory information.

Change-Id: Ic6bb6f78275de9c6c7960f2fc7c06c507b1144c1
Reviewed-on: http://gerrit.openafs.org/6649
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: update btree debugging code
Jeffrey Altman [Fri, 3 Feb 2012 16:16:04 +0000]
Windows: update btree debugging code

B+Tree key strings were changed to wchars for unicode support,
the debugging printf format patterns were not updated to match.
Do so now.

Change-Id: I70619d2e3fbc007f3f21eaf56cc5d61503203818
Reviewed-on: http://gerrit.openafs.org/6648
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Do not open file if shutdown in progress
Jeffrey Altman [Fri, 3 Feb 2012 16:14:50 +0000]
Windows: Do not open file if shutdown in progress

Perform the shutdown check earlier in AFSCommonCreate() to prevent
a request from being processed after the service indicates that
a shutdown has begun.

Change-Id: I8959141b5e2161ffe960e93a500b1153d9594a28
Reviewed-on: http://gerrit.openafs.org/6647
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: AFSRedir DebugFlags Turn on BugCheck
Jeffrey Altman [Wed, 1 Feb 2012 03:34:30 +0000]
Windows: AFSRedir DebugFlags Turn on BugCheck

Turn on bug checking by default via the installation.
This permits sites to disable the functionality but will allow
us to capture more meaningful minidump output.

Change-Id: I62b6d0ce5deed2c8798c9afb09565a8846c32a8c
Reviewed-on: http://gerrit.openafs.org/6646
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Improve AFSNotifyDelete
Jeffrey Altman [Tue, 31 Jan 2012 20:51:34 +0000]
Windows: Improve AFSNotifyDelete

Do not call AFSNotifyDelete after the reference count on the
DirEntry->ObjectInformation is given up.

Log the Parent FID and file name since that is what are passed
to the service to perform a  delete.  Log the actual FID of the
object being deleted and not the address of the FID fields.

Change-Id: Ic02e2cec625258356d1b08e03a02a7a9c4eb4ce7
Reviewed-on: http://gerrit.openafs.org/6645
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: do not lower case direct volume references
Jeffrey Altman [Tue, 31 Jan 2012 20:49:22 +0000]
Windows: do not lower case direct volume references

Not all volumes are lower case.  Do not lowercase the string.

Change-Id: Icb5f5ee9865bd856775486dffb1849f17f9b23f7
Reviewed-on: http://gerrit.openafs.org/6644
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agocom_err: correctly deal with lack of libintl
Tom Keiser [Wed, 1 Feb 2012 08:31:23 +0000]
com_err: correctly deal with lack of libintl

On machines lacking a libintl, _intlize() currently fails to initialize
the output error string--leading to tools (e.g., translate_et) returning
a null string; make afs_com_err fall back to returning the en/US canonical
error text when we don't have any i18n support...

Change-Id: I333745fb0a16e5bc9adb0755591d80de010d4d31
Reviewed-on: http://gerrit.openafs.org/6638
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agolinux: fix probing for noop_fsync
Christof Hanke [Sun, 29 Jan 2012 17:08:57 +0000]
linux: fix probing for noop_fsync

Commit 267934d0e6910c8d8166a6e78f93c1bab40857b8 introduced
probing code to deal with the renameing of simple_fsync
inside the linux-kernel.
This test does not take different parameter-lists
for noop_fsync or simple_fsync resp. into account.
Fix this.

Change-Id: Ib490f0bb7e8098acc83fce001a43c08f478ad582
Reviewed-on: http://gerrit.openafs.org/6628
Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

9 years agoman-pages: add fs_getverify and fs_setverify
Jeffrey Altman [Sun, 29 Jan 2012 21:46:22 +0000]
man-pages: add fs_getverify and fs_setverify

Add man pages for two new Windows only commands

  fs getverify
  fs setverify -verify {on, off}

Change-Id: Id784608fba35147a4e33f22e43c7cd50a2307b9e
Reviewed-on: http://gerrit.openafs.org/6632
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: do not panic if afsredir not ready during shutdown
Jeffrey Altman [Sun, 29 Jan 2012 19:41:06 +0000]
Windows: do not panic if afsredir not ready during shutdown

Change-Id: I0de6ad0f799e2acf1c02c6d53cfd9b1b437328fc
Reviewed-on: http://gerrit.openafs.org/6630
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Increase size of worker thread pools
Jeffrey Altman [Sun, 29 Jan 2012 15:39:28 +0000]
Windows: Increase size of worker thread pools

The size of the afs redirector worker thread pools should be
made configurable but for now just increase the pool size to
be in parity with the default worker pool created by the
afsd service.

Change-Id: Ib3c9356783162620112041582fa3d9dbaf8fce37
Reviewed-on: http://gerrit.openafs.org/6627
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Run Workers until empty task queue
Jeffrey Altman [Sun, 29 Jan 2012 15:37:50 +0000]
Windows: Run Workers until empty task queue

Do not allow a worker thread to sleep until the task queue is
empty.  It is better for the running thread to pick up and process
a task then to sleep this thread and wait for another one to wake
up to perform the work.

Change-Id: I776bb9408ab054b045acb9bc003b88436cc4266b
Reviewed-on: http://gerrit.openafs.org/6626
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Release Notes for 1.7.5
Jeffrey Altman [Sun, 29 Jan 2012 05:22:03 +0000]
Windows: Release Notes for 1.7.5

Release notes updates for 1.7.5.

Change-Id: Ie44441150fc077cc4ca7924c67322a1aed4cb9af
Reviewed-on: http://gerrit.openafs.org/6624
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Stop the thundering herd
Jeffrey Altman [Fri, 20 Jan 2012 19:43:06 +0000]
Windows: Stop the thundering herd

The afs redirector used notification events to wake up worker
threads when a task was added to a work queue.  Notification
events when signalled wake up all threads instead of just one.

Instead, use synchronization events to wake up a single thread at
a time and restructure the code to permit workers to wake up
additional workers if there is additional work to be performed
or during library shutdown.

Thanks to Peter Scott for his assistance.

Change-Id: I0fb9d8578035f606f03170622fc9c50a1dbfee3a
Reviewed-on: http://gerrit.openafs.org/6595
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: DriveSubstitution handle too small buffer
Jeffrey Altman [Wed, 25 Jan 2012 16:27:39 +0000]
Windows: DriveSubstitution handle too small buffer

If the buffer passed to DriveSubstitution is too small the
resulting file path will end up being truncated.  At the very
least log the fact that truncation is occurring.  In addition
return the fact that truncation occurred to the caller.

In NPGetUniversalName allocate a 4K buffer on the heap instead
of calculating a buffer based on the local name buffer size.
The local name buffer size has no relationship with the required
buffer size for the expanded unc or device path.

FIXES 130548

Change-Id: I86fbb9db4aa6a438dbb5e793678ec52283d5546b
Reviewed-on: http://gerrit.openafs.org/6618
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Invalidate all volumes at library init
Jeffrey Altman [Tue, 24 Jan 2012 22:09:01 +0000]
Windows: Invalidate all volumes at library init

The afsredirlib.sys library driver is unloaded when the afsd_service
stops and is reloaded when the afsd_service restarts.  During the
shutdown window any objects known to the kernel are preserved by
afsredir.sys.  When the afsd_service restarts, there are no valid
callbacks on any objects so the afsredirlib.sys must invalidate all
status info to permit the service to request a callback from the
file server on next use.

Change-Id: I3e8fa9513f435ff5cd1a8cfb8daa766aa30dd8c1
Reviewed-on: http://gerrit.openafs.org/6617
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Refactor and consolidate afsredir invalidation
Jeffrey Altman [Tue, 24 Jan 2012 17:52:12 +0000]
Windows: Refactor and consolidate afsredir invalidation

Invalidation requests were being processed in an inconsistent
manner because different rules were being applied to volume root
directories and other objects and whether or not the invalidation
was a whole volume invalidation or not.

This patchset consolidates all invalidation logic for an object
in the new AFSInvalidateObject function.  AFSInvalidateObject
is then called from AFSInvalidateCache and AFSInvalidateVolume
as necessary.

AFSInvalidateVolume executes AFSInvalidateObject on all objects
in the volume object tree.  As a result, whole volume invalidations
whether triggered by the file server or "fs flushvolume" now work.

Change-Id: I83f110b0987eb153794b6803a1fe48247090277f
Reviewed-on: http://gerrit.openafs.org/6616
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agovlserver: Consolidate VLDB entry server flag definitions
Marc Dionne [Mon, 23 Jan 2012 02:21:51 +0000]
vlserver: Consolidate VLDB entry server flag definitions

Group the definitions of server flags for VLDB entries in one place,
and rename VLSERVER_FLAG_UUID to make its name consistent with the
other flags.
This makes it easier to see the complete set of flags and avoid
conflicts.

Change-Id: I3b326e3d97bc297c0314cfc48f0a066c3ff0415e
Reviewed-on: http://gerrit.openafs.org/6615
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoviced: Remove the LWP fileserver
Simon Wilkinson [Mon, 7 Nov 2011 09:48:14 +0000]
viced: Remove the LWP fileserver

*) Remove all LWP specific code from the fileserver, and make pthread
   the default
*) Build the pthreaded fileserver in the 'viced' directory, rather than
   in tviced
*) Move the DAFS specific files from tviced to viced (arguably, these
   should move into dviced, but there are currently no source files in
   that directory)
*) Remove tviced from the build

Change-Id: I6e186c9fad6d9dccd04cf1317a80c087587ef25f
Reviewed-on: http://gerrit.openafs.org/5816
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agovol: remove SYNC fatal_error processing
Andrew Deason [Fri, 13 Jan 2012 18:43:16 +0000]
vol: remove SYNC fatal_error processing

Currently SYNC clients will "disable" themselves on certain error
patterns. For example, if the server end closes its file descriptor
too many times, or takes too long and then closes the fd, the SYNC
client will return an error and set fatal_error. On any subsequent
SYNC requests, the request will immediately fail without contacting
the server, often making SYNC client programs effectively useless
until they are restarted.

There isn't really any reason to cause future requests to fail.
Transient problems in the fileserver can easily make this situation
possible (e.g. a fileserver can crash but still take several minutes
to close the SYNC fd while the core is written to disk), and so while
we may return an error for a specific problematic request, future
requests may be fine.

So, just remove everything related to fatal_error, so future SYNC
requests can continue to be attempted. Adjust some log messages to
reflect the new behavior.

Change-Id: I4b8bfe53f591a9e8541cd5a98c909208df5bcbac
Reviewed-on: http://gerrit.openafs.org/6548
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agolibafs: add replicated connection pool
Derrick Brashear [Thu, 12 Jan 2012 21:48:54 +0000]
libafs: add replicated connection pool

keep pool of connections to use for replicated volumes,
so we can have a separate idle time setting

Change-Id: I61ed62c652c924b33fde920fac766c4ca0043826
Reviewed-on: http://gerrit.openafs.org/6546
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoWindows: make lock reader history debug only
Jeffrey Altman [Sun, 15 Jan 2012 16:43:40 +0000]
Windows: make lock reader history debug only

The lock reader history on osi_rwlock is proving to be too
expensive.  Only use it for DEBUG builds.  Leave the data
structures the same so that DEBUG builds can be mixed with
a RELEASE build of afsd_service.exe.

Change-Id: If0eeddb63c8f9919cdb5e119f31cde77974447b6
Reviewed-on: http://gerrit.openafs.org/6559
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: store data verification mode
Jeffrey Altman [Sun, 22 Jan 2012 23:42:32 +0000]
Windows: store data verification mode

Over the lifetime of OpenAFS a number of bugs have been discovered
that can result in data corruption.  This new mode (Windows only)
will double check that the data received by the file server does
in fact match the data that was written by the cache manager.

After a successful StoreData and status merge but before the BIOD
is released, a fetchdata is issued to read the data written by the
cache manager.  If the data fails to match, the StoreData operation
is repeated.

Data verification mode can be queried with "fs getverify" and set
with "fs setverify {on, off}".  The default value can be set with
the TransarcAFSDaemon\Parameters DWORD "VerifyData" registry value.

By default verification is disabled.

Change-Id: Ic99c1692e6e78790e65ae600c3e428a79df59370
Reviewed-on: http://gerrit.openafs.org/6601
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: VIOC_GETUNIXMODE = smb_IoctlGetUnixMode
Jeffrey Altman [Sun, 22 Jan 2012 23:38:49 +0000]
Windows: VIOC_GETUNIXMODE = smb_IoctlGetUnixMode

VIOC_GETUNIXMODE pioctl should execute smb_IoctlGetUnixMode not
smb_IoctlSetUnixMode.

Change-Id: Ia7dc3e1a82d7d14810f743f50ff7666f13ba8afc
Reviewed-on: http://gerrit.openafs.org/6600
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: fix fs setcrypt help message
Jeffrey Altman [Sun, 22 Jan 2012 23:37:14 +0000]
Windows: fix fs setcrypt help message

Options are on, auth, and off.

Change-Id: I671df4233801f39482b8cac096e89fa38955a852
Reviewed-on: http://gerrit.openafs.org/6599
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows; release BIOD after status merge
Jeffrey Altman [Sun, 22 Jan 2012 23:33:43 +0000]
Windows; release BIOD after status merge

Releasing the BIOD permits the accumulated buffers to be accessed.
Releasing the BIOD before the cm_MergeStatus() call creates a
window where the buffer data version is larger than the cm_scache
data version.  Release the BIOD after the status merge.

Change-Id: I023413cd41fbbd2d844d79a3b29c087792fffa24
Reviewed-on: http://gerrit.openafs.org/6598
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoviced: disable rx keepalives during disk io
Derrick Brashear [Thu, 5 Jan 2012 22:19:45 +0000]
viced: disable rx keepalives during disk io

when we are going to hit the backend storage, disable keepalives.
the net effect of this is that no idle dead time is needed; instead,
the normal dead time will result in a connection with no activity
simply dying naturally if i/o blocks forever.

it's important that keepalives be enabled during callback breaks,
so that is done.

Change-Id: I1a7bfe0bc62a092ca7dd6dbc4710f1b8254ca9a1
Reviewed-on: http://gerrit.openafs.org/6515
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoRevert "Windows: disable memory extent interface"
Jeffrey Altman [Sat, 21 Jan 2012 04:10:51 +0000]
Revert "Windows: disable memory extent interface"

This reverts commit 503bc56403baf741a4a7056a4077edc43812b9d1

Change-Id: I9e40787ecd0833370a86486fab6644667e03aa3b
Reviewed-on: http://gerrit.openafs.org/6603
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoviced: remove FS_STATS_DETAILED
Marc Dionne [Tue, 4 Oct 2011 21:35:18 +0000]
viced: remove FS_STATS_DETAILED

FS_STATS_DETAILED has been unconditionally defined since the IBM days.
Adjust the code to assume it is set.

Change-Id: If7fb913bbb42dba5d749e7c30b8d9b7d81e4b4f8
Reviewed-on: http://gerrit.openafs.org/5550
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoWindows: failover and retry for VBUSY
Jeffrey Altman [Wed, 18 Jan 2012 00:46:30 +0000]
Windows: failover and retry for VBUSY

When a file server returns the VBUSY error for an RPC the
cache manager records the 'srv_busy' state in the cm_serverRef_t
structure binding that file server to the active cm_volume_t
object.  The 'srv_busy' was never cleared which prevents the
volume from being accessed.

Clear the 'srv_busy' flag whenever cm_Analyze() receives a
CM_ERROR_ALLBUSY error which means that all replicas have
been tried or whenever the error is not VBUSY or VRESTARTING.

FIXES 130537

Change-Id: I5020198e4f0ded1df0f64e228e699852f9de7c4d
Reviewed-on: http://gerrit.openafs.org/6563
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: improved idle dead time handling
Jeffrey Altman [Fri, 25 Nov 2011 14:28:18 +0000]
Windows: improved idle dead time handling

RX_CALL_IDLE has been treated the same as RX_CALL_DEAD which is
a fatal error that results in the server being marked down.  This
is not the appropriate behavior for an idle dead timeout error
which should not result in servers being marked down.

Idle dead timeouts are locally generated and are an indication
that the server:

 a. is severely overloaded and cannot process all
    incoming requests in a timely fashion.

 b. has a partition whose underlying disk (or iSCSI, etc) is
    failing and all I/O requests on that device are blocking.

 c. has a large number of threads blocking on a single vnode
    and cannot process requests for other vnodes as a result.

 d. is malicious.

RX_CALL_IDLE is distinct from RX_DEAD_CALL in that idle dead timeout
handling should permit failover to replicas when they exist in a
timely fashion but in the non-replica case should not be triggered
until the hard dead timeout.  If the request cannot be retried, it
should fail with an I/O error.  The client should not retry a request
to the same server as a result of an idle dead timeout.

In addition, RX_CALL_IDLE indicates that the client has abandoned
the call but the server has not.  Therefore, the client cannot determine
whether or not the RPC will eventually succeed and it must discard
any status information it has about the object of the RPC if the
RPC could have altered the object state upon success.

This patchset splits the RX_CALL_DEAD processing in cm_Analyze() to
clarify that only RX_CALL_DEAD errors result in the server being marked
down.  Since Rx idle dead timeout processing is per connection and
idle dead timeouts must differ depending upon whether or not replica
sites exist, cm_ConnBy*() are extended to select a connection based
upon whether or not replica sites exist.  A separate connection object
is used for RPCs to replicated objects as compared to RPCs to non-replicated
objects (volumes or vldb).

For non-replica connections the idle dead timeout is set to the hard
dead timeout.  For replica connections the idle dead timeout is set
to the configured idle dead timeout.

Idle dead timeout events and whether or not a retry was triggered
are logged to the Windows Event Log.

cm_Analyze() is given a new 'storeOp' parameter which is non-zero
when the execute RPC could modify the data on the file server.

Change-Id: Idef696b15a8161335aa48907c15a4dc37f918bdb
Reviewed-on: http://gerrit.openafs.org/6118
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agorx: RX_CALL_IDLE and RX_CALL_BUSY
Jeffrey Altman [Mon, 28 Nov 2011 17:58:02 +0000]
rx: RX_CALL_IDLE and RX_CALL_BUSY

Allocate new Rx error codes for Idle and Busy calls but do not
send these errors on the wire.  They are only intended for local
use.

RX_CALL_IDLE is an indication to an application that requests it
that the rx peer is maintaining an open call channel but has not
sent any actual data for the length of the registered idle dead
timeout.

RX_CALL_BUSY is an indication to an application that requests it
that the rx peer believes the selected call channel is in use by
a pre-existing call.

When either RX_CALL_IDLE or RX_CALL_BUSY are assigned as the call
error and an abort must be sent to the rx peer, the errors are
translated to RX_CALL_TIMEOUT.  This is necessary because it is
not possible to add new Rx error values in a method that is safe
for peers that are not expecting them.

This patchset also documents which Rx errors defined in rx.h are
used on the wire and which are not.

The Unix and Windows cache managers are updated to build with
these new error codes.

Change-Id: Ib236f27b88d503c68134534bb069e12dd83537d8
Reviewed-on: http://gerrit.openafs.org/6128
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows Asynchronous purging of file content after a DV change
Peter Scott [Thu, 19 Jan 2012 01:42:19 +0000]
Windows Asynchronous purging of file content after a DV change

Purge all regions of the file surrounding the extents which are to be
purged. If a failure occurs on the purge due to an existing mapping, flag
for purge during handle close

Change-Id: Id8ef81afaa614ea08e03bbd55ec2cdded0d7139f
Reviewed-on: http://gerrit.openafs.org/6573
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: cm_buf refcnt must hold buf_globalLock
Jeffrey Altman [Thu, 19 Jan 2012 20:25:44 +0000]
Windows: cm_buf refcnt must hold buf_globalLock

An assertion in buf_Recycle() was being triggered when a cm_buf_t
object was supposed to be in the free buffer list but wasn't.
buf_Recycle() was racing with another thread.  The test for
refCount == 0 was performed while holding the buf_globalLock
exclusively but the InterlockedDecrement(refCount) in buf_Release()
was performed without holding buf_globalLock at all.  buf_globalLOck
must be held at least as a read lock.  Otherwise, the refCount can
reach 0 prior to the thread blocking for exclusive access to the
buf_globalLock.  This provides buf_Recycle() which is holding
buf_globalLock the opportunity to race.

The solution is to make sure that buf_Release() always holds
buf_globalLock as a read lock and then use buf_ReleaseLocked()
to perform the actual decrement and test.

Change-Id: Ieb67548a7e44fa5f06f9346f428b1edadfc80696
Reviewed-on: http://gerrit.openafs.org/6576
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Redesign daemon thread queue management
Jeffrey Altman [Thu, 19 Jan 2012 06:21:02 +0000]
Windows: Redesign daemon thread queue management

The daemon thread worker pool has some very poor properties.
The threads spend a significant amount of time polling for
ready to process tasks because so frequently a store/fetch data
request is accompanied by many other requests for the same FID
that would block.

Lets try a new approach. Create one queue for each worker thread
and assign the tasks to a thread by a hash of the FID.  This ensures
that all tasks for a single FID are serialized and prevents multiple
threads from attempting to perform the same task only to decide that
the thread would be forced to block.

Change-Id: I1d00ba0df1aa646e05b2cb3cb0796629f2e6d233
Reviewed-on: http://gerrit.openafs.org/6575
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: prevent race assigning Fcb in AFSInitFcb()
Jeffrey Altman [Wed, 18 Jan 2012 00:43:54 +0000]
Windows: prevent race assigning Fcb in AFSInitFcb()

AFSInitFcb() is executed when the ObjectInformation->Fcb pointer
is NULL.  More than one thread can make that determination at the
same time.  Use InterlockedCompareExchangePointer() to detect
a race and permit cleanup to be performed.

Remove the output parameter of AFSInitFcb() to avoid a double
assignment.

Change-Id: I3870cccd5cd5e95134446523cce3547a2135d5e3
Reviewed-on: http://gerrit.openafs.org/6562
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: cm_EndCallbackGrantingCall refactoring
Jeffrey Altman [Sat, 14 Jan 2012 15:32:51 +0000]
Windows: cm_EndCallbackGrantingCall refactoring

Refactor cm_EndCallbackGrantingCall to prevent assigning a
callback to the cm_scache object in the case where it is going
to be discarded.  If the race was lost the callback data was
already discarded by cm_RevokeCallback.  By assigning and then
discarding we are forced to issue an additional change notification
to the smb client or afs redirector.  Not only is this extra work
but the afs redirector notification can result in a deadlock with
a kernel thread that is waiting for the current thread to complete.

modify the function signature to return whether or not a race
was lost with a callback revocation.

rename 'freeFlag' to 'freeRacingRevokes' since that is what
the flag is meant to indicate.

create a new 'freeServer' flag to indicate when the server
reference should be released.  There was a leak of server
references when a race occurred.

modify all calls to cm_EndCallbackGrantingCall() that provide
an AFSCallBack structure on input to check for a lost race.
If a race occurs, cm_MergeStatus() should not be performed.

Change-Id: Ib17091ed51a24826bf84d33235125b3ccbbe47d4
Reviewed-on: http://gerrit.openafs.org/6556
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: deadlock bet. DirEntry lock + DirectoryNodeHdr.TreeLock
Jeffrey Altman [Sun, 15 Jan 2012 16:08:23 +0000]
Windows: deadlock bet. DirEntry lock + DirectoryNodeHdr.TreeLock

The DirectoryNodeHdr.TreeLock must be obtained before the
DirEntry->NonPaged->Lock.  In AFSLocateNameEntry(), the
DirEntry lock is obtained before the TreeLock when processing
a symlink object.  For that case obtain the TreeLOCK first.
Drop it if it is not required.

Change-Id: I5b73f98b4bc7fcd5c02b8f255fa2423b52eb4a4d
Reviewed-on: http://gerrit.openafs.org/6558
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Correctly mark extents dirty when using the non-persistent AFS
Peter Scott [Wed, 18 Jan 2012 19:04:29 +0000]
Windows: Correctly mark extents dirty when using the non-persistent AFS
cache

Change-Id: I9e03264bb94fe6494f1ca3721e4d7c7faf469fb5
Reviewed-on: http://gerrit.openafs.org/6571
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: Performing async work after cache invalidation
Peter Scott [Wed, 11 Jan 2012 13:49:23 +0000]
Windows: Performing async work after cache invalidation

The code now queues a work item to perform additional work on extent
processing after a cache invalidation has occurred. This additional work
involves walking the current list of extents and purging/flushing regions of
the system cache based upon the current state of the extent.
Additional changes to filter which invlidation events result in a queued
worker to perform asynchronous work.

Change-Id: I72e4e0bac2caf69e41a095ce8fc4c2e083702b5c
Reviewed-on: http://gerrit.openafs.org/6528
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoParallel build fixes
Marc Dionne [Wed, 18 Jan 2012 15:06:36 +0000]
Parallel build fixes

Assorted fixes for issues seen with parallel builds:
- bucoord must depend on butm, since it uses libbutm
- for most object files in roken and hcrypto, headers must be installed
  before building
- remove rules with 2 targets in rxkad and ubik
- budb: add dependencies for db_dump.o

Change-Id: Ide05f223c2f1fe53bff33cb03011ca47bf741c80
Reviewed-on: http://gerrit.openafs.org/6568
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoLinux 3.3: use umode_t for mkdir and create inode ops
Marc Dionne [Wed, 18 Jan 2012 16:22:35 +0000]
Linux 3.3: use umode_t for mkdir and create inode ops

The mkdir and create inode operations have switched to using
umode_t instead of int for the file mode.

Change-Id: Ib8bbf6eaa6e87d6a9692c45b1a3fe93fcc3eff7a
Reviewed-on: http://gerrit.openafs.org/6567
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agoLinux: use standard macro for set_nlink configure test
Marc Dionne [Wed, 18 Jan 2012 15:25:03 +0000]
Linux: use standard macro for set_nlink configure test

A generic macro exists to test for functions in the kernel, use
it for set_nlink.

Change-Id: Iaec2b29e48f500bcf7a1ef80a3f2a1305e5dbb8f
Reviewed-on: http://gerrit.openafs.org/6566
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

9 years agovolinfo: fix formating of placeholder printfs
Derrick Brashear [Tue, 17 Jan 2012 21:08:56 +0000]
volinfo: fix formating of placeholder printfs

needed to placate gcc-llvm on lion

Change-Id: Ie15e4768d2e3feb7ad80dfef05395f2c4a227c0f
Reviewed-on: http://gerrit.openafs.org/6565
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agorx: Correctly test for end of call queue
Marc Dionne [Wed, 18 Jan 2012 01:19:54 +0000]
rx: Correctly test for end of call queue

The intention of this condition is to check if the current call
being considered is the last one on the queue, but the test is
incorrect.  A null next pointer indicates a removed item, not
the end of the queue.

Use the queue_IsLast macro instead to correctly determine that
this is the last item in the queue and that a call has to be
selected, either the current one or a previously seen good choice.

This can cause calls to get permanently stuck in the call queue
and never get assigned to a thread, even when all threads are
idle.

Change-Id: Ie44a45734ab25bd3d2be3635c2e8f05857ca935e
Reviewed-on: http://gerrit.openafs.org/6564
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

9 years agoWindows: disable memory extent interface
Jeffrey Altman [Sat, 14 Jan 2012 15:44:56 +0000]
Windows: disable memory extent interface

There have been reports that the memory extent interface which
is used when NonPersistentCache is active can lead to data corruption.

Change-Id: I3a8acae0648a67534e46c73ef1dcbf7f939a558d
Reviewed-on: http://gerrit.openafs.org/6557
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: restrict service to 2 cpus by default
Jeffrey Altman [Sat, 14 Jan 2012 15:31:01 +0000]
Windows: restrict service to 2 cpus by default

Performance drops off considerably when the number of processors
increases due to lock contention and the cm_SyncOp wait processing.
If the MaxCPUs registry value is not set, limit ourselves to two.
Setting MaxCPUs to zero permits use of all CPUs.

Change-Id: I4bae328ed589811b0ea2a514501a0c1aa74e8015
Reviewed-on: http://gerrit.openafs.org/6555
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: AFS_SERVER_FLUSH_DELAY AFS_SERVER_PURGE_DELAY
Jeffrey Altman [Sat, 14 Jan 2012 04:58:50 +0000]
Windows: AFS_SERVER_FLUSH_DELAY AFS_SERVER_PURGE_DELAY

Alter the flush delay to 5 seconds from 30 seconds

Alter the purge delay to 300 seconds from 5 seconds

Change-Id: I3f8e79d84582c4015e35d58cf1bedc9a023c0d73
Reviewed-on: http://gerrit.openafs.org/6554
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: AFSParseName edge cases
Jeffrey Altman [Sat, 14 Jan 2012 04:57:10 +0000]
Windows: AFSParseName edge cases

If the input path is \afs\ behave as if the path is \afs.

If the input path is \afs\*\ detect the wildcard and return
STATUS_OBJECT_NAME_INVALID.

Change-Id: I0ef4f30fb3b6245a52160b5e7f9233bc5f599485
Reviewed-on: http://gerrit.openafs.org/6553
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

9 years agoWindows: afs root is always a directory
Jeffrey Altman [Sat, 14 Jan 2012 00:32:16 +0000]
Windows: afs root is always a directory

If the root is opened with the FILE_NON_DIRECTORY_FILE option,
fail the request with STATUS_FILE_IS_A_DIRECTORY.

Change-Id: Ic7d29f9032c2a19617276138833938fcf304838e
Reviewed-on: http://gerrit.openafs.org/6552
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>