git.openafs.org Git - openafs.git/log

rx: RX_CALL_IDLE and RX_CALL_BUSY

Allocate new Rx error codes for Idle and Busy calls but do not
send these errors on the wire. They are only intended for local
use.

RX_CALL_IDLE is an indication to an application that requests it
that the rx peer is maintaining an open call channel but has not
sent any actual data for the length of the registered idle dead
timeout.

RX_CALL_BUSY is an indication to an application that requests it
that the rx peer believes the selected call channel is in use by
a pre-existing call.

When either RX_CALL_IDLE or RX_CALL_BUSY are assigned as the call
error and an abort must be sent to the rx peer, the errors are
translated to RX_CALL_TIMEOUT. This is necessary because it is
not possible to add new Rx error values in a method that is safe
for peers that are not expecting them.

This patchset also documents which Rx errors defined in rx.h are
used on the wire and which are not.

The Unix and Windows cache managers are updated to build with
these new error codes.

Change-Id: Ib236f27b88d503c68134534bb069e12dd83537d8
Reviewed-on: http://gerrit.openafs.org/6128
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows Asynchronous purging of file content after a DV change

Purge all regions of the file surrounding the extents which are to be
purged. If a failure occurs on the purge due to an existing mapping, flag
for purge during handle close

Change-Id: Id8ef81afaa614ea08e03bbd55ec2cdded0d7139f
Reviewed-on: http://gerrit.openafs.org/6573
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: cm_buf refcnt must hold buf_globalLock

An assertion in buf_Recycle() was being triggered when a cm_buf_t
object was supposed to be in the free buffer list but wasn't.
buf_Recycle() was racing with another thread.  The test for
refCount == 0 was performed while holding the buf_globalLock
exclusively but the InterlockedDecrement(refCount) in buf_Release()
was performed without holding buf_globalLock at all.  buf_globalLOck
must be held at least as a read lock.  Otherwise, the refCount can
reach 0 prior to the thread blocking for exclusive access to the
buf_globalLock.  This provides buf_Recycle() which is holding
buf_globalLock the opportunity to race.

The solution is to make sure that buf_Release() always holds
buf_globalLock as a read lock and then use buf_ReleaseLocked()
to perform the actual decrement and test.

Change-Id: Ieb67548a7e44fa5f06f9346f428b1edadfc80696
Reviewed-on: http://gerrit.openafs.org/6576
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Redesign daemon thread queue management

The daemon thread worker pool has some very poor properties.
The threads spend a significant amount of time polling for
ready to process tasks because so frequently a store/fetch data
request is accompanied by many other requests for the same FID
that would block.

Lets try a new approach. Create one queue for each worker thread
and assign the tasks to a thread by a hash of the FID. This ensures
that all tasks for a single FID are serialized and prevents multiple
threads from attempting to perform the same task only to decide that
the thread would be forced to block.

Change-Id: I1d00ba0df1aa646e05b2cb3cb0796629f2e6d233
Reviewed-on: http://gerrit.openafs.org/6575
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: prevent race assigning Fcb in AFSInitFcb()

AFSInitFcb() is executed when the ObjectInformation->Fcb pointer
is NULL. More than one thread can make that determination at the
same time. Use InterlockedCompareExchangePointer() to detect
a race and permit cleanup to be performed.

Remove the output parameter of AFSInitFcb() to avoid a double
assignment.

Change-Id: I3870cccd5cd5e95134446523cce3547a2135d5e3
Reviewed-on: http://gerrit.openafs.org/6562
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: cm_EndCallbackGrantingCall refactoring

Refactor cm_EndCallbackGrantingCall to prevent assigning a
callback to the cm_scache object in the case where it is going
to be discarded.  If the race was lost the callback data was
already discarded by cm_RevokeCallback.  By assigning and then
discarding we are forced to issue an additional change notification
to the smb client or afs redirector.  Not only is this extra work
but the afs redirector notification can result in a deadlock with
a kernel thread that is waiting for the current thread to complete.

modify the function signature to return whether or not a race
was lost with a callback revocation.

rename 'freeFlag' to 'freeRacingRevokes' since that is what
the flag is meant to indicate.

create a new 'freeServer' flag to indicate when the server
reference should be released.  There was a leak of server
references when a race occurred.

modify all calls to cm_EndCallbackGrantingCall() that provide
an AFSCallBack structure on input to check for a lost race.
If a race occurs, cm_MergeStatus() should not be performed.

Change-Id: Ib17091ed51a24826bf84d33235125b3ccbbe47d4
Reviewed-on: http://gerrit.openafs.org/6556
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: deadlock bet. DirEntry lock + DirectoryNodeHdr.TreeLock

The DirectoryNodeHdr.TreeLock must be obtained before the
DirEntry->NonPaged->Lock. In AFSLocateNameEntry(), the
DirEntry lock is obtained before the TreeLock when processing
a symlink object. For that case obtain the TreeLOCK first.
Drop it if it is not required.

Change-Id: I5b73f98b4bc7fcd5c02b8f255fa2423b52eb4a4d
Reviewed-on: http://gerrit.openafs.org/6558
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Correctly mark extents dirty when using the non-persistent AFS
cache

Change-Id: I9e03264bb94fe6494f1ca3721e4d7c7faf469fb5
Reviewed-on: http://gerrit.openafs.org/6571
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Performing async work after cache invalidation

The code now queues a work item to perform additional work on extent
processing after a cache invalidation has occurred. This additional work
involves walking the current list of extents and purging/flushing regions of
the system cache based upon the current state of the extent.
Additional changes to filter which invlidation events result in a queued
worker to perform asynchronous work.

Change-Id: I72e4e0bac2caf69e41a095ce8fc4c2e083702b5c
Reviewed-on: http://gerrit.openafs.org/6528
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Parallel build fixes

Assorted fixes for issues seen with parallel builds:
- bucoord must depend on butm, since it uses libbutm
- for most object files in roken and hcrypto, headers must be installed
before building
- remove rules with 2 targets in rxkad and ubik
- budb: add dependencies for db_dump.o

Change-Id: Ide05f223c2f1fe53bff33cb03011ca47bf741c80
Reviewed-on: http://gerrit.openafs.org/6568
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

Linux 3.3: use umode_t for mkdir and create inode ops

The mkdir and create inode operations have switched to using
umode_t instead of int for the file mode.

Change-Id: Ib8bbf6eaa6e87d6a9692c45b1a3fe93fcc3eff7a
Reviewed-on: http://gerrit.openafs.org/6567
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

Linux: use standard macro for set_nlink configure test

A generic macro exists to test for functions in the kernel, use
it for set_nlink.

Change-Id: Iaec2b29e48f500bcf7a1ef80a3f2a1305e5dbb8f
Reviewed-on: http://gerrit.openafs.org/6566
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

volinfo: fix formating of placeholder printfs

needed to placate gcc-llvm on lion

Change-Id: Ie15e4768d2e3feb7ad80dfef05395f2c4a227c0f
Reviewed-on: http://gerrit.openafs.org/6565
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

rx: Correctly test for end of call queue

The intention of this condition is to check if the current call
being considered is the last one on the queue, but the test is
incorrect. A null next pointer indicates a removed item, not
the end of the queue.

Use the queue_IsLast macro instead to correctly determine that
this is the last item in the queue and that a call has to be
selected, either the current one or a previously seen good choice.

This can cause calls to get permanently stuck in the call queue
and never get assigned to a thread, even when all threads are
idle.

Change-Id: Ie44a45734ab25bd3d2be3635c2e8f05857ca935e
Reviewed-on: http://gerrit.openafs.org/6564
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Windows: disable memory extent interface

There have been reports that the memory extent interface which
is used when NonPersistentCache is active can lead to data corruption.

Change-Id: I3a8acae0648a67534e46c73ef1dcbf7f939a558d
Reviewed-on: http://gerrit.openafs.org/6557
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: restrict service to 2 cpus by default

Performance drops off considerably when the number of processors
increases due to lock contention and the cm_SyncOp wait processing.
If the MaxCPUs registry value is not set, limit ourselves to two.
Setting MaxCPUs to zero permits use of all CPUs.

Change-Id: I4bae328ed589811b0ea2a514501a0c1aa74e8015
Reviewed-on: http://gerrit.openafs.org/6555
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFS_SERVER_FLUSH_DELAY AFS_SERVER_PURGE_DELAY

Alter the flush delay to 5 seconds from 30 seconds

Alter the purge delay to 300 seconds from 5 seconds

Change-Id: I3f8e79d84582c4015e35d58cf1bedc9a023c0d73
Reviewed-on: http://gerrit.openafs.org/6554
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSParseName edge cases

If the input path is \afs\ behave as if the path is \afs.

If the input path is \afs\*\ detect the wildcard and return
STATUS_OBJECT_NAME_INVALID.

Change-Id: I0ef4f30fb3b6245a52160b5e7f9233bc5f599485
Reviewed-on: http://gerrit.openafs.org/6553
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: afs root is always a directory

If the root is opened with the FILE_NON_DIRECTORY_FILE option,
fail the request with STATUS_FILE_IS_A_DIRECTORY.

Change-Id: Ic7d29f9032c2a19617276138833938fcf304838e
Reviewed-on: http://gerrit.openafs.org/6552
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

fix spelling in comments

Change-Id: I4b4558833825295bbd16134cbd403a87b1c7b14c
Reviewed-on: http://gerrit.openafs.org/6561
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

DAFS: Fix SYNC_FAILED VScheduleSalvage_r log

SYNC_FAILED is not an unknown protocol code, so stop saying it is.

Change-Id: I87ce896fe061e6b5bfd3efdbb442281682a3e652
Reviewed-on: http://gerrit.openafs.org/6530
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vol: Fix VCreateVolume special inode cleanup

In order to dec the relevant special inodes, we need to know the
parent vol id in addition to the vol id itself. Use the appropriate
volume IDs when IH_DEC'ing special inodes after we fail to create the
volume, so we don't leave behind special inodes.

Change-Id: I77cfafac80c49debf46c86faefadd2a586d6f06b
Reviewed-on: http://gerrit.openafs.org/6529
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Windows: dir buffers out of date - mark them as such

if cm_CheckForSingleDirChange() fails, mark the cm_scache_t
bufDataVersionLow as the current data version so that old directory
buffers are discarded.

Change-Id: I8d587a024027e74e66190fdc993564b640993b4c
Reviewed-on: http://gerrit.openafs.org/6498
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Avoid file server rpcs on deleted files

If a file has been deleted, do not attempt to issue RPCs
to the file server in response to AFS redirector extent processing.
All RPCs will fail with VNOVNODE which will in turn trigger invalidation
requests to the AFS redirector which can deadlock.

Change-Id: I85b6b4a0ce619e54df648163392be93761f709f0
Reviewed-on: http://gerrit.openafs.org/6514
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: use local var for interlocked result

Save the result of the interlocked operations for use in
debug logging. Do not reference the incremented or decremented
object in the log messages, it may have changed.

Local assignment is provided even in functions that are currently
not logging to assist with debugging and as a reminder to use
the result variable in future log messages.

Change-Id: Ia7ed8bf14b204b265e1db7713b96864634a731d7
Reviewed-on: http://gerrit.openafs.org/6508
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSParseMountPointTarget buffer overrun

When parsing the AFS mount point string do not overrun
the buffer if the colon cell/volume separator is not
found.

Change-Id: Id7275cc8815223730f7c39bd11a6f495beb117c4
Reviewed-on: http://gerrit.openafs.org/6507
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Directory Enumeration, DVs, and TreeLocks

Hold the TreeLock exclusively across all operations that
enumerate, validate, or otherwise manipulate directory tree
lists or data versions.

Take the data version into account when deciding what to do
with directory data.  If a directory enumeration takes more
than one request to service and the DV has changed from the
time the directory snapshop was taken by the service and the
enumeration completion, merge in the changes and then mark
the directory as requiring verification.

If a directory change operation completes (create, rename, remove)
and the directory DV has changed by more than one force a full
directory verification.

Set the directory data version to -1 whenever a directory
verification is required.  Otherwise, the check to clear the
VERIFY flag will only update the metadata for the directory.

During a directory verification, if a new entry has been discovered
it is added to the directory.  Make sure the VALID flag is set so
that the entry will not immediately be removed as invalid.

Change-Id: I6be8d00126fccf88bde8ae5f97e850dfb9a2f60f
Reviewed-on: http://gerrit.openafs.org/6460
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: correct log messages in AFSCleanup

Change-Id: I1e202547d82195f85e6de20e72f6b07c6c7475ba
Reviewed-on: http://gerrit.openafs.org/6506
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Return Dir Data Version from AFSCleanup

This patchset returns the directory data version from AFSCleanup().
It does not do anything with it.

Change-Id: I86ac37f9e237bfec3ea612b896bec4ed7d43d068
Reviewed-on: http://gerrit.openafs.org/6505
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: reorg open handle counts and Fcb->NPFcb->Resource

Reorganize when open handle counts are decremented in order
to avoid a race with worker threads performing garbage collection.

Change-Id: I07c1c5e80fad48cd3439dbc9c85bd6dff9b9bf44
Reviewed-on: http://gerrit.openafs.org/6504
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Permit renames of open files

AFS does not impose a restriction on renames of open files.
Failure to permit the rename can cause problems if an anti-malware
service opens the file immediately after the application performing
the rename does so.

Change-Id: Ib23a6a893c5c575e89b8a817faec4c11300a04b7
Reviewed-on: http://gerrit.openafs.org/6503
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Do not prime the service directory cache

Performing a directory enumeration is an expensive operation
that we should be attempting to avoid. The current directory
enumeration and evaluate target requests will use inline bulk
status RPCs to the file server which obtain status for 49 items
at a time from a single directory.

Change-Id: I78e08680fec9715c3c446d0c4c5226cd79db80bd
Reviewed-on: http://gerrit.openafs.org/6502
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: More specific error values

When a mount point, symlink, or dfslink cannot be resolved
return STATUS_REPARSE_POINT_NOT_RESOLVED.

When an operation fails because the volume is readonly, return
STATUS_MEDIA_WRITE_PROTECTED.

Change-Id: Ib35f0d7851c087bf8aa25d4b0138ee72fb6f3c68
Reviewed-on: http://gerrit.openafs.org/6501
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: do not flush dirty extents without permission

When closing file handles, do not permit dirty extents to be
released back to the service if the current handle (Ccb) does
not have write permission. The cleanup operation will fail with
STATUS_ACCESS_DENIED, the extents will be released and all of the
dirty data will be discarded.

Change-Id: Iceacf5319147d1bd6277ea160bc67d91f1a49d5b
Reviewed-on: http://gerrit.openafs.org/6500
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

opr: fix generated target

we need opr for comerr, but we don't want it after. build,
then clean up.

Change-Id: I621f36bc5f6db85720b73b33578975d0dd126a18
Reviewed-on: http://gerrit.openafs.org/6525
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

libuafs: only rebuild h directory when needed

A few changes to allow a "make all ; sudo make install ; make all..."
workflow to work without manually removing files in between.

Make the rebuilding of the h directory dependent on the source
files scanned to build it. This prevents it from being rebuilt
for every "make install".

While we're here, use -f when removing linktest for the clean target.
This allows "make clean" to remove it without prompting when the user
doesn't have write access to the file, as is the case when make install
rebuilds it as root.

Change-Id: I45b34ad41560ef8c905e6be4201fa438a3cc7bc3
Reviewed-on: http://gerrit.openafs.org/6519
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

opr: add buildtools target

opr needs a buildtools target for "make generated". Make it install
the headers, which are needed by the other generated targets.

Change-Id: I34faa81fa84407c5e6e1460dc765d0c2ce1ef3e8
Reviewed-on: http://gerrit.openafs.org/6523
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Make libjafs buildable again

libjafs is surprisingly close to being buildable. Fix a few misc
things which have bitrotted over the years so it is possible to
actually build:

- Add -I$SRC/config to the cflags, so we can include afsconfig.h

- Remove references to the nonexistant rxkstats.o

- Do not link with UAFS' AFS_component_version_number.o, since this
gives us duplicate version number symbols

- Include afs_vosAdmin.h in Group.c, to satisfy some missing symbols

Change-Id: Ie8da88872288073d080a58ed7fe8c8b52052488e
Reviewed-on: http://gerrit.openafs.org/6524
Reviewed-by: Steven Jenkins <steven@synaptian.com>
Tested-by: Steven Jenkins <steven@synaptian.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: discard cached state when we are unsure of validity

in the event we got a network error, we don't know if the server
completed (or will complete) our operation. we can assume nothing.
a more complicated version of this could attempt to verify that the
state is what we expect it to be, but in extended callbacks universe
this is potentially easier to solve anyway. for now, return the
error to the caller, and mark the vcache unstat'd.

Change-Id: Iafb67f24b89d78b8236660d047da12fce1dd6061
Reviewed-on: http://gerrit.openafs.org/6510
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

afs: put back conn if not using in checkserver loop

we get a conn, check it for eligibility, and if not,
just abandon it. "oops"

Change-Id: Ie3841c19b05a87fb225c3e8124cd485cba3c3648
Reviewed-on: http://gerrit.openafs.org/6516
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Derrick Brashear <shadow@dementix.org>

rx: add and export a public keepalive toggle

make enabling and disabling keepalives a public function.
export the function

Change-Id: Ia553d91488511edc0b483d95326f14ac0e315332
Reviewed-on: http://gerrit.openafs.org/6517
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Make src/opr objdir safe

Update the Makefile for src/opr to use $? to reference headers, so objdir
builds work correctly

Change-Id: I3d8e0d885715a1d1bc1578f4e8ce69fe4239bb56
Reviewed-on: http://gerrit.openafs.org/6444
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

afs: increase idledead time

it's actually important this be more than the rx call dead time
so timing out server callbacks to clients don't result in us idle deading
a call to the server when callbacks need to be broken

FIXES 130327

Change-Id: Ibe2468edb61f307da9174d2c51cb0ea61c118c56
Reviewed-on: http://gerrit.openafs.org/6497
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

ukernel: enable nat ping again

if we're not root, no nat ping at all. fix that.

Change-Id: I7ea4db77b30ba639921b11c4ccad35a2e14133b4
Reviewed-on: http://gerrit.openafs.org/6509
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Negate codes using a clear, standard method

The bulk of our code uses 'code = -code' to negate an error code.
Use this, rather than 'code *= -1', as the latter form makes my
head hurt.

Change-Id: I578fbd7c123c37d89ceb1a6373409feb8b619d86
Reviewed-on: http://gerrit.openafs.org/6511
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Use offsetof() in set_header_word to get field offset

Use offsetof() to replace a few instances where the same logic is
open coded in set_header_word and inc_header_word macros. In cases
where the field name involves a variable as an index to an array,
newer gcc gives a sequence point warning.

Change-Id: I43e3d6ef6a63b51003496a1beb72c445a9109615
Reviewed-on: http://gerrit.openafs.org/6513
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vol: initialize readmeinode

Newer gcc complains about readmeinode being potentially used
uninitialized. Doesn't look possible in the code, but initialize
it to quiet the warning.

Change-Id: I7172475a64a3bfb90a76c0266d7812d5d42a2c4c
Reviewed-on: http://gerrit.openafs.org/6512
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Unix CM: reset blacklist on hard-mount retry

Reset black-listed servers on a request when retrying due to a
hard-mount retry. When hard-mounts are in effect, a request may
retry indefinitely. If all the servers have been black-listed
due to a transient error, the request may never complete.

Change-Id: I2510f729cbbb21836b139c94e25867118a6ad873
Reviewed-on: http://gerrit.openafs.org/6330
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

DAFS: Atomically re-hash vnode in VGetFreeVnode_r

VGetFreeVnode_r pulls a vnode off of the vnode LRU, and removes the
vnode from the vnode hash table. In DAFS, we may drop the volume glock
immediately afterwards in order to close the ihandle for the old vnode
structure.

While we have the glock dropped, another thread may try to
VLookupVnode for the new vnode we are creating, find that it is not
hashed, and call VGetFreeVnode_r itself. This can result in two
threads having two separate copies of the same vnode, which bypasses
any mutual exclusion ensured by per-vnode locks, since they will lock
their own version of the vnode. This can result in a variety of
different problems where two threads try to write to the same vnode at
the same time. One example is calling CopyOnWrite on the same file in
parallel, which can cause link undercounts, writes to the wrong vnode
tag, and other CoW-related errors.

To prevent all this, make VGetFreeVnode_r atomically remove the old
vnode structure from the relevant hashes, and add it to the new hashes
before dropping the glock. This ensures that any other thread trying
to load the same vnode will see the new vnode in the hash table,
though it will not yet be valid until the vnode is loaded.

Note that this only solves this race for DAFS. For non-DAFS, the vol
glock is held over the ihandle close, so this race does not exist.
The comments around the callers of VGetFreeVnode_r indicate that
similar extant races exist here for non-DAFS, but they are unsolvable
without significant DAFS-like changes to the vnode package.

Change-Id: I84c5d1bdd29f9e7140e905388b4b65629932c951
Reviewed-on: http://gerrit.openafs.org/6385
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Grab a reference to setp in afs_icl_Event4

We can drop GLOCK in several places in afs_icl_Event4 and the
afs_icl_AppendRecord callee. To ensure that the given afs_icl_set does
not get freed while we have GLOCK dropped, grab a reference to the
set.

Thanks to Ryan C. Underwood for reporting an issue triggered by this.

Change-Id: Ifeda229b444abd75b0f22c7acf18a7553d833964
Reviewed-on: http://gerrit.openafs.org/6431
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

linux: fsync on a directory should return 0, not EINVAL

Directory writes are synchronous, so this is fine. There's a
mostly-convenient function in fs/libfs.c that returns 0 that we can use
to do what we want ("mostly" because it was renamed in 2.6.35).

FIXES 130425

Change-Id: I9a2af60ed3152be036f0145c94152d8cff2e1242
Reviewed-on: http://gerrit.openafs.org/6491
Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

rpm: Don't attempt to restart on upgrade when using systemd

systemd is actually rather capable of leaving the OpenAFS client in an
incredibly broken state, thanks to its willingness to track services and
kill their processes. We should not attempt to restart the client on
upgrade, whether a normal upgrade or a migration from SysV initscripts.
In the former case, it's fine (and correct) for the old AFS to keep
running; in the latter case, the unit file is capable of correctly
shutting down an initscript-launched client. The same is true for the
OpenAFS server.

This brings the packaging in line with the SysV initscript code in the
specfile, which does not attempt to restart the service, as well as with
e.g. Debian's packaging, which uses --no-restart-on-upgrade.

While we're here, clean up a redundant BuildRequires on systemd-units.

Change-Id: I3b1771a7246f04be0e82765976664c50e0adae47
Reviewed-on: http://gerrit.openafs.org/6247
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Windows: Support correct status codes from service

When performing object verification, check for status failures corresponding
to parent object issues which require a validation of the parent

Change-Id: I4a73b55961eda62079c933f9e85888ea24b39f1f
Reviewed-on: http://gerrit.openafs.org/6447
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Handle invalid node types

In the case where the direntry data is invalid, construct an Fcb
of type INVALID so that the direntry can be displayed and the objected
deleted even if it cannot be evaluated.

Change-Id: I37da154b7429929fe833874c7cd048a3a804a96f
Reviewed-on: http://gerrit.openafs.org/6445
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSFileUpdateResultCB ParentDataVersion

Add the parent directory data version to the AFSFileUpdateResultCB
structure.

Change-Id: Ia1b1345c410ff216b35f3d42912ac921b978a299
Reviewed-on: http://gerrit.openafs.org/6459
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: renames that overwrite existing target

The Windows client up to this point has never correctly implemented
directory renames. For the longest time it assumed that the file
server would not replace a pre-existing target. As a result, when
the target name was already in use the contents of the directory
would end up with the target name existing but its previous file id
associated with it.

A second problem was that lookups for the source and target names
were not performed while the directory (or directories) were exclusively
held to ensure that competing changes could not occur.

This patchset corrects both issues in cm_Rename() and adjusts the
redirector interface to match the new behavior.

Change-Id: I4f5cff7debcf9925947ac3fc6931565acb57ebd9
Reviewed-on: http://gerrit.openafs.org/6457
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSDirEnumResp and AFSDirEnumEntry changes

A directory enumeration is not an atomic operation.  The redirector
reads an enumeration a chunk at a time.  During the entire enumeration
it is possible that the data version of the directory object has
changed due to entries being added or removed.  This patchset adds
two data version values to the AFSDirEnumResp structure.

The first is the snapshot data version which is the dv of the
directory object at the time the entry list snapshot was taken.
The second is the current data version number of the directory
object.

If an object has been removed from the directory after the snapshot
was taken, attempts to fetch status information for the object will
fail with a VNOVNODE (aka CM_ERROR_BADFD aka STATUS_INVALID_HANDLE).
The NTStatus field has been added to the AFSDirEnumEntry structure
to permit notifying the redirector of such failures.

RDR_PopulateCurrentEntry() has been extended with an additional
cm_Error parameter that accepts the errorCode field provided by
the cm_direnum_entry_t object constructed during the enumeration.

Change-Id: Iee8f6bf9919780ce4dd6c2b184810c0d6afc39cc
Reviewed-on: http://gerrit.openafs.org/6455
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Add AFSFileEvalResultCB

In response to AFS_REQUEST_TYPE_EVAL_TARGET_BY_ID and
AFS_REQUEST_TYPE_EVAL_TARGET_BY_NAME, return the new AFSFileEvalResultCB
instead of a raw AFSDirEnumEntry. AFSFileEvalResultCB includes
the data version number of the parent directory at the time the
node was evaluated.

Change-Id: Ida25790688f8ab193c234c9b3fadf4f594edd740
Reviewed-on: http://gerrit.openafs.org/6454
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Add AFSFileCleanupResultCB

Add AFSFileCleanupResultCB which includes the parent directory
data version number. This is necessary because object deletion occurs
during the Cleanup processing and the redirector needs to know the
resulting data version of the affected directory.

Change-Id: Iac07ddaaa3e3373f1690c85d247313e070450169
Reviewed-on: http://gerrit.openafs.org/6453
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: STATUS_OBJECT_PATH_INVALID == invalid parent directory

Modify evaluation of nodes by name and id to consistently return
STATUS_OBJECT_PATH_INVALID if the parent FID no longer exists.

Change-Id: I94f56e5b525a35279152f6f7848654a56bbfa235
Reviewed-on: http://gerrit.openafs.org/6446
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Request extents readability

Two minor code modifications to make the code easier to read.

Change-Id: I1cf72911ace4eff17c857cd000cb24fbe0f28c2b
Reviewed-on: http://gerrit.openafs.org/6433
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: RequestExtents avoid bufWrite if rdr held

If the cm_buf_t is held by the redirector the buffer cannot
be written back to the file server even if dirty. Therefore,
do not check whether or not the cm_buf_t is dirty until after
it is known that the buffer is not redirector owned.

Change-Id: I10dc8f74915c2267dc44138284eba273eb708e0a
Reviewed-on: http://gerrit.openafs.org/6432
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: avoid race during Fcb cleanup

The worker thread can race with a AFSCleanup() operation and
tear down the Fcb before the AFSCleanup() drops the Fcb->NPFcb->Resource.
Avoid this race by requiring the worker thread to obtain the resource
once before deleting the resource.

Change-Id: Iafad8260c5dfc4187a62c04b14d55ac0bf0e4aeb
Reviewed-on: http://gerrit.openafs.org/6462
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: avoid deadlock if bulk error during enum

If the cache manager has a valid callback at the start of a
directory enumeration, the service can begin a bulk status rpc
which can fail. The error code from the rpc is never propagated
to the caller, therefore the caller loops forever attempting to
complete the enumeration with status info.

Fix it by returning the error.

Change-Id: I53892ddf338152d53c533ef31c3b1047c96bfbf2
Reviewed-on: http://gerrit.openafs.org/6461
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSInsertHashEntry can fail

If AFSInsertHashEntry() fails, the object information structure
that was being inserted is not in the btree. Therefore, ensure
that the object does not have the AFS_OBJECT_INSERTED_HASH_TREE
or AFS_VOLUME_INSERTED_HASH_TREE flag set (as appropriate).
This permits the unreferenced object to be garbage collected.

Change-Id: I023f765571a7ba014556d9505ab2d46ec930f1a2
Reviewed-on: http://gerrit.openafs.org/6458
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: additional AFSValidateEntry logging

Change-Id: Iecfbaff197b83de1c31c51d18f819c9d1be54f60
Reviewed-on: http://gerrit.openafs.org/6456
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: add DV and error status to dir enumerations

The cm_BPlusDirEnum family of functions are atomic when generating
the directory enumeration but are not atomic with respect to the
rest of the system as the enumeration is accessed.  Therefore, the
data version of the directory at the time the enumeration is created
may not be the same as the directory version when the enumeration
is fully processed.  We therefore store the initial data version in the
cm_direnum_t object.

When the enumeration is fetching status information for each of the
directory entries, it is possible that the fetch status will fail.
We therefore store the fetch status error code in the cm_direnum_entry_t
object.   By doing so, the consumer of the enumeration can make a
reasonable decision about the lack of status info.  For example,
if the resulting error is CM_ERROR_BADFD it is known that the entry
has been removed from the directory since the initial enumeration.

Change-Id: I289881e2c59525a9f998559b00769d3ac3f335c0
Reviewed-on: http://gerrit.openafs.org/6452
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: protect merge status against dscp == scp

If the directory status object is the same as the object for which
status info is being merged, the object will refer to itself as its
own parent. Do not permit that.

Change-Id: I6f7b6416f4c875a30dd5b85ba679389484523b12
Reviewed-on: http://gerrit.openafs.org/6451
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: protect dir ops by CM_SCACHESYNC_STOREDATA

CM_SCACHESYNC_STOREDATA is used to ensure that only one directory
modifying rpc can be issued to the file server at a time on a
single cm_scache_t. However, the local directory modifications
were being made after cm_MergeStatus() and cm_SyncOpDone()
were called. As a result, serialization of changes against the
local directory buffers and b+tree was lost.

Change-Id: I1e99685767b6b9b51e546be0946b189386e8dbd2
Reviewed-on: http://gerrit.openafs.org/6450
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: init scache DV=CM_SCACHE_VERSION_BAD

zero is a valid DV. CM_SCACHE_VERSION_BAD is not.

Change-Id: I65c10153059bae6dbd4da344958db4a6be309633
Reviewed-on: http://gerrit.openafs.org/6449
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: afsredirlib log messages

Improve or correct a number of log messages. Report the correct
FID or NT Status value, etc.

Change-Id: I434b47e1350767f868170323280298f77e1a840a
Reviewed-on: http://gerrit.openafs.org/6442
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Symlink resolve failure error

If a symlink cannot be resolved, return STATUS_REPARSE_POINT_NOT_RESOLVED
instead of STATUS_ACCESS_DENIED. The symlink is after all a reparse
point. This results in a more meaningful error being delivered to
the end user.

Change-Id: I30713dac7b916efaf3cf7a5d7717cb0bc971a31a
Reviewed-on: http://gerrit.openafs.org/6441
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Make idle dead timeout very long

The idle dead timeout processing must eventually be removed
from Rx for initiators. In the meantime, make the timeout period
ten times longer than the hard dead timeout. This permits eventual
failure when the server doesn't respond in ten minutes but avoids
more transient issues.

Change-Id: Ia673666dd55b33c4375ee8fdcbb89c82e8b01185
Reviewed-on: http://gerrit.openafs.org/6440
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: replace strdup with xdr_alloc in callback processing

The CRT allocator cannot be used for memory that will be freed
by afsrpc.dll. Use xdr_alloc() instead.

Change-Id: Idd33710c225d58b4e6eba0bfdb2f8b3282996258
Reviewed-on: http://gerrit.openafs.org/6439
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

windows: osi_TSignalForMLs simplify

Simplify logic for readability and efficiency.

Change-Id: I3c78b23b6fcf8478fe20a803755923108995d532
Reviewed-on: http://gerrit.openafs.org/6438
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: osisleep do not tamper with queues

There is no need to manually remove an entry from a queue before
executing osi_QRemoveHT(). osi_QRemoveHT() removes the item
from the queue and fixes up the pointers correctly. Manual
intervention is a waste of cpu and can be harmful.

Change-Id: Iaea4ceac2cb5f61e5bb73fd181bd934e06ddf0a6
Reviewed-on: http://gerrit.openafs.org/6437
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: osi_sleepInfo tid type

The thread id type is DWORD not size_t for consistency
with the rest of the client_osi package.

Change-Id: I2e2d31d8738d9de82d99f346f5109de133f3e25e
Reviewed-on: http://gerrit.openafs.org/6436
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: add osi_TWaitExt(), fix osi_TWait()

osi_TWait() was adding new locks to the turnstile at the tail
which is the end of the queue locks are removed from. This
implemented LIFO instead of FIFO when FIFO is the "fair" order
to service lock requests.

osi_TWaitExt() is added to permit the Reader to Writer upgrade
request to use LIFO when more than one reader is present.

Change-Id: Ib6435a3edc2cb8519939cfad93e0db4b0604da2d
Reviewed-on: http://gerrit.openafs.org/6435
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: use waiters counter instead of osi_TEmpty

The osi_TEmpty() macro examines the values of the turnstile
pointers. Instead use the lock's 'waiters' counter to determine
if there are waiting threads to signal.

Change-Id: I8e14a03a30adcf1e67b07fc020104c2ada3b5c6a
Reviewed-on: http://gerrit.openafs.org/6434
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: kauth search for kerberos iv port first

Modify src/kauth/user_nt.c to match the service name search
order of the Unix code:

kerberos4
kerberos-iv
kerberos

The standard Windows SERVICES file includes "kerberos-iv" as
port 750.

FIXES 127907

Change-Id: I518a812cc2d465334e8ef6929f8988c51b33749b
Reviewed-on: http://gerrit.openafs.org/6430
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

afs: Panic on afs_conn refcount imbalance

An undercounted afs_conn can easily cause a panic and/or memory
corruption later on, since we put an rx_connection reference with each
afs_conn reference. Panic as soon as we detect this, as this indicates
a serious bug.

Change-Id: I251fd3303114d0822b8cf70805a8a447986a7762
Reviewed-on: http://gerrit.openafs.org/6413
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Add afs_WriteDCache sanity checks

Writing a non-free non-discarded dcache entry with a zero volume id
can easily cause hash table corruption later on, so make sure we don't
do that. Also log something if the write itself fails, as this usually
indicates an unusual situation involving I/O errors or something.

Change-Id: Ib9602227e8cee324cb63a4a3dee28e53af69b446
Reviewed-on: http://gerrit.openafs.org/6419
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Cope with afs_GetValidDSlot errors

Make callers of afs_GetValidDSlot deal with getting a NULL dcache,
which can occur if an error is encountered. Some of these just panic
at least for now, since a code path for recovery is complex, but this
is at least better than dereferencing a NULL pointer.

Change-Id: I4022a914bbaa0e1f3f4daadfdc32d165a6e2febd
Reviewed-on: http://gerrit.openafs.org/6418
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Do not always ignore errors in afs_GetDSlot

Currently afs_UFSGetDSlot will silently swallow any error in reading
the specified dslot from disk, and will return a "blank" dcache to the
caller. However, many callers of afs_GetDSlot will be asking for a
dcache that we know exists, and more importantly, we know is on the
global hash table. If a disk error is encountered and we're given a
"blank" dcache, we will erroneously believe the dcache entry is not on
the hash table, causing corruption of the hash table later on.

So instead, modify all callers of afs_GetDSlot to use either
afs_GetValidDSlot or afs_GetNewDSlot. Calling afs_GetValidDSlot
indicates that the given dentry index is known to be valid, and any
error encountered while reading the entry from disk should result in
an error (for disk I/O errors we have no control over, this results in
a NULL dentry returned; for internal consistency errors we panic).
Calling afs_GetNewDSlot indicates that the specified index may not
exist or may not be valid, and so returning a "blank" dentry in that
case is fine.

For memcache, the situation is the same, except any time we go to
"disk" it is an (internal) error, since there is no disk.

Change-Id: I53ea6e99649e4d6d5cbde58929dfcee1d45a3e7b
Reviewed-on: http://gerrit.openafs.org/6417
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Remove second argument to afs_GetDSlot

All callers of afs_GetDSlot were passing NULL as the second argument
to afs_GetDSlot. So, remove the argument, and behave as if tmpdc was
NULL unconditionally.

Change-Id: I138fe917d739c3020c35c20da48ffdf38f682fd6
Reviewed-on: http://gerrit.openafs.org/6416
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Indicate error from afs_osi_Read/Write better

Currently afs_osi_Read and afs_osi_Write just return -1 on any I/O
error, even though they know the error code given from the OS VFS.
Just return that code instead so the caller can see what the error
was; but negate it, so it's clear that it is an error.

Change-Id: I3d8350da18d075713356137a1cacf182a749fe3e
Reviewed-on: http://gerrit.openafs.org/6412
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: afs_osi_Read/Write returns negative on error

afs_osi_Read and afs_osi_Write need to return negative values on
error. EIO is not negative; return -EIO so we don't accidentally
return "success" if someone requested to read or write EIO bytes.

Change-Id: Id0693776737fdf7086de16a935ad3942f5026e55
Reviewed-on: http://gerrit.openafs.org/6411
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

klog.krb5: cast get_cred_keylen to unsigned

get_cred_keylen can yield a type besides an unsigned int (such as a
size_t on heimdal). But we are printing it with %u, which causes a
warning, so cast it to an unsigned int.

Change-Id: I7b89de5b0b163b9532ac347e9c56e865cb58f266
Reviewed-on: http://gerrit.openafs.org/6410
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

fuse: Autodetect Solaris 11 FUSE

FUSE exists in Solaris 11, but it does not come with a fuse.pc
pkg-config configuration. Autodetect the presence of FUSE anyway.

Change-Id: Ia052ba0a1bfe511dd051f3cfbee10395dc9d2c60
Reviewed-on: http://gerrit.openafs.org/6422
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afsd.fuse: Solaris 11 support

The FUSE in Solaris 11 has a couple of quirks; work around them.

Change-Id: I29b8a8858467d1c6ebacb4926a15165feae64f2c
Reviewed-on: http://gerrit.openafs.org/6421
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afsd: Parse cacheinfo during argument parsing

Currently we parse cacheinfo in afsd_run, when the client is
initialized and started. Parsing cacheinfo can change
afsd_cacheMountDir, however, which may be of interest to afsd.o users;
in particular, libuafs exposes this via uafs_MountDir(). This means
that if a mount dir is not explicitly specified in the libcmd
arguments to afsd, a libuafs-using program will see the mountpoint as
the empty string if it is queried after afsd_parse but before
afsd_run. For afsd.fuse, this causes the cryptic error message:

fuse: bad mount point `': No such file or directory

since the mountpoint is the empty string if it is not specified
explicitly on the command line.

To fix this, move cacheinfo parsing to effectively near the end of
afsd_parse, so the mountpoint is calculated in afsd_parse().

Change-Id: I058f2c7c2f0cc21db21c4b1d38ff63b9e9ed1562
Reviewed-on: http://gerrit.openafs.org/6400
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

fuse: Add -oallow_other by default where possible

By default, fuse mountpoints are only accessible by the same uid as
that which mounted the fuse filesystem. When we're running as root,
specify -oallow_other so by default anyone can access the afs
mountpoint.

Change-Id: Idc732a22136fbe6bc585b76ac6291d8518f1f9de
Reviewed-on: http://gerrit.openafs.org/6390
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Windows: Avoid bottleneck on VolumeLock

The VolumeLock resource was obtained during each AFSParseName()
and held across a wide range of operations including volume
info queries, renames, and extent requests.  These operations can
take a long time to complete and as long as the VolumeLock was
held exclusively there could only be one operation in flight at
a time on a given volume.  This significantly reduced the parallelism
of operations.

The VolumeLock was not required in almost all cases.  This patchset
adjusts the use of the VolumeLock and avoids the bottleneck.

Change-Id: I9d60fe41d157b9e315aeaa15feee8d1e0d4ded4c
Reviewed-on: http://gerrit.openafs.org/6420
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: avoid race in cm_GetNewSCache

The cm_scacheLock is dropped while walking the scache LRU queue.
As a result it is possible for the cm_scache_t that is being
considered for recycling to be accessed and moved to the head
of the queue.

Track the prev and next pointers so it is possible to detect if
the cm_scache_t that is about to be recycled has been moved. If
so, restart the search from the tail.

Change-Id: I6c3b645b85aa60197b9b6d60cffdcb818eb6f4b2
Reviewed-on: http://gerrit.openafs.org/6424
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: cm_BufWrite() must wait in cm_SyncOp()

Now that it is permissible for more than one store data operation
to construct BIOD lists in parallel, cm_BufWrite() must be willing
to wait in cm_SyncOp(). Otherwise, the daemon threads will spin.

Change-Id: I77ee2005025de9255b4c9cdb8bed8efc44b9518a
Reviewed-on: http://gerrit.openafs.org/6423
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

rx: Don't adjust non-existent events

If we notice that time has gone backwards (that is, the current
time is older than the time of the last event we fired), then we
reschedule all pending events.

On Windows, immediately after we have resumed from a suspend, this
code path can be executed with an empty event tree, causing an
exception:

FAULTING_IP:
afsrpc!adjustTimes+cf [c:\src\openafs\openafs.git\repo\src\rx\rx_event.c @ 213]
00000000`61041847 4c8b4030        mov     r8,qword ptr [rax+30h]

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 0000000061041847 (afsrpc!adjustTimes+0x00000000000000cf)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000000030
Attempt to read from address 0000000000000030

Resolve this by checking for an empty tree before we attempt to adjust
event times. If the tree is empty, we just zero the last event time
(so we don't keep running the adjustTimes routine), and continue as
normal.

Change-Id: I42a42ff1bd53a9d5c4733efc7ac5f629426b3aa1
Reviewed-on: http://gerrit.openafs.org/6425
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSCleanup extent processing

1. Perform a CcFlushCache() any time the file is cached
   and the Context Control Block indicates that the handle
   has FILE_WRITE_DATA permission.

2. Perform an AFSFlushExtents() whenever there are dirty
   extents and the handle has FILE_WRITE_DATA permission.
   No point flushing the extents if the AuthGroup does not
   have write permission.  Another Ccb must exist that does
   have write permission.

Change-Id: I3ece011b484c12e7dc936b81c272ba6a42f6c7d6
Reviewed-on: http://gerrit.openafs.org/6399
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSRetrieveValidAuthGroup FILE_READ_DATA

Only an AuthGroup belonging to a Context Control Block that was
granted the FILE_READ_DATA permission is capable of reading
data from the file server.

Change-Id: I93a7d8e65a6bc87b44399a30da5c0dd7d4e07685
Reviewed-on: http://gerrit.openafs.org/6398
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: AFSRequestExtentsAsync retry with alt authgroup

If AFSRequestExtentsAsync() fails to obtain requested extents
due to STATUS_ACCESS_DENIED using the AuthGroup associated with
the Context Control Block, try to find an alternate AuthGroup
to use to perform the extent request. We have already told
Windows what permissions the application has when the file was
opened. Windows will perform its own validation checks prior
to permitting the data to be accessed or altered.

Change-Id: I430657e8c8e30c9f636a5ec81065af4122c926d7
Reviewed-on: http://gerrit.openafs.org/6397
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>

Windows: Use AuthGroups for extent request error reporting

The afs redirector current tracks the most recent extent error
in the File Control Block.  Prior to this patchset the error
was returned to the requesting thread when the process Id matched
the most recent Process to issue a request.  This approach resulted
in a couple of problems.

1. There are multiple threads that can issue an extent request
    on the same file at the same time representing different processes.
    Resetting the process Id with each new request could clear the
    error prior to its receipt.

2. The failure may be due to inappropriate permissions.  Permissions
    are not associated with proceses but with Authentication Groups.

This patchset makes several changes:

1. It enables the afsd_service to track the active authgroup as
    part of the cm_user_t structure and associates that object with
    the BIOD object to ensure that the active authgroup can be
    reported to the afs redirector.

2. It modifies the AFSExtentFailureCB structure to include the
    AuthGroup GUID.

3. It tracks the AuthGroup GUID associated with the extent
    failure in the non-paged file control block.

4. It converts all tests on Process Id to use AuthGroup instead.

5. It alters the behavior of error delivery such that reported
    error is only cleared after it has been reported once to a
    thread using the matching AuthGroup.

These changes make the situation better but not perfect as error
states can still be lost.  However, it avoids the case most often
seen in production where two processes (a end user process and an
anti-malware process) are fighting over a file and the anti-malware
process has no permission to access the file under its own credentials.

Change-Id: Ia5c3877b8d46de695c86884c4166dc812885a72c
Reviewed-on: http://gerrit.openafs.org/6396
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Peter Scott <pscott@kerneldrivers.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>