From: Mark Vitale Date: Fri, 17 Aug 2018 22:48:08 +0000 (-0400) Subject: volser: ensure GCTrans transaction walk remains valid X-Git-Tag: openafs-devel-1_9_0~435 X-Git-Url: https://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=930d8ee638112ca8bf27a9528c0a527cfab54c7d volser: ensure GCTrans transaction walk remains valid Commit bc56f5cc97a982ee29219e6f258b372dbfe1a020 ("volser: Delete timed-out temporary volumes") introduced new logic to GCTrans(). Unfortunately, part of this logic temporarily drops VTRANS_LOCK in order to call VPurgeVolume(). While this lock is dropped, other volser_trans may be added or deleted from the allTrans list. Therefore, GCTrans should not trust the next pointer (nt = tt->next) which was obtained before the lock was dropped. One symptom observed in the field was a segfault while examining tt->volume. Neither tt nor volume were valid any longer, since tt had been set from a stale nt at the top of the loop. To repair, improve, and clarify this logic: - Refactor so nt is assigned correctly and as late as possible. - Add comments to explain the placement of the assigns to future maintainers. Change-Id: Ibd3a504bddd3622730aa349576341e20f2f27836 Reviewed-on: https://gerrit.openafs.org/13286 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot --- diff --git a/src/volser/voltrans.c b/src/volser/voltrans.c index 158fec6..7a7af3e 100644 --- a/src/volser/voltrans.c +++ b/src/volser/voltrans.c @@ -180,7 +180,6 @@ GCTrans(void) VTRANS_LOCK; for (tt = allTrans; tt; tt = nt) { - nt = tt->next; /* remember in case we zap it */ if (tt->time + OLDTRANSWARN < now) { Log("trans %u on volume %" AFS_VOLID_FMT " %s than %d seconds\n", tt->tid, afs_printable_VolumeId_lu(tt->volid), @@ -206,12 +205,27 @@ GCTrans(void) VTRANS_UNLOCK; VPurgeVolume(&error, tt->volume); + /* + * While the lock was dropped, tt->next may have changed. + * Therefore, defer reading tt->next until _after_ we regain the lock. + */ VTRANS_LOCK; } - - DeleteTrans(tt, 0); /* drops refCount or deletes it */ + nt = tt->next; + /* + * DeleteTrans() will decrement tt->refCount; if it falls to 0, it will + * also delete tt itself. Therefore, we must read tt->next _before_ + * calling DeleteTrans(). + */ + DeleteTrans(tt, 0); GCDeletes++; + continue; } + /* + * This path never dropped VTRANS_LOCK or modified the allTrans list. + * Therefore, no special care is required to determine the next trans in the chain. + */ + nt = tt->next; } VTRANS_UNLOCK; return 0;