From 3e531db9ce50dd41f0c64a11ab3bfcf0239ba0cd Mon Sep 17 00:00:00 2001 From: Andrew Deason Date: Thu, 12 May 2016 21:34:31 -0500 Subject: [PATCH] vlserver: rx_SetRxDeadTime before ubik init Currently, vlserver calls rx_SetRxDeadTime to set the default rx deadtime to 50 seconds, but it does so after calling ubik_ServerInitByInfo. ubik_ServerInitByInfo creates several rx connections before it returns, and so these connections get the default rx deadtime (12 seconds), instead of the 50 seconds vlserver tries to set. When ubik detects that a remote site is down, ubik recreates the rx connections for that site, and this new connection gets the new deadtime of 50 seconds. This means that ubik behavior can have different timings in the vlserver, depending on if any remote sites have ever been detected as being 'down' or not. This can result in seemingly-inconsistent or confusing behavior, since some sequences of operations that appear identical can produce different results, depending on if the 12-second timeout or the 50-second timeout is being used. This behavior is not directly to blame for any problems, but it can be very confusing, especially when trying to diagnose or reproduce bugs. So to make things more consistent, just call rx_SetRxDeadTime earlier, so all conns always get the 50-second timeout. In order to do this, though, we must also ensure that rx_Init is called before rx_SetRxDeadTime (otherwise, rx_Init will overwrite our configured deadtime). So also call rx_Init earlier; rx_Init is idempotent, so it's okay that it may be called again after or before this. Note that vlserver is currently the only ubik server that sets a deadtime of 50 seconds, and it's not clear why. Another way to solve this is to just remove the call to rx_SetRxDeadTime, to make vlserver behave more similar to ptserver. But this commit takes a conservative approach to result in a deadtime that is probably the most common in current use. Since, most long-running vlservers will probably eventually lose contact with remote sites at one time or another, and so will eventually use a deadtime of 50 seconds. Change-Id: I49430144d9a62eb8cad1509c1aeafc9fcc927f8e Reviewed-on: https://gerrit.openafs.org/12285 Tested-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk --- src/vlserver/vlserver.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/vlserver/vlserver.c b/src/vlserver/vlserver.c index 22b29f3..08ecab9 100644 --- a/src/vlserver/vlserver.c +++ b/src/vlserver/vlserver.c @@ -477,6 +477,13 @@ main(int argc, char **argv) } } + code = rx_Init(htons(AFSCONF_VLDBPORT)); + if (code < 0) { + VLog(0, ("vlserver: Rx init failed: %d\n", code)); + exit(1); + } + rx_SetRxDeadTime(50); + ubik_nBuffers = 512; ubik_SetClientSecurityProcs(afsconf_ClientAuth, afsconf_UpToDate, tdir); ubik_SetServerSecurityProcs(afsconf_BuildServerSecurityObjects, @@ -490,7 +497,6 @@ main(int argc, char **argv) VLog(0, ("vlserver: Ubik init failed: %s\n", afs_error_message(code))); exit(2); } - rx_SetRxDeadTime(50); memset(rd_HostAddress, 0, sizeof(rd_HostAddress)); memset(wr_HostAddress, 0, sizeof(wr_HostAddress)); -- 1.9.4