viced: Set h_GetHost_r probefail if MPAA_r fails
authorAndrew Deason <adeason@sinenomine.net>
Fri, 17 Feb 2012 21:46:50 +0000 (15:46 -0600)
committerDerrick Brashear <shadow@dementix.org>
Mon, 20 Feb 2012 21:15:05 +0000 (13:15 -0800)
Currently, in h_GetHost_r, if we get a connection whose address does
not match an extant host, but the reported uuid does, we ProbeUuid the
old host. If it fails, we call MultiProbeAlternateAddress_r and set
'probefail'. Later on, if 'probefail' is set, we always add the
connection address to the host, and remove the host->host,host->port
address from the host.

However, this is not always correct. Consider the following situation.

We have an existing host that has primary address 1.1.1.1, and also
has addresses 1.1.1.2 and 1.1.1.3 on the interface list but not on the
hash table. Say that host A stops responding on 1.1.1.1, and a
connection comes in from 1.1.1.2. We ProbeUuid 1.1.1.1 and get a
failure, so we call MultiProbeAlternateAddress_r.
MultiProbeAlternateAddress_r probes via rx_Multi the addresses 1.1.1.2
and 1.1.1.3. Say that 1.1.1.3 responds first, and responds
successfully, so MultiProbeAlternateAddress_r sets 1.1.1.3 to be the
primary address for the host.

After MultiProbeAlternateAddress_r returns, 'probefail' is set. A few
lines down, we see that oldHost->host does not match haddr, and
'probefail' is set, so we add 1.1.1.2 to the interface list, and
remove 1.1.1.3, and set 1.1.1.2 to be the primary address, even though
1.1.1.3 is the address we most recently 'know' is correct.

To fix this, only set 'probefail' if MultiProbeAlternateAddress_r also
fails after the failed ProbeUuid call. Conceptually this makes sense,
since if MultiProbeAlternateAddress_r succeeds, it found an address
that responds successfully to ProbeUuid, and it sets that address to
be the primary address. Therefore, after MultiProbeAlternateAddress_r
returns success, the situation is the same as if the 'good' address
was already the primary address, and the ProbeUuid call succeeded, so
'probefail' should be cleared.

Change-Id: Id32817916a8a42db567ad099aae00745b79598c5
Reviewed-on: http://gerrit.openafs.org/6728
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

src/viced/host.c

index 9b946d7..f8e66b6 100644 (file)
@@ -2038,8 +2038,17 @@ h_GetHost_r(struct rx_connection *tcon)
                                         oldHost,
                                          afs_inet_ntoa_r(oldHost->host, hoststr),
                                         ntohs(oldHost->port),code2));
-                           MultiProbeAlternateAddress_r(oldHost);
-                            probefail = 1;
+
+                           if (MultiProbeAlternateAddress_r(oldHost)) {
+                               /* If MultiProbeAlternateAddress_r succeeded,
+                                * it updated oldHost->host and oldHost->port
+                                * to an address that responded successfully to
+                                * a ProbeUuid, so it is as if the ProbeUuid
+                                * call above returned success. So, only set
+                                * 'probefail' if MultiProbeAlternateAddress_r
+                                * fails. */
+                               probefail = 1;
+                           }
                         }
                     } else {
                         probefail = 1;