Path: vixie!Pa.dec.com!bind-redist-request From: hubert@cac.washington.edu (Steve Hubert) Newsgroups: local.mail.dns.bind Subject: Negative caching problem Date: 28 Jan 1994 11:10:29 -0800 Organization: A blearily-installed InterNetNews site Lines: 45 Sender: daemon@vix.com Distribution: local Message-ID: NNTP-Posting-Host: gw.home.vix.com X-Received: by gw.home.vix.com id AA01707; Fri, 28 Jan 94 11:10:20 -0800 X-Received: by inet-gw-2.pa.dec.com (5.65/13Jan94) id AA12608; Fri, 28 Jan 94 11:09:05 -0800 X-Received: from relay1.UU.NET by inet-gw-2.pa.dec.com (5.65/13Jan94) id AA11016; Fri, 28 Jan 94 10:33:56 -0800 X-Received: by relay1.UU.NET (5.61/UUNET-internet-primary) id AAwavg00600; Fri, 28 Jan 94 13:11:19 -0500 X-Received: from shiva2.cac.washington.edu by relay1 with SMTP (5.61/UUNET-internet-primary) id AAwavg00573; Fri, 28 Jan 94 13:11:12 -0500 X-Received: by shiva2.cac.washington.edu (5.65/UW-NDC Revision: 2.29 ) id AA18978; Fri, 28 Jan 94 10:09:31 -0800 X-To: BIND list X-Cc: Namedroppers list X-Mime-Version: 1.0 X-Content-Type: TEXT/PLAIN; charset=US-ASCII X-Status: seems like an ultrix bug to me We've been experimenting a little with negative caching and have run into a problem. The problem is with the Ultrix4.2a gethostbyname() resolver algorithm. We wonder how widespread this problem is, since I believe this is based on some old version of BIND. We have a host carson.u.washington.edu. Suppose I am on a host called host.cac.washington.edu. I have searching turned on in the resolver options and a search list of "cac.washington.edu" "washington.edu". If I try something like "telnet carson.u" here's what can happen. The first question my resolver asks is for the A record for carson.u.cac.washington.edu. If it isn't in the negative cache, the answer may be fetched from an auth. server and an auth. NXDOMAIN returned. If the answer is in the negative cache the same NXDOMAIN answer will be returned but it will be non-authoritative. The non-authoritative NXDOMAIN answer causes the Ultrix search algorithm to terminate the search and not try the next element which would be carson.u.washington.edu. So "telnet carson.u" fails with a no such host error. Typically, it works the first time and then fails the second (when it has been cached). The offending code is in res_query(). There they have something like: switch (hp->rcode) { case NXDOMAIN: if (hp->aa) h_errno = HOST_NOT_FOUND; else h_errno = TRY_AGAIN; ... The TRY_AGAIN answer causes the search to terminate early in res_search(). The 4.9.2 version of this same piece of code is simply: switch (hp->rcode) { case NXDOMAIN: h_errno = HOST_NOT_FOUND; ... So, the question is, is this just an Ultrix bug in gethostent.c, or did it originate with some old BIND and might it therefore be much more widespread? Thanks, Steve Hubert Networks and Distributed Computing, Univ. of Washington, Seattle