Monday, May 31, 2010

chain_id actually will not work

Thinking about it some more, chain_id will not work. The problem is this: Let us suppose we have the following process for resolving
  • The root servers at says the .com servers are at
  • The .com server at ( says the server is
  • The root server at says the .net servers are at
  • The .net server at says the server ( is at
  • The server at says the server is at
  • The server at says that is at
Let us suppose that we, at the same time, ask for both and

We will now have in the process of being resolved; this process continues when we get the glueless NS referral in the process of resolving

So, at this point, the resolution process has its own chain_id (lets make it “1”) while the process to resolve has another chain_id (lets make it “2”). For our “use chain_id to stop resolution loops” idea to work, both resolutions now need the same chain_id. Since these chains can be fairly long and since multiple resolutions (such as resolving “”, “”, and “”) can use the same glueless resolution, we potentially might have to change a number of different chain_id values.

This is getting hairy.

It is far simpler to not use chain_id. We can stop loops by simply using recurse_depth; if recurse_depth exceed 32, we give up on solving a given query. Every time we follow a glueless NS referral or an incomplete CNAME chain, we either:
  • Create a new resolution process with its recurse_depth being the “parent”’s recurse_depth + 1
  • Connect to an already existing resolution process; when this happens, we increment both the parent’s and child’s recurse_depth.
Doing things this way is far simpler than trying to use chain_id.

Another thing: We need to add a “ns” dw_str to the resolution process; this will store the particular glueless ns record we are in the process of resolving.

(As an aside, the latest Deadwood snapshot can resolve glueless NS referrals if the A record in question is already in the cache)