Monday, May 31, 2010

chain_id actually will not work

Thinking about it some more, chain_id will not work. The problem is this: Let us suppose we have the following process for resolving example.com:
  • The root servers at 127.0.1.1 says the .com servers are at 127.0.1.2
  • The .com server at 127.0.1.2 (ns.com) says the example.com server is ns1.example.net
  • The root server at 127.0.1.1 says the .net servers are at 127.0.1.3
  • The .net server at 127.0.1.3 says the example.net server (ns2.example.net) is at 127.0.1.4
  • The ns2.example.net server at 127.0.1.4 says the ns1.example.net server is at 127.0.1.5
  • The ns1.example.net server at 127.0.1.5 says that example.com is at 127.0.1.6
Let us suppose that we, at the same time, ask for both example.com and ns1.example.net.

We will now have ns1.example.net in the process of being resolved; this process continues when we get the glueless NS referral ns1.example.net in the process of resolving example.com.

So, at this point, the example.com resolution process has its own chain_id (lets make it “1”) while the process to resolve ns1.example.net has another chain_id (lets make it “2”). For our “use chain_id to stop resolution loops” idea to work, both resolutions now need the same chain_id. Since these chains can be fairly long and since multiple resolutions (such as resolving “www.example.com”, “blog.example.com”, and “ftp.example.com”) can use the same glueless resolution, we potentially might have to change a number of different chain_id values.

This is getting hairy.

It is far simpler to not use chain_id. We can stop loops by simply using recurse_depth; if recurse_depth exceed 32, we give up on solving a given query. Every time we follow a glueless NS referral or an incomplete CNAME chain, we either:
  • Create a new resolution process with its recurse_depth being the “parent”’s recurse_depth + 1
  • Connect to an already existing resolution process; when this happens, we increment both the parent’s and child’s recurse_depth.
Doing things this way is far simpler than trying to use chain_id.

Another thing: We need to add a “ns” dw_str to the resolution process; this will store the particular glueless ns record we are in the process of resolving.

(As an aside, the latest Deadwood snapshot can resolve glueless NS referrals if the A record in question is already in the cache)