Friday, June 5, 2009

More on DNS compression

A DNS record looks like this in binary, as per RFC 1035 section 3.2.1:
  1. The "name"; this is a variable-length domain label (and is often times compressed)
  2. Type, a 16-bit big endian value, which is whether it is a "A" record, a "NS" record, a "MX" record or whatever
  3. Class, a 16-bit big-endian value which is no longer used and must be 1 on the modern internet
  4. TTL, how long we want to remember this record, which is a 32-bit big-endian number
  5. rdlength, which tells us how long the actual DNS record is
  6. The actual DNS record
All of the fields are fixed length, except for the annoying variable-length "name" field. So, for each DNS record, we will want, in the decompressed DNS string, two 16-bit big-endian unsigned numbers:
  1. How far from the beginning of the string the name part of the given record is
  2. How from from the beginning of the string the "type" part of the given record is
Because of how I wrote the string library, these 16-bit values should be after the actual record, just like we put the number of AN, NS, and AR records at the end of the string storing the DNS packet.

OK, design is done. Time to start coding again.