A simple P2P DNS proposal

Extending DNS operation for peer-to-peer name resolution may be much simpler than many expect. DNS resolvers know how to maintain caches of previously resolved names. These caches can be used for progressive peer-to-peer name resolution without relying on centralized servers.

Let’s assume first that the names in the peer-to-peer domain can be sorted, for example using alphabetic order – case insensitive, of course, since this is DNS. A node that receives a query for a peer-to-peer name will examine the requested name. If the name is known locally, the node will provide the answer. If the name is not known, the node will forward the request to one of the peers known from the cache, whose name is closest to the requested name according to the alphabetic order. The process will repeat until the name is found, or, if the name is not reachable, no progress can be made. The response will then be returned, just like any regular DNS query.

This may be better explained with an example. Suppose that a P2P name resolver, “rose.p2p,” wants to resolve the name “flower.p2p.” It will first look for the “closest” name in its own cache, maybe “earth.p2p” and send it the request. “Earth.p2p” will look for the closest name in its cache, maybe “fire.p2p,” and forward the request. “Fire.p2p” may forward the request to “Flare.p2p”, which forwards it to “flour.p2p,” which in turn could have the data for “flower.p2p.”

What I just described is basically the PNRP name resolution algorithm, adapted to DNS names instead of PNRP’s binary identifiers. The resolution process has the side effect of propagating knowledge in the network, as all intermediate nodes can see the response and cache it. In our example, rose, earth, fire and flare all learned about “flower.” The interesting point is that the density of cached names increases closer to the name itself. I won’t go in a lengthy mathematical demonstration, but this property has a very good effect on the network performance. As the network learns, queries get resolved ever faster, allowing the system to manage very large number of names.

Of course, there is a catch. The propagation of knowledge works, but it only works because there was an initial “chain of knowledge.” In our example, “flare.p2p” knew about “flour.p2p,” which knew about “flower.p2p.” This initial knowledge cannot be discovered by the side effect of queries, and has to be somehow “seeded” in the network. This has been studied a lot in the context of distributed hash tables, and also during the design of PNRP. The solution is to add to the heuristic described above a “leaf-set management” procedure.

In the context of DHT, the leaf-set designates the other nodes in the network whose indices are closest to a particular node. We can prove that if all nodes in the network know their leaf-set, every name in the network can be resolved. Of course, in the worst case the resolution will require visiting half the nodes in the network, but in practice the caches will allow for efficient short cuts. Efficient resolution requires thus a combination of leaf-set management and efficient caching.

It turns out that the leaf-set can be managed by a simple process. At regular intervals, a node can look at its own cache and find in the cache the two closest names to its own – specifically the one just before its name in alphabetic order and the one just below. It can then send to each of these nodes a query for made up name located in alphabetical order between the sender and the receiver, in the hope of learning whether some names exist that are even closer than the ones in the cache. If new names are found, the process can be repeated until the node is convinced of knowing the two names closest to its own in the network. To each of these, the node can send an “unsolicited DNS response” advertising its own name and address. This will be an extension to the regular DNS protocol, a specific extension for peer-to-peer domains. This simple extension ensures that the network is seeded with all the relevant names, and that queries will succeed, even if they require initially many relays.

We can see that P2P DNS could be implemented with very minimal extensions to the basic DNS protocol: allow for relays of requests to the closest name in the cache instead of relying solely on NS records; perform “leaf-set discovery” by sending fake queries at regular intervals; and, perform “leaf-set maintenance” by pushing unsolicited DNS responses to the leaf-set. Of course, this is just the sketch of the solution. The full specification will need to deal with cache management issues, such as what replacement algorithm to implement if the cache is full. And it will also have to deal with security issues. But that will be the matter for another post. For now, let’s just notice that developing P2P DNS does not need to be hard!

Advertisements

About Christian Huitema

I have been developing Internet protocols and applications for about 30 years. I love to see how the Internet has grown and the applications it enabled. Let's keep it open!
This entry was posted in Uncategorized. Bookmark the permalink.

5 Responses to A simple P2P DNS proposal

    • huitema says:

      Did you actually run simulations of the PNRP like approach? I can assure you that PNRP does indeed scale.

      The goal of the design is indeed to avoid the underlying DHT used in CoDoNS. My main concern with DHT is that they are extremely sensitive to an “insider” attack, i.e. a DHT node starting to actively try to fail the system. There are other concerns, mostly due to the insertion of a hashing layer, which creates requirement for data migration when the set of nodes changes.

  1. My understanding of PNRP is limited, but… I do not believe PNRP routing algorithms and caching strategies will scale to a DNS-sized cloud.

    Consider a PNRP cloud of 100,000,000 nodes and 100,000,000,000 entries. Or consider a PNRP cloud of 100,000,000 nodes and 100,000 entries per node.

    Do you really think a PNRP cloud will scale to satisfy that?

    • I left out, without sacrificing the performance people expect of DNS? Within even one order of magnitude?

      • huitema says:

        The key difference between PNRP and DHT is that PNRP was specifically designed as a name service, not a general database service. As far as PNRP is concerned, a node publishes exactly one entry, its own. To simplify, in the DNS application, that would mean one entry per domain, to find the name server for the domain. After that, NS & server addresses are in the cache, and the records are served per standard DNS.

        The general claim of PNRP is that the number of resolution steps scales as the logarithm of the number of nodes. We verified that with simulating up to 2M nodes.

        The key is the cache management and the structure of the knowledge. Nodes keep “denser” knowledge of nodes closer to them in the name space. This yields maths very similar to the progressive layers in Pastry, except with a more statistical behavior.

        Now, of course, all these will be easier to explain if we had actually published the PNRP design.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s