Self-keying domain names

The concept of “self-keying” domain names emerged as a parallel thread in the discussion of P2P DNS name. I have investigated this kind of names for some time, and I believe that I can make a fairly simple proposal. By nature, the “self-keying” idea implies that the name will be the base32 encoding of a binary hash. My priority is to enable some form of “hash compression” so the base32 string is not overly long and unwieldy.

There is a limit to how short a binary name can be. Suppose that we forgot all notion of security and just wanted to pick random numbers. The random numbers will have to be large enough to avoid collisions, and that probably mean dealing with the birthday problem. If we expect a top level domain to support 1 million users, the random number shall be at least 2×20 = 40 bits. If we want to scale to 100 million, we need at least 54 bits. That number might be reduced if the creator of a name could test in real time whether the name is already “taken,” but we should not bank on that too much, as we cannot predict that P2P names will be reachable on line all the time. In practice, it might be reasonable to expect a name size of 60 bits, which can be encoded in 12 base 32 characters and would enable scaling of a P2P domain to 1 billion names.

We probably don’t want names to be much longer than 12 characters, as in “A2F3G5HXT97E.KEY.” We may want to throw in some hyphens for “punctuation,” as in “A2F3-G5HX-T97E.KEY” to help reading, making the name just slightly harder to read than, for example, a social security number. Not exactly a marketing delight, but it could work. The problem of course is that a 60 bit hash code is rather easy to crack. We need some additional protection. We can reuse here the “call sign” idea, requiring that the hash code has some strong properties, such as a certain number of Z trailing zeroes. The effective strength of the hash code would then be 60+Z. Given enough efforts, names can be generated with proper strength.

If we want to implement these concepts, we can extend the DNSSEC architecture to support “self-keying” names. The key used to verify the name can be encoded in the DNSKEY record specified by DNSSEC. The main part of the proposal is the definition of a new “DNS SALT” record that would contain the information necessary for that verification, including a pointer to the verification algorithm.

The verification of a “self-keying” name would proceed as follow:

  • Retrieve the DNSKEY and DNSSALT records for the DNS name;
  • Using the algorithm specified in the DNSSALT record, verify that the DNS name is consistent with the values of the DNSKEY and DNSSALT;
  • If the value is consistent, assume that the validating resolver has a “trust anchor” and continue processing as specified in RFC 4033, including verifying that record sets are properly signed.

Specifying a DNSSALT record is fairly straightforward. The first octet or two would contain the algorithm identifier. The following octets would contain the data used for the verification, and their format would be specified together with the algorithm. A base algorithm could be implemented around the SHA-256 standard. In that base algorithm, the DNSSALT record would contain 5 fields:

  • The algorithm identifier, maybe set to 1 as this is the “basic” value;
  • The number L of base32 characters in the name;
  • The number Z of trailing zeroes required in the hash code;
  • The key tag value of the DNSKEY RR that will be used for the validation, as specified in RFC 4034;
  • A variable number of octets, used as the “salt” value.

To verify the name, a resolver would first verify that the sum L+Z is large enough to be secure, and that the name contains L base32 characters. If that is true, the resolver will build an octet string by concatenating the parent domain name, the content of the DNSKEY record designated by the tag, and the content of the DNSSALT record. It will then compute the SHA256 hash of that string, and verify that the least significant Z bits of the hash are all zeroes, and that the 5xL most significant bits match the L base32 characters of the name. If all that is true, the name is considered a valid “trust anchor.”

The name itself is generated using a reverse process. First select the key and the desired values Z and L. Select the parent domain, pick an initial salt, format the DNSKEY and DNSSALT records, construct the binary string and compute the SHA256 hash. If the last Z bits are not all zeroes, pick a different salt and repeat. Once a proper salt has bit found, create the name from the base32 encoding of the most significant 5xL bits of the hash code. Voila, a self-keying name.

Of course, if we want to standardize self-keying doamin names, we will need something a bit more elaborate than this one-page summary. We will need to specify the format in detail, explain the security issues, and of course deal with various patent considerations, including the “call sign” patent. I am curious to see whether there is enough interest to go forward with the proposal.

Advertisements
Posted in Uncategorized | Leave a comment

Privacy, the cloud and P2P

For privacy advocates, the move of computing to the cloud looks scary. Web email servers routinely serve copies of email and conversation to lawyers and prosecutors. Of course, responsible companies only do so in response to legal warrants or subpoenas. However, these warrants appear much easier to obtain than, for example, the permission to search a suspect’s home. Moreover, the Wikileaks’ affair showed us that democratic governments are not above pressuring companies like E-Bay to cut a Paypal account. Can we protect our privacy as we move more and more services and data to “the cloud?”

I can think of three answers to the question of “privacy and the cloud.” We could develop a new legal framework to protect data in the cloud. We could develop technical means to encrypt the data stored in the cloud. We could also develop peer-to-peer services that parallel the cloud and can be used as private channels. Each of these approaches has its issues and its promises. We probably need all three.

Changing the legal framework seems hard, but is aligned with the commercial interest of companies like Google, Microsoft or Amazon. Their cloud based services competes with “in house” solutions. Data privacy creates a potential roadblock for cloud solutions. An enterprise will be reluctant to “move to the cloud” if it believes that it will be easier for adversaries or competitors to subpoena their data on a cloud server. They would like to see a framework where data stored in the cloud gets the same protection as data stored in your enterprise or in your house. They may very well get their wishes, at least in democratic countries. That would be very good news for everybody, when it happens.

If the data stored in the cloud is encrypted, the cloud providers cannot read it. Even if they have to execute a warrant, they can only provide a meaningless bunch of bits. This would provide some robust privacy protections if it worked. But that is a big if. It is very easy to store encrypted data in the cloud, but processing or sharing becomes very complicated. And then, there is always the issue of “metadata”, such as the logs of who accesses what, who communicates with what. These logs can reveal lots of information, even if the data is encrypted. Encrypted storage is certainly useful, but can only be a component of the solution.

Changes in legal framework will take some times and will only cover some countries. Encrypted storage will only solve part of the problem. I believe that if we really want private exchange of data, we have to rely on peer-to-peer solutions. But we have to reinvent peer-to-peer! The peer-to-peer solutions of the last decade relied essentially on “always-on” PC. This is not acceptable anymore. To save energy, we must turn off our PC when not in use. This turning on-and-off does not play well with common P2P algorithms, as it forces constant updates of P2P routing table. Besides, P2P solutions need to play well on smart phones and other mobile devices, which probably means inventing new algorithms that exploit proximity and mobility. Mix the cloud into all that, and we have an interesting research problem. But it is a problem that is very much worth solving!

Posted in Uncategorized | Leave a comment

Nothing good for the Internet can come from the UN

Here we go again. The UN is restarting a committee to study changes in Internet Governance. This is not new. The UN had been running an Internet Governance Forum for some time. The old forum achieved little more than publishing a few studies, after organizing a worldwide tour of conferences. Clearly, some governments are frustrated by the lack of progress, and want to restart the process. We should all be worried, because nothing good can come out of that.

Each time I look at this process, I can only find three motives: ensuring worldwide censorship, taxing international connections, and providing cushy positions for international bureaucrats.

In telecommunications, the main trove of cushy positions for international bureaucrats lies in the ITU. Because the ITU is a treaty organization, nations can lobby and get nice jobs for preferred cousins or well-connected politicians. Thirty years ago, when the Internet started to develop, the ITU dominated the world of telecommunication standards. Today, the ITU still play an important role in allocating radio frequencies, but for Internet standards it is mostly a side show. The real work is happening in voluntary organizations like the W3C, the IETF and the IEEE. Lack of relevance combined with budget pressure threatens a number of bureaucrats’ jobs. They are thus natural allies to any effort to move the “governance” of the Internet to the UN, to which the ITU is attached.

In the old days of telephony, the ITU was regulating the tariffs of international phone calls. Not a big deal in theory, because the settlements only apply to the difference. If there were more calls from Country A to Country B than the other way around, Country A pays for the difference at the specified tariff. For most countries, that would be a wash. But many developing countries kept their local tariff very high, so there were many more emigrants calling their families from Europe or the US than the other way around. These telephone tariffs became essentially a tax on emigrants, and a significant source of revenues for many developing countries. With the Internet, this source of revenue disappears. Peering fees are based on business negotiations, and more or less reflect the cost of providing the service. We understand why many African nations would like to “change Internet Governance”, so they could go back to taxing communications.

If it were only for bureaucrat jobs and taxes, the problem would be rather easy to solve. International bureaucrats appear content to go to lots of meetings worldwide even if there is no practical result, and I suppose the UN could continue paying them to do that. Developing countries already receive lots of grants for development aid, and some of that could probably be earmarked to finance “Internet development.” The main problem is really censorship.

Authoritarian regimes around the world have a hard time accepting the freedom of communication that goes with the Internet. In the name of “social harmony,” the Chinese Communist Party has implemented firewall and regulation to ensure that Chinese people don’t access content critical of the regime. In the name of religion, the Saudi Kingdom has implemented its own firewall to ensure that the kingdom’s subjects don’t read something pernicious. These are two prominent examples, but there are many more. Outside of the Western world, censorship is the rule, not the exception. But local censorship, implemented in a single nation, is costly and not very efficient.

I can’t help thinking that the true goal of “Internet Governance” is “Internet Censorship.” Instead of blocking content in their own country only, authoritarian regimes dream of blocking it worldwide. There are examples of that already. The Chinese government on occasion protests when the press in Europe, the US or Japan publishes views that the regime dislikes. The Saudis have campaigned at the UN to establish “anti-blasphemy” regulation that would prevent anyone from criticizing their views on religion. There are many more examples. And what better way to implement that than Internet Governance by the UN?

This is why I just signed ISOC’s petition to not let the UN grant an Internet Management Monopoly to governments.

Posted in Uncategorized | Leave a comment

What would make P2P DNS successful?

At some level, the P2P DNS is a kneejerk reaction against arbitrary revocation of names by DNS registries. If registries cannot be trusted, we should replace them by a distributed system, which presumably would be impervious to interferences. Laudable goal, but that does not necessary mean that if people to rush and register P2P DNS names as soon as something like that is designed and the software available. Not all designs succeed!

Dave Thaler and Bernard Aboba discuss many aspects of protocol design in RFC 5218 “What Makes for a Successful Protocol?” They compare many successful and not-so-successful designs, and look at what characterizes wild success, mild success and failure. This is a very good read, and it can certainly inform the designers of a P2P DNS service.

The first rule of success is obviously to fulfill a need. Ideally, the design enables new applications or new services. In the case of the P2P DNS, the need is not entirely obvious. Web sites today use “classic” DNS, and are quite satisfied. Activists and architects may be concerned by the centralized nature of registration authorities, but few users actually experience the failures. The registration fees are low enough that most users don’t feel an urgent need to protest. This does not bode very well for the future P2P DNS, unless the service somehow provides extra value over regular DNS.

Other rules of success include ease of deployment, good scaling properties, and absence of security issues.

The classic DNS is rather easy to deploy. To be successful, P2P DNS will need to be as easy to deploy or maybe even easier. A web site should be able to create or register a P2P DNS name without hassle. Whatever software is required should be easy to install and configure. Ideally, the design should be such that installation requires little more than typing the desired name. If users have to craft arcane configuration files, the service will never become popular.

Ease of deployment also means compatibility with existing applications. Internet users should be able to click of a “P2P URL” from their regular web browser, and the name should resolve to an address just as fast and reliably as any “.COM” name. If it did not, web site managers would have a strong reason to not use the P2P domain. In practice, that means interconnection between P2P and regular DNS through a set of publicly available gateways. This is a difficult issue to solve, as deploying gateways is often a thankless task.

The DNS design has scaled to millions of names. P2P designs can certainly support the same scale. They may in fact scale better that DNS if one considers latency. DNS achieves scaling by resorting to hierarchies and caches. P2P system can use distributed algorithms to spread the load, and may not need to rely quite as much on caching. This may be an advantage for P2P deployments, but if will probably be a rather minor one, not enough to convince site administrators to use “.P2P” rather than “.COM.”

The DNS does have some “interesting” security issues. A P2P DNS could probably be made more secure than regular DNS, because it is less concerned with compatibility with the installed base. P2P DNS can remain compatible with regular DNS if it supports the same name format and record types, but it could do that with a different query protocol and solve many of the current DNS issues. Good security may be a sufficient advantage to attract at least some initial deployment.

Classic DNS is one of the “widely successful” protocol designs quoted in RFC 5218. Competing with it seems hard. P2P DNS will provide essentially the same service, so it needs to differentiate in other areas. Security, scalability, ease of deployment, but mostly security, may make P2P DNS successful.

Posted in Uncategorized | 2 Comments

Distributed Hash Tables are really hard to secure

One of the designs proposed for a P2P name service is to “use a Distributed Hash Table” to distribute the resolution of the names in a P2P zone. I don’t believe that’s a very good idea, because DHT are really hard to secure, and conversely really easy to attack.

DHT works by assigning a random index number to each node, another random index number to each record being stored, and by storing a copy of the record being stored on the node with the closest index. The node index is typically the hash of the node’s name, and the record index the hash of the record identifier – for the DNS application that would be name and record type. Using hashing functions ensures that the numbers are randomly distributed, and that property can be used to design look up protocols that guarantee access time of the order O(log N), where N is the number of nodes. That makes for very fast responses in large scale databases, but that basically suppose that all nodes are equal.

A “rogue” node in a classic DHT can mount many interesting attacks. It can lie about the presence or absence of records, it can reply with fake copies, and it can mount various kinds of attacks against the DHT routing mechanisms. Rogue participants can also attempt to game the hashing function to target placement of their records at specific nodes, effectively flooding these nodes. They can create fake nodes with strategically chosen indices to cut the DHT “cloud” in little pieces. It does not take many such attackers to bring the reliability of the system close to zero.

We built some strong defenses against such attacks in PNRP. These included replication, randomization, and a routing system that used “distributed caches” rather than “indexed stores.” We remove the use of node indices from the protocol, so that queries are only routed based on a record’s key. Each node can store cached copies of records, but the master copy is always accessible at the publishing node. The master copy is signed by its publisher, which allows for verification. In fact, we spent much more time devising these protections than designing the actual algorithm!

Bottom line, building a reliable P2P name server requires a lot more work than just taking a zone file and distributing it with a standard DHT.

Posted in Uncategorized | 1 Comment

Can we protect P2P names from spoofing?

There is an obvious security issue with P2P names: spoofing.
In the absence of a reference server for the P2P top domain, pretty much anyone
can claim to own the name “Microsoft.P2P.” There are many ways around that. One
is to only ask for name resolution to peers that you trust, so if you ask them
“what is the address of Microsoft.P2P” they give you an answer that you can
trust. Another way is to rely on a certificate authority to validate the name.
Yet another way is to only use self-validating names, such as
“ZA32-5FT-RT0.P2P,” so any answer can be signed with the public key tied to the
name.

Of course, these anti-spoofing protections have their own
problems. Relying on trusted friends does not scale unless one allows for
“chains of friends,” but it takes only one crook or one fool in the chain to
break the trust. Certificate authorities can provide a reasonable security
level, but this defeats the pure P2P nature of the system by relying
effectively on a central authority. Self-validating names are hard to remember,
which can enable a different level of spoofing attacks, e.g. convincing a
hapless party that Christian’s peer address is “SA33-5FD-VT0.P2P” instead of
the actual “ZA32-5FT-RT0.P2P.” Self-validating names also have a limited
life-time, which cannot be longer than the lifetime of the public key from
which they are derived.

My colleagues at Microsoft and I studied that a great deal
when developing PNRP. In PNRP, we defined two categories of names, “secure” and
“insecure.” The insecure names are just picked by the user, something like
“myName.0” in the PNRP syntax. The secure names are derived from a public key
that is generated locally, something like “8188bbb7107c5ee6dd3387375ff55d009862f937.participant”.
The long string of hexadecimal digits is the secure hash of the public key of
the name publisher. Secure names works well in some applications, but are too
unwieldy for a DNS replacement. People could not remember them, and most likely
would not be able to copy them without typos.

The name form that I used in the previous example,
“SA33-5FD-VT0.P2P,” is derived from work that I pursued at Microsoft after PNRP.
With Kim Cameron and Josh Benaloh, we conceived a way to securely compress the
hash code, which we dubbed “Call Signs.” To compress the hash, we search for a
string that, when hashed with the key, results in a hash code ending with “lots
of zeroes.” This compression requires a lot of computing power when the code is
first generated, but results in “call signs” that are no longer than most
domain names. The technology is not used in PNRP, but could certainly be used
in a P2P naming scheme. Whether that’s friendly enough is indeed a matter of
debate.

Posted in Uncategorized | 5 Comments

Unique Names in P2P domains?

I am looking at the web page of the P2P “Open NIC” project: http://wiki.opennicproject.org/dotP2PTLD.
They intend to create a DNS registry for P2P names. I am sure that they are
well intentioned, but I can help thinking that there is a contradiction between
the goal of “independence from authority” and the means of “let’s create a
virtuous central registry.” The central registry is a single point of failure, interference
by governments being one of many failure modes.

In a true P2P system, we want participants to be able to
create names “at will.” But there is a problem. What if two participants decide
to register the same name? How do we decide who gets to use that name? How do
we even know that two parties are trying to use the same name?

We could of course try to use the name system itself to “resolve
the collisions.” A participant chooses a name, but before using it checks
whether that name is already in use. That is certainly good practice, but it is
not failure proof. For example, two participants could create the same name at
the same time, and not find out until later. A partial failure of the system at
the time of the check would also prevent the participant to find out that the
name already exists.

In PNRP, we only solved that problem by allowing multiple
nodes to publish the same name. We introduced the concept of a “name instance,”
i.e. a combination of a name and an instance number. There can be multiple
nodes publishing the same name, but each name and instance number pair is
unique. Our initial idea was to use IPv6 addresses as the instance number, which
guarantees that the instances are unique.

Of course, instance numbers are not sufficient to guarantee
that there is only one “owner” for a name. We only solved that in PNRP for the “secure”
names, derived from a public key. Only the owner of the public key, or a node
authorized by that owner, can publish the name. In PNRP, we tied names to keys
by making the name a hash of the key. These names are thus very large random
numbers, so large that having a collision is extremely unlikely. But it also
limits their use to specific applications.
I am pretty sure that we could rework the “secure
name” idea to work in a P2P DNS environment. The big question is, can we do
that without maintaining a central registry?

Posted in Uncategorized | 4 Comments