Reading summary: CoDNS
I’ll begin this blog with a first article about DNS Performances
DNS Performances are a well documented subject ([1], [2], [3] to only quote 3 studies (let’s throw a [4]th)).
I’ll focus here on [1] and the solution it proposes to reduce the DNS response time, CoDNS.
There is a lot of papers documenting the performances of the DNS protocol. The most common result is that the server’s response time is good most of the time (around 90% of the requests are handled within a few hundred milliseconds), but researchers also point out in their papers that, sometimes, DNS suffers from an increase in latency resulting in a increased response delay (which can go up to ten seconds)
The most significant cause of latency identified by the researchers is DNS Packet Loss at server-side due to server overload. This issue is frequent, but do not cause trouble in most cases due to mechanisms introduced to restore the connection when the connection encounters error. Nevertheless, the authors looked for a way to increase performances by lowering the response delay in order to prevent these problems.
The CoDNS Solution was developed by people from Princeton university and implemented on the PlanetLab CDN. This solution solved the DNS latency issue using a hybrid system.
In most cases, the requests are resolved using the standard DNS resolution (add image here). In the case where the answer is too long to arrive (after a reasonable delay), the server will ask its peers for the information (should it be cached on their side)
The authors use the specific structure of the Content Delivery Networks (CDN) to easily transmit its DNS Request to its nearest peers (in a network sense, they are not necessarily the closest geographically. The closeness between servers is determined maintaining a heartbeat between the servers (around a few bytes per second).
Maintaining the heartbeat seems to increase traffic but the study explains that the heartbeat is too small to introduce a significant increase in network traffic (32 byte each second, 7.5MB per day)
The PoC shows its capacity to solve the issue introduced by the timeout. The authors also compared their solution to the classic approach and their solution show similar (and sometimes better) results in mean time and solve the timeout troubles. The study is not precise on the consequences of the server’s overload in such a context but the solution works and shows remarkable promises.
The only issue remaining is that the solution is not really easy to deploy, using an infrastructure that few entities are able to deploy (it was more significant in 2004, when the system was introduced).
It should be interesting to compare this solution to the performances of the new public DNS resolver, introduced a few years ago by Google, Cloudflare or Cloud9 but few studies document the performances of these DNS resolver (yet).
The CoDNS paper is quoted, even today, by people working in the CDN field such as [5] or [6], while the CoDNS “project” is now discontinued (the project ran on PlanetLab between 2004 and 2012)
—-
sources :
[1] : CoDNS: Improving DNS Performance and Reliability via Cooperative Lookups; KyoungSoo Park, Vivek S. Pai, Larry Peterson and Zhe Wang
[2] : Proactive caching of DNS records: addressing a performance bottleneck, Edith Cohen, Haim Kaplan
[3] : DNS performance and the effectiveness of caching, Jaeyeon Jung, E. Sit, H. Balakrishnan, R. Morris
[4] : Statistics About DNS Root Name Service: Some Recommendations
[5] : OnionDNS: a seizure-resistant top-level domain; Nolen Scaife, Henry Carter, Lyrissa Lidsky, Rachael L. Jones, Patrick Traynor
[6] : LDplayer: DNS Experimentation at Scale, Liang Zhu, John Heidemann
I will probably add a few documents in the comments (or at the end of this article) should I receive more interesting stuff on the subject.