Today I presented at USENIX WOOT ‘13 a new vulnerability that we had found in BIND, the most popular DNS server. Exploiting this vulnerability allows to reduce the amount of effort required for an off-path (blind) DNS cache poisoning attack.

The whitepaper is now publicly available, together with the presentation and ISC’s (the organization behind BIND) notification.

In this blog post I will describe the vulnerability in a less formal fashion.

What is Off-path DNS Cache Poisoning?

An off-path DNS cache poisoning is an attack at which we assert that the adversary does not see the traffic between the DNS resolver and some name server (NS), but manages to poison the cache of the DNS resolver nevertheless. In order to do so, the attacker usually induces a request between the DNS resolver and the NS, and then races against the genuine DNS answer. DNS protects against this type of attack by having some random data inside each DNS query. The random data consists of 3 elements:

  1. The DNS TXID (16 bits)
  2. The UDP Source port (~16 bits)
  3. The destination IP address.

The destination IP address can be random if the target zone has multiple authoritative name servers. We will see next that it indeed has some random properties.  The resolver accepts the answer if and only if the DNS answer contains matching data, otherwise the DNS answer is simply dropped. This means that the attacker must first win the race against the genuine answer and then guess the random data correctly.

Motivation

Since the TXID and UDP Source port randomness are well-studied, we decided to focus on the randomness of the name server (NS) selection. We wanted to see how much entropy can be added if a resolver simply chooses the target NS randomly out of the authoritative name servers. In order to do so we gathered some statistics. We downloaded the root zone file  and created the following CDF graph:

From the graph we can see that almost half of the TLDs have more than 8 NSs. This means a truly random NS selector can bring some entropy but obviously not as much as by randomizing the DNS TXID and UDP source port. However, derandomizing such algorithm can make existing attacks run faster or be more efficient. In addition it can make an off-path attack become on-path (Man-in-the-Middle), if the attacker is on a path between the resolver and some authoritative NS, but not on a path between the resolver and other authoritative NSs.

BIND’s NS selection: The Smoothed Round-Trip Time (SRTT) algorithm

The goal of the resolver when facing multiple authoritative NSs is to choose the fastest one, by Round-Trip Time (RTT). Since the RTT changes frequently, BIND maintains a moving average for each NS IP. This value is first initialized with a very low value between 1 and 32 microseconds. Each time there is a valid DNS response available, the SRTT is updated with a weighted sum which takes into account the old SRTT value and the newly available RTT. The weights are 0.7 for the old SRTT and 0.3 for the RTT. In order to avoid starvation, any non-queried NS out of authoritative NSs thatt can be queried, is multiplied with a decay factory of 0.98. In error situations, such as a network timeout, the SRTT is punished with 200ms with an upper bound of 1s.

The authoritative name servers are simply queried in the order of their SRTT values (NS with lowest SRTT is queried first).

All of the SRTT values are stored inside a cache which is accessed by the NS IP.

Attacks on the NS selection algorithm and our contribution

Now that we understand the SRTT algorithm, we can safely claim that If the attacker can somehow influence the SRTT algorithm, he can derandomize the NS selection. He can achieve this in two different approaches. The first one is to increase the SRTT of all name servers but one. The other strategy would be to lower the SRTT of the target NS.

There are two previous works which try to tackle the NS selection. The first one which adheres to the first approach is [Herzberg & Shulman, 2012]. It abuses IP fragmentation in order to cause the error operation of the SRTT algorithm.  The second, [Petr, 2009], adheres to the second strategy. It uses fast spoofed responses in order to exploit the update operation of the SRTT algorithm.

Both attacks are probabilistic (i.e. there is some guess work behind them so they don’t always work). Our attack is deterministic  (i.e. it always succeeds), and in addition the NS which we lower its SRTT does not even see it.

The Attack

In this post I will describe a simple variant of the attack, the more sophisticated ones (but not more effective) can be found on the whitepaper.

Our attack abuses non-open resolvers, i.e. NSs that do not answer on queries that they are not authoritative of. Most of the resolvers around the Internet are non-open. The attacker does not need to control them, but simply needs to know their IP address. We will see how the attacker can take leverage of them in order lower the SRTT value of an arbitrary NS to an arbitrary value on some target resolver. The attacker also hosts a malicious NS. We assume that this malicious NS is the authority of some domain, i.e. malicious.foo. Let’s also assume that we want to lower the SRTT value of one of the authoritative NSs of ibm.com. It’s important to mention that our attack does not depend on the amount of authoritative NSs the target zone has, but for the sake of simplicity, in this example, the zone has two nameservers: NS1 and NS2. We will lower the SRTT of NS2 on the target resolver, so it will be queried before NS1.

The attack begins by first querying the resolver for some domain the malicious NS is authoritative of, e.g. 666.malicious.foo. The resolver will eventually approach the malicious NS, which replies with a DNS delegation that contains:

  1. A large list of non-open resolvers
  2. The NS that we try to lower the SRTT of, NS2

We assume that NS2 was already on the SRTT cache of the resolver and was contacted prior to the attack, so its SRTT is based on some real RTT value. Since the non-open resolvers were not on the cache, they will be added with very low values between 1 and 32 microseconds according to the SRTT algorithm. This means that they will be queried before NS2 by the NS selection. The resolver will query each of them and they will refuse since they are non-open resolvers. After each query the SRTT value of NS2 will be decayed, i.e. multiplied by 0.98. The resolver will time out after 30 seconds and we can force this timeout if the amount of non-open resolvers is sufficiently large. This means that the resolver will not even get to querying NS2 and at the end of the attack the SRTT of NS2 will be 0.98^n of its original value, where n is the amount of non-open resolvers which were queried before the timeout.

Fix recommendation

The root cause of this vulnerability is that the resolver maintains a shared cache of SRTT values. The cache is shared in the sense that a malicious NS can affect SRTT values of NSs which are not related to it. We suggest to split the cache. For example, it can be split according to the currently queried zone.

More from Software Vulnerabilities

X-Force releases detection & response framework for managed file transfer software

5 min read - How AI can help defenders scale detection guidance for enterprise software tools If we look back at mass exploitation events that shook the security industry like Log4j, Atlassian, and Microsoft Exchange when these solutions were actively being exploited by attackers, the exploits may have been associated with a different CVE, but the detection and response guidance being released by the various security vendors had many similarities (e.g., Log4shell vs. Log4j2 vs. MOVEit vs. Spring4Shell vs. Microsoft Exchange vs. ProxyShell vs.…

MSMQ QueueJumper (RCE Vulnerability): An in-depth technical analysis

13 min read - The security updates released by Microsoft on April 11, 2023, addressed over 90 individual vulnerabilities. Of particular note was CVE-2023-21554, dubbed QueueJumper, a remote code execution vulnerability affecting the Microsoft Message Queueing (MSMQ) service. MSMQ is an optional Windows component that enables applications to exchange messages via message queues that are reachable both locally and remotely. This analysis was performed in collaboration with the Randori and X-Force Adversary Services teams, by Valentina Palmiotti, Fabius Watson, and Aaron Portnoy. Research motivations…

X-Force prevents zero day from going anywhere

8 min read - This blog was made possible through contributions from Fred Chidsey and Joseph Lozowski. The 2023 X-Force Threat Intelligence Index shows that vulnerability discovery has rapidly increased year-over-year and according to X-Force’s cumulative vulnerability and exploit database, only 3% of vulnerabilities are associated with a zero day. X-Force often observes zero-day exploitation on Internet-facing systems as a vector for initial access however, X-Force has also observed zero-day attacks leveraged by attackers to accomplish their goals and objectives after initial access was…

Topic updates

Get email updates and stay ahead of the latest threats to the security landscape, thought leadership and research.
Subscribe today