Hello DANE users,
One of our customers brought to our attention failure to deliver email to some domains with a very broken DNS setup when enabling Opportunistic DANE.
The impacted domains have different name servers at the zone and TLD delegation level, and these name servers serve different zone content for the domain, eg for say a "domain.tld":
- some of these serve a valid zone content such as (A): domain.tld. IN MX 1 mail.domain.tld. mail.domain.tld IN A 192.168.1.1
- others server a "parking" like zone (B): *.domain.tld. IN CNAME domain.tld. domain.tld. IN A 192.168.1.2
None of these zones are signed.
When trying to check for DANE support for the domain, the following scenario can happen if data from both zones is intermixed:
1) MX lookup for the domain: returns "1 mail.domain.tld." (from zone A) 2) A/AAAA lookup for "mail.domain.tld": returns "A 192.168.1.1" through "CNAME domain.tld." (from zone B) 3) Since the A reply is insecure and is from a CNAME alias, as per §2.2.2 of the RFC we issue an explicit CNAME request on "mail.domain.tld." to check if this is secure: this returns a NODATA answer (from zone A)
In that case, our implementation considers that having a NODATA answer to a CNAME that it received previously in a former response is suspicious, and temporary fails this delivery attempt as if we hit a DNS error. (Subsequent retries may have a different reply order subject to DNS caches refresh, but since only the address in zone A is a valid next-hop, odds for a successful delivery are low)
AFAIK, this particular case is not covered by the RFC, and we're trying to assess if degrading to Opportunistic TLS in that case instead of temporary failing could actually introduce some security flaw in the process.
It seems to me that for an attacker to be able to purposely return such inconsistent answers, he would need to be able to spoof answers if zones are not signed (like here) or have either control of the zone DNS key or of the MTA's resolver, both which would allow other ways to skip DANE.
I'd like to get your opinions on this. Does that sound safe ? What do other implementations do in that particular case ?
Thanks,
Gaël.