Archives – RIPE 74

DNS Working Group

11 May 2017

At 4 p.m.:

Jaap Akkerhuis: Here we are for the second part of DNS and we are calling on, literally, speed, like not normal, where we always have short of time, but anyway, the first talk is about DNS and privacy by Benno Overeinder, NLnet Labs.

BENNO OVEREINDER: Thank you. DNS privacy. So, first of all, this work I present is kind of a summary of all of work done by many others, people here in the room, people next weekend ‑‑ this weekend, in Madrid at the DNS OARC, and people at the IETF, so it's a community effort, so to say. And you might remember also last year there was a presentation by Sarah Dickinson about DNS privacy and you might remember that. That gave an excellent overview of all the activities in the DNS privacy enhancement Working Group at the IETF and all these kind of activities going ahead. Today I am more focusing on as I mentioned here, implementation deployment so what can we, you, do today in, well, installing and running and deploying DNS privacy enhanced services.

Good.

Why do we talk about DNS privacy, so one step back. In 2013, July, there was an Internet architecture board, RFC, published, it's about privacy considerations for Internet protocols. And as you might know, an RFC takes about one‑and‑a‑half years, two years, three years, to be published, so this work already started in 2011 but the publication was in 2013, and what a coincidence for us, as a community, it became also urgent because of the Snowden revelations in June the same year, so there was this draft or this RFC actually and all the discussions in the plenary, in the IETF, got lots of attention about what can the IETF do about this. And as a follow‑up on this work there was another RFC a year later by two IB members and it's a position, it's a statement, but it's a RFC published by two IETF ‑‑ how do you say this? ‑‑ participants, I think is the correct word. It says pervasive monitoring is an attack. The IETF works to permit per vasetive monitoring, that is the main mission. You can read it, it's a good read. There was a thread model RFC later on confidentiality in the face of pervasive surveillance, it makes a threat model of all the attacks on privacy on the Internet.

Having said this, what does it matter actually for DNS? So DNS and privacy, is DNS not public data? I want to have visitors on my website. So ‑‑ but indeed. So, there was a very well written and interesting draft RFC, written by Stefan Botsmeir from AFNIC, on DNS privacy considerations. It gives an excellent overview of all kind of attack factors, but attack on your privacy, attack factors on the DNS, so it makes good verview on which parts can leak leak information or information can be intercepted. And data, one of the take aways of this RFC is, data might be public but the DNS transaction shouldn't, and that is in our back mind. If we go further and talk about attack, it's not DDOS the attack on the privacy of the users here in the room and everywhere else on the Internet.

So, let's first kind of do kind of ‑‑ how you say it ‑‑ categorisation of attacks but it's far from complete. For completeness I will give you some references for other presentations or documents you can read on this. So one of the attack factors is on the first or last mile, depending on your perspective as a client, as a user, it's the first mile. It's from my computer to the resolver. And everything there, all the DNS queries are open, not encrypted, so every on path listener can monitor what I am trying to resolve, every request from my ‑‑ to the resolver can be intercepted and if you collect sufficient number of data you can make some conclusions here.

So that is one part I want to discuss later. I will come back to that later, it's for setting your mind.

The other thing is DNS information leakage, so going from the resolver to the authorities and has been discussed already with QNAME, you are leaking information unnecessarily. So the full domain name ends up at the root, at the top level domain and the secondary level domain. So ‑‑ that might be okay but the first two shouldn't be, well shouldn't expose all the data to the first two authoritative name servers in the iteration. So we are leaking information here also.

So, what can we do against this? Well, there are many more scenarios and background studies about this, there is an excellent IETF tutorial by Sarah and by DK G ‑‑ David ‑‑ Daniel Gilmore. If you are interested in the subject it's very good overview, video and other ‑‑ at the IETF tourerial website. They gave a good overview. For other information about presentations, etc., you can go to the DNS privacy.org website, there is a collection of what we are doing and with a others are doing and all kind of presentations.

That is for all the information and the background here. Now want doing to the implementation part because of course we want to you use this software sooner or later. So implementation, again I reiterate it's implementation by many people here in the room and at the IETF. So these are typical pictures ‑‑ these are pictures from the hackathons at the IETF so good opportunity for us to interact with other people to work on some items. Sometimes we want prizes, we get a raspberry pi and USB stick and note pads, all kind of great prizes here. But again, implementation, so given the drafts and the RFCs being proposed in the different Working Groups, mostly DNS op and the DNS privacy enhancement Working Group, deprive, we considered ‑‑ well we, these are the things you can consider to use to protect you for the first mile. How to I make sure that my query to the resolver is not listened to by others, how can I encrypt this? And of course we can do opportunistic, start TLS, we have more interest in more strict TLS, D TLS, confidential DNS drafts but that wasn't pursued, and also existing and out deployed is DNS curve and DNSCrypt but these are not IETF standards, DNSCrypt being ‑‑ resolvers working with that talk, DNSCrypt to authoritative name servers.

Good. So these ‑‑ these are the the approaches to protect the first mile. But again, so, DNS loves UDP, I actually I love UDP also for that. But DNS over ‑‑ authenticated TLS connections, we have to kind of well tune, repair DNS ‑‑ tune DNS to work well over TLS, TCP and TLS. And these are a number of things we have to look into in more detail. For example, just sending a UDP packet, well I don't have to explain that, it's just ‑‑ it's out there and you don't look at it later you get sometime or not an answer. With TCP it's different so you need some overhead involved here and also for TLS you have the sessions, you have to do even more work than TCP. Buckley there is already IETF work going on, already published and are implementation, so for the TCP we have TCP fast open, you exchange with the set‑up some data and with the resumption, TLS session you don't have to hold the key exchange again. That was done. The other things which were not done in the DNS community is out of order, so pipelining, so TCP streaming out of order answering was not by default available, at least for unbound and for Knot Resolver. I think it was already there ‑‑ implemented for BIND, can somebody affirm that? I think so. Pretty sure. So we had to to some changes also in our resolver so it was not ready for prime use. And of course if you want to scale from thousands of customers to ten thousands or more than that, you need robust TCP management and probably we can learn a lot of http serves and proxies because they pretend ‑‑ they do run large servers with many, many clients, but my ‑‑ maybe http traffic is bit different from DNS traffic, so we heard these claims, we still think we can scale the machines for sure, but where is the bottleneck? Is it the stack, the TCP stack? The a.m. calm. We think it's more TCP stack rather than application. Do you move it into user space and make optimisation there. All of that. Kind of part of ongoing research and testing and prototyping. For now we are good, at least. For the customer, for the to the resolver order of 10,000 customers can be dealt with with modern up to date kernels ‑‑ operating system kernels. Out of order processing, I want to mention a bit about this. What is it? With UDP everything is fine, so A, B, C, it's been requested and arrives at name resolver ‑‑ resolver and if B and C are resolved answered earlier, it can be done and then later you send A. Because it needs sometime to be resolved, with streaming B and C has to wait for A to be answered so you get a traffic jam so to say. With out of order processing, and that is to be done at the resolver, so it's not something you can use from a library per se. You need some measurements here. A, B, C can be requested, B and C can be answered later, A can be requested over TCP stream. That needs some administration management. That is an important enhancement of resolvers to perform well with TCP to be kind of production‑ready. So it's with ooop, nowadays young people shouting, I think it's a cry of joy. There we go.

Reducing DNS leakage. I think, I am pretty sure QNAME minute misation has been presented here earlier, this is the other part, so it's the first one was going from the stop to resolver now from resolver to the authoritative name server, the rest of the thing. And that is just breaking up your query into different parts and only share the necessary label to the name server from whom you want to have an answer. So this all works, also with different kind of resolvers, has been implemented you can use that. There has been some problems here because some name servers didn't reply correctly famous ‑‑ the thing well‑known was the empty non‑terminals, some authoritative name servers answered annex domain where we expect to data, for example, so we have to work around that. Another thing, when at least unbound was extended with QNAME implemented this feature it worked well and then after sometime people start sending us e‑mail well we cannot resolve this domain etc., we could not resolve, we tonight get any answers back. So, what happened is in QNAME minute misation you ask not for an A record but for the delegation from at the root to net and you ask for delegation to RIPE. So we asked for NS records. And apparently some operators drop NS queries. Maybe it's towards name server, might also be the firewalls in front of that. So, they were really dropping so they didn't get any other answer, they get never an answer, never. What we did actually was very pragmatic. Okay, we just query again for .net but we say we want A record for that and you get the delegation, good enough and it works fine and maybe even better because nobody sees now at the query time that we are trying to do something like QNAME minimisation. So maybe we are hiding now better. That is about the status but all working.

So, having said that, deployment. So, what can do you today? Can you delay it, can you test it, etc., we hope so. So, for the, again, this is what I have been talking to in the past 15 minutes, DNS of it. LS for the stub to the resolver and QNAME minimisation from the resolver. Deployment of DNS over TLS and now take an example, we are working on, so as far as no ‑‑ only initiative but please correct me if I am wrong, so encrypting your data from your stub to the authoritative. So, we collaborate in a project it's called getDNS, and getDNS not per se is something about TLS but just a verse tile stub resolver so DNS over TLS and getDNS is primarily to be a library so you can link your application with getDNS library and your application has all the goodies of DNS, DNSSEC, etc., etc.. so it's very handy and useful and I will stop pitching this. But an important thing is that you can also as an application, run a stub based on getDNS. This is just a stub linked with the getDNS library. And it can also fall back as a full recursive if necessary and that is quite handy, you can use the getDNS stub for DNSSEC, it does roadblocker or if there is some troubling resolver in the mid‑tell can fall back to full recursive and do the whole DNSSEC validation, you get your DNSSEC crypto or data locally so you can do all kind of interesting things like DANE, and DANE ‑‑ DNS 64 so with IPv6, IPv4 synthesis for IPv6 only networks, works also, all kind of good things. And we at the last item, DNS privacy. So, the stub can be configured that it enforces TLS. So any query ‑‑ every query so it's kind of, it's the stub, the getDNS stub can be configured to DNS privacy, how do you say that, not able because it's able but it will be enforced. And this configuration with some other things we call that stubby, so stubby is getDNS stub resolver but with all privacy enhancement options enabled and enforced. So it's part of a larger project. It comes with getDNS 110, and will be actively developed and presented. So, later this week this weekend Sarah will give a presentation on the stubby work, and the privacy enhancements evolved at the DNS‑OARC workshop.

Good. At the resolver side, which resolvers can I use if I have this privacy enhanced stub resolver? You can ‑‑ you have a number of options here so you can use the Knot Resolver, have been BIND plus TLS NGI NX or HA proxy set‑ups running and they are available at this very moment and inbound resolvers running where you can connect with and use as a resolver.

For an overview of these TLS resolvers you can go to the privacy .net website, that will give a list of number of it. LS enhanced or DNS privacy enhanced recursives, I just call them here OARC is running, Yeti is running, SURFnet and DK G is running one with Knot Resolver and you can find on the websites the names and the of the resolvers.

Good. The thing now is still, it's not primed to be used everywhere because it still needs some configuration how do we put the trust so it's SP K I pinning, that is not all the things are in place yet so it needs some manual configuration and jump in trust, I do trust this resolver, etc.. that is not being solved yet and maybe secure DHCP can help but how do you get that? I don't know yet. But these are things we are working on.

QNAME minimisation enabled resolvers, currently unbound, Knot Resolver, BIND is on the road map, so there is also a choice to do this. So, you can already do your part on this to protect privacy of the users, of the clients using resolvers by either testing or reading up, all these kind of work, think how you can roll it out or get in contact with Sarah or me or Alison to work on this. I want to conclude here. I already mentioned the resources, you can find more information about this project in DNS privacy enhancement Working Group results, there is actually two websites but it's currently the same one, the DNS privacy.org and .net, the .net should be for infrastructure, for operators, and the .oarc more for larger community, people who want to use it locally and average user actually, average client. And you can have a look at the getDNS project.

Acknowledgements, I already mentioned a number of names here, Sara Dickinson, Alison and Willem are the main contributors to the whole, coordinating everything and a lot of people you see here also helped us with testing, they wrote small parts of code for the getDNS project or privacy, so thank you all. And I am ready for questions, I guess. Still five minutes doing. Good. Thank you.

AUDIENCE SPEAKER: Philip, not speaking for the RIPE NCC. I think, this is certainly a really great work, there is a lot of stuff, I mean QNAME minimisation is great, being able to use TLS to connect DNS is great but then ‑‑ and I Googled my question so, it was only one not really serious answer. The thing in these days after doing a DNS lookup it's very likely that you set up a TLS connection.

BENNO OVEREINDER: Yes.

AUDIENCE SPEAKER: This, the first thing it's going to do is that the thing you just looked up it's going send in the clear text over the network. So I wonder what your opinion is on the whole system picture while we spend all this time to hide it in DNS and then immediately the client will leak it.

BENNO OVEREINDER: Yes, set up TLS, yes. I know the problem but I don't have the solution here. And I am not that much involved in how the TLS Working Group is ‑‑

AUDIENCE SPEAKER: That was one thing, one presentation ‑‑

BENNO OVEREINDER: ‑‑ is able to work on that.

AUDIENCE SPEAKER: I have no idea what this solution is to this until now they said it doesn't matter it's going in the plain text over the TLS connection because it was in plain text on DNS too, so why should we fix it let them fix it first, at least this part is fixed now so now it's their problem.

BENNO OVEREINDER: Thank you

ONDREJ SURY: Are you going to integrate stub with DNS trigger, DNS trigger because you need more teeth to punch hole through the hijacking DNS solutions at the hotels than just cute puppy so do you plan to integrate the stubby with DNSSEC trigger?

BENNO OVEREINDER: Yes, yes thank you for the question. Actually, Willem is planning to do that. So, just one step, so Ondrej mentioned DNSSEC trigger which was kind of ‑‑ it's a script and unbound resolver so what it does, when you open your laptop it probes, well, from your DCP you get a resolver, and if the resolver is not answering DNSSEC ‑‑ is not DNSSEC able, it starts ‑‑ if DNSSEC able sorry then it will use the resolver cache. If it's not DNSSEC available it will kind of fall back to full recursion and will try to do the full recursion and DNSSEC locally on your laptop. If that is not pop it's captive portal it's well, we messed around a bit, then we forward our DNS queries over port 80 to one of the ‑‑ to a number of resolvers, one run by us and one by red head and all your queries over port 80s anyway, I think it was four ‑‑ HCP ‑‑ probably 443. And then you could resolve even in a captive portal. And that worked quite well. So get stubby is even more useful here so we are thinking to do the same for stubby and replace all DNSSEC trigger stuff with stubby trigger, yeah. And also with better ‑‑ can be drop in replacement even for system D but these are our own evil plans.

JAAP AKKERHUIS: Trying to push this into standard red head ‑‑

BENNO OVEREINDER: Has helped us out with the red head set‑up and trying to push this with their distributions.

SHANE KERR: What is the idea? Jaap Jaap you always get DNSSEC and DNSSEC take behaviour in the so user didn't need to know anything about it.

BENNO OVEREINDER: Any other questions?

JAAP AKKERHUIS: Any other questions for Benno?

BENNO OVEREINDER: Thank you again.
(Applause)

JAAP AKKERHUIS: Some details will be DNS‑OARC, this weekend in Madrid. Ondrej, about people doing weird stuff in DNS, I mean like 90% of human kind.

ONDREJ SURY: Hi, I am Ondrej Sury from cz.nic and who of you are going to DNS‑OARC? I am sorry you will see this twice. But it was less than half of the room so that is I think that is okay. This is actually a project we started when we were finishing the Knot Resolver and we somehow found that the DNS is complex and hard. There are a lot of DNS protocol violations out there and the DNS resolvers have to cope with that because the user doesn't want to know that their favourite service just messing with DNS, they want the service. So the resolvers basically have to have a lot of work arounds in the code to ‑‑ so the service works in the end. And we started doing internal list of those, I would say DNS violations, and here I thought, well, we are not the only one writing a DNS software so let's make this public and let's make this community afford so more people can join when we find something weird in the continues and maybe we can talk about DNS operators of the service to fix the issue. So that is how the DNS violation project was born, basically. And the purpose is not to shame the operators or the writers of the DNS software but it's to better understand what are the common violations and breakages in the DNS protocol to make the DNS he can could system better, and to share the knowledge so other people who decide that writing a DNS resolver is a good idea, please don't, don't fall into the same traps as we did, and for the implementers to avoid common pitfalls they tend to make the same mistakes or ignore the same bits and for the implementers to decide whether they want to have a work around or not. So, there is nothing sensitive in there only public DNS information you can get which is by running DIG or whatever your favourite DNS tool is and this can be used by to attack the servers or anything.

And let's come to the common violations. Basically, the worst I have seen is the CDNs unfortunately, because they all have this beautiful idea that they need to write their own DNS server. And then there are weird stuff like garbage at the end of the packet and case sensitive DNS servers, breakages related to QNAME minute misation, EDNS is a special category, that is with us for 18 years now, and we still see servers not supporting EDNS. And then there are other breakages, let's see some of those, we have found.

So this is the garbage at the end of the packet, some K MP media name servers and they return and some garbage at the end of the packet. It's probably they allocate a static buffer and fill it not completely and send it to have enough place, this is really violation of the DNS protocol. It's easy to cope with but it is violation and ‑‑ make the data bigger on the wire, and we are all trying to make the DNS packet smaller to prevent DDOS attacks. And this is not really helping.

And in the same company, while they have generated a ‑‑ whatever you throw at them they append just some prefixes and back. I might ‑‑ I am not aware this breaks anything but certainly it doesn't feel right as well.

This is interesting, it's case sensitive, as you might know the DNS is not case sensitive, and so if you have a name server that is case sensitive it breaks techniques like 0 X 20 that might run randomise a case and somebody pointed out this is actually a database query into encoded to the DNS query. So the interesting thing is that it only breaks when you change the case in that part, I highlighted with red, if you mess with the case, upper case lower case in other parts it works just fine but this is one part particularly break. So, if you know somebody from ‑‑ let them know. And this is very hard to catch because it returns an X domain instead of no error, and it would be easy to have a work around for that if it returns ServFail or some other kind of error but this is just makes a statement that nothing there exists so the resolver cannot really distinguish this from an error which this really is.

Other example of this, but this is actually OK‑ish because it was refused so the resolver can, when it see refused, it can have a lodge that I can if it sees refused then I am not going to ran tomorrowise the case and I am going to send it all lower case or something like that and then it would work, so this is slightly better, not correct but slightly better. Then mall formed packets on EDNS queries, this is something I really think we should not support and not have work arounds because if you ‑‑ mess the DNS so hard you should not be on the Internet. So if you use EDNS to these name servers handling this domain name it will just return in a ‑ formatted packet and that is so sad. And another example, it's slightly better because it at least returns something we can work with, but still, EDNS is a standard, it was standardised in '99 I think. So, it's kind of, well, I can't understand why it's not implemented now because it's a new service. So, also, CDNs, as I said they really don't care about the standards, just to care to get this little bit of DNS they need to make their service work, and I think that is completely around mindset because they think that the DNS is just something they don't have to care about but I think it's quite the opposite, the DNS is also at the core of the service. So, let's see some examples. I think ‑‑ I will get back to that. So, just quick recap what is empty terminals. A domain name that has /KHAO*EULDZ but it's empty, it doesn't have any record at the point in the DNS tree. So, for example, this is the Czech communications CDN name servers serving some video content for the CDN, they return not implemented if you ask them for the NS queries, and it breaks the query minimisation. Sad. Well, the Akamai problem, which is with us since we started talking about QNAME minimisation, David Laurence said they fixed it but they had to revert the fix because some customers complained that it break something they expected so there are now doing case by case fix for customers in reaching out to the customers so they are aware of the fact and they are working on that but they haven't been able to deploy like Akamai‑wide. At least something. Then there are DNSSEC related breakages. I really don't know what DNSSEC designer produce that had but there is unrecorded points to itself, and ‑‑ well, this is even more interesting, this is the wild card point doing wild card. It should at least have an owner of the zone. And this was reported by Victor because he cares deeply about TLSA and here ‑‑ here for TLSA record so no error, it breaks look up in post fix.

Then this was interesting. I reached to those Git book guys and they fixed some of the breakage they did because they return an A record to just any query. So, now they return the A record just on for A query but surprisingly for AAAA query. So it's like I am asking if you has IPv6 address and they reply here is IPv4 address for you. And I haven't been able to convince them this is completely wrong, but they fix the ‑‑ that break the DNS. So yeah, whatever. But ‑‑ I don't know, maybe somebody has a better arguments to send them to it this is wrong and it doesn't help anything but it doesn't break anything, you just ignore the records and you are done with that.

Then this is a new one, reported by Robert, the Google DNS reports duplicate A records which again is not serious but again it inflates the packet and it's useful. And Google is aware of this so they will fix it but I guess it's not high priority issue for them.

So, this one is fresh. Just reported to me like two days ago. The Raiffeisen Bank, they have some smart load balancers and return an ex‑domain to any query that is not A or AAAA. Form for them, they don't return so A record with that so it's not cached, but there was like ZUNO Bank and they were doing that, so when somebody asked for something that is not AAAA or A it would ‑‑ the resolver would cache the annex domain and because that means there is really nothing at this point in the tree, DNS tree, it would just refuse all their queries. It was fixed, but this is not fixed and this is same problem but as I said form they don't have A record there so it doesn't affect the resolvers or the client, it does affect the resolvers because it prevents negative caching. And they also, as added bonus, dropped packets with EDNS, the future version 1, so they just don't respond to, they didn't ‑‑ the response packet which again opens in the future might open the various attacks on the domain. And the not that hard to implement. So, on the positive side, there were some fixed issues. So, this was fixed very quickly, the day after it was reported. Filtered TLSA queries, it dropped the packet and now works okay. Then Google had a slight bug in the EDNS version 1 again, the future one, they originally correct original code, the bad version but they forgot to include the query in the question section and this is now have been fixed. Hooray. And this was really weird because again the Czech communications load balancers, they were returning the AD bit for authoritative answers. I don't know, I really don't know why, maybe ‑‑ I don't know. But this is this has been fixed as well because it didn't make any sense and nobody would interpret that. And this has been also fixed, the CloudFlare omitting, when the ‑‑ zone was signed and the delegation answer was wrong and again this was fixed so we have some progress inside the project. Just to point out the other related work, Mark Andrews from ISC is doing the compliance project, here is the URL. So, and he scans the Internet for domains he have and test for EDNS compliance. Very interesting. And if you want to join the effort, you can and you know about some DNS violations, you can submit this, we have some standardised format, it's like we have a number like DV E, DNS violations, like CV E, we hope to not that that high number. There is description, evidence, we standardise having textual representation and DNS files and proposed fix or work around and it's advised to if you can to report to the violating party, so we have submitted also report date and fix date.

And here is some thoughts what we can do about the state of DNS. So for DNS resolver vendors, that would be me ISC net ‑‑ power ‑‑ we should be really patient even though I can be passionate also, not only patient about the DNS and we should education the other parties that violated because the DNS is a protocol, there is a lot of RFCs and sometimes they are contradicting each other and stuff like that and I think it is good thing that and if you are doing that, we coordinate our next steps and I think we should make a plan to remove some of the work arounds and announce that prominently in the future and maybe some resolving the most horrible stuff there is, and there was an idea from audience at the IETF where I had the presentation first, that maybe for some of the violations we could add a small time penalty, so if you break DNS, we will just make this more unpleasant for you. And for the DNS community that is all of us, we should be more inclusive and invite people who well, now think that they don't care about DNS, but maybe they do, they just don't know about it. And it's our job to convince them that they should care about the DNS, like the CDN people, because even the CDN fend on DNS so ‑‑ one they run on. One of the idea would be promote use of existing solutions, so people don't write their own DNS servers because they can, and try to convince them that it's not an easy task to write the DNS server correctly so maybe they could use something that exists already. And maybe the DNS‑OARC, which is, well, really great place for DNS people, maybe a neutral platform for this kind of discussion with all the involved parties. And in more particular way how can you help? We have the DNS violations repository on GitHub and we have a mailing list. If you like coding websites, we have a domain but we have no website so if you want to write a static generator from our DV Es to generate block style website, that would be great. You can join the time of reviewers. You can report the violations if you find something weird in the continues, it would be great if you come to our project and submit a new submission, new DV E. And you should, that would be great, you should also report violation to DNS operator. And if you know any DNS people who are not in our DNS community invite them and tell them how great it is to collaborate with other DNS people and talk about DNS. I really sound like DNS NERD now, do I? You can also write blog posts about correct behaviour of continues and make this feel more easy for non‑DNS people or people on the edge of non‑DNS and DNS people. And I would like to hear your ideas what we can do about the state or state of the DNS and technical decay.

So, thanks for listening and I am ready for questions.

AUDIENCE SPEAKER: Yell at that, violating DNS for over ten years now. It's an old joke in the DNS development community that you can write a resolver in about three days and then you spend the revs your life making it work. So, thank you for this. For starters. I to have a question or two about the scope. So, is this about what you find running in the wild or is it specifically about authoritative name servers software?

ONDREJ SURY: I would say both, but it's much easier to get the software fixed because you can talk directly to the people writing it and it's much harder to find an operator of something that is running in the wild.

AUDIENCE SPEAKER: Right.

ONDREJ SURY: So right now it's a thing exclusively the things we found in the wild.

AUDIENCE SPEAKER: So does that include not direct name servers software but say, firewalls and load balancers that ‑‑

ONDREJ SURY: It may be but we don't know that because we don't know what is on the path.

AUDIENCE SPEAKER: Thomas, a question for my own interest. Do you run ‑‑ do you have some test ‑‑ you run against particular domains which others can benefit or is it just random reports for random people which you use?

ONDREJ SURY: It's mostly DIG, it's very hard to write for this. It has been suggested several times but and it would be nice if somebody did that but I am afraid it won't be me, I already have too much. But it might be other output that some kind of test that you can run against your or integrate this into Zonemaster or something like that.

AUDIENCE SPEAKER: Because you mentioned the website so it would be possible to put like a C J or something there and,

ONDREJ SURY: Mostly those are very subtle things and some are not. For EDNS Mark and ‑‑ has a test for it. For other things it's ‑‑ it may be possible.

AUDIENCE SPEAKER: Okay. Thanks.

AUDIENCE SPEAKER: Hi, maybe I can quote Steve Crocker, I am Benno Overeinder NLnet Labs I am here to help you. So, of course, you mentioned the DNS resolver implementers, we can discuss that how we can coordinate some of these, well, things you proposed. How there can be educate the community. That is indeed reaching out and again I am here also a little bit as PCAP could chair, write up some of your mindings or advice in RIPE document, to share with the operational community. But that is maybe more question also to the room.

ONDREJ SURY: Yes, I am personally terrible at writing documents. But ‑‑ yeah. If there is more help for pushing me forwards to and contribute to the document that might be a good idea.

BENNO OVEREINDER: Yes. There is a lot of work I understand you were in contact with many other ‑‑ well with the violateers, so to say, and some of them resolved that and solved that problem and others won't so you ‑‑ how can we go forward and make it scale and have some effect in a couple of years. Thank you.

PETER KOCH: DENIC. Thank you for this. I really like this. This has ‑‑

ONDREJ SURY: So nice ‑‑

PETER KOCH: Should have been taken up like 20 years ago or so. Really great. The good questions have gone already, I guess. I have one: You mentioned the work around so that raises some concerns of course because as soon as the vendor and I am not sure you reached out to vendors and operators as far as you could, which is a great effort, but as soon as a vendor has a registered violation of the standard and you have work around they could claim that the robust principle would suggest you fix their problems. Could you elaborate on what your idea is with regards to the Knot Resolver at least.

ONDREJ SURY: ‑‑ actually, this is the first time I see ‑‑ I hear this fault and I think you are right, and maybe we should just remove the work arounds but then ‑‑

PETER KOCH: Everybody else can resolve it but you can't.

ONDREJ SURY: Should be place for DNS resolvers implementers, I think work around is helpful for the DNS resolvers but doesn't mean they are not violating the standard. So, I think we just deal with that if some vendor claims such thing like ‑‑ there is a work around so I will not worry about that. Then we might also say to him, okay, the work around ‑‑ we have a plan to remove that in 2021. Something like that. And that is what I said, that the DNS resolver vendors should cooperate and make a plan, to maybe remove some of the work arounds and make this less complex, because the work arounds make the code more complex and vulnerable to security issues because of the complexity.

PETER KOCH: One remark. When you said things break ‑ minimisation, that is a feature not a bug, QNAME minimisation is an RFC that is a deviation from the standard. You get the guys on other grounds but breaking QNAME minimisation is not the offence. Thank you.

ONDREJ SURY: I understand but the things I have listed also violate the DNS standard as it is.

ANAND BUDDHDEV: From the RIPE NCC. You mentioned that you have been mostly using DIG to find these violations and then putting them up on GitHub. One thing that might be useful is if vendors of resolver software would log specifically that they were using a work around when resolving a domain name because it would help anybody scanning the logs to find such domain names without having to use ‑‑ do it by hand. So, you know it's easy to parse logs with scripts and say here is a few domains I found from yesterday's log and we could still analyse them by hand but you would have a starting point.

ONDREJ SURY: Thank you for raising that. I intended to do that in Knot Resolver but I forgot about it so thank you for remind me again, and there is a great idea, to log it, but well the problem is that the code is already in the resolver so you need doing and find all the work arounds you put in there, we might probably start when we add new work around we will to that and slowly go back and find all the stuff we had to do. Thanks.

AUDIENCE SPEAKER: Small comment ‑‑

PIETER LEXIS: So the biggest problem with logging with these work arounds would be some of these violations are very subtle and can appear just being a packet drop so you won't even see it and if would you log every time do you not get a response from authoritative server, you will have more work actually parsing the logs and finding out if it is really broken or not.

Martin Swissy: You asked for suggestions and I have a small one. It is well understood that in no case publishing these violations should be interpreted as a name and shame. Right. So, you might promote those who are the most responsive when they are ‑‑ they know there are some violation reports, I think this is very important, you said it earlier, some actors ‑‑ players are more responsive and they fix it. So maybe by distributing monthly or quarterly level say hey those are the best. So the comparison is not in number of violations, everybody confesses they violate DNS but in terms of responsiveness it might be that differentiation.

ONDREJ SURY: Good idea and once somebody writes a website for us it would be.

JAAP AKKERHUIS: Thanks for this.
(Applause)

And now Gerry Lundstrom is going to talk about Drool.

JERRY LUNDSTROM: Hello, I work for DNS‑OARC and I am going to talk about a tool I wrote called Drool, DNS replay tool. And also disclaimer, this is software‑based replay tool so the numbers are not going to be that sexy that we saw earlier. So, the tool uses PCAPs, reads and parses the DNS from the PCAPs and you can send it to specific target. It started as come cost sponsored project from their innovation fund, they needed a tool to be able to test whatever appliance boxes they buy, see up to standards we have Beta version out, I think it's about a month ago. And I would happily receive feedback if anyone is using it, if it's working or not working, we also have a suggestion within DNS‑OARC to start a repository of PCAPs with interesting information in it, like maybe violations or something else like that. We will see how that goes. So, some of the features are of course trying to utilise all the resources you have, it can manipulate the timings between packets so you can speed up and slow down or just ignore the timing. You can loop the PCAPs and you can say ‑‑ find all the UDP or TCP and replay it however you want. It also implements all the filtering that PCAP has and things like that. So here is a small example, and if you look at the first line, you see there is ‑‑ the 'c' for configure and then it's test configure, so we are ignoring timing, and then we are setting the target to the local host port 53, there is nothing actually answering there, but then you are skipping the replay ‑‑ reply, so it doesn't wait for anything to come back. You send everything you found as UDP and you make three of these pools so you spin up a lot of ‑‑ this was done on my local workstation. So it saw 1.7 million packets and sent almost 1.07 million packets per second.

Future implementations, what we are looking at parsing the responses, matching the packages, getting a lot of statistics around that, like is the replies more or less the same and stuff like that. Of course, increasing the performance and looking at the various kits I saw earlier to actually speed it up even more. More statistics, and I have an idea about the control channel to the daemons so we can build GUI or stuff like that. And massive client IP simulation is something I have been thinking about so you can use the client IP addresses within the PCAPs so you can replay large quantities of normal traffic but this will of course require very specific network set‑up to get the responses back correctly. That is also something we are looking at.

And it's on GitHub. Any questions?

AUDIENCE SPEAKER: Phil RIPE NCC. So this is not really into the question directed at your specific thing but more for the community using tools like this. Sometimes we see people playing with replaying traffic and it can either create insane amounts of loads or it can be insanely confusing, so I was wondering whether in your documentation you can do a prominent note point out to people if they start using this on a packet traces they captured, say some big resolver or something like that, to be extremely careful in how they deploy it and carefully balance what they think they can learn from replaying captures from the damage they are doing to other parties, because if I have to explain to my colleagues like, yes, this must be a replied traffic because there is no way we could have generated have already spent way too much time trying to deal with it.

JERRY LUNDSTROM: So there is a lot of tools throughout that can be used in harm also so ‑‑

AUDIENCE SPEAKER: I don't see it as don't do it but include the education to the people who use it, that at least they are aware of the damage they are doing and not think okay this is a cool nifty tool I am going to try it out and leave others with the damage.

JERRY LUNDSTROM: Hopefully the people with the right, with the really big network capacity doesn't do that for fun.

AUDIENCE SPEAKER: Well, I don't know. What we have seen, there has been people had apparently access to big networks so I don't know if they did it for fun or not.

SHANE KERR: Dash dash Dos RIPE option ‑‑ Shane Kerr oracle. This is cool. I have a question about the behaviour of the tool if it is unable to replay at the rate that PCAP is recorded at. So trying record it with the same intervals, that is an option, right?

JERRY LUNDSTROM: Yes.

SHANE KERR: If you fall behind is it hurry up and try to catch up or does it skip behind a little bit?

JERRY LUNDSTROM: No, if you have a high speed PCAP let's say and it doesn't ‑‑ it's not able to play that it's going to give you a lot of warnings and as soon as it starts catching up it's not going to give you warnings. So there is really no buffering or nothing like that, so if you can't play it at the same speed you are always going to get behind.

SHANE KERR: Okay. Cool.

AUDIENCE SPEAKER: Tony marks also oracle. When you add the replies facility could you ensure you have machinery ‑ output because that would make the tool very useful.

JERRY LUNDSTROM: Okay.

JAAP AKKERHUIS: It looks like that ends it it, two questions in one. Thank you.
(Applause)

And now it's up to Pieter Lexis who is going to talk about DNS‑DIST.

PIETER LEXIS: Good afternoon, I know I am the last one before you can all go to the bar so I will try to keep it as short as possible. But don't hesitate to ask any questions. So, DNS‑DIST, it started out a couple of years ago as really small concentrating load balancer, especially when you are talking about recursors, one that is busy is happy, the cache is hot and you get quick answers whereas normal load balancers where split the data over many of the back‑end servers. Then we manage to figure out that people would be very interested in having something from a name servers where you can do some traffic inspection and can do load balancing and maybe a have a nice Swiss army knife in there. So, most of the time when you have a problem in your DNS or the ‑‑ you are under attack but see some weird stuff on the network, this is usually how you do this, TCP dump, you do some grep, awk and ‑‑ IP table stuff or something on your firewall in front of the name server and then you realise it's incomplete or something is missing or you are blocking too much so your clients cannot resolve some names anyway more and you have doing back and everything is terrible. So, sometimes when you are a resolver operator you get attacked but it's not really an tack but more of a a device somewhere in your customer network does weird things, for instance we have seen at some point a smart TV that would to a query for its ‑‑ firmer update website, get an answer, decided they did not like it and immediately request another and it again ‑‑ so would you end up with thousands of queries a second that would just get rejected every time you send it back. And DNS‑DIST is one of these things that can help you here. And it's fall back stuff that can hurt performance. DNS‑DIST ring buffer of the last 10,000 queries it received and you can do inspection on it, if you see something weird or want to know what is going on we have a command, it's called top queries, will give you the top five queries of the last 10,000 entries that it has. You can also filter based on responses, you are seeing a lot of ServFails which queries are this? We also do some histogram stuff so we take back end replies into account so you can see what is going on, what time and how fast things are. And we have a function called grep Q which stands for grab query where you have the possibility to quickly go through your DNS data which is in the buffer which is live so there is no TCP dump that you have read at that point and figure out how you have to filter this even further, this is just a way of immediately seeing what is going on on the live traffic. So what you see here is actual output, well little bit anonymized from a test I was running, what you can see is you have the time tells you how many seconds ago this query came in, the client, the back‑end server, if this is a reply, name, QTYPE and the latency in milliseconds, and the response and some of the flags that that were in there. Second example, for instance, Apple.com 100 milliseconds for grasp Q which means give me all the answers we have for anything inside Apple.com which was slower than 100 milliseconds so helps you to identify a slow back ends or slow upstream authoritative servers.

And so for the ways of limiting traffic on this, so pretending it's a firewall you can drop everything you want, if you want to have ‑‑ fur afraid of some domain you can just add a to main block, fill this in in console so DNS‑DIST has console, what you can connect to over TCP and it just presents with you a console like a normal network device where you can live configure anything that you want for blocking or reading data. We have implemented several, what I find very interesting ways of dealing with large amount of traffic. For instance, you have an add action here which adds a query per second IP rule, everything is documented on DNS‑DIST.org. What this rule does, it will limit IPv4 /24 subnets and IPv6 /64 subnets to five queries per second. So if a sixth query is received after, within the second, the query is dropped and there is no answer going out to the client limiting any potential response flooding.

There is also some more advanced stuff, for instance here we create a rule that gets all the queries of QTYPE AAAA that somewhere in the name the word PowerDNS and we mix it up with a net mask group which some IP addresses in there, and then if it matches we delay the packet packet by 900 milliseconds. This is one of the ways where you can, the example I told you with the firmware update query, you can just delay the query for 900 milliseconds, send it back to the client and doesn't like it and will send in a new request, but then you don't get 1,000 queries a second you just get one or one point something.

There is also a way to automatically block traffic based on patterns. There is a maintenance function that you can put into the configuration file which is like Lua by the way. This one just grabs the ‑‑ all the IP addresses that had more than 100 and X to main responses in less than 10 seconds and adds a dynamic block with a short log message where all these are blocked from 60 seconds from querying, just thrown to the floor.

We have implemented a lot of traffic selecters, so you can filter the DNS packets based on source addresses, QNAME, QTYPE, any of the flags, in responses for instance look in the authority or answer section and see how many entries are there, if there are too many or too little you can drop the packet or do some other funky stuff with it. The number of labels in the name we have implemented regular expresses, which in one of the previous slides I so you see here PowerDNS but it's full ‑‑ engine so you can do all kinds of things with pattern matching for random sub domain attacks and all these kind of things. And you can combine all these selecters with and/or Knot so have an interest dynamic way of blocking traffic or doing some other actions with it so I mostly talked about dropping, you can also do other things like routing into other pools for instance, you have some legacy platform you put DNS‑DIST in front of it and want to do some DNSSEC validation as well, you can send every query that comes in with the DNSSEC okay bit set to that pool. So you don't get any DNSSEC stuff on your current platform. You can send truncated response back. It's very easy to implement any two TC if you have authoritative platform behind it which does not support that, return any kind of status response, for A and AAAA we support adding custom answers so you can just pop in an answer without any time going back end, delay responses, remove some flags before sending it to back end if it doesn't support several things. You can do some at the it originating IP address as EDNS option in client subset pass it off to the back‑end so it knows the original query, IP we are planning to add as well some IP transparency support so you can see the original source address but that is future work. The interesting feature where you can log the query and answer through TCP connection over proto buff which alow us you to set up for all the answers you have received and contains the response time from the back‑end etc. So you can do some research on it later on. Or keep it for lawful examiner September or just anything. You can do some statistics based on DNS name, strip client subnet if ‑‑ so if you want to play with this, it's OpenSource it's ‑‑ we have packages, you can download the repository which is, it's normal PowerDNS but not specific just a DNS ‑‑ just man ‑‑ documentation if you need any help or support or just want to play around with it and see what we can do we can talk to you and I will be happy to take any questions.
(Applause)

SHANE KERR: This is Shane Kerr from oracle. So many questions. I will ask one and go to the back of the line. When you talk growing the proto buff, is this DNS tap you are doing?

PIETER LEXIS: It is not, it is very similar. We are planning doing full DNS tap support but that was implemented before it was a thing.

AUDIENCE SPEAKER: You don't have to. Phil, I like what you have been doing and I have have been having a number of my teams, we are about to submit upstream patches to that you to full DNS tap and hopefully maybe you can help us make it a bit better. That is one point. The second is, I think you ‑‑ I would like you to consider doing destination filtering,

PIETER LEXIS: Destination address of the packet? Of the original query? That is in there already you can match on destination address.

AUDIENCE SPEAKER: I will find it separately.

PIETER LEXIS: Maybe in master unreleased version. I will find you later.

AUDIENCE SPEAKER: From Netnod, just a clarification question, maybe I missed T if I remember correctly DNS‑DIST like very old thing that was used for load balancing testing, so this is new development or has this been in the code for a long time?

PIETER LEXIS: The original DNS‑DIST I think was written in 2013ish where it was a really simple programme that would just round‑robin DNS packets and at some point we were talking to some people debugging a PowerDNS recurser problem and had no problem switching out in back ends because they had a load balancer in front of it and we asked them are you happy and they were like yeah we are really happy with it and we start asking around other people and like no, totally sucks, so we figured we could do a nice DNS aware load balancer would only do DNS and do it well. Some old code is still in there but mostly a rewrite.

SHANE KERR: Shane again. My next question is in PowerDNS, is this the tool that you use for RRL like support

PIETER LEXIS: You can implement RRL in this tool yes.

SHANE KERR: That leads to my next question. Is there a good standard out of the box things, one of the nice things about RRL is you turn it on and you are done the capabilities you are talking about are really cool for a lot of the people in the room who are very heavily actively managing their DNS but there is a lot of people who aren't. So what is the story there?

PIETER LEXIS: I think we are planning to have some default configuration shipped, right now it's not experimental tool but just a play thing that actually many people using in production to either alleviate problems in back end or we had one I think Ondrej talked about back ends or off not doing casing properly, actually fixing up the casing in DNS‑DIST before sending it out to the world, or I think the back‑end because back end doesn't support it.

SHANE KERR: Nice.

AUDIENCE SPEAKER: I saw in slide 9 at your ‑‑ two back when you have the action condition action thing so do you have ‑‑ have you considered using a scripting language with packaging and stuff so people of can keep plugging things will have repository and best practices for that.

PIETER LEXIS: We don't have it yet. It might be a good thing to have some nice standard scripts around to do things, is that the question?

AUDIENCE SPEAKER: Yes.

PIETER LEXIS: Everything is Lua, all these are functions you can call from inside the Lua.

SHANE KERR: Shane again. So there is no Python tool trance pilar for this or something?

PIETER LEXIS: No, sorry.

SHANE KERR: My next question, I think my last one, is is anyone taking an RP Z feed and feeding it into DNS‑DIST?

PIETER LEXIS: No, there is no that support there.

SHANE KERR: You could write the scripts.

PIETER LEXIS: You could, but that would mean having XFar ‑‑ we have a recurser that does this that would not be something interesting to implement. If somebody wants to pick it up sure.

SHANE KERR: Not in Lua, though. Okay.

AUDIENCE SPEAKER: I was the comment from Shane on RRL because if I remember DNS‑DIST was mainly load balancer for the resolver, but RRL was actually a mitigation again ‑‑ reflect for the name server so I guess this is but DNS‑DIST, can you also use that as in front of the authoritative name server and the follow‑up question would be because I am then thinking how does that impact performance and if you have any numbers on that?

PIETER LEXIS: So we did some measurements on this, so performance‑wise it's maybe even faster because it has a small cache built in so we can even cache from the back ends as well. I think there are a couple of large hosters that uses some of their authoritative servers without any issues so they should just work. And we don't have ‑‑ it does not care if you are in front of an out or recursor, most of the limiting actions that you can perform will make sense in one of the cases for the back end so either out or recurser.

AUDIENCE SPEAKER: Martin Swissy: You said in your first slide about what people to now. And you offered something much better and I agree with you. So, based on your experience it was the first one where there is a loop ‑‑ that is it ‑‑ so I understand that DNS‑DIST administrators will do it better, I believe. So, based on your experience do you have any hints, help to before taking actions, for example to different between legitimate traffic once you are overloaded and the traffic you should go sent to the pool or drop, so it might belittle bit disturbing to see how many actions one can take, but is there any way to learn with those commands, how to different shape the legitimate traffic and the other ‑‑ based on your experience of course.

PIETER LEXIS: Based on experience it's usually when you are under attack or see some weird things it's very specific so it's not easy to say for in a general case this is legitimate traffic and this is illegitimate traffic because anything can be classified as either legitimate or illegitimate depending how you run your stuff. But using the intra, the inspection tooling you can figure out what is going on and then you have to decide not automatically, that is unfortunate because ‑‑ you are going to have to implement machine learning in Lua. It's cool.

AUDIENCE SPEAKER: And coffee also.

AUDIENCE SPEAKER: I would suggest if you want to do stuff like that that you do some kernel modules and some inspection at that level and then from your learning and sampling that to you there and anomaly detection then generate the Lua.

JAAP AKKERHUIS: This brings us to the end. Thanks.
(Applause)

And we would ‑‑ we are going see all of you in Dubai I assume and tomorrow, the diversity subject starts at 9:30 so you can sleep in a little bit. Instead of 9:00. See you at dinner. And thanks for...
(Applause).