Will if fail over, or just fail? Some DNS empirical testing

While out to dinner the other night we got to talking about name resolution in Windows, as one does while out enjoying a fine meal with friends… During the discussion we found that one of us had experienced some strange behavior with regards to the prioritization of DNS servers and failover. Specifically the issue was that the DNS resolver in Windows would not fail over to the next DNS server on the list if the server currently in use became unavailable. This sounded like a major bug and we were amazed that if this was indeed the case; how come we had not heard about it before? Out came the smartphones, the conversation stopped, the food went cold and the beer warm. After quite some time (at this point the waitress had started sending us long gazes, wondering if we were all stricken by some strange debilitating disease) we still had nothing to back up this case. So there was nothing for it: I went back to the hotel to do some testing. This is what I found…

Resolving names on Windows

It is important to understand that we are talking about the DNS name resolution behavior here when the machine is acting as a DNS client, not a DNS Server. Windows uses two components to resolve, register and cache DNS names. These are the DNS Client and the DNS Resolver. The DNS Resolver, aka the Windows resolver, is part of the TCP/IP protocol and cannot be disabled without disabling the protocol itself. The DNS Client is implemented as a service with the friendly name DNS Client, and a service name of Dnscache. Its description reads as follows:

The DNS Client service (dnscache) caches Domain Name System (DNS) names and registers the full computer name for this computer. If the service is stopped, DNS names will continue to be resolved. However, the results of DNS name queries will not be cached and the computer’s name will not be registered. If the service is disabled, any services that explicitly depend on it will fail to start.

So the DNS Client is responsible for caching the results from the DNS Resolver, and register the computer’s FQDN in Dynamic DNS. If you stop the DNS Client you can still resolve names, but they will not be cached and a DNS Server will be queried each time a name needs to be resolved to an IP address. Needless to say, this may impact performance.

Name resolution works the same for both Windows clients; XP, Vista, 7, 8 etc., and servers; 2003, 2008, 2008 R2, 2012. Both servers and clients need to be able to resolve, cache and register network names and they all do it the same way.

The DNS Server list

Each network adapter in a Windows machine that is bound to either the TCP/IPv4 or TCP/IPv6 protocol has a prioritized list of zero or more DNS servers to which queries to resolve names can be sent. The adapters themselves are also prioritized. A name is resolved using this process:

The DNS Resolver queries the DNS servers in the following order:

  1. The DNS Resolver sends the name query to the first DNS server on the preferred adapter’s list of DNS servers and waits one second for a response.
  2. If the DNS Resolver does not receive a response from the first DNS server within one second, it sends the name query to the first DNS servers on all adapters that are still under consideration and waits two seconds for a response.
  3. If the DNS Resolver does not receive a response from any DNS server within two seconds, the DNS Client service sends the query to all DNS servers on all adapters that are still under consideration and waits another two seconds for a response.
  4. If the DNS Resolver still does not receive a response from any DNS server, it sends the name query to all DNS servers on all adapters that are still under consideration and waits four seconds for a response.
  5. If the DNS Resolver does not receive a response from any DNS server, the DNS client sends the query to all DNS servers on all adapters that are still under consideration and waits eight seconds for a response.

If the DNS Resolver receives a positive response, it stops querying for the name, adds the response to the cache (via the DNS Client service) and returns the response to the client.

If the DNS Resolver has not received a response from any server within eight seconds, the DNS Resolver responds with a timeout.

The Case

The case we are exploring here is as follows:

On a Windows machine with one network adapter, which is bound to the TCP/IPv4 protocol, with two or more DNS servers specified in its TCP/IP properties; if the primary DNS server does not respond the machine will not fail over to the secondary DNS server and will be unable to resolve names.

I used this setup to test.

  • 2 Windows Server 2012 DC/DNS servers named MAYA1 and MAYA2 with the addresses 192.168.131.10 and 192.168.131.11 respectively. Both have the same DNS zones and configuration, i.e.. they will both answer with the same information when queried.
  • A Windows 8 Professional client, named MAYA-CLIENT1, with a dynamically assigned IP address from MAYA1, acting as a DHCP server, and 2 dynamically configured DNS servers (MAYA1 and MAYA2) in that order.
  • All machines on the same subnet.
  • No Internet access
  • Resolving names from a zone the two DNS servers were authoritative for
  • Network Monitor on the Windows 8 client used to capture network traffic
  • All servers and client were VMs running on Hyper-V
  • 2 DNS names for testing; nothere1 (1.1.1.1) and nothere2 (1.1.1.2)

image

Testing

The test I did was very simple. With Network Monitor running on the client I first pinged the name nothere1. In Network Monitor I verified that the response had come from MAYA1 (the configured primary DNS server). I emptied the DNS cache on the client and disconnected MAYA1 from the network. I then pinged nothere1 again, using Network Monitor to see which server answered.

It should come as no surprise that this worked exactly as expected. Under normal conditions Windows will use its configured primary DNS server, if that fails it will use the next configured server on the list after a short delay.

Here is the process in Network Monitor

image

Frame Operation
15 Query from client to primary DNS server for the name nothere1
16 Answer from primary DNS server to client with the IP address of nothere1 (1.1.1.1)
<Primary DNS server MAYA1 disconnected from network and the DNS Cache is emptied on the client>
31 New query from client to primary DNS server for the name nothere1
34 Since no reply has been received from the primary DNS server for 1 second a new query is sent to the secondary DNS server MAYA2 (192.168.131.11) for the name nothere1
37 Answer from secondary DNS server to client with the IP address of nothere1 (1.1.1.1)

 

Notes

  • These tests were performed with Windows 8 as the resolving client. It is quite possible that an earlier version of Windows behaves differently, but nothing I have found suggests so. Testing of that will have to wait for the time being, though.
  • I did come across people who reported variations of the problem originally stated in this post, but none with the exact same result. Some people were able to log on to their domain and resolve names successfully when the primary DNS server was offline, but were baffled that nslookup did not work. I guess they didn’t know that nslookup always queries the primary DNS server.
  • Some documentation claims that it is the DHCP Client service (service name Dhcp) that registers the computer’s FQDN with DNS. I have tested this on the setup used in this article and have not been able to reproduce that behavior. This article; How to configure DNS dynamic updates in Windows Server 2003, claims that the DHCP Client service registers the name even for statically configured addresses. A funny thing here is that the service description for the DHCP Client service also claims that it registers addresses: Registers and updates IP addresses and DNS records for this computer. If this service is stopped, this computer will not receive dynamic IP addresses and DNS updates. If this service is disabled, any services that explicitly depend on it will fail to start. I also remember reading about this in the Windows 2000 days, so maybe this was the way it worked before?

Conclusion

So far it looks like Windows behaves exactly as you would expect! That is always nice.

More information:

Fun exercise

To see a demonstration of how the DNS Resolver and the DNS Client work together; try this:

  1. Stop and disable the DNS Client service on a machine
    Just stopping it will not work because Windows will restart it as soon as you start resolving names.
  2. Ping your favorite DNS name and marvel at Windows’ ability to resolve the name without the benefit of the cache.
  3. Run ipconfig /displaydns to display the DNS cache
    Instead of the usual list of cached DNS names you will see this error: Could not display the DNS Resolver Cache.
  4. Run ipconfig /flushdns to clear the DNS cache:
    No dice; this error pops up: Could not flush the DNS Resolver Cache: Function failed during execution.
  5. Run ipconfig /registerdns to register the current computer’s name with a DNS server:
    Fail: Registration of DNS records failed: The binding handle is invalid.
  6. Sit back and enjoy your deep understanding of DNS on Windows.
  7. Re-enable and restart your DNS Client service.

MVP Award!

I just got word that I have been awarded the Microsoft Most Valued Professional (MVP) Award for 2012, in the Directory Services discipline. This is a great honor, and I accept it humbly.

First off, thanks to those who nominated me and gave me advice on how to become an MVP, and also Microsoft for finding me worthy. I am really looking forward to connecting with the MVP community, and with the Directory Services group particularly. Also, thanks to everyone who has congratulated me on the award.

I have a lot of new ideas for new community content, so keep watching this space!

TechEd 2012: Day 4

Last day!

It has been a fantastic conference! A lot of interesting sessions for every timeslot.

I started out with Marcus Murray’s session about Advanced Persistent Threats (SIA303). I have been disappointed with Marcus’ sessions at earlier TechEds but this time I was positively surprised. Marcus, among other things, gave a good rundown on the RSA attack. Good session.

Being in a geeky mood, I next went to see Aaron Margosis and his Sysinternals Primer: Gems session (SIA311). I have read a lot of Aarons stuff before and it was great to finally see him in person. Aaron has also written a new book about the Sysinternals tools which I’m planning to get. If you’re interested it’s called Windows Sysinternals Administrator’s Reference.

In keeping with TechEd tradition the guys responsible for the agenda had placed Mark Russinovich’s last session at the very end of the conference. No doubt to make as many people as possible stay for as long as possible. I never leave before the conference ends, so this was no problem for me. The session was the 2012 edition of the Case of the Unexplained series. Mark had all new cases and it was a very fun session.

The last session of the conference was Andy Malone’s Cryptographic Chronicles, part 2 (SIA401). This was a continuation of an earlier session. This time Andy promised beer to whoever could break his cipher challenge. I did my best, but was unable to break it this time. (No one else did either, so I wasn’t too sad.) The session itself was very interesting and a nice conclusion to the conference.

We walked back to the city center, had some dinner, and went home. All in all a great week, both at the conference, and in Amsterdam. Already looking forward to next year!

TechEd 2012: Day 3

Day 3 and still going strong. Great conference so far!

The day started with another good session from Samuel Devasahayam. This time it was SIA312: What’s New in Active Directory in Windows Server 2012. A lot of new cool features in AD for Server 2012, first among which must be Dynamic Access Control.

After that Mark Russinovich gave a very interesting session about Windows Azure internals (AZR302). Azure is really a great service and it was very cool to hear a little about how it is built and run. Mark even had a new Dave Cutler story.

John Craddock was also at TechEd and before lunch I attended his “Windows Server 2012: A Techie’s Insight into the Hot New Features”. This was an OK session where John went through what he felt were the hottest new features in Windows Server 2012. His selection was Direct Access, Compound Tokens and Dynamic Access Control. You have to see John at least once every time you’re at TechEd.

During lunch we participated in Andy Malone’s interactive session about BYOD; SIA04-LNC Adventures in BYOD Land. This would have worked much better in a smaller room and with fewer people as the atmosphere in the big theater room at the RAI didn’t encourage the audience to speak up. Especially since Andy is known to speak his mind afterwards. But it was still interesting.

The last session of the day for me was Mark Russinovich’s “Malware Hunting with the Sysinternals Tools”. This was the best one so far in the conference. Mark gave us an update on the tools, and went on to dissect some of the latest malware, e.g. Stuxnet and Flame. We even managed to get a picture with the man! (I’m to his right in the picture, with the white shirt and glasses!)

TechEd 2012: Day 2

Lovely wheather in Amsterdam on our second day at Microsoft TechEd 2012.

Unfortunately I missed the second keynote (will have to catch that on video later).

Since I really like IPv6 I chose Edward Horley’s talk WCL324: IPv6 Bootcamp: Get Up to Speed Quickly as my first item of business. Ed, whom I have not seen before, delivered a great talk and had some very good advice on IPv6 deployments. I fully agree with his recommendation of disabling the IPv6 transition technologies, but leaving IPv6 itself enabled. You will have much more predictable network behavior then.

Next up I got a chance to meet an old hero of mine; Dr. Tom Schinder. Tom was/is the leading expert on ISA Server/TMG products. I have read his book on ISA 2000 and probably all of his blog posts on the subject. Tom has lately begun spending more time as an architect and gave a very interesting session on the foundations of cloud computing; AAP304: Private Cloud, Principles, Concepts and Patterns. The session basically explained why the System Center suite does what it does. I got to shake the good man’s hand later and have a quick chat. Unfortunately I didn’t have a change to get a picture with him. The camera on my HTC HD7 is just too poor.

In my first session after lunch, SIA200: Cyber Security Defenses: What Works Today, I also got a change to meet another speaker that I have heard a lot about; Robert Grimes. Robert wrote the Protect You Windows Network book together with Jesper Johansson. This is a great book that I highly recommend to anyone interested in security. Rober, and his co-presenter Mark Simos, gave us a fantastic 75 minutes of security information. Loved this session!

The day’s last session was Andy Malone’s SIA400 The Cryptography Chronicles: Explaining the Unexplained, Part 1. This is one of the very few level 400 sessions on TechEd and it was great. I am really fascinated by cryptography and Andy gave a really good introduction. He also had a little competition for the audience with a Caesar substitution cipher (which I was able to solve!). Unfortunately it had some mistakes so the plaintext read WE ARE AKK GEEGS, instead of the intended WE ARE ALL GEEKS. I talked to Andy later and he admitted that it had been pretty late when he put that one together. Part 2 will be on Friday.

We finished the day with a nice cruise on the Amsterdam canals.

TechEd 2012: Day 1

First day of the conference proper and we are, to use a Microsoft term; superexcited.

First item of business was the keynote delivered by Brad Anderson and a few others. Much attention was given to the upcoming launch of Windows Server 2012 and the new System Center 2012 Suite. Microsoft has a great story here and I’m having trouble seeing other companies with the same capabilities.

Next for me was Mark Russinovich’s Introduction to Windows Azure Virtual Machines (AZR208). This is a really interesting new capability of Azure, which also puts Microsoft in the IaaS market (previously Azure was only PaaS). Mark delivered a great talk as always. The man’s knowledge is impressive.

During lunch we caught Mark Russinovich and Scott Guthries live chat about Windows Azure. Questions were put to the panel via Twitter. That was really cool. I even got some of my own questions answered!.

Hacking sessions are always fun so after lunch I went to see Paula Januszkiewicz’ talk entitled “Crouching Admin, Hidden Hacker; Technologies for Hiding and Detecting Traces (SIA301)”. This was a huge let down as Paula just kept repeating old, well known security axioms like “Have a dedicated admin account” and “Don’t log on to workstations with your admin user”. Very boring and disappointing. As far as I could tell she didn’t cover any of the stuff the title of her talk indicated. Paula’s demo of Stuxnet, if you can call letting your VM become infected, failed, but that was due to the recently discovered deactivation feature of Stuxnet, although we did not know this at the time.

The last session of the day was SIA205 Running Active Directory on Windows Azure Virtual Machines. The talk was delivered by Samuel Devasahayam from the Directory Services team. Sam was a great guy and covered all the stuff you need to know before you deploy AD on Azure VMs. We met Sam later outside the conference center and had a nice chat.

Looking forward to tomorrow!

TechEd Europe 2012: Day 0

Today was pre-conf day. A whole day of focusing on a single subject. John Craddock delivered the goods in my selected session: Building Federated External Access for Microsoft SharePoint 2010. The TechExpo and the Hands On Labs areas were not open yet, so the conference hasn’t really started yet.

Very much looking forward to tomorrow, which will start off with the main keynote. That’s always fun. Later in the day I plan to catch some sessions on the new features in Windows Azure, and Mark Russinovich will deliver one of them!

TechEd Europe 2012: T -1

Hello again

1 day until @teched_europe starts!

Another nice day in Amsterdam, although it rained a lot today. Went down to the @AmsterdamRAI conference center to register for the event. New this year is self service registration, Microsoft taking lessons from the airline industry I guess. The RAI seems nice, but everything was pretty much closed, will see more tomorrow.

A few friends arrived today also. We’re going to have dinner and see  England vs. Italy in the European Cup.

TechEd Europe 2012: T -2

Hello all!

So it’s time for the long awaited return of the @teched_europe Conference here in Europe. This time in the wonderful city of Amsterdam, the Netherlands. It has been a long wait for us techno geeks since Microsoft decided to push the conference back to the summer timeframe instead of autumn. 18 months to wait is a long time!

Arrived in Amsterdam today, nice hotel, terrible Internet (writing this at the McDonald’s next door that has free WiFi). Taking in the sites today. More to follow!

If you want to know more about the Conference, have a look at: http://europe.msteched.com/

2 days until TechEd starts!

What’s special about the builtin Administrator account?

Every installation of Windows based on the Windows NT code base has a builtin admin account called Administrator. Every installation of Active Directory Directoy Services also has a builtin admin account called Administrator. (If you are running a version of Windows other than English, your accounts may be named something else.) This account provides complete access to files, directories, services, and other facilities. But are there other things that make these account special?

  • The Relative Identifier (RID) is always 500
    In Windows each Security Principal is identified with a Security Identifier or SID. The SIDs have two parts; the machine or domain component and the Relative Identifier (RID). The RID is simply a whole number incremented with one (1) each time a new Security Principal, typically a group or user, is created. The builtin Administrator accounts, whether they are in a local SAM database or in Active Directory, always have the RID 500. This means that if you know the domain or machine component of the SID, you also know the full SID of the builtin Administrator. From there it is easy to do a “reverse lookup” and find the actual username of the builtin Administrator, and then to start trying to break into it. (Some older code even lets you authenticate with the SID directly, as opposed to a username.) See next bullet.
  • The account cannot be locked out
    The builtin Administrator account cannot be locked out of the system no matter how many failed logon attempts it accumulates. This makes it a prime target for brute force attacks. Auditing can help you find out if someone is trying to do a brute force attack using the builtin Administrator account. Other, manually created, administrator accounts can be locked out, and therefore do not present a similar threat. Renaming your builtin Administrator account will afford you some protection, but be aware of the limitations of this; see previous bullet.
  • The account cannot be deleted
    At least not using the default Windows tools.
  • The account is disabled on client OSs as of Windows XP
    In Windows XP and onwards, the builtin Administrator account does not have a password and is disabled. During setup you are required to create at least one new account, and this account becomes an administrator.