The Gratuitous ARP…some more…

The current list of potential fixes for this issue is very short and not entirely with consistent results.

Of note in the serverfault thread on this issue, is that Microsoft created a hotfix, In our testing, this hotfix didn’t correct the issue. Though, there were successful implementations of this fix, as noted on the thread.

From the Cisco forums and Microsoft forums, the following noted fix can also be attempted. “craft an NDIS driver and push out our own GARP with the SPA set whenever we completed asserting an IP address.” Though, we haven’t gone down this path yet, it seems several companies have implemented this fix with success.

We’ve created an in house solution that has had inconsistent results. I’m hoping further testing and development will allow us to come up with a much more robust solution for this issue.

The Gratuitous ARP cont…

Continuing on with working out what is involved with the Gratuitous ARP.

Secondly, it appears that the windows neighbor-cache (arp-cache) is only updated if the machine can no longer talk to the machine that is in it’s cache currently. It does not send out occasional ARP requests to make sure the cache is not stale.

This is the real issue faced in our current example. With the other systems not updating their cache’s in a timely manner (in this case, a couple of seconds is the most it can be down), we’ve effectively black holed the new server, IP wise.

Apparently, this is a known change in how Windows Server 2008 handles ARP Caching. Although, it seems that it is not a highly publicized change, but there is a very good Technet article on the change. The fundamental change is to the SPA field being set to This way the ARP or neighboring cache’s are not updated with incorrect information. A more thorough analysis of a real example can be found here,

The next part will look more into what are the current ways to get around this issue.

The Gratuitous ARP

We currently face an issue where everyone is throwing around the term, ‘Gratuitous ARP.’ It is a problem that is causing us issue with rolling database instances to new hardware, or another example would be moving IP addresses in a cluster scenario.

The real issue faced is that of the behavior involving Windows Server 2008 R2.

First, a Windows Vista or Windows Server 2008 will not update the Neighbor cache if an ARP broadcast is received unless it is part of a broadcast ARP request for the receiver. What this means is that when a gratuitous ARP is sent on a network with Windows Vista and Widows Server 2008, these systems will not update their cache with incorrect information if there is an IP address conflict.

…more to come after I sleep. An extra long day in which many things were broken.

The Chain

Read an interesting blog post yesterday in regards to writing everyday on topics you are learning about. The post was from Chris Strom and it can be found here.

I like the idea of writing everyday to help understand knowledge. I am completely doing this for myself and if I happen to help someone else along the way, that will be awesome!

I’m still working out the early topics and I certainly want a list going ahead of time before I commit myself to a long term project.

For starters, I’m going to utilize the vast amount topics we come up with everyday at work. A small group of software engineers with very strong opinions about a wide range of topics…I should always have a large amount of material to work with.

