Group Policy Fails For One User

Recently I started noticing a very strange problem. user Group Policy was not applying for a single user. All other users got GPO just fine, but for one user GPO failed to apply. This manifested itself in a couple of interesting ways:

  • When you run gpresult under this user's account you get an error that this user does not have RSOP data
  • If you use folder redirection via GPO, as I was, suddenly this user does not get folder redirection applied. The user's "my documents" folder all of sudden is a local copy from %userprofile%, and it has nothing in it. If the user goes directly to the network-based documents folder the original documents are of course all there.

As always, the first step in troubleshooting any group policy problem is to turn on UserEnvDebugLevel. I set it to 0x00030002 to get as much information as possible. I then ran gpupdate /force: /target:User. Upon examining the userenv.log file I found the following information:

USERENV(c74.bb0) 11:54:24:953 LibMain: Process Name:  C:\WINDOWS\system32\gpupdate.exe
USERENV(c74.e70) 11:54:24:985 RefreshPolicyEx: Entering with force refresh 0
...
USERENV(4bc.ed4) 11:54:29:811 ProcessGPOs: GetGPOInfo failed.
USERENV(4bc.ed4) 11:54:29:811 ProcessGPOs: No WMI logging done in this policy cycle.
USERENV(4bc.ed4) 11:54:29:811 ProcessGPOs: Processing failed with error 997.

Error messages are good. Do a web search for error 997 and you find, well, not much. There is a newsgroup post where some guy has the same problem, and fixes it by deleting the user account and creating a new one. Thanks but no thanks. I'm not much for solutions to a small cut that involve amputating the limb the cut happens to be on.

Instead I went to the event logs on the DC where I noticed relatively quickly, by filtering for event ID 540, that this user was not logging on with Kerberos. For some reason, when this user logged on a quick Kerberos logon happened, followed by an almost immediate logoff, and then an NTLM logon. This is why Group Policy is not being applied - the user is not logging on with Kerberos. Without Kerberos you do not get Group Policy. Unfortunately, it does not tell us why this is happening.

At this point, I looped in my friend Jimmy, who has forgotten far more about Active Directory than I will ever know. Jimmy immediately suggested turning on some Kerberos debugging, which sounded like a really good idea. We used the following the parameters:

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\Kerberos\Parameters
Value: LogLevel
Type: REG_DWORD
Data: 1

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Kdc
Value: KdcExtraLogLevel
Type: REG_DWORD
Data: 4

Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Kdc
Value: KdcDebugLevel
Type: REG_DWORD
Value: 1

These values need to be in the specific keys listed here. I know that sounds obvious, but it must not have been so obvious to the authors of this KB article:
http://support.microsoft.com/kb/887993/en-us

Once we did that we got this error in the event log:

Event Type: Error
Event Source: Kerberos
Event Category: None
Event ID: 3
Date:  11/25/2006
Time:  10:15:44
User:  N/A
Computer: <server>
Description:
A Kerberos Error Message was received:
         on logon session
 Client Time:
 Server Time: 18:15:44.0000 11/25/2006 Z
 Error Code: 0xd KDC_ERR_BADOPTION
 Extended Error: 0xc00000bb KLIN(0)
 Client Realm:
 Client Name:
 Server Realm: <domain>
 Server Name: host/<dc.domain>

 Target Name: host/<dc.domain>@<domain>
 Error Text:
 File: 9
 Line: ae0
 Error Data is in record data.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 30 15 a1 03 02 01 03 a2   0.¡....¢
0008: 0e 04 0c bb 00 00 c0 00   ...»..À.
0010: 00 00 00 03 00 00 00      .......

More error messages. As we mentioned before, the only thing better than error messages is not having the problem in the first place. Unfortunately, we got nothing else here. There is virtually no information available on the KDC_ERR_BADOPTION error, just like we had no information on the error 997 that we saw earlier.

This is where we got stuck. We have two very interesting error messages, but no information on what caused them. However, one thing seems really clear, the error is client based, and happens only for a single user. In addition, we are seeing this on two different client computers for the same user. Since the user has a roaming profile this allows us to discard several possible sources of the problem:

  • It is almost certainly not related to the client computer since it happens on two different computers
  • It is almost certainly not related to the DC since no other users are affected by this

The only other variable is the user itself. With the roaming profile we should expect that any corruption in the user's profile would manifest itself on more than one computer. To test this hypothesis we logged the user off, deleted the locally cached copy of the profile on the client, and renamed the server copy to something else. When the user next logs on the client will re-create a new profile for the user based on the local Default profile, and upload it to the file server where the profile should be stored.

This procedure verified that it was indeed profile corruption. When the user logged on next a complete kerberos logon was negotiated and group policy was applied.

We still do not know the exact nature of the profile corruption, nor what caused it. We are still investigating though. For now, what we do know is that if you have this happen you can fix the problem with a slightly less painful procedure than complete user amputation - namely by forcing the user's profile to be re-created. This is obviously not a particularly graceful solution, and is a bit like amputating a significant piece of the user. However, in a domain environment, if encryption keys are stored in Active Directory, it is considerably less painful than deleting the user account and starting over, as the newsgroup post we pointed to earlier suggested.

If you have any information to give us on this error please let us know. In the meantime, we will keep searching.

Published 25 November 2006 03:25 PM by jesper

Comments

# Eric S. said on 26 November, 2006 05:07 PM

Don't seem to be an easy issue, I'm eager to see the solution.

By the way for the registery key it's allways usefull to have reminder of those. (Now I know why I keep checking your blog ;).

# Mick said on 26 November, 2006 06:02 PM

An interesting problem that I have also seen in the past.  Generally speaking I've always removed the profile (tried same account / another pc) or simply renamed the profile and it's been resolved.  I guess this is just another example of a corrupt user profile causing errors

# P Bryant said on 27 November, 2006 05:15 AM

Seen this too - and fixed the same way (rename the profile folder on teh server); however my suspciions were raised by redirection of appdata - was this being done too?

# jesper said on 27 November, 2006 09:57 AM

Yes, I was redirecting the entire profile. I have a hunch that to find out what is going on I would have to selectively remove pieces of the profile. I tried with the Run key in the registry already, but that did not do it. The problem is that you may make your profile invalid when you do that, so you need to be careful.

# art said on 03 December, 2006 11:45 PM

got the same issue just this day.....

we're approaching the phase of merging the existing forests (4 actually) into a single forest a few days from now and this happened. i love my job!

# JEA said on 09 February, 2007 09:39 AM

We had a similar problem a few years ago. Only for two users out of approx 1000 (W2003 Citrix Terminal Servers). Roaming profile + folder redirection. In our case the problem came when the attributes of the user object were synchronized from a MIIS. Recreation of the user object resolved the issue but the MIIS sync somehow corrupted the user object.

  We investigated all user attributes but did not find anything. GPO simulation failed with the same error. I do not remember but believe that we eliminated the profile as the cause of the problem (we recreated it).

  A case was opened with MS but we did not find the problem. They should have most of the data (My reference testuser was named "Sandra Bullock"). Permissions loooked OK but we probably did not go into enough detail here.

Luckily both guys were consultants who left a short while later and we have not seen the problem since.

# LeGrande said on 02 May, 2007 08:33 AM
I have encountered similar problems in the past. I have found that you can remove all the ntuser.* files and have the user log in again and the problem will be resolved close to 90% of the time.
# Hadar said on 06 August, 2008 04:40 PM

Had the same issue at a site and it turned out to be a slow connection and group policy threshold.

Here is the fix technet.microsoft.com/.../cc759191.aspx

Putting in the two reg keys fixed the issue.

Leave a Comment

(required) 
(required) 
(optional)
(required)