I have experienced
most of the time when SMS Executive stop and when I see in logs folder, crash
folder, and I found SMS Executive service again crashed. This is happening due
to unhandled exception.
What is an unhandled exception?
In almost every Configuration Manger crash there is an exception
involved. An exception occurs when an instruction is attempted but fails for
some reason (e.g. an Access Violation), so when an exception occurs we need information
about that exception or what was in memory when the exception occurred.
Most applications have their own exception handling code and
Configuration Manager is no different. Configuration Manager has its own
exception handler that is designed to collect certain predefined data such as
thread stack information and other data when the exception has occurred. Note
that it is also sometimes necessary to do live debugging or post mortem
debugging when an application/OS crashes using the Windows debugging tools.
Components that could cause unhandled
exception
SMS Executive: SMSEXEC.EXE is the main service that calls many threads.
Any running thread will terminate SMS_EXECUTIVE service if an exception occurs
in the thread, and the Configuration Manager site server exception handler will
collect the required data.
Data collected when Configuration Manger site server encounters
an exception
- A log file (CRASH.LOG) that details the thread stacks and very
basic information.
- All current .LOG files from the \LOGS folder.
These are saved in the \LOGS\CRASHDUMPS\YYYYMMDD_000XX folder
where YYYYMMDD is the date when the crash occurred and XX represents the number
of crashes in that day.
- An individual thread log for every component at the time of
the failure. These files have no extension but can be viewed in any text editor
or SMS Trace or CM Trace.
Depending on the nature of crash and current memory conditions,
not all of the above information will be captured. Here’s an example:
With this in mind, here are some steps you can do if you
experience one of these crashes:
1. Check the LOGS\CRASHDUMPS\CRASH.LOG file and make a note of
the failing component and thread ID.
2. Locate the <component>_thread_<thread number> in
\Logs and open in a text editor such as Notepad.
3. Look at the bottom of the log to identify the last thing the
component was doing when the crash occurred.
4. Take corrective action based on what was occurring. Often
there will be a reference in the log to a specific file or object that is causing
the crash.
NOTE If nothing
useful is found in the log file, a memory dump could be used to analyze the
issue deeper.
In our example, examining the CRASH.LOG shows the following:
EXCEPTION INFORMATION
Time = 08/29/2012 17:28:47.406
Service name = SMS_EXECUTIVE
Thread name = SMS_AD_SYSTEM_DISCOVERY_AGENT
Executable = D: \Microsoft Configuration Manager\bin\i386\smsexec.exe
Process ID = 11789 (0x2E0D)
Thread ID = 13565 (0x33FD)
Instruction address = 77bd8efa
Exception = c0000005 (EXCEPTION_ACCESS_VIOLATION)
Description = "The thread tried to read from the virtual address 00000000 for which it does not have the appropriate access."
Raised inside CService mutex = No
Time = 08/29/2012 17:28:47.406
Service name = SMS_EXECUTIVE
Thread name = SMS_AD_SYSTEM_DISCOVERY_AGENT
Executable = D: \Microsoft Configuration Manager\bin\i386\smsexec.exe
Process ID = 11789 (0x2E0D)
Thread ID = 13565 (0x33FD)
Instruction address = 77bd8efa
Exception = c0000005 (EXCEPTION_ACCESS_VIOLATION)
Description = "The thread tried to read from the virtual address 00000000 for which it does not have the appropriate access."
Raised inside CService mutex = No
Examining the corresponding <component>_thread_<thread
number> we can see the following:
Starting the data discovery. SMS_AD_SYSTEM_DISCOVERY_AGENT
INFO: Processing search path: 'LDAP://OU=xxx ,OU=xx,DC=GLOBAL,DC=xx,DC=xx'. SMS_AD_SYSTEM_DISCOVERY_AGENT
INFO: Full synchronization requested SMS_AD_SYSTEM_DISCOVERY_AGENT
INFO: DC DNS name = 'FQDN' SMS_AD_SYSTEM_DISCOVERY_AGENT
INFO: Processing search path: 'LDAP://OU=xxx ,OU=xx,DC=GLOBAL,DC=xx,DC=xx'. SMS_AD_SYSTEM_DISCOVERY_AGENT
INFO: Full synchronization requested SMS_AD_SYSTEM_DISCOVERY_AGENT
INFO: DC DNS name = 'FQDN' SMS_AD_SYSTEM_DISCOVERY_AGENT
So by looking at this it becomes apparent that the Active
Directory System Discovery method is causing the exception to occur. From this
point you could continue troubleshooting the cause of the issue with Active
Directory System Discovery, or perhaps if this is a secondary site you could
disable the Active Directory System Discovery if you do not need it.