I'm Robert Marshall, director and principle consultant at SMSMarshall Ltd, who's specialism is in the Microsoft System Center 2012 Configuration Manager product and all of its dependent products, covering all aspects from Architecture, Implementation, Migration to Break-Fix.
I've been in front of computers for over 30 years now, with my roots in programming 8 bit computers, I've taken an often exciting journey which has led to becoming an expert in an enterprise product. I consider my career as starting 17 years ago, when I began my first serious role as a deployment engineer. I've seen 8 bit through to 64 bit, the rise and refinement of the interface we take for granted now, the rise of the Internet from land-line based modem access for the few, to the powerful broadband connections we have today for the masses, I saw mobile phones come into existence, and I've seen Microsoft evolve from more than a handful of employees to the company it is now, while pretty much tinkering with every OS they have released, as well as seeing an industry that has evolved around those humble beginnings to become what we have today. I'm a keen technical puzzle solver, I love to solve gnarly problems around my area of specialism. And, I love to share when I have time. I hope you enjoy the blog.
Had a problem pop up on a down-level ConfigMgr Primary site server yesterday ... the backup was failing to complete successfully.
To drill in to this problem, first I got SMSBKUP.LOG in to trace and tracked through the most recent backup until I could see the failure point:
Error: Failed to backup \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy21\SMS\BackupTemp\SMSbkSiteRegSMS.dat up to Z:\SMSBCK\AAABackup\SiteServer: \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy21\SMS\BackupTemp\SMSbkSiteRegSMS.dat is not readable.
This is one of two registry tasks that take place in the backup process, the first exports a copy of the NAL key, then this second one which is failing exports a copy of the SMS key. Both take place at the same time as defined in the backup control file.
You can find the backup control file in <DRIVE>:\SMS\INBOXES\SMSBKUP.BOX and its called SMSBKUP.CTL, open it in notepad and explore. Please keep in mind, it's handy to know whats in something, but for those that need to be told, don't make changes in here unless you REALLY know what you are doing, and don't do it on a production server first!
anyway, here is the registry export tasks that are defined in the SMSBKUP.CTL file:
# DO NOT MODIFY - Default Registry backup tasks - DO NOT MODIFY:##----------------------------------------------------------------------# Site Serverreg \\%SITE_SERVER%\HKEY_LOCAL_MACHINE\Software\Microsoft\NAL %SITE_SERVER_DEST%\SMSbkSiteRegNAL.datreg \\%SITE_SERVER%\HKEY_LOCAL_MACHINE\Software\Microsoft\SMS %SITE_SERVER_DEST%\SMSbkSiteRegSMS.dat
Oki, so we know what is being exported and the filenames that will be used to store the data.
Back to the SMSBKUP.LOG!
I know what time the failure was reported, so the next log to pop open is called SMSWRITER.LOG.
SMSWRITER will tell me some more detail about what happened, and here you are:
Failed to export \\ANYSERVER\HKEY_LOCAL_MACHINE\Software\Microsoft\SMS up to AAABackup\SiteServer\SMSbkSiteRegSMS.dat; unable to save key to \\ANYSERVER\D$\_SMA83E.tmp: error 1,450 - Insufficient system resources exist to complete the requested service.
"Insufficient system resources exist to complete the requested service.", nice error to encounter. As fuzzy as the Hubble telescope before it got a servicing. Was it the VSS service, or a process it spawned off to do something, or the OS? Hmm, eventlog doesn't show us anything happening. All seems well on the server.
There is a change you can make to the PagedPoolSize for the OS's Memory Manager, sometimes the default value for PagedPoolSize can cause a problem like this, and just to see if this was the fix I applied the maximum value (all F's) and rebooted the server. Same problem, so I backed out that registry key change (always best) as we knew it wasn't making a difference. One reboot later and i'm still without a solution.
Curious thing is that the _SMA83E.tmp file referenced in the SMSWRITER log is actually created on the root of D:, it's zero bytes in size. Very fresh. Checked the permissions on the registry and file system, all seems fine. I can even perform an export of the SMS key from the registry without any problems as a USER or SYSTEM.
This one had me stalled for a while. I work with another SMS admin, and we bounced some ideas around and he tried to do a manual export of the SMS registry key using the REG utility.
REG \\ANYSERVER\HKEY_LOCAL_MACHINE\Software\Microsoft\SMS c:\test.data
Suprise suprise, it failed, generating exactly the same message that was noted in SMSWRITER.LOG. Now we are on to something as we've produced the error at a lower level. So it's not VSS causing the problem but the actual export of the SMS registry key.
I did an export of the registry key using REGEDIT, so it's odd that REG.EXE would fail. So, now we began taking a real close look at the SMS key and its contents, and stumbled across an oddity in HKLM\SOFTWARE\MICROSOFT\SMS\COMPONENTS\SMS_SCHEDULER for the "Routing Packages" value. It had an excessive amount of RPG's listed. Excessive as in 30K of references. We took the list in to Excel, did a DIR /B of the inbox and compared the two lists and noted that half the listed RPG's in the "Routing Packages" value were not present as files in the schedule.box inbox.
We urgently needed to get a backup taken of this server, as the worst always happens when you are not ready for it ... so we essentially removed all the zombied RPG's from the registry and fired off a backup which completed successfuly. The RPG's we removed were very old, in terms of the serial numbers they used (no files exist so no creation date), there is a very large gap between the top (highest) value of the zombie RPG's, and the RPG's that have a file associated with them. Now RPG files should expire after 25hrs, so we're not confident the problem has been resolved as this site server has been complaining of failed backups for over two days and this batch of RPG's should have been deleted by the scheduler but obviously were not. We have to monitor the build up in this key over the next few days, and especially keep an eye on the backup status messages from this site server for when it falls over again.
What we've found out here is that if a MULTI_SZ value in the registry is too large then REG.EXE will fail to back it up! That's definately worth noting.
If you do have to mess around with VSS, take a look at VSSADMIN the command line tool, and the GUI interface (bring up Computer Manager, right click "Disk Management" and select "All Tasks" then "Configure Shadow Copies ..."
This can also occur if the DB runs out of disk space, the RPG's can build up causing the maximum size of the key that REG supports to be exceeded, resulting in a failed backup.
Most often, if you see this issue you have "another" issue that needs dealing with!