Zombie Processes are Eating your Memory

Zombies probably won’t consume 32 GB of your memory like they did to me, but zombie processes do exist, and I can help you find them and make sure that developers fix them. Tool source link is at the bottom.

Is it just me, or do Windows machines that have been up for a while seem to lose memory? After a few weeks of use (or a long weekend of building Chrome over 300 times) I kept noticing that Task Manager showed me running low on memory, but it didn’t show the memory being used by anything. In the example below task manager shows 49.8 GB of RAM in use, plus 4.4 GB of compressed memory, and yet only 5.8 GB of page/non-paged pool, few processes running, and no process using anywhere near enough to explain where the memory had gone:

image

My machine has 96 GB of RAM – lucky me – and when I don’t have any programs running I think it’s reasonable to hope that I’d have at least half of it available.

Sometimes I have dealt with this by rebooting but that should never be necessary. The Windows kernel is robust and well implemented so this memory disappearing shouldn’t happen, and yet…

The first clue came when I remembered that a coworker of mine had complained of zombie processes being left behind – processes that had shut down but not been cleaned up by the kernel. He’d even written a tool that would dump a list of zombie processes – their names and counts. His original complaint was of hundreds of zombies. I ran his tool and it showed 506,000 zombie processes!

Update, November 2020: the original FindZombieHandles.exe tool only tracked process handles that had not been closed. After hitting a thread handle leak in an internal tool it was updated to detect and report on thread handle leaks as well.

It occurred to me that one cause of zombie processes could be one process failing to close the handles to other processes. And the great thing about having a huge number of zombies is that they are harder to hide. So, I went to Task Manager’s Details tab, added the Handles column, and sorted by it. Voila. I immediately saw that CcmExec.exe (part of Microsoft’s System Management Server) had 508,000 handles open, which is both a lot  and also amazingly close to my zombie count.

image

I held my breath and killed CcmExec.exe, unsure of what would happen:

Performance Tab after cropped

The results were as dramatic as I could imagine. As I said earlier, the Windows kernel is well written and when a process is killed then all of its resources are freed. So, those 508,000 handles that were owned by CcmExec.exe were abruptly closed and my available memory went up by 32 GB! Mystery solved!

What is a zombie process?

Until this point we weren’t entirely sure what was causing these processes to hang around. In hindsight it’s obvious that these zombies were caused by a trivial user-space coding bug. The rule is that when you create a process you need to call CloseHandle on its process handle and its thread handle. If you don’t care about the process then you should close the handles immediately. If you do care – if you want to wait for the process to quit – WaitForSingleObject(hProcess, INFINITE); – or query its exit code – GetExitCodeProcess(hProcess, &exitCode); – then you need to remember to close the handles after that. Similarly, if you open an existing process with OpenProcess you need to close that handle when you are done.

If the process that holds on to the handles is a system process then it will even continue holding those handles after you log out and log back in – another source of confusion during our investigation last year.

So, a zombie process is a process that has shut down but is kept around because some other still-running process holds a handle to it. It’s okay for a process to do this briefly, but it is bad form to leave a handle unclosed for long.

Where is that memory going?

Another thing I’d done during the investigation was to run RamMap. This tool attempts to account for every page of memory in use. Its Process Memory tab had shown hundreds of thousands of processes that were each using 32 KB of RAM and presumably those were the zombies. But ~500,000 times 32 KB only equals ~16 GB – where did the rest of the freed up memory come from? Comparing the before and after Use Counts pages in RamMap explained it:

image

We can plainly see the ~16 GB drop in Process Private memory. We can also see a 16 GB drop in Page Table memory. Apparently a zombie process consumes ~32 KB of page tables, in addition to its ~32 KB of process private memory, for a total cost of ~64 KB. I don’t know why zombie processes consume that much RAM, but it’s probably because there should never be enough of them for that to matter.

A few types of memory actually increased after killing CcmExec.exe, mostly Mapped File and Metafile. I don’t know what that means but my guess would be that that indicates more data being cached, which would be a good thing. I don’t necessarily want memory to be unused, but I do want it to be available.

Trivia: rammap opens all processes, including zombies, so it needs to be closed before zombies will go away

I tweeted about my discovery and the investigation was picked up by another software developer and they reproed the bug using my ProcessCreateTests tool. They also passed the information to a developer at Microsoft who said it was a known issue that “happens when many processes are opened and closed very quickly”.

Windows has a reputation for not handling process creation as well as Linux and this investigation, and one of my previous ones, suggest that that reputation is well earned. I hope that Microsoft fixes this bug – it’s sloppy.

Why do I hit so many crazy problems?

I work on the Windows version of Chrome, and one of my tasks is optimizing its build system, which requires doing a lot of test builds. Building chrome involves creating between 28,000 and 37,000 processes, depending on build settings. When using our distributed build system (goma) these processes are created and destroyed very quickly – my fastest full build ever took about 200 seconds. This aggressive process creation has revealed a number of interesting bugs, mostly in Windows or its components:

What now?

If you aren’t on a corporate managed machine then you probably don’t run CcmExec.exe and you will avoid this particular bug. And if you don’t build Chrome or something equivalent then you will probably avoid this bug. But!

CcmExec is not the only program that leaks process handles. I have found many others leaking modest numbers of handles and there are certainly more.

The bitter reality, as all experienced programmers know, is that any mistake that is not explicitly prevented will be made. Simply writing “This handle must be closed” in the documentation is insufficient. So, here is my contribution towards making this something detectable, and therefore preventable. FindZombieHandles is a tool, based on NtApiDotNet and sample code from @tiraniddo, that prints a list of zombies and who is keeping them alive. Here is sample output from my home laptop:

274 total zombie processes.
249 zombies held by IntelCpHeciSvc.exe(9428)
249 zombies of Video.UI.exe
14 zombies held by RuntimeBroker.exe(10784)
11 zombies of MicrosoftEdgeCP.exe
3 zombies of MicrosoftEdge.exe
8 zombies held by svchost.exe(8012)
4 zombies of ServiceHub.IdentityHost.exe
2 zombies of cmd.exe
2 zombies of vs_installerservice.exe
3 zombies held by explorer.exe(7908)
3 zombies of MicrosoftEdge.exe
1 zombie held by devenv.exe(24284)
1 zombie of MSBuild.exe
1 zombie held by SynTPEnh.exe(10220)
1 zombie of SynTPEnh.exe
1 zombie held by tphkload.exe(5068)
1 zombie of tpnumlkd.exe
1 zombie held by svchost.exe(1872)
1 zombie of userinit.exe

274 zombies isn’t too bad, but it represents some bugs that should be fixed. The IntelCpHeciSvc.exe one is the worst, as it seems to leak a process handle every time I play a video from Windows Explorer.

Visual Studio leaks handles to at least two processes and one of these is easy to reproduce. Just fire off a build and wait ~15 minutes for MSBuild.exe to go away. Or, if you “set MSBUILDDISABLENODEREUSE=1” then MSBuild.exe goes away immediately and every build leaks a process handle. Unfortunately some jerk at Microsoft fixed this bug the moment I reported it, and the fix may ship in VS 15.6, so you’ll have to act quickly to see this (and no, I don’t really think he’s a jerk).

You can also see leaked processes using Process Explorer, by configuring the lower pane to show handles, as shown here (note that both the process and thread handles are leaked in this case):

image

Just a few of the bugs found, not all reported

Process handles aren’t the only kind that can be leaked. For instance, the “Intel(R) Online Connect Access service” (IntelTechnologyAccessService.exe) only uses 4 MB of RAM, but after 30 days of uptime had created 27,504 (!!!) handles. I diagnosed this leak using just Task Manager and reported it here. I also used the awesome !htrace command in windbg to get stacks for the CreateEventW calls from Intel’s code. Think they’ll fix this?

image

Using Processs Explorer I could see that NVDisplay.Container.exe from NVIDIA has ~5,000 handles to \BaseNamedObjects\NvXDSyncStop-61F8EBFF-D414-46A7-90AE-98DD58E4BC99 event, creating a new one about every two minutes? I guess they want to be really sure that they can stop NvXDSync? Reported, and a fix has been checked in.

image

Apparently Corsair Link Service leaks ~15 token handles per second. Reported here.

Apparently Adobe’s Creative Cloud leaks tens of thousands of handles – ~6,500 a day? Reported here.

Apparently Razer Chroma SDK Service leaks a lot of handles – 150,000 per hour? Reported here.

Apparently ETDCtrl.exe (11.x), some app associated with ELANTech/Synaptics trackpads, leaks handles to shared memory. The process accumulated about 16,000 handles and when the process was killed about 3 GB of missing RAM was returned to the system – quite noticeable on an 8 GB laptop with no swap.

VSCode had a process handle leak which I noticed on a coworker’s machine when doing pair programming – almost 1,000,000 process handles had been leaked, totalling 35 GB of RAM. I reported that on twitter, and a fix landed just a couple of days later. I wrote up a post-mortem on twitter later on.

Apparently the netprofm leaks process handles, leading to over 1,110 zombie processes on my work machine – issue filed here.

Apparently HP’s HPPrintScanDoctorService.exe (version 6.0.0.0, signed 2:06 AM 6/16/2022) leaks process and thread handles to HPSUPD-Win32Exe.exe. Reported on twitter, but no human response so far.

Apparently nobody has been paying attention to this for a while – hey Microsoft, maybe start watching for handle leaks so that Windows runs better? And Intel and NVIDIA? Take a look at your code. I’ll be watching you.

So, grab FindZombieHandles, run it on your machine, and report or fix what you find, and use Task Manager and Process Explorer as well.

Twitter announcement is here, Hacker News discussion is here, reddit discussion is here.

Updates: Microsoft recommended disabling the feature that leaks handles and doing so has resolved the issue for me (and they are fixing the leaks). It’s an expensive feature and it turns out we were ignoring the data anyway! Also, all Windows 10 PIDs are multiples of four which explains why ~500,000 zombies led to PIDs in the 2,000,000+ range.

Another take on zombie processes can be found here.

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in Debugging, Investigative Reporting, Performance, Programming, Rants and tagged , , . Bookmark the permalink.

74 Responses to Zombie Processes are Eating your Memory

  1. Very timely Bruce thanks, found a couple issues lately that tied back to ccmexec and zp. Mex.dll has been a lifesaver for me on this.

  2. thesombrerokid says:

    RAII is so important it’s a rare occasion where it’s worth dropping everything you’re doing and implementing it around important resources like process handles. The fact microsoft can ship code in 2018 that leaks process handles is disgraceful.

    The API is to blame of course it should be reimplemented as an RAII API immedately but sparing that you could probably fix it with minimal impact by wrapping the handle in a std::unique_ptr with CloseHandle as the deleter function.

    • Arioch The says:

      RAII is impossible in managed (GC-based) languages.

      Only manual control using iDisposable is acailable there.
      Granted, “using” syntactic sugar helps a lot, still that is manual (read: voluntary opt-in) option.

      So, while you name C++ class, think that more and more code (intel too) is made in C#

      One beginner’s app i saw could not even read a file twice, cause in menu line handler it just did not explicitly closed the file (local variable, so from non-managed C++ POV it should had just work)

      • John Payson says:

        If one subdivides objects into those that own non-fungible resources, those that have mutable state but own no resources, and those which have neither mutable state nor fungible resources, RAII is much better than GC for for the first, both are good for the second, and GC is better than RAII for the third. Consequently, a good language really should provide for both RAII and GC. Unfortunately, languages that support GC don’t really accommodate RAII, and vice versa.

        • Arioch says:

          That was the reason I think MS COM was built over ARC model (perhaps SOM and CORBA predating it were too). Apple managed to bring GC into non-managed Objective C but after few years moved back to ARC because of it.

          Frankly, at this point booth RAII and iDisposable/using seem about the same.
          In both cases you have to manually implement the boilerplate, and your resource owning class would either free resources in d-tor or in some finalyzer method, which frankly looks like implementation detail. Also while mere creating of object is surely much easier than using{} or try{}finally{} control constructs, from “bigger picture” it is comparable hassle.

          I guess the reasons are mostly psychological. GC language push programmers to never think about owning and managing things, cause smart runtime will do it better. And modern computers have abundance of all generic resources like CPU power and memory. So when for those “non-fungible resources” it is required to switch your mindset to good ol’ manual lifetime control – they just don’t even have the idea about that.
          The Rust language has a point of their separation of ownership and leasing concepts. At the expense of yet more boilerplate 😀

  3. Diego says:

    Stop reading and you said windows 10… If see, most zombie process are from microsoft.

  4. Arioch says:

    0 total zombie processes.
    No zombies found. Maybe all software is working correctly, but I doubt it. More
    likely the zombie counting process failed for some reason. Please try again.

    —-

    why stay in doubt? why not just add a self-test, to create a zombie then try to find it, and see if the search worked or not?

  5. e-Van says:

    @Arioch : Did you run the tool in “Run as administrator” mode ?

    • Arioch says:

      Yes, both usual and elevated.
      OTOH I think that is another miss for the tool, it could self-elevate.

      The tool does not know to wait for user to read its output, so it should be run form some other console application, like cmd.exe

      Then the tool does not know how to elevate itself, which means I should spawn one more CMD window, this time elevated. Then manually traverse into download folder (where I just downloaded four files one after another) and run it.
      Tedious, quite.

      I understand this is only proof-of-concept so can only hope SysInternals would take the bucket and make somewhat user-friendly tool, something with UI like their RootKit Revealer

      • brucedawson says:

        Having an *option* to self elevate would be nice, but I wouldn’t want it to require elevation since then some people would not be able to run it. The elevation doesn’t change the counts reported, it just affects whether all process names are retrieved.

        • Arioch says:

          But can tool without elevation reliably scan other processes handles, especially those of elevated processors? See Process Explorer, Process Monitor – the latter installs a custom driver so it requires elevation, the former only shows a subset of data when not-elevated.

          What does this tool look for, for current interactive user’s applications, or a total count of zombie processes on the machine? If latter, then elevation is required.

          So in the end, the question is whether it would be a reasonable use case to only count zombies among Current User processes, ignoring all other zombies.

          I’d say “no”, but I do not have to work on terminal servers and similar environments, so YMMV

          • brucedawson says:

            You are correct that running FindZombieHandles.exe non-elevated misses some zombies, and misses the names for other processes. On the other hand, it does find a lot of zombies, so ???

          • Arioch says:

            So we are back at square one: whether “find some zombies not all” is a reasonable for general situation task. Especially for non-informed user (one that did not read entire manual, and all blogs/forums, before running the tool So perhaps the ideal behavior could be when the tool by default elevates itself, but has a bail-out option to keep itself not-elevated. That said, I still think this tool just asks to be included into SysInternals suite or NirSoft suite, etc. And since they implement their tools in classic old C++, closer to API, I guess if they can be persuaded they are better suited to polish both UI and compatibility. You found the problem and made the conception demonstrator, and that is great. You obviously don’t plan to extend it into feature-full toolkit of utilities anyway.

      • brucedawson says:

        I accepted a pull request from the author of NtApiDotNet so FindZombieHandles.exe now works better as non-admin – it gets process names. It also works now on earlier versions of Windows.

        It will still miss some handle leaks when run as non-admin.

    • Arioch says:

      And yeah, it was a win8.1 box about 4 hours after boot-up, so quite possible it did not undergo zombie infestation yet 😀

    • Arioch says:

      I run the tool (with elevation) at another box. Win 7 rus x64
      Reported zero zombies too.

      That box is out of domain (the first box was in-domain).
      Both have Kaspersky installed, while it should not effect it but who knows for sure.

      Being local admin on both I have enough grants though to run ProcExp and ProcMon elevated.
      So, perhaps, I am just lucky to not have any zombie incubators installed and running.

  6. Steve Hackathorn says:

    Well said and done! More devs need to worry about this stuff!

  7. @BruceDawson, The specific issue you are referring to is an issue with Ccmexec.exe and its associated driver, prepdrv.sys.The issue is not with the OS. The driver that registers for process create and delete notifications (PsSetCreateProcessNotifyRoutine). This driver then queues a notification to its usermode component (Ccmexec) to let it know about either a process create or exit. The issue is that the queue that the driver is using has a limit of 1024 entries. This means that if you are on a machine that has very high process create/delete in a very short duration (e.g. a build machine), the notifications from the driver are lost. The System Center team is working on it.

  8. Dodutils says:

    Funny thing is that to run FileZombieHandles I had to install VS 2017 which I did not have on my actual machine then I compiled&ran and surprise, it found zombies from … VS 2017 😦

    1 zombie of \Device\HarddiskVolume7\Program Files (x86)\Microsoft Visual Studio\Installer\resources\app\ServiceHub\Hosts\Microsoft.ServiceHub.Host.CLR\vs_installerservice.exe
    2 zombies held by \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\IDE\devenv.exe(2140)
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\MSBuild.exe
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\IDE\PerfWatson2.exe
    1 zombie held by Unknown(1792)

    nd after I quit VS 2017 other zombies appeared :

    1 zombie of \Device\HarddiskVolume7\Program Files (x86)\Microsoft Visual Studio\Installer\resources\app\ServiceHub\Hosts\Microsoft.ServiceHub.Host.CLR\vs_installerservice.exe
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\ServiceHub\Hosts\ServiceHub.Host.CLR.x86\ServiceHub.VSDetouredHost.exe
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\ServiceHub\Hosts\ServiceHub.Host.CLR.x86\ServiceHub.SettingsHost.exe
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\ServiceHub\Hosts\ServiceHub.Host.CLR.x86\ServiceHub.RoslynCodeAnalysisService32.exe
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\ServiceHub\Hosts\ServiceHub.Host.CLR.x86\ServiceHub.IdentityHost.exe
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\ServiceHub\Hosts\ServiceHub.Host.CLR.x86\ServiceHub.Host.CLR.x86.exe
    1 zombie of \Device\HarddiskVolume3\Microsoft Visual Studio\2017\Community\Common7\ServiceHub\Hosts\ServiceHub.Host.CLR.AnyCPU\ServiceHub.DataWarehouseHost.exe

    • brucedawson says:

      Man, the verbose output is *really* hard to read. Maybe post the non-verbose output as well? And be wary of trimming the output – the owner of the first zombie is missing, for instance.

      Anyway, cool find. Consider filing a bug. From the VS IDE go Help-> Send Feedback-> Report a Problem

  9. Interesting, i will test, I have strange problems with Edge running long, roughly a week.
    Some convenience for program for UAC:
    in “app.manifest” line

    <requestedExecutionLevel level=”requireAdministrator” uiAccess=”false” />
    </requestedPrivileges>

    and, at end of program
    “Console.ReadKey();”
    🙂

    • brucedawson says:

      Yes, I could make it require admin, but I have mixed feelings about that.

      Similarly with the Console.ReadKey() – it makes it worse for people who run it from a command prompt. Sigh… decisions, decisions.

      BTW, I copied your XML fix and deleted the comment that just had the fix.

  10. Rob says:

    I’ve been having memory issues for weeks, and was able to track it down to zombie processes thanks to this post. Unfortunately, it seems that “System” is the process holding open the handles… Any idea what that could mean, or where I should start looking next? It seems every process that executes turns into a zombie.

    • brucedawson says:

      How many zombie processes do you have? What version of Windows are you running?

      My best guess is that you have some malware on your system that is making Windows misbehave, or a poorly written driver, but I really don’t know.

      • Rob says:

        I’d hit over 50k after only a few hours, and eventually run out of main memory after 8-10 hours or so. I’m running Windows 10.

        The issue seemed to go away when in safe mode, so I tried removing/updating some drivers and it didn’t seem to fix it. I think I’m just going to reformat at this point. Thanks anyways, this was a very helpful post regardless!

        • brucedawson says:

          Wow – that’s crazy. 50k zombie processes will consume about 3.2 GB of RAM which on most “normal” machines is a huge percentage. I think you have two separate mysteries:
          1) Who is creating all of those processes? 10k+ per hour is extremely high for anyone not building Chrome.
          2) Why are they not closing the process handles?

          The problems are almost certainly related. An ETW trace would reveal the process creation – maybe grab one before you reinstall? 30-60 s and I can take a brief look and see if anything interesting is apparent. You have my curiosity piqued! https://randomascii.wordpress.com/2015/09/01/xperf-basics-recording-a-trace-the-ultimate-easy-way/

          • Rob says:

            This is just from normal usage. Short lived process such as chrome threads made up the bulk of what I was seeing. When literally every thread and process leaves behind a zombie, I guess it fills up pretty quick.

            I unfortunately already nuked the PC, but if for some reason it happens again I’ll definitely get you a trace!

            • brucedawson says:

              Zombie threads should be much more lightweight than zombie processes – if zombie threads are even a thing.

              50,000 zombies in a few hours suggests several zombie processes *per second*. That is not normal. Chrome, for instance, creates a new process on most navigations, but most people don’t navigate to several new sites per second.

          • Dodutils says:

            For what I understand @rob told about 50.000 handles and you said it represent 3.2GB meaning 64KB per handle ? I am curious, why would a handle use such amount of memory ?

            • Rob says:

              I don’t believe it’s the handles themselves taking up most of that space. The handles are just keeping the zombies alive, and each zombie is keeping it’s page table allocated.

              • brucedawson says:

                That is correct. The handle seems to cause ~32 KB of private working set to stay in memory, along with ~32 KB of page tables, per process. This is described in the article.

        • Aaron LaFave says:

          @Rob @brucedawson
          We’re experiencing a similar issue of the “System” process creating and holding open quite a few zombie processes (although a lot less than you’re talking about). Still, after several days of uptime our developer workstations run out of memory and must be restarted to clear the issue.

          I’ve been investigating with the Sysinternals tools, the FindZomebieProcesses tool from this post, but I’m no closer to finding the root cause of the issue. We’ve previously gone through support cases with many of our software vendors trying to determine the cause (we had suspicions it was related to our security software, our time-tracking software, our file-sharing software, etc.).

          Did you ever find a cause for your issue? Or any other guidance on troubleshooting exactly why the System process is spawning zombies?

          By the way, this post and the comments were excellent and very timely for us. Thank you both for your contributions!

          • brucedawson says:

            In my case it wasn’t ‘the “System”‘ process holding the handles it was a SYSTEM process. That is, there is a process called “System” and there are several processes that run as the SYSTEM user. In my case the CcmExec.exe process (a SYSTEM process) was holding the handles so I knew which team at Microsoft to contact. If the “System” process (with that name) is leaking handles then I don’t know what is going on. Driver bug?

            Try in safe mode, maybe try using App Verifier or similar to grab call stacks for handle creation. Or, just try disabling one piece of software at a time until the problem stops happening.

            It is quite possible that the process that is leaking the process handles is not actually creating the processes, but is merely getting handles to processes created by others – that is quite common.

            Good luck!

          • Caleb Champlin says:

            I’m also having the same issue where pid 4 (System) is the owner of zombies leaving me to believe it’s some kernel driver misbehaving. Did you ever find a way to track down your root cause?

            • Aaron LaFave says:

              Hi Caleb –

              I’ve never solved this issue, and in fact, we have some really ugly “hacks” in place to mitigate it – such as scheduled reboots of certain machines during off-hours. Running Bruce’s tool indicates the biggest perpetrator is by far System/PID-4, and running RAMMap shows lots of small memory items held open.

              I tried Bruce’s suggestions, but grabbing call stacks to trace handle creation is beyond my skillset. I spent too much time trying to learn it, and finally gave up.

              FYI, the issue in my environment has gone away in a couple cases; my personal workstation no longer seems to be affected, but other workstations and servers are still affected.

              My best guess is that in my case it’s related to security software; anti-virus, DLP, vulnerability protection, etc. There are a bunch of drivers related to those applications, specifically network drivers that intercept/read/filter traffic. I opened a ticket with one of the vendors whom I suspected was the issue, and after a nightmare back and forth over a couple weeks, they finally believed the issue was not their fault. (They un-installed all their software, and the issue persisted. However, for a fresh OS install, the issue initially presented itself upon installation of their software. So, I disagree with them, but who knows. This vendor was Trend Micro, just FYI.) We can’t just dump this software, obviously, so I’m stuck with the issue. If I could prove it was definitely this software, I would find a replacement. But since it’s not occurring on every machine, I’m hesitant to initiate a company-wide replacement project.

              If you get any further info or leads, I’d be really interested. I’d love to actually resolve this issue rather than just mitigate it.

  11. Larry Mosley says:

    Hi Bruce, this is probably being caused by the use of software metering in the SCCM environment (used to gather info on processes that are executed for reporting in SCCM). It isn’t something you can turn off on your own – your SCCM admins have that power.

    It is fixed in the recent SCCM 1802 release – so you might pass this on to whomever manages your Windows computers.

    https://cloudblogs.microsoft.com/enterprisemobility/2018/03/22/now-available-update-1802-for-system-center-configuration-manager/

    • brucedawson says:

      Thanks Larry. And indeed, our SCCM admins turned off this feature when I told them it was causing problems. They weren’t actually using the data anyway 🙂

      So, my machines are no longer leaking process handles (well, a few, but not from SCCM, and just a few)

  12. Diceman says:

    on the subject of memory, I’m curious if you have dug into the Standby Memory bug affecting Creators update and later, and causing stuttering for windows gamers across a wide array of games.

  13. Gabriel says:

    Hi Bruce, why would the program report that there are XXXX number of zombies while 4 processes are only listed?

    Sample run:
    C:\Users\XXX\Downloads\Zombie>FindZombieHandles.exe
    7891 total zombie processes.
    1 zombie held by SynTPEnh.exe(6072)
    1 zombie of SynTPEnh.exe
    1 zombie held by explorer.exe(380)
    1 zombie of notepad++.exe
    Pass -verbose to get full zombie names.

    My zombie count seems to increase one per 3s. How should I troubleshoot it?

    • brucedawson says:

      I don’t know/can’t remember why the total zombie processes wouldn’t match those listed. Try running as admin?

      And, if the zombie count is increasing by one per 3 s then after a day or two it should be clearly visible as an elevated handle count in *some* process. So, just wait and then look at task manager or process explorer and see which process has the highest handle count. That’s probably your culprit. Good luck, and let us know what you find.

  14. Gabriel says:

    Hmmm Task Manager doesn’t give much info too, and I’m already running as admin.

    So far my top 2 handles are Lync (3000+) and Outlook (6000+), but restarting them does not reduce the total zombie process count..

    • brucedawson says:

      Curious. I am unaware of a way to have a zombie process without somebody having a handle to it. At one per 3 s you should be leaking ~28,000 zombies per day, which should be very noticeable. If you get up to 30,000+ zombies and that isn’t visible in somebody’s handle count (make sure task manager is showing processes from all users) then I’m afraid you are in terra incognito. I guess you could try killing processes (and not restarting them) until the leaking stops, but ???

      • xkiller213 says:

        Hmmm anyway I seem to have solved my problem by updating my system drivers. Previously before I posted my earlier comments, I’ve updated my graphic drivers, Lenovo power management drivers, uninstalled a couple of unused programs but none of them helped.

        The 2 which seemed to solve the problem are the Intel chipset drivers (FKA Intel INF update utility) version 10.1.1.9, and the Intel USB 3.0 drivers version 1.0.9.254.

        Thanks for the help!

  15. Chau Cao says:

    I have the problem, FindZomebieHandles wont list 130000 zombie processes. Task manager does not show the program with high handle (5000 is max). However, using rammap, I found out cmd.exe is the zombie processes with >300000 session.

    How can I kill cmd.exe?

    • brucedawson says:

      Is Task Manager showing processes from all users? Maybe run procexp in admin mode, search for handles to cmd.exe, ???

      You don’t need to kill cmd.exe – it’s already dead. You need to kill the process that is holding a handle to it or otherwise keeping it alive. If no process has enough handles to explain the mystery then I’d guess that you have a driver bug, somehow.

  16. Jeff Stokes says:

    Bruce, any chance of a tool to do this in Linux too?

    • brucedawson says:

      My Linux skills are definitely not up to that challenge.

    • Derek Fawcus says:

      For unix systems, just use ps. Z state indicates a zombie.

      A zombie only exists when its parent process has not waited for it, and in that case the pid / ppid fields from ps will identify the relationship.

      Even then, a Zombie on unix systems is less of an issue, as it consumes little more than a process table slot. No memory, no open files etc.

      So while less of an issue, on unix having zombies around used to eventually prevent creation of new processes due to lack of free process table slots, or lack of an available pid.

      • brucedawson says:

        > used to eventually prevent creation of new processes

        What was the limit? And is that fixed now? I think Windows could have about a billion zombies without running out of process IDs but you’d need many TB of RAM to support that.

        • Jeff Stokes says:

          Turns out some distros are pretty limited in terms of pid max value

          On Linux System there is a limit to the maximum number of processes to run. This was started from kernel version 2.5. You can find the maximum number of PID of a Linux system using below command. # cat /proc/sys/kernel/pid_max 32768 Above you can see that the maximum number of process id’s are 32768.

          • brucedawson says:

            32 K process IDs is a pretty low limit – heaven help those who have to deal with zombie processes in that environment.

            I checked my Linux box (thanks for the pointer) and it’s got a maximum of 4 M process IDs. None of the PID leaks I’ve seen on Windows would get close to that limit.

  17. xaviergmail says:

    Oh my god. What the F??

    236,389 total zombie processes.
    80,478 zombies held by synergyd.exe(4668)
    40,239 zombies of syntool.exe
    40,239 zombies of synergys.exe

    The next one down is 44 zombies held by OfficeClickToRun.exe(4884)

    And then a few other who account for less than 30 combined.

    I have 24GB RAM and always run into issues after a couple of days uptime. 236k zombie processes and only ~80k of which accounted for by Synergy? You bet I’ll be uninstalling that and filing a complaint >.<

    Any idea how to figure out what the other 156k has gone towards?

    • brucedawson says:

      Uninstall Synergy and maybe that will solve everything – fingers crossed. Please complain loudly and publicly so that they fix their sloppy software. You can improve the lives of millions!

  18. Steve Bardwell says:

    Bruce –

    You mention that there is a bug that “… they reproed the bug using my ProcessCreateTests tool. They also passed the information to a developer at Microsoft who said it was a known issue that “happens when many processes are opened and closed very quickly”.”
    Do you know if this bug has been fixed?
    Steve Bardwell

  19. David Harrington says:

    I seem to be over my head a bit here but I got here while finally trying to figure out why I’ve had to reboot at an increasingly frequent rate to restore half decent Win 10 laptop performance. I noticed I had more google chrome processes than windows (1) and tabs (~10) open. Via Task Mgr, I saw 23 google Chrome processes. So I closed Chrome, reopened and restored the tabs and lo and behold I now had 43 google chrome processes running. Can you help me fix this?
    Thanks, Dave

    • brucedawson says:

      It is normal – or at least common – for there to be more Chrome processes than windows. For security and stability reasons Chrome will often put each domain (cnn.com, sketchyads.com, etc.) in a separate process. Since a single page may have ads from ten different sites (yes, really) a single page may end up creating eleven processes. There are also various processes for the main browser, the GPU, networking, etc.

      Yes, it seems like a lot, but the security and stability gains are huge. If the GPU process crashes or hangs, for instance, it will be automatically restarted and you probably won’t even notice. That’s pretty cool.

      When you restart Chrome there are some temporary processes as well – that may be why the process count was so high. If you wait it will settle down.

      It also depends on how many tabs you have open, and whether you have visited them. Chrome tries not to fully reload all tabs because that is expensive. However if you Ctrl+Tab through all of your open tabs then it is forced to load them.

      You might want to try looking at Chrome’s Task Manager. From most web pages you can just type Shift+ESC to open it and see exactly what the processes are doing. You can even kill off ad processes or the GPU process to experiment (not really recommended). You can also see how much memory each process and page is using.

      Slowdowns can come from many sources. One of the best ways to speed up your computer is to either buy more memory or a solid-state drive. Both are fairly affordable these days. These won’t solve all performance problems, but they solve many.

  20. David Harrington says:

    Hi Bruce, thanks for your response. I hadn’t been looking at Chrome as a source of my performance problem(s), I just noticed what I reported. My main suspect is iTunes but I haven’t been running lately but still get the performance degradation over time, just not as fast. I added max memory to my little Lenovo and now have a whopping 12 GB. It noticeably helped but it’s still just a matter of time, Thanks again!

  21. Pingback: Windows, Zombie Processes, and bullshit code | Jeff Stokes

  22. Will says:

    I don’t leave comments often but I have to say Thank you for this write up. Had been wondering why my RAM was just getting murdered out of the blue and it turns out 16GB of RAM was being used by SynTPEnh.exe. Just wanted to say thanks, if you see this or not.

Leave a reply to brucedawson Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.