Tracking the Lost CPU Cycles
I noticed my computer exhibiting a strange sort of behavior today. I recognized exactly what was going on, but I decided to take a few screenshots and write about it because most people are unaware that this happens. Here’s how it goes:
Confusion in the Task Manager
You notice that your computer is behaving as though it’s under heavy load, but you can’t find which application is hogging the CPU. You take a look at your task manager and see something like this. Look, in particular, at the areas that I circled:

Here, the process list shows that 7% of the CPU time is being taken by googletalk, while the remaining 93% is spent idle. Those numbers add up just fine. However, at the bottom, we see that 61% of the processor time is in use — that’s a whole lot more than 7%. So what’s using the other 54%?
I know some of you have seen this before and probably thought something devious was going on. Could it be a virus? Perhaps spyware? I’m sure you’ve heard about rootkits–programs that hide their existence from the user. Could this perhaps be a sign of a rootkit?
Well, the reality is a whole lot less exciting. What we’re really dealing with here is bad reporting. Once again, as in the case of the Sony rootkit fiasco, Mark Russinovich gives us the tools to see what’s really going on. One of his free utilities, Process Explorer, gives us a more accurate view than the built-in task manager. Have a look at the following screenshot, and look, in particular, at the first three processes listed. This screenshot was taken soon after the previous one, so the numbers won’t match.

This will all probably make a lot more sense with a bit of explanation…
Interrupts and DPCs
One of the primary responsibilities of the operating system is to schedule time for each process that requests use of the CPU. Most of a program’s run time is spent waiting–waiting for you to type something, waiting for a file to open, that kind of thing. When a program is ready to do something, the operating system schedules it a time slot. Yet even on computers with over a hundred processes running, most of the time there isn’t any process that’s ready to run. The OS schedules this left-over time to process number zero, the “Idle” process. This special-purpose process sends the CPU a HALT instruction that tells the CPU to go into low-power mode and wait for something to happen (like a keypress, for example).
So, we’ve got X number of programs running, plus process number zero, the “Idle” process. Between these, we can account for all the time that’s allocated by the process scheduler. However, this isn’t necessarily all of the time that gets used by the CPU. The OS kernel itself also uses CPU time, but it doesn’t ever have to wait in line for the scheduler. This code, which is usually hardware drivers (like for your video card), runs under a totally different set of rules.
Kernel CPU time is, for the most part, divided into two categories: time spent on interrupts, and time spent on Deferred Procedure Calls (DPCs). These are really two heads of the same beast; the distinction comes from what kind of code you’re dealing with and exactly when that code has to run. The important point is that interrupts and DPCs aren’t part of the normal process schedule, but do take up (some times significant) CPU time.
So, what we saw in the first screenshot was the result of the fact that DPC and interrupt time isn’t reported by Task Manager. At the time of the screenshot, about 58% of the CPU time was being taken by DPCs and interrupts, leaving about 42% of the CPU time for the scheduler to use as necessary. Of that remaining 42% which the scheduler had to work with, 93% went unused and 7% went to googletalk. Some quick math (42% x 93%) tells us that the real time spent idle was only 39%. Googletalk only used 3% of the total CPU time, which was 7% of the time allocated to the scheduler.
Confused yet? Well, here’s the executive summary: Windows’ built-in Task Manager does a poor job at reporting CPU usage because it doesn’t directly report the time that is used by the Windows kernel (drivers in particular). The per-process percentages are actually calculated based on the remaining time after the drivers have already taken their piece of the pie. This can lead to boatloads of confusion when trying to diagnose a problem, particularly when the real culprit is a driver. Process Explorer by Sysinternals does report DPC and interrupt time, thus bringing balance back to the universe.
If you want to find out more about DPCs, interrupts, and Windows process scheduling, check out Chapter 3 of the book Microsoft Windows Internals.
Excellent explaination. Thanks!
Comment by david — December 28, 2006 @ 11:30 am