SCOMpercentageCPUTimeCounter cause CPU Spike

Blog, Operations Manager

*Originally posted at adatum.no

To be honest this have existed for years, and written about back in 2014. Now, in 2017, SCOM 2016 UR2 is released the problem remains. Perhaps with greater consequence due to virtualization.

If you’re unfamiliar with the problem SCOMpercentageCPUTimeCounter.vbs (.ps1 in SCOM 2016) is a script included in the “System Center Core Monitoring” management pack, and is used as the data source for a rule and a monitor to determinate agent health by gathering ‘HealthService’ CPU usage. The rule and monitor are set to run at a fixed interval of 321 seconds (I assume the person who wrote the MP just tapped 3-2-1 on their numpad 🙂 ) and sync time set to 00:00

If you want to look at the actual code you will find  the data source on SystemCenterCore.com

Running this script every 5 minutes isn’t exactly a problem when you have physical servers or a small amount of virtual machines on your Hypervisor. But if you run 100 or 300VM’s on one host and each single VM start this script simultainiasly it will create it creates unnecessary load on your host. If this host is overcommitted as well CPU wait time could cause a ‘freeze’ on your tenant machines as well.

To illustrate the problem, I have attached a graph, that clearly show spikes during script execution.

vcenter host cpu spike SCOM

On a monitored computer you will see a cscript.exe process executing the following command line “c:\windows\system32\cscript.exe” /nologo “SCOMPerventageCPUTimeCounter.vbs

Cscript.exe running SCOM Cpu percentange script

Unfortunately out of the box there isn’t much to do. Sync time and interval is the only overridable parameters, and these will only help reduce the load on the agent machine itself. So if you experience CPU utilization peaks due to this script, I see only two options

  • Disable the rule and monitor
    • Then you will have to rely on the CPU utilization monitor from the operating system management pack
  • Create a new rule and monitor, using SpreadInitializationOverInterval parameter
    • Reduces load as executions occurs randomly within the set interval
    • Requires authoring skills, but possible. Some information here.

To not let this go into oblivion, I have left feedback on Operations Manager user voice. Hopefully, Microsoft will make some changes in the future. If you have suggestions or other experience please let me know and i will update accordingly.

 

Posted by Sheikvara

+919840688822, +919003270444

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s