on 09-07-2013 1:35 PM
Hi All,
We have set CCMS alerts on CPU utilization and many times we receive alerts "
Alert Text : CPU Utilization: 100 % > 98 % 15 min. avg. value over threshold value"
But immediately when we see at OS Level; the output of prstat or vmstat. It shows normal. Not even a single time we noticed it 90% or more. Even we we see the history of CPU usage in that hour; the idle percentage looks fine.
Based on this; such alerts are always doubtful. What this alert si actually calculating. Is it actually giving average CPU usage of last fifteen minutes or it is just instantaneous value which might have come back to normal in couple of seconds.
Alert is coming from MTE shown is below
"Performance properties assigned from group CPU_Utilization
Comparison value Last reported value
Change from GREEN to YELLOW 95 %
Change from YELLOW to RED 98 %
Reset from RED to YELLOW 90 %
Reset from YELLOW to GREEN 85 %
Alert is triggered if the comparative value
Message class RT
Message number 035
Text CPU Utilization: &1 &3 > &2 &3 15 min. avg. value over threshold value"
Could you please help in understanding how this alert is calculated. Is it instant value or average and what's meaning of "CPU Utilization: &1 &3 > &2 &3 15 min. avg. value over threshold value"
Thanks for your help always
Varun
Hi All,
Got it confirmed now.
As long as we select comparison value as 'last reported value'. It will give instantaneous alert. Even a single spike in CPU Usage can cause the 15minload value alert. To fix this; we changed the comparison value to 'Smoothing over last 15 minutes' and iy exactly matches the alert description
Thanks
Varun
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
87 | |
10 | |
10 | |
10 | |
7 | |
6 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.