Delay in Processing Rules

[Deleted User][Deleted User]
Hi,

This morning I've noticed a very long delay with the checks and the time taken to receive a notification.



I have 200 rules configured; lets say 30 of those checks are a simple ping to servername.



All my PING checks are set to:

Interval between 2 checks=60 secs

Number of retries before rule fails=2

Interval when status is down=60 secs.



I was assuming the check is carried out every 60 secs. If the ping response failed (lets say server turned off) It would say that was fail 1, and try again 60 secs later...

Ping still doesn't get a response so that is fail 2; server marked as down - email sent out...



(Server can be down for 2 mins before notification is sent)



But



I turned off a server this morning, and did not get a notification for about 15 mins!

From the 'All Rules View' - I can see the 'Last Up' time was about 10 - 15 mins ago for ALL ping checks.



How are rules processed?

Does a rule that fails (for fail 1) try again 60 secs later, or does it wait until all the cycle of checks has been made?

Can serverscheck perform more then 1 check at a time?



I have the Enterprise version - 5.12.0.



I expect the 'Last Up' time for each ping check to be 1 min ago (when check last run).



I've checked there is no Internet Explorer caching, and can prove the check is not executing every minute as I can turn the server off and not receive a notification within 2-3mins.



Thanks

Comments

  • AdministratorAdministrator
    The number of checks being performed at the same time depends on the purchased version.



    For example the Enterprise MX1 Edition performs 2 checks at the same time, whereas the Enterprise MX2 Edition performs 5.



    You can easily check what it is doing by opening the monitoring_rule.log file for today (logging subdirectory). In there you will see the activity of each thread.



    When you only have 2 threads available within your license, then all the rules have to be performed by those 2 threads.



    The interval set between 2 checks is the minimum interval


  • [Deleted User][Deleted User]
    Indeed.... Checking that log file, the 60 second PING checks are being run every 7 - 10 mins. :(



    Although I have many rules configured, most don't run more frequent then every 5 mins.

    Disk checks for example only run once every 2 hours.



    I've not noticed this much of a delay before.



    What can I do to resolve these issues? I cant rely on a notification 20 mins after the server has gone down...

    (Assuming it requires 2 checks before alert, and each check is every 10 mins).



    :(
  • AdministratorAdministrator
    The number of rules is more than the threads can handle and therefore I would recommend to upgrade to a higher version.
This discussion has been closed.