Series of questions

reverettreverett
Hello,



We are continuing to evaluate SC and have a few more questions:



I saw another post on memory leakage. Almost daily we receive a low memory warning from Local Machine and after looking at the associated graph it’s apparent there is a constant decline in available memory. Rebooting clears available memory and the cycle starts over. This did not occur before installing SC and SC is the only thing running on the PC. I truly believe there is a leakage problem.



Is there any way to clear/reset current service level? I think we skewed the percentage while experimenting with configuration and would like to start fresh.



Is there any way to show (via a report?) the configuration of all rules? For example, we’d like to be able to see what response time settings (ping) are setup for all rules to verify they are consistent.



Based on the Response Escalation setting, do we need to actually respond to alerts? The reason I ask is because we continue to receive OK status emails with no corresponding Warning or Down email. I understand the need for the first email to let you know everything is OK but I don’t understand why we keep receiving OK when we haven’t receive another Warning or Down before that.



Thanks,



Randall




Comments

  • AdministratorAdministrator
    Memory is not leaked but more data is being stored where you could see a change in free memory. We are looking at storing less data (resulting in more disk i/o's) so that over time less memory is used. However as the other user depicted a process jumping from 40MB to 220 MB in a single day is not possible.



    Resetting data can be done as per knowledge base by removing the files in the /data subdirectory.



    You could import the serverscheck.conf file in Excel by using |X| as a field delimiter



    Escalated alerts: please reply with a debug log with only data related to that issue.

    http://kb.serverscheck.com/index.php?page=index_v2&id=33&c=6
  • reverettreverett
    Thanks!



    Is the data being stored in memory eventually written to disk? If so, how often? If it's a matter of adding enough memory to stay afloat until the write to disk that would be acceptable.



    Sorry, I'll check the KB on the reset.



    I'll try importing the .conf file.



    I'm afraid I don't understand your last response. Will the debug mode allow me to see what data pertains to multiple instances of "OK" alerts without a prior instance of a "Warning" or "Down" alert?



    Thank you for your timely responses!!!



    Randall
  • AdministratorAdministrator
    Some items are written every 15 seconds and some only every hour.



    We are looking into the issue ourselves and to find solutions on different levels. One idea is to do a memory reset every 24hours if the exact issue can not be located.



    For resetting, simply delete the files in the /data subdirectory and the *.rrd in the c:serverscheck_databases directory (first one resets the counters you see in the interface, second one the graph data)



    Isolate the rule that generates those multiple OK by pausing all others. Then run it in debug mode as per url I gave. In that mode it will create a file called debug.log and it will store all actions it takes. That information can help us isolate the issue.
  • reverettreverett
    I searched the KB in reference to resetting service level or removing files from the /data subdirectory and could not find anything. Do you by chance have a link to that or know the KB number or title?



    Thanks!!!
  • AdministratorAdministrator
    Some items are written every 15 seconds and some only every hour.



    We are looking into the issue ourselves and to find solutions on different levels. One idea is to do a memory reset every 24hours if the exact issue can not be located.



    For resetting, simply delete the files in the /data subdirectory and the *.rrd in the c:serverscheck_databases directory (first one resets the counters you see in the interface, second one the graph data)



    Isolate the rule that generates those multiple OK by pausing all others. Then run it in debug mode as per url I gave. In that mode it will create a file called debug.log and it will store all actions it takes. That information can help us isolate the issue.
This discussion has been closed.