Windows checks failing, but testing = ok

jamesbjamesb
Hello



Install verison, Professional Edition



Been working for about 1-2 months. Today I had to add some more devices and monitors, taking my count from 74 to 96 monitors.



All of a sudden all my windows checks fail, accross the board, some have come back up but most windows checks are now red. If you "test" the windows health monitor it returns status ok, however it still is showing down on hosts.



I have restarted the machine 2 times, increased the memory by 1gb (now 4 in total) the memory usage of the service "Monitoring_thread2.exe" is grabbing up to 1.9gb of memory every time its runs, then just drops off.



Some of the error message presented on the alert page



"Not enough storage is available to complete this operation" - I have 31gb of hard drive space on the machine and should have a spare 1gb of memory now



"The RPC server is unavailable" yet test returns ok, and ping and bandwidth monitoring are fine.



"The authentication service is unknown" same as above test returns ok.



Another message not related to windows check is a bandwidth checker that has been working fine "The method "_addr_loopback" is not supported by this Transport Domain"



This machine is 4cores, 4gb of memory and windows 2008 standard R2 64 and only runs this software (minus 2am when it does something else for an hour)



As i have been writing this out, all the windows checks are failing again, I run a second server monitoring tool that is reporting everything is fine, could i get some assistance as to what might be causing this

Comments

  • jamesbjamesb
    OK update progress, Because i need this working i deleted a number of hosts, I am now only looking at around 20 machines with a total of 83 monitors and my windows hosts have all come back up



    Could you confirm if this is a resource issue of the machine? anything over 85 monitors, I can not find a min spec on the website
  • AdministratorAdministrator
    Hello,



    What version are you running? Can you provide a debug as well?
  • jamesbjamesb
    8.8.11



    To give you some more details, I pushed the monitors up to 95 and 5 of the windowshealths started to fail again. I deleted one host (with a windowshealth check) and now I am back to all green. Seems I have hit a limit at 93.



    I will try and get some logs from a debug
  • AdministratorAdministrator
    1.9GB RAM is not a abnormal for the Monitoring_Thread2.exe - if you'd like to lower that then simply configure it to run in half speed mode which still wil be more than sufficient for your setup.



    In respect to the errors:



    "Not enough storage is available to complete this operation" this is a windows kernel error and not related to the actual storage available on the hard drive



    "rpc server unavailable" is a Windows security issue. Following article might help: http://wiki.serverscheck.com/index.php/Windows_Errors



    Same applies to "The authentication service is unknown"



    if the TEST SETTINGS works fine then it most probably is the result of incorrect settings of the ServersCheck Monitoring service. See the service account settings instructions: http://wiki.serverscheck.com/index.php/Configuring_Service_for_Windows_Checks



    "The method "_addr_loopback" is not supported by this Transport Domain" => this indicates a problem with communicating with your remote system over the SNMP protocol with the remote host not supporting it or an issue with a firewall in between blocking the UDP communications


This discussion has been closed.