Monitoring_rule.exe seems to have died

jacobsenmjacobsenm
I got a lot of messages below in the Monitoring rules.

Does anyone have same experience ??

ServersCheck works well, however I can't find a way what is causing this issue.



Thanks.







# Mon Oct 16 00:05:14 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 00:35:14 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 00:45:48 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 01:12:20 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 01:24:35 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 01:51:11 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 02:03:21 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 02:30:01 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 02:40:32 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 03:11:59 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 03:19:02 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 03:52:06 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 04:00:56 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 04:30:43 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 04:39:32 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 05:09:55 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 05:20:22 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 05:48:56 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 05:59:24 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 06:25:52 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 06:41:44 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 07:03:15 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 07:19:02 2006 Monitoring_rule.exe seems to have died; it will now be restarted

# Mon Oct 16 07:43:44 2006 Monitoring_rule.exe seems to have died; it will now be restarted

Comments

  • jacobsenmjacobsenm
    Mistake:

    Not in the monitoring rules. I meant Monitoring_Manager Logs.



    Thanks, Menno
  • AdministratorAdministrator
    Run in it debug mode and see the error message either being sent to console or in debug log (some errors are sent to console only)



    http://kb.serverscheck.com/index.php?page=index_v2&id=33&c=6
  • jacobsenmjacobsenm
    I had run the debug a few months ago, when I had a ticket created. (Ticket nbr was 985). I was/am running the version 6.4.4. I did many things to duplicate the errors, even the newest version. I have put ServersCheck on different machines and they all give same errors. Clean install on Windows XP and Windows 2000 (without AntiVirus), nothing helped.



    The erros shows on the console saying:

    Socket could not be created: Unknown error



    Since support cannot replicate the issue, I was wondering if other customers see this behaviour.

    Though ServersCheck software continues working well (it does the self repair) I hate to see these errors appearing.



    In the dashboard the "Monitoing Since" is always mentioning that it is running since a few hrs. That is because the self repair is doing that.







    Thanks.






  • AdministratorAdministrator
    In the dashboard the "Monitoing Since" is always mentioning that it is running since a few hrs. That is because the self repair is doing that.



    => that is incorrect because that would mean that the monitoring_manager has restarted however your issue indicates that the monitoring_rule.exe is restarted.



    It is indeed correct that the self repair (internal fail-over within the Enterprise edition) takes care of it



    I will leave this open if so that other users may report it too.
  • jacobsenmjacobsenm
    You are correct. In my case the Monitoring_manager starts itself a couple of times a day done by the internal fail-over.
  • AdministratorAdministrator
    The monitoring manager starts itself? That doesn't appear from above log. How do you know it restarts itself?
  • jacobsenmjacobsenm
    Because I can see it in the logs, see below.





    # Mon Oct 16 15:47:07 2006 Monitoring_rule.exe seems to have died; it will now be restarted

    # Mon Oct 16 16:10:15 2006 Monitoring_rule.exe seems to have died; it will now be restarted

    # M Mon Oct 16 16:16:44 2006 ServersCheck Monitoring Manager

    # M Mon Oct 16 16:16:44 2006 ENTERPRISE MX1 6.4.2

    # M Mon Oct 16 16:16:44 2006 Started OK

    # Mon Oct 16 16:29:35 2006 Monitoring_rule.exe seems to have died; it will now be restarted

    # Mon Oct 16 16:51:03 2006 Monitoring_rule.exe seems to have died; it will now be restarted
  • AdministratorAdministrator
    A lot has changed in terms of the internal fail-over so I don't know if the Monitoring Manager will still restart as often as you state it.



    The socket errors will still cause the Monitoring Rule to stop.
  • AdministratorAdministrator
    A lot has changed in terms of the internal fail-over so I don't know if the Monitoring Manager will still restart as often as you state it.



    The socket errors will still cause the Monitoring Rule to stop.
  • kgiddingskgiddings
    I am getting the same message.



    # Tue Oct 17 07:13:12 2006 Monitoring_rule.exe seems to have died; it will now be restarted



    The only difference is that I am getting an application error seen below and don't get the above message until clicking the "ok" button. As you can see below the application error occured at 2:36 am and the above message at 7:13 am once I clicked "ok". The bad part about this is that the monitoring checks do not function until the "ok" button is pressed. I have opened a ticket on this before and was told support couldn't replicate this problem and it must be the pc, so I also have installed this on a few different PC's and still get the same message. Please let me know if you are getting the message in your windows event log. I'm glad I'm not the only one seeing this error. Maybe now someone can fix it.



    Event Type: Information

    Event Source: Application Popup

    Event Category: None

    Event ID: 26

    Date: 10/17/2006

    Time: 2:36:43 AM

    User: N/A

    Computer: ALKEITHG2

    Description:

    Application popup: monitoring_rule.exe - Application Error : The instruction at "0x76082614" referenced memory at "0x00000008". The memory could not be "read".



    Click on OK to terminate the program

    Click on CANCEL to debug the program



    For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.






  • jacobsenmjacobsenm
    Keith,



    Sorry, but I thing your issue is different than mine. My eventlog is clear and does not have entries from ServersCheck. Also I do not have the ok button to press to continue monitoring.

    Have you tried on different OS as well ?



    Thank you, Menno Jacobsen




  • kgiddingskgiddings
    I have tried 2000 pro and xp pro. Are you running serverscheck professional or enterprise? i was told upgrading to enterprise would fix my problem becuase professional doesn't have self healing capabilites. If you are running enterprise edition that i guess it makes sense that your not getting the application error message pop up windows. The software is self healing after the problem. But in both of our cases this is no doubt a problem with the software.
  • jacobsenmjacobsenm
    Keith, I ran ServersCheck on Windows 2000 and moved to Windows XP due to the problem mentioned.

    I have the Enterprise version MX1.



    I think too that it is a ServersCheck issue. I hope ServersCheck can resolve it. To troubleshoot the problem I tried already many things.



    I had limitted the number of checks to about 5 (all PINGs to local systems within LAN) to even then I got the errors. My thinking wa to isolate the problem by removing checks each time, but I cannot put the finger on the problem.

    The debug way does not show the exact rule that fails.



    Menno
  • AdministratorAdministrator
    Keith, Menno



    We can quickly determine if the issue is the same. Keith do you get Menno's error:

    "Socket could not be created: Unknown error"



    My guess is no as Keith's message is totally different than Menno's.
  • AdministratorAdministrator
    Menno,



    I asked a developer to look at it and here is the actual situation. On the Windows platform, ServersCheck uses Winsock to create socket based connections (for ping checks etc..)



    If Winsock, a Windows component, is unavailable for any reason, then a socket can not be created resulting in above error.



    In a next update ServersCheck will retry to create a socket in the event of socket failure (for PING checks).
  • kgiddingskgiddings
    Well I had an open ticket but it was closed again without resolving the issue. I have quite a few checks and also the enviromental sensors. So it would be hard for me to start disabling checks to see what was causing it. I do know that it only happens at night and is okay throughtout the day. It would be better if it did happen at day so I could click "ok" and contiune monitoring my systems. As far as the socket message it seems he only received that message while debugging, so I will have to do that tonight and see what I get. At this point I honeslty can't say that I am not getting this message.



    Menno, I hope to that this can get resolved.
  • jacobsenmjacobsenm
    Administrator, This sounds good for me



    Keith, the "Socket could not be created: Unknown error" is occurring when running serverscheck in debug mode. The error will be shown in a DOS box. See if you have similar issues.



    I run debug mode as follows:

    To run it in debug mode, proceed as follows:

    1/ Stop the ServersCheck Monitoring Service

    2/ Go the ServersCheck main directory

    3/ Double click on s-graphs.exe (this will start the graphing component)

    4/ Double click on s-alerts.exe (this will start the alerting component)

    5/ Open the command prompt and cd to the main ServersCheck directory

    6/ Type following in the command prompt:

    monitoring_manager.exe > debug.txt



    Thank you !
  • jacobsenmjacobsenm
    Keith,



    I have ServersCheck installed on a test machine (notebook) to replicate the issue and started from there to eliminate checks. Maybe that is worthwile doing ...

    so it does not affect your Production system.
  • kgiddingskgiddings
    Sounds like a good idea, but I have not went that route do to license restrictions. If I can get the okay to do so from ServersCheck I would like to do that myself.
  • AdministratorAdministrator
    Menno,



    Can you download the internal build of the monitoring_rule.exe that hopefully prevents the monitoring_rule to die because of the socket error.



    http://www.serverscheck.com/files/monitoring_rule.zip



    Can you let me know what type of checks other than PING you use?
  • AdministratorAdministrator
    Keith,



    there is no problem on running the same license on a backup machine while isolating the issue.
  • jacobsenmjacobsenm
    I will do the test using the new monitoring_rule.zip. I need some time to set it up on my test machine. Today is very hectic. I will probably do this tomorrow. I think this file can only be used on version 6.7.x only right ?



    Thanks, Menno



    By the way:

    checks that I perform are

    TCP

    ODBC

    PING

    PINGAVG

    DRIVESPACE

    EVENTLOG

    URLEXISTS

    DNS

    FTP

    LINUXDISK

    PROCESS

    SERVICES

    SNMP

    MEMORY

    CPU

    SMTPPOP3

    HTTPSTATUS
  • AdministratorAdministrator
    Yes it is only for 6.7 deployments.



    Following checks use Winsock:

    TCP

    PING

    DNS

    FTP

    SNMP

    SMTPPOP3



    We will do some testing on a test machine to see if same error occurs.
  • jacobsenmjacobsenm
    Administrator / Keith

    I have setup SC v6.7.1 and applied the monitoring_rule.exe patch. Tomorrow there will be more news about the died processes. For now it is running fine, but usueally the logs are filled with the messages after 4-5 hrs.



    Thanks
  • AdministratorAdministrator
    In the mean time, I checked with a developer and the "died processes" are most likely linked to following:

    in the Enterprise edition the internal fail-over watches performance and response times of the monitoring_rule.exe



    If deemed not answering properly then thread is killed. The monitoring manager detects the killed thread and logs the line:

    "Monitoring_rule.exe seems to have died; it will now be restarted " and starts a new thread



    When this action happens then an entry should also appear in the watcher.log file
  • jacobsenmjacobsenm
    On my test machine (v6.7.1) I get the same messages again:

    # Thu Oct 19 14:57:25 2006 Monitoring_rule.exe seems to have died; it will now be restarted

    # Thu Oct 19 14:59:40 2006 Monitoring_rule.exe seems to have died; it will now be restarted

    # Thu Oct 19 16:34:48 2006 Monitoring_rule.exe seems to have died; it will now be restarted

    # Thu Oct 19 16:43:07 2006 Monitoring_rule.exe seems to have died; it will now be restarted

    # Thu Oct 19 16:45:07 2006 Monitoring_rule.exe seems to have died; it will now be restarted





    Administrator, do you have other suggestions that I can test ?



    Thank you
  • AdministratorAdministrator
    But do you still get the socket errors?



    I gave above an explanation of the "Monitoring_rule.exe seems to have died; it will now be restarted "



    As this causes confusion, in the next release that type of logging will be removed.
  • jacobsenmjacobsenm
    I did not put serverscheck in Debug mode. I have done now. Will send you updates later today.



    Thanks
This discussion has been closed.