SIM (System Integrity Monitor) is a useful tool to monitor and ensure that services are running and responding. SIM can also be used to monitor system resources, and setup to send you email alerts.
Servers running CPanel already have a service monitoring script running by default, called CHKSERVD. SIM may still be useful for CPanel users for it's monitoring and alert system. You should disable the auto-restart of downed services if you are running CPanel.Directions:
|1.||Download SIM (System Integrity Monitor) from here: http://www.rfxnetworks.com/sim.php|
|2.||Extract the files.
tar -xzvf sim-current.tar.gz
|3.||Execute the install script 'setup' that was extracted with the '-i' parameter.
You should get something similar to this: (the bolded text is user input, and anything in the square brackets is a description of what is performed)
SIM 2.5-3This is followed by the usual 'GNU General Public License' which you should read over, if you have never before.
If you do not agree with the implied and expressed agreements in the GNU GPL, please terminate your use of this software. Press return, to view the SIM 2.5-3 README. [ENTER]This is followed by the README file that is included with the program describing the software, installation procedures and configuration.
SIM 2.5-3So far we managed to everything up to this point only using the Enter key, but now comes the configuration part. This is where we will go more in depth.
Running the Auto-Config Script. If you are comming from step 3, you do not have to do this, but if you want to redo the configuration later, you can get this script by executing:
Most of the questions are self explanatory.
When the log file reaches this size, the log is renamed, and a new one is started, hence rotating logs. SIM logs generally do not grow that fast, and hence the default value of 128KB maybe a convenient number for you.
Email to send alerts to
This is the email account you want to get notification to, you can use root, or you can send it to an email account on another server (in case some attacker does not want you to have those alerts for some reason).
After how many events to disable email alerts
I have been email flooded by my own server before, so be careful with this value. You can set this value high, and if you do end up getting swarmed later you can just change the value again later. (Some ISPs may not be happy with your email account getting bombed, so you may want to leave it lower in that case).
Auto-restart services found to be offline
This is a useful feature, so I recommend setting this value to 'true'. It is possible that during a high cpu load, the service may be unresponsive / lagging, and therefore restarting the service would be unnecessary.
Disable auto-restart after how many downed service events
If the service just won't restart, why keep trying right? This number does not correspond to how many failed consecutive attempts, but how many times the service is down within a day before giving up.
Name of service in 'ps'
Generally the value to be used here is the same as the script used to restart the service. If you execute '/etc/init.d/httpd restart' to restart the service HTTP service, then 'httpd' should be the correct value. You can always check using 'ps' if you know what to look for.
You may need to change the name of the FTP service, since you may be using proftpd, or pure-ftpd. (default is for proftpd)
Ports for the services
The default values should be correct, unless you setup a service to run on a non-default port. (ie. FTP on 2121)
Enable if you want to get notifications when your server load is high. Usually processes start to wait/lag when the server load is higher then the number of CPUs found. (remember that HyperThreaded CPUs are counted as 2)
If you decide you want to modify the settings by hand, and not through the script, the file is located at: (default location)
In order for SIM to work properly, it must be executed regularly, and the best method for this is by using a cronjob.
By default, a SIM cronjob is automatically added during setup and set to run every 5 minutes. To add it or remove it later, you can execute this:
After executing, it should tell you whether it was 'Installed' or 'Removed'.