Open Source Software Technical Articles

Want the Best of the Wazi Blogs Delivered Directly to your Inbox?

Subscribe to Wazi by Email

Your email:

Connect with Us!

Current Articles | RSS Feed RSS Feed

How to Build a Distributed Monitoring Solution with Nagios

  
  
  

With Nagios, the leading open source infrastructure monitoring application, you can monitor your whole enterprise by using a distributed monitoring scheme in which local slave instances of Nagios perform monitoring tasks and report the results back to a single master. You manage all configuration, notification, and reporting from the master, while the slaves do all the work.



This design takes advantage of Nagios's ability to utilize passive checks – that is, external applications or processes that send results back to Nagios. In a distributed configuration, these external applications are other instances of Nagios.



Why use a distributed configuration? You might need a distributed solution if you have hosts and services on a separate LAN that are not reachable by your main Nagios instance. A distributed implementation also gives you enhanced privacy, in that users can be defined to access only certain services and hosts. You get centralized configuration and notification, giving you a single consolidated view of the status of all your devices and subnets.



A distributed solution is often called a master/slave solution. On the master you have a copy of every service that you want to check on the slaves, but the copy on the master has the active check disabled and notification enabled, while on the slaves both active and passive checks are enabled and notification is disabled.



A look at configuration files for a slave and a master, which in this case check a web page, helps illustrate the difference. You can usually find them in the Nagios configuration directory, often in /etc/nagios3/conf.d/services.cfg:



Slave Configuration:




# Generic service definition template
define service{
name generic-service ;
active_checks_enabled 1 ;
passive_checks_enabled 1 ;
parallelize_check 1 ;
obsess_over_service 1 ;
check_freshness 0 ;
notifications_enabled 0 ;
event_handler_enabled 1 ;
flap_detection_enabled 1 ;
process_perf_data 1 ;
register 0 ;
}

# Service definition for http
define service{
use generic-service ;
host_name www.mysite.com ;
service_description HTTP ;
is_volatile 0 ;
check_period 24x7 ;
max_check_attempts 3 ;
normal_check_interval 1 ;
retry_check_interval 5 ;
contact_groups admins,webmaster ;
notification_options w,u,c,r ;
notification_interval 960 ;
notification_period never ;
check_command check_http ;
}



Master Configuration:



# Generic service definition template 
define service{
name generic-service ;
active_checks_enabled 0 ;
passive_checks_enabled 1 ;
parallelize_check 1 ;
obsess_over_service 1 ;
check_freshness 0 ;
notifications_enabled 1 ;
event_handler_enabled 1 ;
flap_detection_enabled 1 ;
process_perf_data 1 ;
register 0 ;
}

# Service definition for http
define service{
use generic-service ;
host_name www.mysite.com ;
service_description HTTP ;
is_volatile 0 ;
check_period 24x7 ;
max_check_attempts 3 ;
normal_check_interval 1 ;
retry_check_interval 5 ;
contact_groups admins,webmaster ;
notification_options w,u,c,r ;
notification_interval 960 ;
notification_period 24x7 ;
check_command check_http ;
}



A good candidate to become a Nagios slave is one close (network-wise) to the services you want to monitor. A good master must be reachable by all of its slaves. The hardware resources required both for slaves and master depend on the number and kind of checks you do. Usually Nagios is not especially resource-hungry, so you should be able to monitor up to 5,000 services from any slave and collect 20,000 to 30,000 services on a master.




Once you have choose the appropriate machines, follow the standard Nagios installation procedure on both types of Nagios hosts, editing the configuration files as above.



Connecting the Pieces



The basic concept behind connecting all the pieces is passive service, implemented with Nagios Service Check Acceptor (NSCA). This tool runs a daemon on the master that waits for information regarding services. Slaves use the nsca_client to send information about their services to the master. To do that, you specify in the main configuration file what Nagios calls an obsessive compulsive service processor (OCSP) command to be executed after every check. In its most basic form, the relevant part of the configuration file for a Nagios slave might look like this:



# /etc/nagios/nagios.cfg
. . .
obsess_over_services=1
ocsp_command=submit_service_check
ocsp_timeout=5
obsess_over_hosts=1
ochp_command=submit_host_check
ochp_timeout=5




You have then to define the ocsp_command submit_service_check in the /etc/nagios/conf.d/commands.cfg file like this:




define command
{

command_name submit_service_check

command_line /usr/lib/nagios/plugins/submit_service_check.sh
$HOSTNAME$ '$SERVICEDESC$' $SERVICESTATEID$ '$SERVICEOUTPUT$'


Finally, you need the shell script you defined for the command_line. The following could be your /usr/lib/nagios/plugins/submit_service_check.sh:




#!/bin/bash
/usr/bin/printf "%s\t%s\t%s\t%s
" "$1" "$2" "$3" "$4" | /usr/lib/nagios/plugins/send_nsca -H -c /usr/lib/nagios/send_nsca.cfg


While simple, this solution is inefficient and uses a lot of resources, because for every check made on every service the slave must start a new shell process, and run an nsca_client command. Unless you have fewer than 100 services on the slave, don't use it. Instead, try one of these more efficient alternatives:




    • OCSP Sweeper is a utility that runs on the slave and creates a FIFO queue to which OCSP events are sent. It reads the contents of the queue every N seconds and sends the data to the NSCA on the master.

    • With OCP_Daemon, Nagios writes host and service check data into a named pipe instead of running a command every time to send the information to the master. A daemon polls the pipe takes care of sending the data to the master Nagios server.



If you use either alternative on the master, edit the file /etc/nsca/nsca.conf and set the option aggregate_writes for the NSCA daemon to 1. With this set, NSCA will process multiple results at one time, and give you a small performance boost on the master.



Central Configuration



Now you know all you need to know to set up service checks on the slaves and send information from the slaves to the master.



A benefit of a master/slave configuration is the ability to centrally configure all the Nagios nodes, both master and slaves. There are many ways to do this.



One of my favorite ways to manage distributed Nagios configuration is to use a version control system (VCS) such as Subversion. In this setup you store all the configurations under the VCS (which is a good practice anyway, to keep your configuration file with a version number and a change history). The various Nagios sites each have their own directories where they can put their files; I suggest a setup like this:



/etc/nagios/conf.d/
master/
site1/
site2/
...


In this way the people in charge of each site can manage only their files and commit them to the main repository once they're done. You can also add an hook to this operation to update all the other sites.



This configuration could work but is really hard to maintain if many people work on it, and you can encounter problems with templates and names that overlap. To keep a solution like this working you need to enforce a strong configuration policy, such as requiring the use of the the fully qualified domain name of the server as a prefix for every check name.



Another approach is to use a configuration tool that can manage multiple Nagios installations, such as NagiosQL, opcfg, or Nconf. I've personally tested and used NagiosQL (full disclosure: I also help the project with the Italian translation). With it (or with the other projects) you can configure templates, checks, and services for all your Nagios installation from a single point. NagiosQL supports FTP and SCP to copy files remotely, and keeps all the configurations in a MySQL database.



Performance and Privacy



Now you're in business, running slaves that report information back to your master. You might think, "I should display all the performance graphs on the master so I can check all the information there." I had this idea too, but usually it will lead to a disaster. Your master is the machine that aggregates all the checks, so in most cases displaying the graphs there too will slow down the machine and make it useless for its real job of reporting and notification. Instead, I suggest you keep the performance tools on the slaves, because every peripheral Nagios will have less work than the master. This way you won't have to worry about sending performance information to the master. But I know, you really want all the information there. You can do that, but to do it right you should think about things like disk optimization, using a RAM disk, and parallel processes – topics beyond the scope of this article.



You can install any of several plugins to monitor performance on your slaves. Some of the most common are PNP4Nagios, nagiosgraph, and n2rrd.



As I mentioned earlier, having multiple Nagios installations can also help you also with the privacy. You can define contacts on the master who can see only the web front end information coming from different slaves. Alternatively, you could define a contact on a slave who would be able to see only the services defined there.



Setting up a distributed monitoring solution with Nagios, and using tools and plugins to make your task easier, can give you many benefits, but you need an accurate planning and strict policy guidelines for the staff that manages the configurations. I suggest using a graphical configuration tool such as NagiosQL that supports templates; using templates makes it easier to keep things in order.



[promo type=image]


This work is licensed under a Creative Commons Attribution 3.0 Unported License
Creative Commons License.

Comments

This has been uncovered that your products and services dior replica that women regularly shop for as soon as they happen to be released procuring happen to be predominantly products. Not limited boots and shoes, bangles, form solutions, shopping bags, or anything else. Any longest point in time that women investin choosing may be a designer purse. A good designer purse for women is definitely product or service they will regularly get problematic to pick. The colour, any gucci replica texture and consistancy, design and style, any overall performance together with room or space, or anything else together with other essentials produces girls think one more time earlier than purchasing designer purse. Certain shopping bags could possibly maintain texture and consistancy together with superior structure even so it could possibly are lacking regarding overall performance together with room or space together with vice-versa. Because you’ve consider particular bag might possibly be perfect for everyone, the next phase is finding the colour. As expected the fact that an effective way to enhance away an individual's container in your dress could be to find the color selection the fact that also contrasts good or simply works with an individual's outfits. Current shopping bags consist of a great many opportunities, during color selection, specifications, layout together with chanel replica structure. But if the meet is certainly bright white, including, you could about go with all color selection for designer purse. Brand-new home theater system, what are the real get hold of gear inexpensive to allow him or her any on a daily basis together with typical way the fact that they might be dress in. That is why, countless providers generated inexpensive boots or shoes burberry replica so will receive one of the best tie in with the fact that they might be own thus to their on a daily basis way. 
Posted @ Tuesday, September 30, 2014 1:09 AM by afb
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics