Website Monitoring with Check_MK

Most enterprises or ISPs are more concerned with "How do I monitor hundreds/thousands of servers? Which may or may not be web servers???" than with "How do I monitor 10k+ Apache Vhosts on our 4-node web cluster?"

Both are valid concerns. I'll try to give ideas for both scenarios.

There's no ideal solution, especially since app monitoring has been a stepchild in Check_MK until recently.

Checking a plain old website shouldn't be all too hard though - and it isn't!

 

Discuss this!

This article is work in progress, and experiences / best practices need some more discussion. Join the discussion in the #Check_MK irc channel on freenode. Serverfault.com might also be a good place.

 

 

The HTTP checks

Check_MK's WATO GUI lets you configure HTTP checks. Behind the scenes, it uses the classical check_http nagios plugin.

The plugin is really powerful, so what Check_MK adds for you is a nice GUI with input validation. So, misconfiguring the check will be really hard.

The checks will show up like this:

No, my websites aren't that slow.

Monitored via Wifi from my tablet (smile)

Here you see the actual rulesets:

Go to  "Active Checks", in that menu you can select "Check HTTP service" and configure the check.


 

I tried to split the checks into three groups to allow appropriate reaction and different intervals.

  • Default Vhost
  • Website availability per-Vhost
  • SSL Cert validity per Vhost

I've taken the time to do a (carefully chosen) content query on each per Vhost availability check. Querying for the domain-name in the body might be silly since that could be in the actual error message for a 40x or 500 error. Instead I try to find stuff like brand slogans, etc.

The Apache Check


Process inventory

Using inventory_processes_perf:

# webservers
inventory_processes_perf += [
    ( "Apache2", "~.*sbin/(apache2|httpd)", ANY_USER, 1, 4, 90, 250),
# add tomcat
# add nginx
# add ruby thing
# add lighty
]

 

mk_apache

Can't show a GUI screenshot of that atm, I don't want to update my tablet install for it.

 

 

Further Topics

Config file, check deployment on server

ext_status enabling

Include note about IPv6 issues.

 

 

Questions to think about:

 

23:04 < hostdream> hello
23:04 < hostdream> what's the best practice to use check_http on all web server 
and to test all website running on servers ?
23:14 < darkfader> i have an unfinished article on my wiki about that, i'll add 
a screenshot of the wato rules i conf'd
23:16 < darkfader> if you wanna monitor many vhosts that will not work as 
simple, since i manually added those sites
23:16 < darkfader> and the "vhost name" field in the gui can't autofill
23:16 < darkfader> so you'll need to invest some work there
23:16 < darkfader> http://confluence.wartungsfenster.de/display/Adminspace/Website+Monitoring+with+Check_MK
23:17 < darkfader> screenshot of what i monitor and how it's configured
23:17 < darkfader> and see the note about mk_apache further down, that's an 
interesting piece, although it didn't have alerting support 
when i last tried it
23:19 < darkfader> so what you need to think about: if you have 100 vhosts / 
sites on a webserver, how will your monitoring know
23:19 < darkfader> if you don't need per vhost check it will be very very easy 
though
23:20 < darkfader> assign http check to all servers tagged "web", done :)

 

Lets use a list of Vhosts!

1. The "legacy" check 

HTTP Vhost checks using inline scritping and a classical Nagios Check:

websites.mk
OMD[sitename]:~/etc/check_mk/conf.d$ cat websites.mk 
# define a check command for this job
extra_nagios_conf += """
define command {
    command_name check_httpvhost
    command_line $USER1$/check_http -H $ARG1$
}
"""

_vhosts = {
"webserver1": "demosites.public.de",
"webserver2": "www.website1.de",
}

for _host in _vhosts.keys():
   legacy_checks += [ 
      (( "check_httpvhost!%s" % _vhosts[_host], "Vhost_%s" % _vhosts[_host], True), [ "websites" ])
   ]

The great thing is that you can also attach more than one check like this. I.e. add one that checks the SSL certs, and another one that does a DNS check on the physical webserver, all from the same inline script.

Don't forget to set a different interval for the SSL and DNS checks, unless they need a check each minute...

2. 2012 style... "active checks"

Using inline scripting and the more modern "active checks". This is the same method as WATO uses and can be further enhanced to be visible in WATO.

websites.mk
OMD[sitename]:~/etc/check_mk/conf.d$ cat websites.mk 
_vhosts = {
"webserver1": "demosites.public.de",
"webserver2": "www.website1.de",
}

for _host in _vhosts.keys():
      active_checks['http'] = [
    (( u'Vhost_%s' % _vhosts[_host], { 
  'virthost' : (_vhosts[_host], False), 
  'no_body' : True,
  'uri': '/',
  'omit_ip': True,
  'onredirect': 'follow',
  'response_time' : (600.0, 3600.0),
     }), [], [ "websites" ] ),
   ] + active_checks['http']

Note: this method might use $HOSTADDRESS$ nagios macro and not be usable if your host addresses are all set to 127.0.0.1.

 

3. Whats next?

Dynamic data collection:

You can also import python modules during config compile time.

Use an underscore prefix so the modules are not loaded into Check_MK config/precompiled checks.

from sys import system.open as _s

This way you can dynamically load the list of vhosts from a different location (a file, or redis, ...)

 

Addressing multiple vhosts more easily

Change the data structure just a little:

nesting
 _vhosts = {
"webserver1": [
   ("demosite1.public.de", "https"), 
   ("demosite2.public.de", "http"),
   ("demosite3.public.de", "http"),
],
"webserver2": [("www.website1.de", "https")],
}

 

 

Beta version:

_webservers = {
"vm001.xyz.de": [ "demosites.public.de", "www.website1.de" ],
}


def add_site(_host, _vhost)
    active_checks['http'] = [

    (( u'Vhost_%s' % _vhost, {
        'virthost': (_vhost, False),
        'no_body' : True, 'uri': '/',
        'omit_ip' : True, 'response_time' : (600.0, 3600.0),
     }), [], [ _host ] ),

   ] + active_checks['http']

for _host, _vhosts in _webservers.iteritems():
    add_site(_host, _vhost)