Automatic Inventory

Now I have four machines.  Keeping them in sync is the challenge.  Worse yet, knowing whether they are in sync or out of sync is a challenge.

So the first step is to make a tool to inventory each machine.  In order to use the inventory utility in a scalable way, I want to design it to produce machine-readable results so that I can easily incorporate them into whatever I need.

What I want is a representation that is both friendly to humans and to computers.  This suggests a self-describing text representation like XML or JSON.  After a little thought I picked JSON.

What sorts of things do I want to know about the machine?  Well, let’s start with the hardware and the operating system software plus things like the quantity of RAM and other system resources.  Some of that information is available from uname and other is availble from the sysinfo(2) function.

To get the information from the sysinfo(2) function I had to do several things:

  • Install sysinfo on each machine
    • sudo apt-get install sysinfo
  • Write a little program to call sysinfo(2) and report out the results
    • getSysinfo.c

Of course this program, getSysinfo.c is a quick-and-dirty – the error handling is almost nonexistent and I ought to have generalized the mechanism to work from a data structure that includes the name of the flag and the attribute name and doesn’t have the clumsy sequence of if statements.

/*
 * getSysinfo.c
 *
 * $Id: getSysinfo.c,v 1.4 2014/08/31 17:29:43 marc Exp $
 *
 * Started 2014-08-31 by Marc Donner
 *
 * Using the sysinfo(2) call to report on system information
 *
 */

#include <stdio.h> /* for printf */
#include <stdlib.h> /* for exit */
#include <unistd.h> /* for getopt */
#include <sys/sysinfo.h> /* for sysinfo */

int main(int argc, char **argv) {

   /* Call the sysinfo(2) system call with a pointer to a structure */
   /* and then display the results */
   struct sysinfo toDisplay;
   int rc;

   if ( rc = sysinfo(&toDisplay) ) {
      printf("  rc: %dn", rc);
      exit(rc);
   }

   int c;
   int opt_a = 0;
   int opt_b = 0;
   int opt_f = 0;
   int opt_g = 0;
   int opt_h = 0;
   int opt_m = 0;
   int opt_r = 0;
   int opt_s = 0;
   int opt_u = 0;
   int opt_w = 0;
   int opt_help = 0;
   int opt_none = 1;

   while ( (c = getopt(argc, argv, "abfghmrsuw?")) != -1) {
      opt_none = 0;
      switch (c) {
         case 'a':
            opt_a = 1;
            break;
         case 'b':
            opt_b = 1;
            break;
         case 'f':
            opt_f = 1;
            break;
         case 'g':
            opt_g = 1;
            break;
         case 'h':
            opt_h = 1;
            break;
         case 'm':
            opt_m = 1;
            break;
         case 'r':
            opt_r = 1;
            break;
         case 's':
            opt_s = 1;
            break;
         case 'u':
            opt_u = 1;
            break;
         case 'w':
            opt_w = 1;
            break;
         case '?':
            opt_help = 1;
            break;
      }
   }

   if ( opt_none || opt_help ) {
      showHelp();
      return 100;
   } else {
      if ( opt_u || opt_a ) { printf("  "uptime": %lun", toDisplay.uptime); }
      if ( opt_r || opt_a ) { printf("  "totalram": %lun", toDisplay.totalram); }
      if ( opt_f || opt_a ) { printf("  "freeram": %lun", toDisplay.freeram); }
      if ( opt_b || opt_a ) { printf("  "bufferram": %lun", toDisplay.bufferram); }
      if ( opt_s || opt_a ) { printf("  "sharedram": %lun", toDisplay.sharedram); }
      if ( opt_w || opt_a ) { printf("  "totalswap": %lun", toDisplay.totalswap); }
      if ( opt_g || opt_a ) { printf("  "freeswap": %lun", toDisplay.freeswap); }
      if ( opt_h || opt_a ) { printf("  "totalhigh": %lun", toDisplay.totalhigh); }
      if ( opt_m || opt_a ) { printf("  "mem_unit": %dn", toDisplay.mem_unit); }
      return 0;
   }
}

int showHelp() {
   printf( "Syntax: getSysinfo [options]n" );
   printf( "nDisplay results from the sysinfo(2) result structurenn" );
   printf( "Options:n" );
   printf( " -b : bufferramn" );
   printf( " -f : freeramn" );
   printf( " -g : freeswapn" );
   printf( " -h : totalhighn" );
   printf( " -m : mem_unitn" );
   printf( " -r : totalramn" );
   printf( " -s : sharedramn" );
   printf( " -u : uptimen" );
   printf( " -w : totalswapnn" );
   printf( "getSysinfo also accepts arbitrary combinations of permitted options." );
   return 100;
}

And with this in place, the python program sysinfo.py required to pull together various other bits and pieces becomes possible:

#
# sysinfo
#
# report a JSON object describing the current system
#
# $Id: sysinfo.py,v 1.8 2014/08/31 21:04:30 marc Exp $
#

from subprocess import call
from subprocess import check_output
import time

# First we get the uname information
#
# kernel_name : -s
# nodename : -n
# kernel_release : -r
# kernel_version : -v
# machine : -m
# processor : -p
# hardware_platform : -i
# operating_system : -o
#

operating_system = check_output( ["uname", "-o"] ).rstrip()
kernel_name = check_output( ["uname", "-s"] ).rstrip()
kernel_release = check_output( ["uname", "-r"] ).rstrip()
kernel_version = check_output( ["uname", "-v"] ).rstrip()
nodename = check_output( ["uname", "-n"] ).rstrip()
machine = check_output( ["uname", "-m"] ).rstrip()
processor = check_output( ["uname", "-p"] ).rstrip()
hardware_platform = check_output( ["uname", "-i"] ).rstrip()

# now we get the boot time using who -b
boot_time = check_output( ["who", "-b"]).rstrip().lstrip()

# now we get information from our handy-dandy getSysinfo program
GETSYSINFO = "/home/marc/projects/s/sysinfo/getSysinfo"
getsysinfo_uptime = check_output( [GETSYSINFO, "-u"] ).rstrip().lstrip()
getsysinfo_totalram = check_output( [GETSYSINFO, "-r"] ).rstrip().lstrip()
getsysinfo_freeram = check_output( [GETSYSINFO, "-f"] ).rstrip().lstrip()
getsysinfo_bufferrram = check_output( [GETSYSINFO, "-b"] ).rstrip().lstrip()
getsysinfo_sharedram = check_output( [GETSYSINFO, "-s"] ).rstrip().lstrip()
getsysinfo_totalswap = check_output( [GETSYSINFO, "-w"] ).rstrip().lstrip()
getsysinfo_freeswap = check_output( [GETSYSINFO, "-g"] ).rstrip().lstrip()
getsysinfo_totalhigh = check_output( [GETSYSINFO, "-h"] ).rstrip().lstrip()
getsysinfo_mem_unit = check_output( [GETSYSINFO, "-m"] ).rstrip().lstrip()

print "{"
print "  "report_date": "" + time.strftime("%Y-%m-%d %H:%M:%S") + "","
print "  "operating_system": " + """ + operating_system + "","
print "  "kernel_name": " + """ + kernel_name + "","
print "  "kernel_release": " + """ + kernel_release + "","
print "  "kernel_version": " + """ + kernel_version + "","
print "  "nodename": " + """ + nodename + "","
print "  "machine": " + """ + machine + "","
print "  "processor": " + """ + processor + "","
print "  "hardware_platform": " + """ + hardware_platform + "","
print "  "boot_time": " + """ + boot_time + "","
print "  " + getsysinfo_uptime + ","
print "  " + getsysinfo_totalram + ","
print "  " + getsysinfo_freeram + ","
print "  " + getsysinfo_sharedram + ","
print "  " + getsysinfo_totalswap + ","
print "  " + getsysinfo_totalhigh + ","
print "  " + getsysinfo_freeswap + ","
print "  " + getsysinfo_mem_unit
print "}"

which in turn enables the Makefile:

#
# Makefile for sysinfo
#
# $Id: Makefile,v 1.9 2014/08/31 21:27:35 marc Exp $
#

FORCE := force

HOST := $(shell hostname)
HOSTS := flapjack waffle pancake frenchtoast
SSH_FILES := $(HOSTS:%=.%_ssh)
PUSH_HOSTS := $(filter-out ${HOST}, ${HOSTS})
PUSH_FILES := $(PUSH_HOSTS:%=.%_push)

help: ${FORCE}
	cat Makefile

FILES := Makefile sysinfo.py sysinfo.bash getSysinfo.c

checkin: ${FILES}
	ci -l ${FILES}

install: ~/bin/sysinfo

~/bin/sysinfo: ./sysinfo.bash
	cp $< $@
	chmod +x $@

getSysinfo: getSysinfo.c
	cc $ $*.sysinfo
	touch $@

test: ${FORCE}
	time python sysinfo.py

force:

Notice the little trick with the Makefile variables HOST, HOSTS, SSH_FILES, PUSH_HOSTS, and PUSH_FILES that lets one host push to the others for distributing the code but lets it call on all of the hosts when gathering data.

With all of this machinery in place and distributed to all of the UNIX machines in my little network, I was now able to type ‘make ssh’ and get the resulting output:

marc@flapjack:~/projects/s/sysinfo$ more *.sysinfo
::::::::::::::
flapjack.sysinfo
::::::::::::::
{
  "report_date": "2014-09-01 10:37:30",
  "operating_system": "GNU/Linux",
  "kernel_name": "Linux",
  "kernel_release": "3.2.0-52-generic",
  "kernel_version": "#78-Ubuntu SMP Fri Jul 26 16:21:44 UTC 2013",
  "nodename": "flapjack",
  "machine": "x86_64",
  "processor": "x86_64",
  "hardware_platform": "x86_64",
  "boot_time": "system boot  2014-08-07 22:01",
  "uptime": 2118958,
  "totalram": 2089889792,
  "freeram": 145928192,
  "sharedram": 0,
  "totalswap": 2134896640,
  "totalhigh": 0,
  "freeswap": 2062192640,
  "mem_unit": 1
}
::::::::::::::
frenchtoast.sysinfo
::::::::::::::
{
  "report_date": "2014-09-01 10:37:31",
  "operating_system": "GNU/Linux",
  "kernel_name": "Linux",
  "kernel_release": "3.13.0-32-generic",
  "kernel_version": "#57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014",
  "nodename": "frenchtoast",
  "machine": "x86_64",
  "processor": "x86_64",
  "hardware_platform": "x86_64",
  "boot_time": "system boot  2014-07-19 14:58",
  "uptime": 3785970,
  "totalram": 16753840128,
  "freeram": 14150377472,
  "sharedram": 0,
  "totalswap": 17103319040,
  "totalhigh": 0,
  "freeswap": 17103319040,
  "mem_unit": 1
}
::::::::::::::
pancake.sysinfo
::::::::::::::
{
  "report_date": "2014-09-01 10:37:31",
  "operating_system": "GNU/Linux",
  "kernel_name": "Linux",
  "kernel_release": "3.13.0-35-generic",
  "kernel_version": "#62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014",
  "nodename": "pancake",
  "machine": "x86_64",
  "processor": "x86_64",
  "hardware_platform": "x86_64",
  "boot_time": "system boot  2014-08-31 09:06",
  "uptime": 91840,
  "totalram": 16753819648,
  "freeram": 15609884672,
  "sharedram": 0,
  "totalswap": 17104367616,
  "totalhigh": 0,
  "freeswap": 17104367616,
  "mem_unit": 1
}
::::::::::::::
waffle.sysinfo
::::::::::::::
{
  "report_date": "2014-09-01 10:37:30",
  "operating_system": "GNU/Linux",
  "kernel_name": "Linux",
  "kernel_release": "3.13.0-35-generic",
  "kernel_version": "#62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014",
  "nodename": "waffle",
  "machine": "x86_64",
  "processor": "x86_64",
  "hardware_platform": "x86_64",
  "boot_time": "system boot  2014-08-31 09:07",
  "uptime": 91784,
  "totalram": 16752275456,
  "freeram": 15594139648,
  "sharedram": 0,
  "totalswap": 17104367616,
  "totalhigh": 0,
  "freeswap": 17104367616,
  "mem_unit": 1
}

So now I have the beginning of a structured inventory of all of my machines, and an easy way to scale it up.

Log consolidation

Well, my nice DNS service with two secondaries and a primary is all well and good, but my logs are now scattered across three machines. If I want to play with the stats or diagnose a problem or see when something went wrong, I now have to grep around on three different machines.

Obviously I could consolidate the logs using syslog. That’s what it’s designed for, so why don’t I do that. Let’s see what I have to do to make that work properly:

  1. Set up rsyslogd on flapjack to properly stash the DNS messages
  2. Set up DNS on flapjack to log to syslog
  3. Set up the rsyslogd service on flapjack to receive syslog messages over the network
  4. Set up rsyslog on waffle to forward dns log messages to flapjack
  5. Set up rsyslog on pancake to forward dns log messages to flapjack
  6. Set up the DNS secondary configurations to use syslog instead of local logs
  7. Distribute the updates and restart the secondaries
  8. Test everything

A side benefit of using syslog to accumulate my dns logs is that they’ll now be timestamped so I can do more sophisticated data analysis if I ever get a Round Tuit.

Here’s the architecture of the setup I’m going to pursue:

2014-08-04-dns-syslog-architecture

So the first step is to set up the primary DNS server on flapjack to write to syslog.  This has several parts:

  • Declare a “facility” in syslog that DNS can write to.  For historical reasons (Hi, Eric!) syslog has a limited number of separate facilities that can accumulate logs.  The configuration file links sources to facilities, allowing the configuration master to do various clever filtering of the log messages that come in.
  • Tell DNS to log to the “facility”
  • Restart both bind9 and rsyslogd to get everything working.

The logging for Bind9 is specified in a file called at /etc/bind/named.conf.local.  The default setup involves appending log records to a file named /var/log/named/query.log.

We’ll keep using that file for our logs going forward, since some other housekeeping knows about that location and no one else is intent on interfering with it.

The old logging stanza was:

logging {
    channel query.log {
        file "/var/log/named/query.log";
        severity debug 3;
    };
    category queries { query.log; };
};

What I want will be this:

logging {
    channel query.log {
	syslog local6;
        severity debug 3;
    };
    category queries { query.log; };
};

Because I have decided to use the facility named local6 for DNS.

In order to make the rsyslogd daemon on flapjack listen to messages from DNS, I have to declare the facility active.

The syslog service on flapjack is provided by a server called rsyslogd.  It’s an alternative to the other two main stream syslog products – syslog-ng and sysklogd.  I picked rsyslogd because it comes as the standard logging service on Ubuntu 12.04 and 14.04, the distros I am using in my house.  You might call me lazy, you might call me pragmatic, but don’t call me late for happy hour.

In order to make rsyslogd do what I need, I have to take control of the management of two configuration files: /etc/rsyslog.conf and /etc/rsyslog.d/50-default.conf.  As is my wont, I do this by creating a project directory ~/projects/r/rsyslog/ with a Makefile and the editable versions of the two files under RCS control.  Here’s the Makefile:

cat Makefile
#
# rsyslog setup file
#
# As of 2014-08-01 syslog host is flapjack
#
# $Id: Makefile,v 1.4 2014/08/02 12:11:52 marc Exp $
#

TARGETS = /etc/rsyslog.conf /etc/rsyslog.d/50-default.conf

FILES = Makefile rsyslog.conf 50-default.conf

help: ${FORCE}
	cat Makefile

# sudo
/etc/rsyslog.conf: rsyslog.conf
	cp $< $@ 

/etc/rsyslog.d/50-default.conf: 50-default.conf
	cp $< $@ 

# sudo
push: ${TARGETS}

# sudo
restart: ${FORCE}
	service rsyslog restart

verify: ${FORCE}
	rsyslogd -c5 -N1

compare: ${FORCE}
	diff /etc/rsyslog.conf rsyslog.conf
	diff /etc/rsyslog.d/50-default.conf 50-default.conf

checkin: ${FORCE}
	ci -l ${FILES}

FORCE:

Actually, this Makefile ends up in ~/projects/r/rsyslog/flapjack, since waffle and pancake will end up with different rsyslogd configurations and I separate the different control directories this way.

In order to log using syslog I need to define a facility, local6, in the 50-default.conf file. The new assertion looks like this:

local6.*	-/var/log/named/query.log

With a restart of each of the appropriate daemons, we’re off to the races and the new logs appear in the log file. I needed to change the ownership of the /var/log/named/query.log from bind to syslog in order for the new writer to be able to write, but that was the work of a moment.

Now comes the task of making the logs from the two secondary DNS servers go across the network to flapjack. This involved a lot of little bits and pieces.

First of all, I had to tell the rsyslogd daemon on flapjack to listen to the rsyslog UDP port. I could have turned on the more reliable TCP logging facility or the even more reliable queueing facility, but let’s get real. These are DNS query logs we’re talking about. I don’t really care if some of them fall on the floor. And anyway, the traffic levels on donner.lan are so low that I’d be very surprised if the loss rate is significant anyway.

To turn on UDP listening on flapjack all I had to do was uncomment two lines in the /etc/rsyslog.conf file:

# provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514

One more restart of rsyslogd on flapjack and we’re good to go.

The next step is to make the DNS name service on waffle and pancake send their logs to the local6 facility. In addition, I had to set up rsyslog on waffle and flapjack with a local6 facility, though this time the facility has to know to send the logs across to flapjack by UDP rather than writing locally.

The change to the named.conf.local file for waffle and pancake’s DNS secondary service was identical to the change to flapjack’s primary service, so kudos to the designers of bind9 and syslogd for good modularization.

To make waffle and pancake forward their logs over to flapjack required that the /etc/rsyslog.d/50-default.conf file define local6 in this way:

local6.*	@syslog

Notice that the @ tells rsyslogd to forward logs to local6 via UDP. I could have put the IP address of flapjack right after the @ or I could have put in flapjack. Instead, I created a DNS listing for a service host named syslog … it happens to have the same IP address as flapjack, but it gives me a level of indirection if I should desire to relocate the syslog service to another host.

With a restart of rsyslogd and bind9 on both waffle and pancake, we are up and running. All DNS logs are now consolidated on a single host, namely flapjack.