Well, I now have four different UNIX machines and I’ve been doing sysadmin tasks on all of them. As a result I now have four home directories that are out of sync.
How annoying.
Ultimately I plan to create a file server on one of my machines and provide the same home directory on all of them, but I haven’t done that yet, so I need some temporary crutches to tide me over until I get the file server built. In particular, I need to find out what is where.
The first thing I did was establish trust among the machines, making flapjack, the oldest, into the ‘master’ trusted by the others. This I did by creating an SSH private key using ssh-keygen
on the master and putting the matching public key in .ssh/authorized_keys
on the other machines.
Then I decided to automate the discovery of what directories were on which machine. This is made easier because of my personal trick for organizing files, namely to have a set of top level subdirectories named org/
, people/
, and projects/
in my home directory. Each of these has twenty-six subdirectories named a
through z
, with appropriately named subdirectories under them. This I find helps me put related things together. It is not an alternative to search but rather a complement.
Anyway, the result is that I could build a Makefile that automates reaching out to all of my machines and gathering information. Here’s the Makefile:
# $Id: Makefile,v 1.7 2014/07/04 18:57:44 marc Exp marc $ FORCE = force HOSTS = flapjack frenchtoast pancake waffle FILES = Makefile checkin: ${FORCE} ci -l ${FILES} uname: ${FORCE} for h in ${HOSTS}; do ssh $$h uname -a | sed -e 's/^/'$$h': /'; done host_find: ${FORCE} echo > host_find.txt for h in ${HOSTS}; do ssh $$h find -print | sed -e 's/^/'$$h': /' >> host_find.txt; done clusters.txt: host_find.txt sed -e 's|(/[^/]*/[a-z]/[^/]*)/.*$$|1|' host_find.txt | uniq -c | grep -v '^ *1 ' > clusters.txt force:
Ideally, of course, I’d get the list of host names in the variable HOSTS
from my configuration database, but having neglected to build one yet, I am just listing my machines by name there.
The first important target host_find
does an ssh to all of the machines, including itself, and runs find, prefixing the host name on each line so that I can determine which files exist on which machine. This creates a file named host_find.txt
which I can probably dispense with now that the machinery is working.
The second important target, clusters.txt
, passes the host_find.txt output through a SED script. This SED script does a rather careful substitution of patterns like /org/z/zodiac/blah-blah-blah
with /org/z/zodiac
. Then the pipe through uniq -c
counts up the number of identical path prefixes. That’s fine, but there are lots of subdirectories /org/f
that are empty and I don’t want them cluttering up my result, so the grep -v '^ *1 '
pipe segment excludes the lines with a count of 1.
The result of running that tonight is the following report:
8 flapjack: ./org/c/coursera 351 flapjack: ./org/s/studiopress 3119 flapjack: ./org/g/gnu 1312 flapjack: ./org/f/freedesktop 293 flapjack: ./org/m/minecraft 9 flapjack: ./org/b/brother 2 flapjack: ./org/n/national_center_for_access_to_justice 1168 flapjack: ./org/w/wordpress 4 flapjack: ./projects/c/cron 10 flapjack: ./projects/c/cups 6 flapjack: ./projects/d/dhcp 33 flapjack: ./projects/d/dns 15 flapjack: ./projects/s/sysadmin 5 flapjack: ./projects/f/ftp 3 flapjack: ./projects/p/printcap 8 flapjack: ./projects/p/programming 8 flapjack: ./projects/t/tftpd 35 flapjack: ./projects/n/netboot 7 flapjack: ./projects/l/logrotate 8 flapjack: ./projects/r/rolodex 189 flapjack: ./projects/h/html5reset 6 frenchtoast: ./projects/p/printcap 5 frenchtoast: ./projects/c/cups 380 pancake: ./org/m/minecraft 3 pancake: ./projects/l/logrotate 15 pancake: ./projects/d/dns 9 pancake: ./projects/s/sysadmin 11 waffle: ./projects/s/sysadmin 8 waffle: ./projects/t/tftpd 15 waffle: ./projects/d/dns 3 waffle: ./projects/l/logrotate 375 waffle: ./org/m/minecraft
And … voila! I have a map that I can use to figure out how to consolidate the many scattered parts of my home directory.
[2014-07-04 – updated the Makefile so that it is more friendly to web browsers.]
[2014-07-29 – a friend of mine critiqued my Makefile code and pointed out that gmake has powerful iteration functions of its own, eliminating the need for me to incorporate shell code in my targets. The result is quite elegant, I must say!]
# # Find out what files exist on all of the hosts on donner.lan # Started in June 2014 by Marc Donner # # $Id: Makefile,v 1.12 2014/07/30 02:07:07 marc Exp $ # FORCE = force # This ought to be the result of a call to the CMDB HOSTS = flapjack frenchtoast pancake waffle FILES = Makefile host_find.txt clusters.txt # # This provides us with the ISO 8601 date (YYYY-MM-DD) # DATE := $(shell /bin/date +"%Y-%m-%d") help: ${FORCE} cat Makefile checkin: ${FORCE} ci -l ${FILES} # A finger exercise to ensure that we can see the base info on the hosts HOSTS_UNAME := $(HOSTS:%=.%_uname.txt) uname: ${HOSTS_UNAME} cat ${HOSTS_UNAME} .%_uname.txt: ${FORCE} ssh $* uname -a | sed -e 's/^/:'$*': /' > $@ HOSTS_UPTIME := $(HOSTS:%=.%_uptime.txt) uptime: ${HOSTS_UPTIME} cat ${HOSTS_UPTIME} .%_uptime.txt: ${FORCE} ssh $* uptime | sed -e 's/^/:'$*': /' > $@ # Another finger exercise to verify the location of the ssh landing # point home directory HOSTS_PWD := $(HOSTS:%=.%_pwd.txt) pwd: ${HOSTS_PWD} cat ${HOSTS_PWD} .%_pwd.txt: ${FORCE} ssh $* pwd | sed -e 's/^/:'$*': /' > $@ # Run find on all of the ${HOSTS} and prefix mark all of the results, # accumulating them all in host_find.txt HOSTS_FIND := $(HOSTS:%=.%_find.txt) find: ${HOSTS_FIND} .%_find.txt: ${FORCE} echo '# ' ${DATE} > $@ ssh $* find -print | sed -e 's/^/:'$*': /' >> $@ # Get rid of the empty directories and report the number of files in each # non-empty directory clusters.txt: ${HOSTS_FIND} cat ${HOSTS_FIND} | sed -e 's|(/[^/]*/[a-z]/[^/]*)/.*$$|1|' | uniq -c | grep -v '^ *1 ' | sort -t ':' -k 3 > clusters.txt force: