Planet SysAdmin

November 28, 2014

Chris Siebenmann

How I made IPSec IKE work for a point to point GRE tunnel on Fedora 20

The basic overview of my IPSec needs is that I want to make my home machine (with an outside address) appear as an inside IP address on the same subnet as my work machine is on. Because of Linux proxy ARP limitations, the core mechanics of this involve a GRE tunnel, which must be encrypted and authenticated by IPSec. Previously I was doing this with a static IPSec configuration created by direct use of setkey, which had the drawback that it didn't automatically change encryption keys or notice if something went wrong with the IPSec stuff. The normal solution to these drawbacks is to use an IKE daemon to automatically negotiate IPSec (and time it out if the other end stops), but unfortunately this is not a configuration that IKE daemons such as Fedora 20's Pluto support directly. I can't really blame them; anything involving proxy ARP is at least reasonably peculiar and most sane people either use routing on subnets or NAT the remote machines.

My first step to a working configuration came about after I fixed my configuration to block unprotected GRE traffic. Afterwards I realized this meant that I could completely ignore managing GRE in my IKE configuration and only have it deal with IPSec stuff; I'd just leave the GRE tunnel up all the time and if IPSec was down, the iptables rules would stop traffic. After I gritted my teeth and read through the libreswan ipsec.conf manpage, this turned out to be a reasonably simple configuration. The core of it is this:

conn cksgre
    left=<work IP alias>
    leftsourceip=<work IP alias>
    right=<home public IP>
    # what you want for always-up IPSec
    # I only want to use IPSec on GRE traffic

    # authentication is:

The two IP addresses used here are the two endpoints of my GRE tunnel (the 'remote' and 'local' addresses in 'ip tunnel <...>'). Note that this configuration has absolutely no reference to the local and peer IP addresses that you set on the inside of the tunnel; in my setup IPSec is completely indifferent to them.

I initially attempted to do authentication via PSK aka a (pre) shared secret. This caused my setup of the Fedora 20 version of Pluto to dump core with an assertion failure (for what seems to be somewhat coincidental reasons), which turned out to be lucky because there's a better way. Pluto supports what it calls 'RSA signature authentication', which people who use SSH also know as 'public key authentication'; just as with SSH, you give each end its own keypair and then list the public key(s) in your configuration and you're done. How to create the necessary RSA keypairs and set everything up is not well documented in the Fedora 20 manpages; in fact, I didn't even realize it was possible. Fortunately I stumbled over this invaluable blog entry on setting up a basic IPSec connection which covered the magic required.

This got the basic setup working, but after a while the novelty wore off and my urge to fiddle with things got the better of me so I decided to link the GRE tunnel to the IKE connection, so it would be torn down if the connection died (and brought up when the connection was made). You get your commands run on such connection events through the lefupdown="..." or rightupdown="..." configuration setting; your command gets information about what's going on through a pile of environment variables (which are documented in the ipsec_pluto manpage). For me this is a script that inspects $PLUTO_VERB to find out what's going on and runs one of my existing scripts to set up or tear down things on up-host and down-host actions. As far as I can tell, my configuration does not need to run the default 'ipsec _updown' command.

(My existing scripts used to do both GRE setup and IPSec setup, but of course now they only do the GRE setup and the IPSec stuff is commented out.)

This left IPSec connection initiation (and termination) itself. On my home machine I used to bring up and tear down the entire IPSec and GRE stuff when my PPPoE DSL link came up or went down. In theory one could now leave this up to a running Pluto based on its normal keying retries and timeouts; in practice this doesn't really work well and I wound up needing to do manual steps. Manual control of Pluto is done through 'ipsec whack' and if everything is running smoothly doing the following on DSL link up or down is enough:

ipsec whack --initiate|--terminate --name cksgre >/dev/null 2>&1

Unfortunately this is not always sufficient. Pluto does not notice dynamically appearing and disappearing network links and addresses, so if it's (re)started while my DSL link is down (for example on boot) it can't find either IP address associated with the cksgre connection and then refuses to try to do anything even if you explicitly ask it to initiate the connection. To make Pluto re-check the system's IP addresses and thus become willing to activate the IPSec connection, I need to do:

ipsec whack --listen

Even though the IPSec connection is set to autostart, Pluto does not actually autostart it when --listen causes it to notice that the necessary IP address now exists; instead I have to explicitly initiate it with 'ipsec whack --initiate --name cksgre'. My current setup wraps this all up in a script and runs it from /etc/ppp/ip-up.local and ip-down.local (in the same place where I previously invoked my own IPSec and GRE setup and stop scripts).

So far merely poking Pluto with --listen has been sufficient to get it to behave, but I haven't extensively tested this. My script currently has a fallback that will do a 'systemctl restart ipsec' if nothing else works.

PS: Note that taking down the GRE tunnel on IPSec failure has some potential security implications in my environment. I think I'm okay with them, but that's really something for another entry.

Sidebar: What ipsec _updown is and does

On Fedora 20 this is /usr/libexec/ipsec/_updown, which runs one of the _updown.* scripts in that directory depending on what the kernel protocol is; on most Linux machines (and certainly on Fedora 20) this is NETKEY, so _updown.netkey is what gets run in the end. What these scripts can do for you and maybe do do for you is neither clear nor documented and they make me nervous. They certainly seem to have the potential to do any number of things, some of them interesting and some of them alarming.

Having now scanned _updown.netkey, it appears that the only thing it might possibly be doing for me is mangling my /etc/resolv.conf. So, uh, no thanks.

by cks at November 28, 2014 08:01 AM

November 27, 2014

Chris Siebenmann

Using iptables to block traffic that's not protected by IPSec

When I talk about my IPSec setup, I often say that I use GRE over IPSec (or 'an IPSec based GRE tunnel'). However, this is not really what is going on; a more accurate but more opaque description is that I have a GRE tunnel that is encrypted and protected by IPSec. The problem, and the reason that the difference matters, is that there is nothing that intrinsically ties the two pieces together, unlike something where you are genuinely running X over Y such as 'forwarding X11 over SSH'. In the X11 over SSH case, if SSH is not working you do not get anything. But in my case if IPSec isn't there for some reason my GRE tunnel will cheerfully continue working, just without any protection against either eavesdropping or impersonation.

In theory this is undoubtedly not supposed to happen, since you (I) designed your GRE setup to work in conjunction with IPSec. Unfortunately in practice in practice there are any number of ways for IPSec to go away on you, possibly without destroying the GRE tunnel in the process. Your IPSec IKE daemon probably removes the IPSec security policies that reject unencrypted traffic when it shuts down, for example, and if you're manually configuring IPSec with setkey you can do all sorts of fun things like accidentally leaving a 'spdflush;' command in a control file that only (re)loads keys and is no longer used to set up the security policies.

The obvious safety method is to add some iptables rules that block unencrypted GRE traffic. If you are like me, you'll start out by writing the obvious iptables ruleset:

iptables -A INPUT -p esp -j ACCEPT
iptables -A INPUT -p gre -j DROP

This doesn't work. As far as I can tell, the Linux IPSec system effectively re-injects the decrypted packets into the IP stack, where they will be seen in their unencrypted state by iptables rules (as well as by tcpdump, which can be both confusing and alarming). The result is that after the re-injection the ipfilters rules see a plain GRE packet and drop it.

Courtesy of this netfilter mailing list message, it turns out that what you need is to match packets that will be or have been processed by IPSec. This is done with a policy match:

iptables -A INPUT -m policy --dir in --pol ipsec -j ACCEPT
iptables -A INPUT -p gre -j DROP

# and for outgoing packets:
iptables -A OUTPUT -m policy --dir out --pol ipsec -j ACCEPT
iptables -A OUTPUT -p gre -j DROP

Reading the iptables-extensions manpage suggests that I should add at least '--proto esp' to the policy match for extra paranoia.

I've tested these rules and they work. They pass GRE traffic that is protected by IPSec, but if I remove the IPSec security policies that force IPSec for my GRE traffic these iptables rules block the unprotected GRE traffic as I want.

(Extension to non-GRE traffic is left as an exercise to the reader. I have a simple IPSec story in that I'm only using it to protect GRE and I never want GRE traffic to flow without IPSec to any destination under any circumstances. Note that there are potentially tricky rule ordering issues here and you probably want to always put this set of rules at the end of your processing.)

by cks at November 27, 2014 04:16 AM

November 26, 2014

Ben's Practical Admin Blog

HP iLO SSL Certificate Script v3 now available

It’s fair to say that I have not done any scripting for HP iLO since the release of their PowerShell scripting toolkit. I simply didn’t have a need. However in the past week I received a request to update my SSL Signing script for iLO to use the HP toolkit, and so I have. As the HP […]

by Ben at November 26, 2014 09:38 AM

Chris Siebenmann

Using go get alone is a bad way to keep track of interesting packages

When I was just starting with Go, I kept running into interesting Go packages that I wanted to keep track of and maybe use someday. 'No problem', I thought, 'I'll just go get them so I have them sitting around and maybe I'll look at them too'.

Please allow yourself to learn from my painful experience here and don't do this. Specifically, don't rely on 'go get' as your only way to keep track of packages you want to keep an eye on, because in practice doing so is a great way to forget what those packages are. There's no harm in go get'ing packages you want to have handy to look through, but do something in addition to keep track of what packages you're interested in and why.

At first, there was nothing wrong with what I was doing. I could easily look through the packages and even if I didn't, they sat there in $GOPATH/src so I could keep track of them. Okay, they were about three levels down from $GOPATH/src itself, but no big deal. Then I started getting interested in Go programs like vegeta, Go Package Store, and delve, plus I was installing and using more mundane programs like goimports and golint. The problem with all of these is that they have dependencies of their own, and all of these dependencies wind up in $GOPATH/src too. Pretty soon my Go source area was a dense thicket of source trees that intermingled programs, packages I was interested in in their own right, and dependencies of these first two.

After using Go seriously for not very long I've wound up with far too many packages and repos in $GOPATH/src to keep any sort of track of, and especially to remember off the top of my head which packages I was interested in. Since I was relying purely on go get to keep track of interesting Go packages, I have now essentially lost track of most of them. The interesting packages I wanted to keep around because I might use them have become lost in the noise of the dependencies, because I can't tell one from the other without going through all 50+ of the repos to read their READMEs.

As you might guess, I'd be much better off if I'd kept an explicit list of the packages I found interesting in some form. A text file of URLs would be fine; adding notes about what they did and why I thought they were interesting would be better. That would make it trivial to sort out the wheat from the chaff that's just there because of dependencies.

(These days I've switched to doing this for new interesting packages I run across, but there's some number of packages from older times that are lost somewhere in the depths of $GOPATH/src.)

PS: This can happen with programs too, but at least there tends to be less in $GOPATH/bin than in $GOPATH/src so it's easier to keep track of them. But if you have an ever growing $GOPATH/bin with an increasing amount of programs you don't actually care about, there's the problem again.

by cks at November 26, 2014 06:45 AM

RISKS Digest

November 25, 2014

Chris Siebenmann

My Linux IPSec/VPN setup and requirements

In response to my entry mentioning perhaps writing my own daemon to rekey my IPSec tunnel, a number of people made suggestions in comments. Rather than write a long response, I've decided to write up how my current IPSec tunnel works and what my requirements are for it or any equivalent. As far as I know these requirements rule out most VPN software, at least in its normal setup.

My IPSec based GRE tunnel runs between my home machine and my work machine and its fundamental purpose is to cause my home machine to appear on the work network as just another distinct host with its own IP address. Importantly this IP address is publicly visible, not just an internal one. My home machine routes some but not all of its traffic over the IPSec tunnel and for various reasons I need full dual identity routing for it; traffic to or from the internal IP must flow over the IPSec tunnel while traffic to or from the external IP must not. My work machine also has additional interfaces that I need to isolate, which can get a bit complicated.

(The actual setup of this turns out to be kind of tangled, with some side effects.)

This tunnel is normally up all of the time, although under special circumstances it needs to be pulled down locally on my work machine (and independently on my home machine). Both home and work machines have static IPs. All of this works today; the only thing that my IPSec setup lacks is periodic automatic rekeying of the IPSec symmetric keys used for encryption and authentication.

Most VPN software that I've seen wants to either masquerade your traffic as coming from the VPN IP itself or to make clients appear on a (virtual) subnet behind the VPN server with explicit routing. Neither is what I want. Some VPNs will bridge networks together; this is not appropriate either because I have no desire to funnel all of the broadcast traffic running around on the work subnet over my DSL PPPoE link. Nor can I use pure IPSec alone, due to a Linux proxy ARP limitation (unless this has been fixed since then).

I suspect that there is no way to tell IKE daemons 'I don't need you to set things up, just to rekey this periodically'; this would be the minimally intrusive change. There is probably a way to configure a pair of IKE daemons to do everything, so that they fully control the whole IPSec and GRE tunnel setup; there is probably even a way to tell them to kick off the setup of policy based routing when a connection is negotiated. However for obvious reasons my motivation for learning enough about IKE configuration to recreate my whole setup is somewhat low, as much of the work is pure overhead that's required just to get me to where I already am now. On the other hand, if a working IKE based configuration for all of this fell out of the sky I would probably be perfectly happy to use it; I'm not intrinsically opposed to IKE, just far from convinced that investing a bunch of effort into decoding how I need to set it up will get me much or be interesting.

(It would be different if documentation for IKE daemons was clear and easy to follow, but so far I haven't found any that is. Any time I skim any of it I can see a great deal of annoyance in my future.)

PS: It's possible that my hardcoded IPSec setup is not the most modern in terms of security, since it dates from many years ago. Switching to a fully IKE-mediated setup would in theory give me a free ride on future best practices for IPSec algorithm choices so I don't have to worry about this.

Sidebar: why I feel that writing my own rekeying daemon is safe

The short version is that the daemon would not involved in setting up the secure tunnel itself, just getting new keys from /dev/urandom, telling the authenticated other end about them, writing them to a setkey script file, and running the necessary commands to (re)load them. I'd completely agree with everyone who is telling me to use IKE if I was attempting to actively negotiate a full IPSec setup, but I'm not. The IPSec setup is very firmly fixed; the only thing that varies is the keys. There are ways to lose badly here, but they're almost entirely covered by using a transport protocol with strong encryption and authentication and then insisting on fixed IP addresses on top of it.

(Note that I won't be negotiating keys with the other end as such. Whichever end initiates a rekey will contact the other end to say more or less 'here are my new keys, now use them'. And I don't intend to give the daemon the ability to report on the current keys. If I need to know them I can go look outside of the daemon. If the keys are out of sync or broken, well, the easy thing is to force an immediate rekey to fix it, not to read out current keys to try to resync each end.)

by cks at November 25, 2014 05:26 AM

November 24, 2014

Chris Siebenmann

Delays on bad passwords considered harmful, accidental reboot edition

Here is what I just did to myself, in transcript form:

$ /bin/su
Password: <type>
['oh, I must have mistyped the password']
[up-arrow CR to repeat the su]
bash# reboot <CR>

Cue my 'oh damn' reaction.

The normal root shell is bash and it had a bash history file with 'reboot' as the most recent command. When my su invocation didn't drop me into a root shell immediately I assumed that I'd fumbled the password and it was forcing a retry delay (as almost all systems are configured to do). These retry delays have trained me so that practically any time su stalls on a normal machine I just repeat the su; in a regular shell session this is through my shell's interactive history mechanism with an up-arrow and a CR, which I can type ahead before the shell prompt reappears (and so I do).

Except this time around su had succeeded and either the machine or the network path to it was slow enough that it had just looked like it had failed, so my 'up-arrow CR' sequence was handled by the just started root bash and was interpreted as 'pull the last command out of history and repeat it'. That last command happened to be a 'reboot', because I'd done that to the machine relatively recently.

(The irony here is that following my own advice I'd turned the delay off on this machine. But we have so many others with the delay on that I've gotten thoroughly trained in what to do on a delay.)

by cks at November 24, 2014 09:01 PM

Racker Hacker

Trust an IP address with firewalld’s rich rules

Managing firewall rules with iptables can be tricky at times. The rule syntax itself isn’t terribly difficult but you can quickly run into problems if you don’t save your rules to persistent storage after you get your firewall configured. Things can also get out of hand quickly if you run a lot of different tables with jumps scattered through each.

Why FirewallD?

FirewallD’s goal is to make this process a bit easier by adding a daemon to the mix. You can send firewall adjustment requests to the daemon and it handles the iptables syntax for you. It can also write firewall configurations to disk. It’s especially useful on laptops since you can quickly jump between different firewall configurations based on the network you’re using. You might run a different set of firewall rules at a coffee shop than you would run at home.

Adding a trusted IP address to a device running firewalld requires the use of rich rules.

An example

Consider a situation where you have a server and you want to allow unrestricted connectivity to that server from a bastion or from your home internet connection. First off, determine your default zone (which is most likely “public” unless you’ve changed it to something else):

# firewall-cmd --get-default-zone

We will use as our example IP address. Let’s add the rich rule:

firewall-cmd --zone=public --add-rich-rule='rule family="ipv4" source address="" accept'

Let’s break down what we’re asking firewalld to do. We’re asking to allow IPv4 connectivity from to all ports on the server and we’re asking for that rule to be added to the public (default) zone. If you list the contents of your public zone, it should look like this:

# firewall-cmd --list-all --zone=public
public (default, active)
  interfaces: eth0
  services: dhcpv6-client mdns ssh
  masquerade: no
  rich rules:
	rule family="ipv4" source address="" accept

The post Trust an IP address with firewalld’s rich rules appeared first on

by Major Hayden at November 24, 2014 02:44 PM

Chris Siebenmann

Using the SSH protocol as a secure transport protocol

I have an IPSec problem: my IPSec tunnel uses constant keys, with no periodic automatic rekeying. While IPSec has an entire protocol to deal with this called IKE, in practice IKE daemons (at least on Linux) are such a swamp to wade into that I haven't been willing to spend that much time on it. Recently I had a realization; rather that wrestle with IKE, I could just write a special purpose daemon to rekey the tunnel for me. Since both ends of the IPSec tunnel need to know the same set of keys, I need to run the daemon at either end and the ends have to talk to each other. Since they'll be passing keys back and forth, this conversation needs to be encrypted and authenticated.

The go-to protocol for encryption and authentication is TLS. But TLS has a little problem for my particular needs here in that it very much wants to do authentication through certificate authorities. I very much don't want to. The two endpoints are fixed and so are their authentication keys, and I don't want to have to make up a CA to sign two certificates and (among other things) leave myself open to this CA someday signing a third key. In theory TLS can be used with direct verification of certificate identities, but in practice TLS libraries generally make this either hard or impossible depending on their APIs.

(I've written about this before.)

As I was thinking about this it occurred to me that there is already a secure transport protocol that does authentication exactly the way I want it to work: SSH. SSH host keys and SSH public key authentication is fundamentally based on known public keys, not on CAs. I don't want to literally run my rekeying over SSH for various reasons (including security), but these days many language environments have SSH libraries with support for both the server and client sides. The SSH protocol even has 'do command' operation that can be naturally used to send operations to a server, get responses, and perhaps supply additional input.

It's probably a little bit odd to use SSH as a secure transport protocol for your own higher level operations that have nothing to do with SSH's original purpose. But on the other hand, why not? If the protocol fits my needs, I figure that I might as well be flexible and repurpose it for this.

(The drawback is that SSH is relatively complex at the secure transport layer if all that you want is to send some text back and forth. Hopefully the actual code complexity will be minimal.)

by cks at November 24, 2014 05:16 AM

November 23, 2014

Chris Siebenmann

I'm happier ignoring the world of spam and anti-spam

As I've mentioned a couple of times, I'm currently running a sinkhole SMTP server to collect spam samples. Doing this has let me learn or relearn a valuable lesson about anti-spam work.

My sinkhole SMTP server has several sorts of logging and monitoring, including a log of SMTP commands, and of course I can run it or turn it off as I feel like. When I first set it up, I configured it to be auto-started on system reboot and I watched the SMTP command log a lot of the time with 'tail -f'. The moment a new spam sample showed up I'd go read it.

The problem with this wasn't the time it took. Instead the problem is simpler; actively monitoring my sinkhole SMTP server all the time made me think about spam a lot, and it turns out that having spam on my mind wasn't really a great experience. In theory, well, I told myself that watching all of the spam attempts was somewhere between interesting (to see their behavior) and amusing (when they failed in various ways). In practice it was quietly wearying. Not in any obvious way that I really noticed much; instead it was a quiet drag that got me a little bit down.

Fortunately I did notice it a bit, so at a couple of points I decided to just turn things off (once this was prompted by a persistent, unblockable run of uninteresting spam that was getting on my nerves). What I found is that I was happier when I wasn't keeping an eye on the sinkhole SMTP server all the time, or even checking in on it very much. Pretty much the less I looked at the sinkhole server, the better or at least more relaxed I felt.

So what I (re)learned from all of this is that not thinking very much about the cat and mouse game played between spammers and everyone else makes me happier. If I can ignore the existence of spammers entirely, that's surprisingly fine.

As a result of this my current approach with my sinkhole SMTP server is to ignore it as much as possible. Currently I'm mostly managing to only check new samples once every few days and not to do too much with them.

(I probably wouldn't have really learned this without my sinkhole SMTP server because it has the important property that I can vary the attention I pay to it without any bad consequences for my real mail. Even running it at all is completely optional, so sometimes I don't.)

by cks at November 23, 2014 07:27 AM

Build a FreeBSD 10.1-release Openstack Image with bsd-cloudinit

We are going to prepare a FreeBSD image for Openstack deployment. We do this by creating a FreeBSD 10.1-RELEASE instance, installing it and converting it using bsd-cloudinit. We'll use the CloudVPS public Openstack cloud for this. We'll be using the Openstack command line tools, like nova, cinder and glance.

November 23, 2014 12:00 AM

November 22, 2014

Steve Kemp's Blog

Lumail 2.x ?

I've continued to ponder the idea of reimplementing the console mail-client I wrote, lumail, using a more object-based codebase.

For one thing having loosely coupled code would allow testing things in isolation, which is clearly a good thing.

I've written some proof of concept code which will allow the following Lua to be evaluated:

-- Open the maildir.
users = "/home/skx/Maildir/.debian.user" )

-- Count the messages.
print( "There are " .. users:count() .. " messages in the maildir " .. users:path() )

-- Now we want to get all the messages and output their paths.
for k,v in ipairs( users:messages()) do
    -- Here we could do something like:
    --   if ( string.find( v:headers["subject"], "troll", 1, true ) ) then v:delete()  end
    -- Instead play-nice and just show the path.
    print( k .. " -> " .. v:path() )

This is all a bit ugly, but I've butchered some code together that works, and tried to solicit feedback from lumail users.

I'd write more but I'm tired, and intending to drink whisky and have an early night. Today I mostly replaced pipes in my attic. (Is it "attic", or is it "loft"? I keep alternating!) Fingers crossed this will mean a dry kitchen in the morning.

November 22, 2014 09:39 PM

Chris Siebenmann

The effects of a moderate Hacker News link to here

A few days ago my entry on Intel screwing up their DC S3500 SSDs was posted to Hacker News here and rose moderately highly up the rankings, although I don't think it made the front page (I saw it on the second page at one point). Fulfilling an old promise, here's a report of what the resulting traffic volume looked like.

First, some crude numbers from this Monday onwards for HTTP requests for Wandering Thoughts, excluding Atom feed requests. As a simple measurement of how many new people visited, I've counted unique IPs fetching my CSS file. So the numbers:

(day) (that entry) (other pages) (CSS fetches)
November 17 0 5041 453
November 18 18255 6178 13585
November 19 17112 10141 11940
November 20 908 6341 876
November 21 228 4811 530

(Some amount of my regular traffic is robots and some of it is from regular visitors who already have my CSS file cached and don't re-fetch it.)

Right away I can say that it looks like people spilled over from the directly linked entry to other parts of Wandering Thoughts. The logs suggest that this mostly went to the blog's main page and my entry on our OmniOS fileservers, which was linked to in the entry (much less traffic went to my entry explaining why 4K disks can be a problem for ZFS). Traffic for the immediately preceding and following entries also went up, pretty much as I'd expect, but this is nowhere near all of the extra traffic so people clearly did explore around Wandering Thoughts to some extent.

Per-day request breakdowns are less interesting for load than per minute or even per second breakdowns. At peak activity, I was seeing six to nine requests for the entry per second and I hit 150 requests for it a minute (for only one minute). The activity peak came very shortly after I started getting any significant volume of hits; things start heating up around 18:24 on the 18th, go over 100 views a minute at 18:40, peak at 19:03, and then by 20:00 or so I'm back down to 50 a minute. Unfortunately I don't have latency figures for DWiki so I don't know for sure how well it responded while under this load.

(Total page views on the blog go higher than this but track the same activity curve. CSpace as a whole was over 100 requests a minute by 18:39 and peaked at 167 requests at 19:05.)

The most surprising thing to me is the amount of extra traffic to things other than that entry on the 19th. Before this happened I would have (and did) predict a much more concentrated load profile, with almost all of the traffic going to the directly linked entry. This is certainly the initial pattern on the 18th, but then something clearly changed.

(I was surprised by the total amount of traffic and how many people seem to have visited but that's just on a personal basis where it's surprising for so many people to be interested in looking at something I've written.)

This set of stats may well still leave people with questions. If so, let me know and I'll see if I can answer them. Right now I've stared at enough Apache logs for one day and I've run out of things to say, so I'm stopping this entry here.

Sidebar: HTTP Referers

HTTP Referers for that entry over the 18th to the 20th are kind of interesting. There were 17,508 requests with an empty Referer, 13,908 from the HTTPS Hacker News front page, 592 from a redirector of some sort, 314 from the link in this HN repeater tweet, and then we're down to a longer tail (including reddit's /r/sysadmin, where it was also posted). The Referers feature a bunch of various alternate interfaces and apps for Hacker News and so on ( was surprisingly popular). Note that there were basically no Referers from any Hacker News page except the front page, despite that as far as I know the story never made it to the front page. I don't have an explanation for this.

by cks at November 22, 2014 05:47 AM

RISKS Digest

November 21, 2014

Chris Siebenmann

Lisp and data structures: one reason it hasn't attracted me

I've written before about some small scale issues with reading languages that use Lisp style syntax, but I don't think I've said what I did the other day on Twitter, which is that the syntax of how Lisp languages are written is probably the primary reason that I slide right off any real interest in them. I like the ideas and concepts of Lisp style languages, the features certainly sound neat, and I often use all of these in other languages when I can, but actual Lisp syntax languages have been a big 'nope' for a long time.

(I once wrote some moderately complex Emacs Lisp modules, so I'm not coming from a position of complete ignorance on Lisp. Although my ELisp code didn't exactly make use of advanced Lisp features.)

I don't know exactly why I really don't like Lisp syntax and find it such a turn-off, but I had an insight on Twitter. One of the things about the syntax of S-expressions is that they very clearly are a data structure. Specifically, they are a list. In effect this gives lists (yes, I know, they're really cons cells) a privileged position in the language. Lisp is lists; you cannot have S-expressions without them. Other languages are more neutral on what they consider to be fundamental data structures; there is very little in the syntax of, say, C that privileges any particular data structure over another.

(Languages like Pyhton privilege a few data structures by giving them explicit syntax for initializers, but that's about it. The rest is in the language environment, which is subject to change.)

Lisp is very clearly in love with lists. If it's terribly in love with lists, it doesn't feel as if it can be fully in love with other data structures; whether or not it's actually true, it feels like other data structures are going to be second class citizens. And this matters to how I feel about the language, because lists are often not the data structure I want to use. Even being second class in just syntax matters, because syntactic sugar matters.

(In case it's not clear, I do somewhat regret that Lisp and I have never clicked. Many very smart people love Lisp a lot and so it's clear that there are very good things there.)

by cks at November 21, 2014 06:36 AM

November 20, 2014


OS X Not Appending Search Domains - Yosemite Edition

FinderIt seems this problem has resurfaced again with the new version of Mac OS X. More specifically, this problem seems to affect appending search domains when the hostname contains a dot. In Yosemite (10.10), mDNSResponder has been replaced with discoveryd. Fortunately, all we need to do here is add the --AlwaysAppendSearchDomains argument to the LaunchDaemon startup file and everything should work as expected.

  1. Before you do anything, make sure you have updated to at least OS X 10.10.1.
  2. You will need to edit /System/Library/LaunchDaemons/ Add <string>--AlwaysAppendSearchDomains</string> to the list of strings in the ProgramArguments <array>.
  3. Restart discoveryd to see your changes take effect.
    sudo launchctl unload -w /System/Library/LaunchDaemons/
    sudo launchctl load -w /System/Library/LaunchDaemons/
  4. Profit!

by Scott Hebert at November 20, 2014 04:20 PM

Steve Kemp's Blog

An experiment in (re)building Debian

I've rebuilt many Debian packages over the years, largely to fix bugs which affected me, or to add features which didn't make the cut in various releases. For example I made a package of fabric available for Wheezy, since it wasn't in the release. (Happily in that case a wheezy-backport became available. Similar cases involved repackaging gtk-gnutella when the protocol changed and the official package in the lenny release no longer worked.)

I generally release a lot of my own software as Debian packages, although I'll admit I've started switching to publishing Perl-based projects on CPAN instead - from which they can be debianized via dh-make-perl.

One thing I've not done for many years is a mass-rebuild of Debian packages. I did that once upon a time when I was trying to push for the stack-smashing-protection inclusion all the way back in 2006.

Having had a few interesting emails this past week I decided to do the job for real. I picked a random server of mine,, which stores backups, and decided to rebuild it using "my own" packages.

The host has about 300 packages installed upon it:

root@rsync ~ # dpkg --list | grep ^ii | wc -l

I got the source to every package, patched the changelog to bump the version, and rebuild every package from source. That took about three hours.

Every package has a "skx1" suffix now, and all the build-dependencies were also determined by magic and rebuilt:

root@rsync ~ # dpkg --list | grep ^ii | awk '{ print $2 " " $3}'| head -n 4
acpi 1.6-1skx1
acpi-support-base 0.140-5+deb7u3skx1
acpid 1:2.0.16-1+deb7u1skx1
adduser 3.113+nmu3skx1

The process was pretty quick once I started getting more and more of the packages built. The only shortcut was not explicitly updating the dependencies to rely upon my updages. For example bash has a Debian control file that contains:

Depends: base-files (>= 2.1.12), debianutils (>= 2.15)

That should have been updated to say:

Depends: base-files (>= 2.1.12skx1), debianutils (>= 2.15skx1)

However I didn't do that, because I suspect if I did want to do this decently, and I wanted to share the source-trees, and the generated packages, the way to go would not be messing about with Debian versions instead I'd create a new Debian release "alpha-apple", "beta-bananna", "crunchy-carrot", "dying-dragonfruit", "easy-elderberry", or similar.

In conclusion: Importing Debian packages into git, much like Ubuntu did with bzr, is a fun project, and it doesn't take much to mass-rebuild if you're not making huge changes. Whether it is worth doing is an entirely different question of course.

November 20, 2014 01:28 PM

Chris Siebenmann

Sometimes the way to solve a problem is to rethink the problem

After a certain amount of exploration and discussion, we've come up with what we feel is a solid solution for getting our NFS mount authentication working on Linux. Our solution is to not use Linux; instead we'll use OmniOS, where we already have a perfectly working NFS mount authentication system.

To get there we had to take a step back and look at our actual objectives and constraints. The reason we wanted our NFS mount authentication on Linux is that we want to offer a service where people give us disks (plus some money for overhead) and we put them into something and make them available via NFS and Samba and so on. The people involved very definitely want their disks pace available via NFS because they want their disk space to be conveniently usable (and fast) from various existing Linux compute machines and so on. We wanted to do this on Linux (as opposed to OmniOS (or FreeBSD)) because we trust Linux's disk drivers the most and in fact we already have Linux running happily on 16-bay and 24-bay SuperMicro chassis.

(I did some reading and experimentation with OmniOS management of LSI SAS devices and was not terribly enthused by it.)

We haven't changed our minds about using Linux instead of OmniOS to talk to the disks; we've just come to the blindingly obvious realization that we've already solved this problem and all it takes to reduce our current situation to our canned solution is adding a server running OmniOS in front of the Linux machine with the actual disks. Since we don't view this bulk disk hosting as an critical service and it doesn't need 10G Ethernet (even if that worked for us right now), this second server can be one of our standard inexpensive 1U servers that we have lying around (partly because we tend to buy in bulk when we have some money).

(Our first round implementation can even take advantage of existing hardware; since we're starting to decommission our old fileserver environment we have both spare servers and more importantly spare disk enclosures. These won't last forever, but they should last long enough to find out if there's enough interest in this service for us to buy 24-bay SuperMicro systems to be the disk hosts.)

This rethinking of the problem is not as cool and interesting as, say, writing a Go daemon to do efficient bulk authentication of machines and manage Linux iptables permissions to allow them NFS access, but it solves the problem and that's the important thing. And we wouldn't have come up with our solution if we'd stayed narrowly focused on the obvious problem in front of us, the problem of NFS mount authentication on Linux. Only when one of my coworkers stepped back and started from the underlying problem did we pivot to 'is there any reason we can't throw hardware at the problem?'.

There is a valuable lesson for me here. I just hope I remember it for the next time around.

by cks at November 20, 2014 05:30 AM

November 19, 2014

Racker Hacker

A response to Infoworld’s confusing article about Fedora

Working with the Fedora community is something I really enjoy in my spare time and I was baffled by a article I saw in Infoworld earlier last week. Here’s a link:

The article dives into the productization of Fedora 21 that hopes to deliver a better experience for workstation, server, and cloud users. The article suggests that Red Hat drove Fedora development and that the goals of Red Hat and Fedora are closely aligned.

That couldn’t be further from the truth.

I was heavily involved with the changes as a Fedora board member in 2013 and we had many lively discussions about which products should be offered, the use cases for each, and how development would proceed for each product. FESCo and the working groups trudged through the process and worked diligently to ensure that users and developers weren’t alienated by the process. It was impressive to see so many people from different countries, companies and skill levels come together and change the direction of the project into a more modern form.

Some of those board members, FESCo members, and working group members worked for Red Hat at the time. Based on the discussions, it was obvious to me that these community members wanted to make changes to improve the project based on their own personal desires. I never heard a mention of “Red Hat wants to do…” or “this doesn’t align with …” during any part of the process. It was entirely community driven.

Some projects and products from Fedora eventually make it into the Red Hat product list (Red Hat Atomic is an example) but that usually involves Red Hat bringing a community effort under their umbrella and adding formal processes so they can offer it to their customers (and support it).

Fedora’s community is vibrant, independent, and welcoming. If anyone is ever confused by the actions of the community, there are many great ways to join the community and learn more.

The post A response to Infoworld’s confusing article about Fedora appeared first on

by Major Hayden at November 19, 2014 02:24 PM


Breath-taking honesty

An essay has been making the rounds about how much you (developer type person) are worth for hourly-rate. It is breath-taking in its honesty. This person sets out a pay-rate scale based strictly on public-access reputation markets and evidence of community activity, with a premium on community contributions. To get top dollar, you will need to tick all of these boxes:

  • Have high-rated open-source libraries on GitHub.
  • Have a StackOverflow reputation over 20K.
  • Vendor-based code certifications, such as those from Oracle (Java) or Zend (Php)
  • Evidence of mastery in multiple languages. So, Ruby AND Erlang, not Erlang and maybe Ruby if you have to.
  • Published talks at conferences

If you don't have all those boxes ticked, you can still get paid. It just won't be enough to live on in most hot technical job-markets. The author is also very explicit in what they don't care about:

  • Cost-of-living. With fully remote work, location is elective. Want to make boatlads of cash? Move to the Montana prairie. You won't get more money by living in London.
  • Education. Masters, BA, BS, whatever. Don't care.
  • Past employment. Blah blah blah corporate blah.
  • Years of experience. I call bull on this one, since I'm dead certain that if they see "10 years of experience in Blah" this job-reviewer is going to look more critically on the lack of an auditable career than someone with 2 years.

Before long we're going to get a startup somewhere that will take evidence of all of the first list of bullet-points and distil it down to a Klout-like score.

One not-mentioned feature of this list is it means there are a variety of career suicide moves if more companies start adopting this method of pricing developer talent:

  • Working on closed-source software.
  • Working for a company that doesn't contribute to open source projects.
  • Working for a company that doesn't pay to present at conferences.
  • Working for a company that doesn't pay for continuing education.
  • Working for a company that has strict corporate communications rules, which prevent personal blogging on techical topics.
  • Working for a company with employment contracts that prohibit technical contributions to anything, anywhere that isn't the company (often hidden as the no-moonlighting rule in the employment contract).

Career suicide, all of it. I'm glad the systems engineering market is not nearly as prone to these forces.

by SysAdmin1138 at November 19, 2014 12:40 PM

Chris Siebenmann

Finding free numbers in a range, crudely, with Unix tools

Suppose, not entirely hypothetically, that you have the zone file for the reverse mapping of a single /24 subnet (in Bind format) and you want to find a free IP address in that subnet. The first field in the zone file is the last octet of the IP (ie for '' it is '1'), so this problem reduces to finding what numbers are not used in the file out of the range 1 to 254.

So let's do this really crudely with readily available Unix tools:

grep '^[0-9]' ptrzone | awk '{print $1}' | sort >/tmp/used
seq 1 254 | sort >/tmp/all
comm -13 /tmp/used /tmp/all | sort -n

(We must do a 'sort -n' only at the end because of a stupid GNU comm decision.)

We can get rid of one of the two files here by piping directly into comm, but unfortunately we can't get rid of both of them in standard Bourne shell. In Bash and zsh (and AT&T ksh), we can use what they each call 'process substitution' to directly embed each pipeline into the comm arguments:

comm -13 <(grep '^[0-9]' ptrzone | awk '{print $1}' | sort) <(seq 1 254 | sort) | sort -n

(This also works in rc, with a different syntax.)

Once upon a time I would have happily written the single-line version. These days my tendencies are towards the multi-line version unless I'm feeling excessively clever, partly because it's easier to check the intermediate steps.

(A neater version would condense runs of numbers into ranges, but I don't think you can do that simply with any existing common Unix tools. Perl experts can maybe do it in a single line.)

PS: One hazard of a very well stuffed $HOME/bin directory is that it turns out I already had a command to do this for me (and with the neater range-based output). Oh well, such things happen. Perhaps I should thin out my $HOME/bin on the grounds of disuse of much of it, but it's always hard to get rid of your personal little scripts.

by cks at November 19, 2014 04:27 AM

November 18, 2014

Everything Sysadmin

Book Excerpt: Organizing Strategy for Operational Teams

When Esther Schindler asked for permission to publish an excerpt from The Practice of Cloud System Administration on the Druva Blog, we thought this would be the perfect piece. We're glad she agreed. Check out this passage from Chapter 7, "Operations in a Distributed World".

If you manage a sysadmin team that manages services, here is some advice on how to organize the team and their work:

Organizing Strategy for Operational Teams

November 18, 2014 08:59 PM

Standalone Sysadmin

#SysChat discussion tonight on Twitter, Webcast on Friday

Well, this is new and interesting.

SolarWinds approached me a little while ago and asked me if I'd be interested in taking part in a combination Twitter discussion and webcast, talking about system administration, monitoring, and so on. I said yes, but what I really meant was, "You've seen my twitter stream...I was probably going to be talking about that stuff anyway, so sure!"

The details are that the twitter discussion will be using the #syschat hashtag. The idea is for this to be an interactive thing, with other twittererrers...twittites....twitterati...whatever, with other folks taking part. So if you see me tweeting with that hashtag, don't be surprised, and join in!

Watching a hashtag on Twitter is tough with only, and I find even HootSuite or Tweet deck slow for these kinds of things, even though I use Hootsuite all the time. Whenever I'm watching a realtime thing like this (or, during launch events on the #nasa tag), I will use TwitterFall, which has a realtime display that kind of rains tweets down it, and since you're signed in, you can reply and retweet and interact the same way that you would on anything else. Once you're signed in, on the left side, just add "#syschat" to the searches, and you'll immediately get all of the past tweets falling.

Tonight's event is at 8:30pm Eastern time, and it'll probably last an hour or an hour and a half. And honestly, it's not like I ever get OFF of the internet, so I'll probably be watching for messages for the next day or so.

The event will be co-hosted by SolarWinds Head Geek LawrenceGarvin.

The webcast will be happening on Friday, 12pm-1pm. It's going to be a presentation on the topic of "Evolution of SysAdmin" and "Holistic Monitoring for Servers and Applications". You need to sign up to watch live, but it's free.

I want to be very up front about this: SolarWinds IS paying me to do these things, but at no point have they ever even suggested that I add mentions of SolarWinds products, or asked me to even say nice things about them, or about the company in general. The talk isn't even going to be about SolarWinds - it's on the topic of system administration, and questions about the kinds of things we deal with. This isn't me advocating for SolarWinds, because I can't. I've never had an environment where it made sense to use their products.

But I do thank them for asking me to take part in this thing. I have to imagine that being paid to tweet about things I would have talked about anyway is a little like how comedians feel when they get paid to host a stand-up special. I just want to say something like, "and you thought all that time tweeting was a waste. Ha!"

Anyway, see you today at 8:30pm ET, and Friday at noon. Don't forget to sign up!

by Matt Simmons at November 18, 2014 11:20 AM

Chris Siebenmann

Why I need a browser that's willing to accept bad TLS certificates

One of my peculiarities is that I absolutely need a browser that's willing to accept 'bad' TLS certificates, probably for all species of bad that you can imagine: mismatched host names, expired certificates, self-signed or signed by an unknown certificate authority, or some combination of these. There are not so much two reasons for this as two levels of the explanation.

The direct reason is easy to state: lights out management processors. Any decent one supports HTTPS (and you really want to use it), but we absolutely cannot give them real TLS certificates because they all live on internal domain names and we're not going to change that. Even if we could get proper TLS certificates for them somehow, the cost is prohibitive since we have a fair number of LOMs.

(Our ability to get free certificates has gone away for complicated reasons.)

But in theory there's a workaround for that. We could create our own certificate authority, add it as a trust root, and then issue our own properly signed LOM certificates (all our LOMs accept us giving them new certificates). This would reduce the problem to doing an initial certificate load in some hacked up environment that accepted the LOMs out-of-box bad certificate (or using another interface for it, if and where one exists).

The problem with this is that as far as I know, certificate authorities are too powerful. Our new LOM certificate authority should only be trusted for hosts in a very specific internal domain, but I don't believe there's any way to tell browsers to actually enforce that and refuse to accept TLS certificates it signs for any other domain. That makes it a loaded gun that we would have to guard exceedingly carefully, since it could be used to MITM any of our browsers for any or almost any HTTPS site we visit, even ones that have nothing to do with our LOMs. And I'm not willing to take that sort of a risk or try to run an internal CA that securely (partly because it would be a huge pain in practice).

So that's the indirect reason: certificate authorities are too powerful, so powerful that we can't safely use one for a limited purpose in a browser.

(I admit that we might not go to the bother of making our own CA and certificates even if we could, but at least it would be a realistic possibility and people could frown at us for not doing so.)

by cks at November 18, 2014 04:12 AM

RISKS Digest