Planet SysAdmin

Upcoming sysadmin conferences/events

Contact me to have your event listed here.

April 23, 2014

Everything Sysadmin

Racker Hacker

Configure static IP addresses for Project Atomic’s KVM image

Amid all of the Docker buzz at the Red Hat Summit, Project Atomic was launched. It’s a minimalistic Fedora 20 image with a few tweaks, including rpm-ostree and geard.

There are great instructions on the site for firing up a test instance under KVM but my test server doesn’t have a DHCP server on its network. You can use Project Atomic with static IP addresses fairly easily:

Create a one-line /etc/sysconfig/network:


Drop in a basic network configuration into /etc/sysconfig/network-scripts/ifcfg-eth0:


All that’s left is to set DNS servers and a hostname:

echo "nameserver" > /etc/resolv.conf
hostnamectl set-hostname

Bring up the network interface:

ifup eth0

Of course, you could do all of this via the nmcli tool if you prefer to go that route.

Configure static IP addresses for Project Atomic’s KVM image is a post from: Major Hayden's blog.

Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.

by Major Hayden at April 23, 2014 03:14 PM

Standalone Sysadmin

No updates lately - super busy!

I've been writing less lately than normal, and given my habits of not posting, that's saying something!

Lately, I've been feeling less like a sysadmin and more like a community manager, honestly. On top of the normal LOPSA Board Member duties I've had, I'm serving as co-chair of the LISA'14 Tutorials committee AND the Invited Talks committee. PLUS I've been doing a lot of work with PICC, the company that manages the Cascadia IT Conference and LOPSA-East (which is next week, so if you haven't yet, register now. Prices are going up starting on Monday!) .

All of this leaves not much time at all for doing actual sysadmin work, and even less for writing about it.

As an overview of the stuff I've been dealing with at work, let me just implore you, if you're using Cisco Nexus switches, Do Not Use Switch Profiles. I've written about them before, but it would be impossible for me to tell you not to use them emphatically enough. They're terrible. I'll talk about how terrible some time later, but trust me on this.

Also, I've been doing a whole lot of network migration that I'll also write about at some time in the future, but I'll just say that it's really demoralizing to perform the same migration three times, but I'm awfully glad that I had a plan to rollback. At the moment, I'm working on writing some python scripts to make per-port changes simpler so that I can offload it to students. I'm glad that Cisco has the native Nexus Python API, but their configuration support is severely lacking - basically equivalent to cli(). Also, students migrating hosts to the new core...what could possibly go wrong? ;-)

Alright, no time to write more. I will work on writing more frequently, anyway!

by Matt Simmons at April 23, 2014 11:18 AM

Chris Siebenmann

At least partially understanding DMARC

DMARC is suddenly on my mind because of the news that AOL changed its DMARC policy to 'reject', following the lead of Yahoo which did this a couple of weeks ago. The short version is that a DMARC 'reject' policy is what I originally thought DKIM was doing: it locks down email with a From: header of your domain so that only you can send it. More specifically, all such email must not merely have a valid DKIM signature but a signature that is for the same domain as the From: domain; in DMARC terminology this is called being 'aligned'. Note that the domain used to determine the DMARC policy is the From: domain, not the DKIM signature domain.

(I think that DMARC can also be used to say 'yes, really, pay attention to my strict SPF settings' if you're sufficiently crazy to break all email forwarding.)

This directly affects anyone who wants to send email with a From: of their Yahoo or AOL address but not do it through Yahoo/AOL's SMTP servers. Yahoo and AOL have now seized control of that and said 'no you can't, we forbid it by policy'. Any mail system that respects DMARC policies will automatically enforce this for AOL and Yahoo.

(Of course this power grab is not the primary goal of the exercise; the primary goal is to cut off all of the spammers and other bad actors that are attaching Yahoo and AOL From: addresses to their email.)

This indirectly affects anyone who has, for example, a mailing list (or a mail forwarding setup) that modifies the message Subject: or adds a footer to the message as it goes through the list. Such modifications will invalidate the original DKIM signature of legitimate email from a Yahoo or AOL user and then this bad DKIM signature will cause the message to be rejected by downstream mailers that respect DMARC. The only way to get such modified emails past DMARC is to change the From: header away from Yahoo or AOL, at which point their DMARC 'reject' policies don't apply.

DMARC by itself does not break simple mail relaying and forwarding (including for simple mailing lists), ie all things where the message and its headers are unmodified. An unmodified message's DKIM signature is still valid even if it doesn't come directly from Yahoo or AOL (or whoever) so everything is good as far as DMARC is concerned (assuming SPF sanity).

Note that Yahoo and AOL are not the only people with a DMARC 'reject' policy. Twitter has one, for example. You can check a domain's DMARC policy (if any) by looking at the TXT record on _dmarc.<domain>, eg I believe the 'p=' bit is the important part.

PS: I suspect that more big free email providers are going to move to publishing DMARC 'reject' policies, assuming that things don't blow up spectacularly for Yahoo and AOL. Which I doubt they will.

by cks at April 23, 2014 05:13 AM

April 22, 2014

Security Monkey

RAWR: Rapid Assessment of Web Resources

Here's another one for the toolkit folks!


Anything that can make our lives easier when doing web assessments is a good thing.  The more information one can extract in the early phases of an assessment, the better.  And nobody wants to do this by hand, right?  You might remember EyeWitness that I blogged about 2 months ago? &nbs

April 22, 2014 11:22 PM

Steve Kemp's Blog

I've not commented on security for a while

Unless you've been living under a rock, or in a tent (which would make me slightly jealous) you'll have heard about the recent heartbleed attack many times by now.

The upshot of that attack is that lots of noise was made about hardening things, and there is now a new fork of openssl being developed. Many people have commented about "hardening Debian" in particular, as well as random musing on hardening software. One or two brave souls have even made noises about auditing code.

Once upon a time I tried to setup a project to audit Debian software. You can still see the Debian Security Audit Project webpages if you look hard enough for them.

What did I learn? There are tons of easy security bugs, but finding the hard ones is hard.

(If you get bored some time just pick your favourite Editor, which will be emacs, and look how /tmp is abused during the build-process or in random libraries such as tramp [ tramp-uudecode].)

These days I still poke at source code, and I still report bugs, but my enthusiasm has waned considerably. I tend to only commit to auditing a package if it is a new one I install in production, which limits my efforts considerably, but makes me feel like I'm not taking steps into the dark. It looks like I reported only three security isseus this year, and before that you have to go down to 2011 to find something I bothered to document.

What would I do if I had copious free time? I wouldn't audit code. Instead I'd write test-cases for code.

Many many large projects have rudimentary test-cases at best, and zero coverage at worse. I appreciate writing test-cases is hard, because lots of times it is hard to test things "for real". For example I once wrote a filesystem, using FUSE, there are some built-in unit-tests (I was pretty pleased with that, you could lauch the filesystem with a --test argument and it would invoke the unit-tests on itself. No separate steps, or source code required. If it was installed you could use it and you could test it in-situ). Beyond that I also put together a simple filesystem-stress script, which read/wrote/found random files, computes MD5 hashes of contents, etc. I've since seen similar random-filesystem-stresstest projects, and if they existed then I'd have used them. Testing filesystems is hard.

I've written kernel modules that have only a single implicit test case: It compiles. (OK that's harsh, I'd usually ensure the kernel didn't die when they were inserted, and that a new node in /dev appeared ;)

I've written a mail client, and beyond some trivial test-cases to prove my MIME-handling wasn't horrifically bad there are zero tests. How do you simulate all the mail that people will get, and the funky things they'll do with it?

But that said I'd suggest if you're keen, if you're eager, if you want internet-points, writing test-cases/test-harnesses would be more useful than randomly auditing source code.

Still what would I know, I don't even have a beard..

April 22, 2014 09:14 PM

LZone - Sysadmin

Website Technology Changes in March 2014

As in the last months let's look into changes visible at the frontend pages of the biggest websites. This time I compared the changes between February to April.

These last two months saw the usual lot of version upgrades, along with some probably unintended un-hiding of server versions, several sites going to CloudFlare as well as a premiere with IPv6 being available on the first adult movie site.

The detailed results can be found here:

What Changed?

DNS-Prefetching The HTML header based DNS prefetching is expanding once more and for the first time used on adult site:
IPv6 An AAAA record was sighted for the first time for That makes IPv6 available for the first time on a major adult site!
CDN Changes
Version Upgrades
  • upgrades from PHP 5.3.2-1ubuntu4.15 to recent 4.23
  • upgrades from PHP 5.4.21 to 5.5.8
  • upgrades from PHP 5.3.3-7+squeeze18 to 19
  • upgrades from PHP 4.4.6 to 5.4.16
  • upgrades from PHP 5.3.3-7+squeeze9 to 19
  • upgrades from PHP 5.3.19-1ubuntu3.9+wmf1 to 3.10+wmf1
  • upgrades from PHP 5.3.21 to 5.3.26
  • upgrades from nginx 1.4.4 to 1.4.7
  • upgrades from nginx 1.4.4 to 1.4.7
  • upgrades from squid 3.1.18 to 3.2.1
Hiding Server Version Against the trend in the last month this month three sites have unhidden the previously hidden server details:
  • normally not showing any server version displayed "PHP/5.4.4-14+deb7u7" in February.
  • previously not indicating the Apache version now shows CentOS Apache 2.2.15
  • previously hiding the server version in April indicated "Servlet 2.4; JBoss-4.3.0.GA_CP06 (build"

Note: the website links lead to a history page for the different sites were you can see the change details.


All the results listed above are based on a simple scanning script. The results present a snapshot of the websites and a single response only. This is of course not necessarily an indicating for what techniques the site uses in daily operations!

by Lars Windolf at April 22, 2014 08:24 PM

Who is using which CDN in 04/2014

Recent CDN usage for top 200 Alexa ranked sites and major German sites. For measurement method read more here...

CDN Sites
Akamai sport1.DE

by Lars Windolf at April 22, 2014 05:43 PM

The Nubby Admin

Definitive List of Web-Based Server Control Panels

(Updated April 21, 2014)

As someone who has worked in web hosting, I’ve had my eye on just about every web-based control panel ever created. Most people will likely think of cPanel when they hear the phrase “server control panel” and have visions of web hosts dancing in their heads. Server control panels can be used for much more than web hosting, however. Control panels can allow people to administer systems with the click of a button having little interaction with the gorier details. Some might think that kind of scenario is categorically wrong, but I disagree in some circumstances. There are some *NIX oriented colleagues that I’d tackle before they got too close to a Windows server. For them, WebsitePanel might be a better option. There are also some folks that have need of their own server(s) and are happy to perform their own button mashing to reboot services and etc. I’m reminded of Jordan Sissel’s SysAdvent post “Share Skills and Permissions with Code.” In those scenarios, server control panels are excellent.

The nature of server control panels makes them most desirable by web hosting companies. As such, most of the web-based server control panels that I have found are slanted in that direction and might take some creativity to warp to your needs. Others appear to be more easily used as a general “E-Z Mode” SysAdmin front-end (Open Panel comes to mind). Don’t discard a control panel simply because it is slanted to web hosting. Some of them are much fuller than that.

Nevertheless, here the latest version of my ever growing list of web-based server control panels:

FOSS Control Panels

  • CentOS Web Panel (AKA CWP. CentOS Linux only [duh]. Unkown license, but it’s “Free”)
  • DTC (Linux, FreeBSD and Mac OS X Server. GPL license. Stands for “Domain Technologie Control.” Looks like a great feature set. I don’t know why it’s not more popular.)
  • EHCP (Linux only. GPL license. Stands for “Easy Hosting Control Panel”)
  • Froxlor (Linux and BSD. GPL License. A fork of SysCP. )
  • GNU Panel (Linux only. BSD license. Just kidding! It’s GPL.)
  • ISPConfig (Linux only. BSD license. Made by the HowToForge folks. HTTP, SMTP, FTP, DNS and OpenVZ virtualization are supported among many other features)
  • IspCP Omega (Linux only. Fork of VHCS. Old VHCS code is MPL, new code is GPL2. The goal is to port everything and make it GPL2.)
  • Open Panel (Linux only. GPL license. Their pre-made OpenApps looks cool. I don’t know why this hasn’t made more waves than it has!)
  • RavenCore (Linux and BSD. GPLv2.) RavenCore’s only home on the internet is apparently SourceForge. The domain listed for the project,, doesn’t respond. Take that for what it’s worth.
  • SysCP (Linux only. GPL license.)
  • VestaCP (Linux only, GPLv3 License) Has paid support options, but the control panel itself is free.
  • VHCS (URL Removed! Google says that the domain has been harboring viruses and other evil things) (Linux only. MPL license. Stands for “Virtual Hosting Control System”)
  • WebController (Windows. only GPL. SourceForge project with an appalling website. Looks like it’s abandoned but I’m not sure.)
  • Web-CP (Linux only. Not sure what license, but I assume GPL since it was a fork of the older web://cp product that itself was GPL. Web-CP looks abandoned. The last update on the site was 2005 and the latest bug closed in Mantis is 2006. The wiki is full of spam [I've never seen spam for breast enlargement and pistachios on the same page before - Thanks Web-CP!])
  • zpanel (Windows and POSIX-based OSs – that supposedly includes Max OS X, but a commentor below disputes that.)

Control Panels with a Free and Paid Edition

  • Ajenti (LGPLv3 with special clauses. Linux and BSD.) Annoying licensing model that’s free for your own servers at home or internal work servers, however as soon as you attempt to do any kind of hosting on it you have to cough up money. Seems like a decent product though.
  • ApPHP Admin Panel (Free, Advanced, and Pro version. Linux. )
  • ServerPilot (Ubuntu Linux only) This isn’t so much a server control panel, as it is a management pane for developers who deploy PHP applications on Ubuntu. It is not self hosted. There’s a free edition that has basic management features for your server, and paid editions with more features.
  • Webmin(Primarily POSIX-based OSs, however a limited Windows version exists)
    • Usermin Module (POSIX only. Simple webmail interface and user account modification for non-root users)
    • Virtualmin Module (POSIX only. Allows for multi-tenant use of a server much like a shared web host)
    • Cloudmin Module (POSIX only. Creats VPSs using Xen, KVM and OpenVZ among others)

Commercial Control Panels

  • Core Admin: Commercial control panel, but has a free web edition. Manage many servers from one portal and delegate permissions to different users.
  • cPanel / WHM (Linux and FreeBSD. The granddaddy of control panels started back in 1996 as an in-house app that eventually got licensed. WHM controls the entire server. cPanel is user-oriented.)
    • WHMXtra (Not a control panel on its own, but it’s a significant third-party add-on to WHM)
  • DirectAdmin (Linux and BSD.)
  • Ensim (Control panel that handles the management of cloud services Microsoft Hyper-V, Active Directory, Lync, Mozy, Anti Virus / Anti Spam Solutions like F-Secure, MessageLabs, Barracuda and a ton of other things. It’s really for $n aaS providers to build a business around.)
  • Enkompass (Windows only. cPanel’s Windows product.)
  • H-Sphere (Windows, Linux and BSD. Originally made by Positive Software before being bought by Parallels. I’m not sure how this software compares / competes with Parallels’s Plesk. This is an all-in-one provisioning, billing and control panel tool. Obviously focused solely on web hosts.)
  • HMS Panel (Linux only.)
  • Hosting Controller (Windows and Linux. Also supports managing Microsoft Exchange, BlackBerry Enterprise Server, SharePoint, Office Communication Server, Microsoft Dynamics and more.)
  • HyperVM (Linux only. Virtualization management platform. Uses Xen and OpenVZ. Sister product to Kloxo.)
  • InterWorx (Linux only. Can manage Ruby on Rails.)
  • Kloxo (Linux only. More than just a server management platform, this is a large web hosting platform that is geared very much for a client / provider relationship.
  • Layered Panel (Control panel geared towards free hosts that inject ads into their customers sites. Linux.)
  • Live Config (Linux)
  • Machsol (Unusual in this list because it’s a control panel to manage the hosting of major enterprise server applications like Exchange, Sharepoint and BES.)
  • Parallels Helm (Windows. One of the many acquisitions that Parallels has made.)
  • Parallels Plesk (Linux and Windows. Probably the biggest competitor to cPanel.)
  • SolusVM (Linux only. Manages VPSs and VPS clusters using OpenVZ, Xen and KVM.)
  • vePortal VPS Control Panel
  • vePortal veCommander
  • WebsitePanel (Windows only. The former dotnetpanel after it was revised by SMB SAAS Systems Inc. and released as a SourceForge project.)
  • xopanel (Windows, Linux, BSD. Unsure about license.) Actually, I’m unsure about a lot concerning this product. The product and website are all in Turkish and don’t seem to have an English counterpart. That’s a shame because the product looks good.
  • xpanel (Rather emaciated looking control panel with very low price. Only advertised to run on Fedora.)

Billing / Automation Tools for Control Panels

These are billing and automation tools that tightly integrate with control panels.

Misc. Inclusions

  • Aventurin{e} (Linux only. This is actually a pre-made image that you drop onto a server. It allows you to provision VPSs.
  • BlueOnyx (Linux only. The successor to BlueQuartz below. This isn’t a control panel itself, but a full-fledged Linux distribution. However, since it’s geared to web hosting companies, it has a web interface for your to manage most of the server’s functions. I debated if I should include it, but decided in the affirmative for the sake of being thorough.)
  • BlueQuartz (Linux appliance. Based on the EOL CentOS 4.)
  • Cast-Control (Streaming media control panel. Does ShoutCast, Icecast and more.)
  • CentovaCast (Internet Radio streaming control panel. Based on ShoutCast.)
  • Fantistico (Automated application installation tool)
  • Installtron (Automated application installation tool)
  • SCPanel (ShoutCast internet radio hosting panel)
  • Softaculous (Automated application installation tool)
  • WHMXtra (Additional features for WHM)

Gaming Control Panels

Included because, hey, they’re control panels too!

Defunct Control Panels

  • CP+ (Linux only. Ancient control panel that has since been abandoned. The developer, psoft, is yet another Parallels acquisition. Only included for thoroughness.)

I’d like for this to become a definitive list of web-based control panels (regardless of their focus; server management or web hosting). Basically, if it can manage a server or services and has a web front-end, I’d like to know about it. I’d appreciate any social shares. Likes, +1s, Tweets, Stumbles, Digg’s and etc. are awesome. If you know of any control panels that I’ve missed (active or defunct, since I love history), or if you spot a control panel that I mis-categorized, please let me know in the comments below.


by WesleyDavid at April 22, 2014 11:35 AM

Racker Hacker

Launch secure LXC containers on Fedora 20 using SELinux and sVirt

selinux-penguin-new_mediumGetting started with LXC is a bit awkward and I’ve assembled this guide for anyone who wants to begin experimenting with LXC containers in Fedora 20. As an added benefit, you can follow almost every step shown here when creating LXC containers on Red Hat Enterprise Linux 7 Beta (which is based on Fedora 19).

You’ll need a physical machine or a VM running Fedora 20 to get started. (You could put a container in a container, but things get a little dicey with that setup. Let’s just avoid talking about nested containers for now. No, really, I shouldn’t have even brought it up. Sorry about that.)

Prep Work

Start by updating all packages to the latest versions available:

yum -y upgrade

Verify that SELinux is in enforcing mode by running getenforce. If you see Disabled or Permissive, get SELinux into enforcing mode with a quick configuration change:

sed -i 's/^SELINUX=.*/SELINUX=enforcing/' /etc/selinux/config

I recommend installing setroubleshoot-server to make it easier to find the root cause of AVC denials:

yum -y install setroubleshoot-server

Reboot now. This will ensure that SELinux comes up in enforcing mode (verify that with getenforce after reboot) and it ensures that auditd starts up sedispatch (for setroubleshoot).

Install management libraries and utilities

Let’s grab libvirt along with LXC support and a basic NAT networking configuration.

yum -y install libvirt-daemon-lxc libvirt-daemon-config-network

Launch libvirtd via systemd and ensure that it always comes up on boot. This step will also adjust firewalld for your containers and ensure that dnsmasq is serving up IP addresses via DHCP on your default NAT network.

systemctl start libvirtd.service
systemctl enable libvirtd.service

Bootstrap our container

Installing packages into the container’s filesystem will take some time.

yum -y --installroot=/var/lib/libvirt/filesystems/fedora20 --releasever=20 --nogpg install systemd passwd yum fedora-release vim-minimal openssh-server procps-ng

This step fills in the filesystem with the necessary packages to run a Fedora 20 container. We now need to tell libvirt about the container we’ve just created.

virt-install --connect lxc:// --name fedora20 --ram 512 --filesystem /var/lib/libvirt/filesystems/fedora20/,/

At this point, libvirt will know enough about the container to start it and you’ll be connected to the console of the container! We need to adjust some configuration files within the container to use it properly. Detach from the console with CTRL-].

Let’s stop the container so we can make some adjustments.

virsh -c lxc:// shutdown fedora20

Get the container ready for production

Hop into your container and set a root password.

chroot /var/lib/libvirt/filesystems/fedora20 /bin/passwd root

We will be logging in as root via the console occasionally and we need to allow that access.

echo "pts/0" >> /var/lib/libvirt/filesystems/fedora20/etc/securetty

Since we will be using our NAT network with our auto-configured dnsmasq server (thanks to libvirt), we can configure a simple DHCP setup for eth0:

cat < < EOF > /var/lib/libvirt/filesystems/fedora20/etc/sysconfig/network
cat < < EOF > /var/lib/libvirt/filesystems/fedora20/etc/sysconfig/network-scripts/ifcfg-eth0

Using ssh makes the container a lot easier to manage, so let’s ensure that it starts when the container boots. (You could do this via systemctl after logging in at the console, but I’m lazy.)

chroot /var/lib/libvirt/filesystems/fedora20/
ln -s /usr/lib/systemd/system/sshd.service /etc/systemd/system/


Cross your fingers and launch the container.

virsh -c lxc:// start --console fedora20

You’ll be attached to the console during boot but don’t worry, hold down CTRL-] to get back to your host prompt. Check the dnsmasq leases to find your container’s IP address and you can login as root over ssh.

cat /var/lib/libvirt/dnsmasq/default.leases


After logging into your container via ssh, check the process labels within the container:

# ps aufxZ
LABEL                           USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 1 0.0  1.3 47444 3444 ?      Ss   03:18   0:00 /sbin/init
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 18 0.0  2.0 43016 5368 ?     Ss   03:18   0:00 /usr/lib/systemd/systemd-journald
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 38 0.4  7.8 223456 20680 ?   Ssl  03:18   0:00 /usr/bin/python -Es /usr/sbin/firewalld -
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 40 0.0  0.7 26504 2084 ?     Ss   03:18   0:00 /usr/sbin/smartd -n -q never
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 41 0.0  0.4 19268 1252 ?     Ss   03:18   0:00 /usr/sbin/irqbalance --foreground
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 44 0.0  0.6 34696 1636 ?     Ss   03:18   0:00 /usr/lib/systemd/systemd-logind
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 46 0.0  1.8 267500 4832 ?    Ssl  03:18   0:00 /sbin/rsyslogd -n
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 dbus 47 0.0  0.6 26708 1680 ?     Ss   03:18   0:00 /bin/dbus-daemon --system --address=syste
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 rpc 54 0.0  0.5 41992 1344 ?      Ss   03:18   0:00 /sbin/rpcbind -w
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 55 0.0  0.3 25936 924 ?      Ss   03:18   0:00 /usr/sbin/atd -f
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 56 0.0  0.5 22728 1488 ?     Ss   03:18   0:00 /usr/sbin/crond -n
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 60 0.0  0.2 6412 784 pts/0   Ss+  03:18   0:00 /sbin/agetty --noclear -s console 115200 
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 74 0.0  3.2 339808 8456 ?    Ssl  03:18   0:00 /usr/sbin/NetworkManager --no-daemon
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 394 0.0  5.9 102356 15708 ?  S    03:18   0:00  \_ /sbin/dhclient -d -sf /usr/libexec/nm
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 polkitd 83 0.0  4.4 514792 11548 ? Ssl 03:18   0:00 /usr/lib/polkit-1/polkitd --no-debug
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 rpcuser 110 0.0  0.6 46564 1824 ? Ss   03:18   0:00 /sbin/rpc.statd
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 111 0.0  1.3 82980 3620 ?    Ss   03:18   0:00 /usr/sbin/sshd -D
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 409 0.0  1.9 131576 5084 ?   Ss   03:18   0:00  \_ sshd: root@pts/1
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 413 0.0  0.9 115872 2592 pts/1 Ss 03:18   0:00      \_ -bash
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 438 0.0  0.5 123352 1344 pts/1 R+ 03:19   0:00          \_ ps aufxZ
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 411 0.0  0.8 44376 2252 ?    Ss   03:18   0:00 /usr/lib/systemd/systemd --user
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 412 0.0  0.5 66828 1328 ?    S    03:18   0:00  \_ (sd-pam)
system_u:system_r:virtd_lxc_t:s0-s0:c0.c1023 root 436 0.0  0.4 21980 1144 ?    Ss   03:19   0:00 /usr/lib/systemd/systemd-hostnamed

You’ll notice something interesting if you run getenforce now within the container — SELinux is disabled. Actually, it’s not really disabled. The processing of SELinux policy is done on the host. The container isn’t able to see what’s going on outside of its own files and processes. The libvirt documentation for LXC hints at the importance of this isolation:

A suitably configured UID/GID mapping is a pre-requisite to making containers secure, in the absence of sVirt confinement.

In the absence of the “user” namespace being used, containers cannot be considered secure against exploits of the host OS. The sVirt SELinux driver provides a way to secure containers even when the “user” namespace is not used. The cost is that writing a policy to allow execution of arbitrary OS is not practical. The SELinux sVirt policy is typically tailored to work with an simpler application confinement use case, as provided by the “libvirt-sandbox” project.

This leads to something really critical to understand:

Containers don’t contain

Dan Walsh has a great post that goes into the need for sVirt and the protections it can provide when you need to be insulated from potentially dangerous virtual machines or containers. If a user is root inside a container, they’re root on the host as well. (There’s an exception: UID namespaces. But let’s not talk about that now. Oh great, first it was nested containers and now I brought up UID namespaces. Sorry again.)

Dan’s talk about securing containers hasn’t popped up on the Red Hat Summit presentations page quite yet but here are some notes that I took and then highlighted:

  • Containers don’t contain. The kernel doesn’t know about containers. Containers simply use kernel subsystems to carve up namespaces for applications.
  • Containers on Linux aren’t complete. Don’t compare directly to Solaris zones yet.
  • Running containers without Mandatory Access Control (MAC) systems like SELinux or AppArmor opens the door for full system compromise via untrusted applications and users within containers.

Using MAC gives you one extra barrier to keep a malicious container from getting higher levels of access to the underlying host. There’s always a chance that a kernel exploit could bypass MAC but it certainly raises the level of difficulty for an attacker and allows server operators extra time to react to alerts.

Launch secure LXC containers on Fedora 20 using SELinux and sVirt is a post from: Major Hayden's blog.

Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.

by Major Hayden at April 22, 2014 04:11 AM

Chris Siebenmann

The question of language longevity for new languages

Every so often I feel a temptation to rewrite DWiki (the engine behind this blog) in Go. While there are all sorts of reasons not to (so many that it's at best a passing whimsy), one concern that immediately surfaces is the question of Go's likely longevity. I'd like the blog to still be here in, say, ten years, and if the engine is written in Go that needs Go to be a viable language in ten years (and on whatever platform I want to host the blog on).

Of course this isn't just a concern for Go; it's a concern for any new language and there's a number of aspects to it. To start with there's the issue of the base language. There are lots of languages that have come and gone, or come and not really caught on very much so that they're still around but not really more than a relatively niche language (even though people often love them very much and are very passionate about them). Even when a language is still reasonably popular there's the question of whether it's popular enough to be well supported on anything besides the leading few OS platforms.

(Of course the leading few OS platforms are exactly the ones that I'm most likely to be using. But that's not always the case; this blog is currently hosted on FreeBSD, for example, not Linux, and until recently it was on a relatively old FreeBSD.)

But you'd really like more than just the base language to still be around, because these days the base language is an increasingly small part of the big picture of packages and libraries and modules that you can use. We also want a healthy ecology of addons for the language, so that if you need support for, say, a new protocol or a new database binding or whatever you probably don't have to write it yourself. The less you have to do to evolve your program the more likely it is to evolve.

Finally there's a personal question: will the language catch on with you so that you'll still be working with it in ten years? Speaking from my own experience I can say that it's no fun to be stuck with a program in a language that you've basically walked away from, even if the language and its ecology is perfectly healthy.

Of course, all of this is much easier if you're writing things that you know will be superseded and replaced before they get anywhere near ten years old. Alternately you could be writing an implementation of a standard so that you could easily swap it out for something written in another language. In this sense a dynamically rendered blog with a custom wikitext dialect is kind of a worst case.

(For Go specifically I think it's pretty likely to be around and fully viable in ten years, although I have less of a sense of my own interest in programming in it. Of course ten years can be long time in computing and some other language could take over from it. I suspect that Rust would like to, for example.)

by cks at April 22, 2014 03:46 AM

April 21, 2014

Chris Siebenmann

Thinking about how to split logging up in multiple categories et al

I've used programs that do logging (both well and badly) and I've also written programs that did logging (also both reasonably well and badly) and the whole experience has given me some views on how I like logging split up to make it more controllable.

It's tempting to say that controlling logging is only for exceptional cases, like debugging programs. This is not quite true. Certainly this is the dominant case, but there are times when people have different interests about what to log even in routine circumstances. For example, on this blog I log detailed information about conditional GETs for syndication feeds because I like tracking down why (or why not) feed fetchers succeed at this. However this information isn't necessarily of interest to someone else running a DWiki instance so it shouldn't be part of the always-on mandatory logging; you should be able to control it.

The basic breakdown of full featured logging in a large system is to give all messages both a category and a level. The category is generally going to be the subsystem that they involve, while the level is the general sort of information that they have (informational, warnings, progress information, debugging details, whatever). You should be able to control the two together and separately, to say that you want only progress reports from all systems or almost everything from only one system and all the way through.

My personal view is that this breakdown is not quite sufficient by itself and there are a bunch of cases where you'll also want a verbosity level. Even if verbosity could in theory be represented by adding more categories and levels, in practice it's much easier for people to crank up the verbosity (or crank it down) rather than try to do more complex manipulations of categories and levels. As part of making life easier on people, I'd also have a single option that means 'turn on all logging options and log all debugging information (and possibly everything)'; this gives people a simple big stick to hit a problem with when they're desperate.

If your language and programming environment doesn't already have a log system that makes at least the category plus level breakdown easy to do, I wouldn't worry about this for relatively small programs. It's only fairly large and complex programs with a lot of subsystems where you start to really want this sort of control.

Sidebar: the two purposes of specific control

There are two reasons to offer people specific control over logging. The first is what I mentioned: sometimes not all information is interesting to a particular setup. I may want information on syndication feed conditional GETs while you may want 'time taken' information for all requests. Control over logging allows the program to support both of us (and the person who doesn't care about either) without cluttering up logs with stuff that we don't want. This is log control for routine logs, stuff that you're going to use during normal program operation.

The second reason is that a big system can produce too much information at full logging flow when you're trying to troubleshoot it, so much that useful problem indicators are lost in the overall noise. Here categorization and levels are a way of cutting down on the log volume so that people can see the important things. This is log control for debugging messages.

(There is an overlap between these two categories. You might log all SQL queries that a system does and the time they take for routine metrics, even though this was originally designed for debugging purposes.)

by cks at April 21, 2014 06:40 AM

April 20, 2014

Chris Siebenmann

A heresy about memorable passwords

In the wake of Heartbleed, we've been writing some password guidelines at work. A large part of the discussion in them is about how to create memorable passwords. In the process of all of this, I realized that I have a heresy about memorable passwords. I'll put this way:

Memorability is unimportant for any password you use all the time, because you're going to memorize it no matter what it is.

I will tell you a secret: I don't know what my Unix passwords are. Oh, I can type them and I do so often, but I don't know exactly what they are any more. If for some reason I had to recover what one of them was in order to write it out, the fastest way to do so would be to sit down in front of a computer and type it in. Give me just a pen and paper and I'm not sure I could actually do it. My fingers and reflexes know them far better better than my conscious mind.

If you pick a new password based purely at random with absolutely no scheme involved, you'll probably have to write it down on a piece of paper and keep referring to that piece of paper for a while, perhaps a week or so. After the week I'm pretty confidant that you'll be able to shred the piece of paper without any risk at all, except perhaps if you go on vacation for a month and have it fall out of your mind. Even then I wouldn't be surprised if you could type it by reflex when you come back. The truth is that people are very good at pushing repetitive things down into reflex actions, things that we do automatically without much conscious thought. My guess is that short, simple things can remain in conscious memory (this is at least my experience with some things I deal with); longer and more complex things, like a ten character password that involves your hands flying all over the keyboard, those go down into reflexes.

Thus, where memorable passwords really matter is not passwords you use frequently but passwords you use infrequently (and which you're not so worried about that you've seared into your mind anyways).

(Of course, in the real world people may not type their important passwords very often. I try not to think about that very often.)

PS: This neglects threat models entirely, which is a giant morass. But for what it's worth I think we still need to worry about password guessing attacks and so reasonably complex passwords are worth it.

by cks at April 20, 2014 06:12 AM

April 19, 2014

Hard drive accessing every second with new Ubuntu install

The annoying pulse

After installing the new Ubuntu (Xubuntu) 14.04, I rebooted the machine to the desktop. I noticed that the hard drive was being accessed every second consistently. Just pulse...pulse...pulse. It was driving me nuts, as I had no clue what could be causing it. Since this issue was IO related the easiest way to solve this was to install and run iotop.

Installing iotop and finding the perpetrator

# Install iotop
sudo apt-get install iotop

# Run IOTop with -o so we only see processes doing IO
iotop -o

# Output looked similar to this

Total DISK READ:       0.00 B/s | Total DISK WRITE:         2.00 M/s
 1    be/4 root     0.00 B/s    2.00 M/s   0.00 %     3.00 % [ext4lazyinit]

What the heck is ext4lazyinit?

Now I know anything in brackets is a kernel thread, and ext4 is the file system I used (Ubuntu uses) to format the partitions. Time to go see what this kernel thread does. After some digging I found the Git commit log for this ext4 file system feature.

Who whats a file system to be lazy?

In the old (or not so old) days of ext2 and 3 when you formatted the file system you would have to wait while it was created. Usually most of the waiting was because you had to wait for the inode tables to be created. Make a ext3 file system and watch how long it takes to make the inode tables. The bigger the drive the longer it takes to make these tables. We are talking 10 or more mins on 2TB drives and above. Ted Tso the creator of ext4 helped cut this time down by allocating some of the inode tables during creation and then allowed the rest of them to be made on first mount. When the file system is mounted the first time it sees this was marked to be done and creates a background kernel thread to finish creating the inode tables. This way your install time is cut down dramatically because you don't need to wait for the file system inode tables to be created. The are just happily created in the background as you work.

When will this bloody pulsing stop?

So how long will this take to finish? I'm not to sure as I left my machine on all night so it could finish. It was done by the morning so I can say less than 12hrs on a 1TB drive. Don't fret though it will eventually finish and rest assured there is nothing wrong with your hard drive or the install. This likely happens on older Ubuntu installs and any other distros that use ext4 as well.

by at April 19, 2014 08:19 PM

Steve Kemp's Blog

I was beaten to the punch, but felt nothing

A while back I mented github-backed DNS hosting.

Turns out does that already, and there is an interesting writeup on the design of something similar, from the same authors in 2009.

Fun to read.

In other news applying for jobs is a painful annoyance.

Should anybody wish to employ an Edinburgh-based system administrator, with a good Debian record, then please do shout at me. Remote work is an option, as is a local office, if you're nearby.

Now I need to go hide from the sun, lest I get burned again...

Good news? Going on holiday to Helsinki in a week or so, for Vappu. Anybody local who wants me should feel free to grab me, via the appropriate channels.

April 19, 2014 07:03 PM

Joe Topjian

Building an LXC Server - Ubuntu 14.04 Edition


This article is a basic step-by-step HOWTO to create a server capable of hosting LXC-based containers.

Prerequisites and Dependencies

This server will be using Ubuntu 14.04. As 14.04 has just been released, some steps might change in the future.

apt Update

First, make sure all of the base packages are up to date:

$ sudo apt-get update
$ sudo apt-get dist-upgrade

Open vSwitch

The previous version of this article advocated Open vSwitch. I have since stopped using OVS as I’ve been able to configure Linux Bridge with the exact same features by using newer kernels.


The newer LXC builds support ZFS as a backing store. This means that deduplication, compression, and snapshotting can all be taken advantage of. To install ZFS on Ubuntu 14.04, do:

$ sudo apt-add-repository ppa:zfs-native/daily
$ sudo apt-get update
$ sudo apt-get install ubuntu-zfs

Install LXC

Ubuntu 14.04 provides LXC 1.0.3, which is the latest version. I’m not sure if Ubuntu 14.04 will continue providing up-to-date versions of LXC, given it being an LTS release. If it falls behind, it might be beneficial to switch to the ubuntu-lxc/daily ppa.

To install LXC, just do:

$ sudo apt-get install lxc

Configure LXC

Back to ZFS

By default, LXC will look for a zpool titled lxc:

$ sudo zpool create -f tank /dev/vdc

Make sure deduplication and compression are turned on:

$ sudo zfs set dedup=on tank
$ sudo zfs set compression=on tank

LXC can use ZFS’s native snapshot features. To make sure you can see snapshots when running zfs list, do the following:

$ sudo zpool set listsnapshots=on tank

To configure LXC to use ZFS as the backing store and set the default LXC path, add the following to /etc/lxc/lxc.conf:

lxc.lxcpath = /tank/lxc/containers
lxc.bdev.zfs.root = tank/lxc/containers

Ensure /tank/lxc/containers, or whichever path you choose, exists:

$ sudo zfs create tank/lxc
$ sudo zfs create tank/lxc/containers

Using LXC

Creating a Container

Create the first container by doing:

$ sudo lxc-create -t ubuntu -n test1 -B zfs -- -S /root/.ssh/

When the command has finished, you’ll see that LXC has created a new ZFS partition:

$ sudo zfs list
$ df -h

Testing ZFS Deduplication

You can see the ZFS dedup stat by doing:

$ sudo zpool list
tank  99.5G   186M  99.3G     0%  1.01x  ONLINE  -

With that number in mind, create a second container:

$ sudo lxc-create -t ubuntu -n test2 -B zfs -- -S /root/.ssh/

When the command has finished, review the ZFS stat:

$ sudo zpool list
tank  99.5G   187M  99.3G     0%  2.02x  ONLINE  -

The dedup ratio doubled. This effectively means that no new disk space was consumed when the new container was created!

Port Forwarding

By default, LXC uses the veth networking mode for containers. This is the most robust networking mode. Other modes exist and I highly recommend this article for a detailed look at them.

veth mode can be thought of as a form of NAT and the LXC server is now acting as a NAT’d gateway for all of the containers running on the server. If you want the containers to be accessible from the public internet, you will need to do some port forwarding.


I have put together a small script called lxc-nat that will configure port forwarding based on entries made in /etc/lxc/lxc-nat.conf.

For example, if you have Apache running in a container called www, create the following entry: -> www:80

Or if you want to access www via SSH: -> www:22


This article showed the steps used to configure a server to host LXC-based containers on a ZFS storage backend.

April 19, 2014 06:00 AM

Chris Siebenmann

Cross-system NFS locking and unlocking is not necessarily fast

If you're faced with a problem of coordinating reads and writes on an NFS filesystem between several machines, you may be tempted to use NFS locking to communicate between process A (on machine 1) and process B (on machine 2). The attraction of this is that all they have to do is contend for a write lock on a particular file; you don't have to write network communication code and then configure A and B to find each other.

The good news is that this works, in that cross system NFS locking and unlocking actually works right (at least most of the time). The bad news is that this doesn't necessarily work fast. In practice, it can take a fairly significant amount of time for process B on machine 2 to find out that process A on machine 1 has unlocked the coordination file, time that can be measured in tens of seconds. In short, NFS locking works but it can require patience and this makes it not necessarily the best option in cases like this.

(The corollary of this is that when you're testing this part of NFS locking to see if it actually works you need to wait for quite a while before declaring things a failure. Based on my experiences I'd wait at least a minute before declaring an NFS lock to be 'stuck'. Implications for impatient programs with lock timeouts are left as an exercise for the reader.)

I don't know if acquiring an NFS lock on a file after a delay normally causes your machine's kernel to flush cached information about the file. In an ideal world it would, but NFS implementations are often not ideal worlds and the NFS locking protocol is a sidecar thing that's not necessarily closely integrated with the NFS client. Certainly I wouldn't count on NFS locking to flush cached information on, say, the directory that the locked file is in.

In short: you want to test this stuff if you need it.

PS: Possibly this is obvious but when I started testing NFS locking to make sure it worked in our environment I was a little bit surprised by how slow it could be in cross-client cases.

by cks at April 19, 2014 04:38 AM

April 18, 2014


Using sysdig to Troubleshoot like a boss

If you haven't seen it yet there is a new troubleshooting tool out called sysdig. It's been touted as strace meets tcpdump and well, it seems like it is living up to the hype. I would actually rather compare sysdig to SystemTap meets tcpdump, as it has the command line syntax of tcpdump but the power of SystemTap.

In this article I am going to cover some basic and cool examples for sysdig, for a more complete list you can look over the sysdig wiki. However, it seems that even the sysdig official documentation is only scratching the surface of what can be done with sysdig.


In this article we will be installing sysdig on Ubuntu using apt-get. If you are running an rpm based distribution you can find details on installing via yum on sysdig's wiki.

Setting up the apt repository

To install sysdig via apt we will need to setup the apt repository maintained by Draios the company behind sysdig. We can do this by running the following curl commands.

# curl -s | apt-key add -  
# curl -s -o /etc/apt/sources.list.d/draios.list

The first command above will download the Draios gpg key and add it to apt's key repository. The second will download an apt sources file from Draios and place it into the /etc/apt/sources.list.d/ directory.

Update apt's indexes

Once the sources list and gpg key are installed we will need to re-sync apt's package indexes, this can be done by running apt-get update.

# apt-get update

Kernel headers package

The sysdig utility requires the kernel headers package, before installing we will need to validate that the kernel headers package is installed.

Check if kernel headers is installed

The system that I am using for this example already had the kernel headers packaged installed, to validate if they are installed on your system you can use the dpkg command.

    # dpkg --list | grep header
    ii  linux-generic                                  amd64        Complete Generic Linux kernel and headers
    ii  linux-headers-3.11.0-12             3.11.0-12.19                     all          Header files related to Linux kernel version 3.11.0
    ii  linux-headers-3.11.0-12-generic     3.11.0-12.19                     amd64        Linux kernel headers for version 3.11.0 on 64 bit x86 SMP
    ii  linux-headers-generic                          amd64        Generic Linux kernel headers

It is important to note that the kernel headers package must be for the specific kernel version your system is running. In the output above you can see the linux-generic package is version and the headers package is for If you have multiple kernels installed you can validate which version your system is running with the uname command.

# uname -r

Installing the kernel headers package

To install the headers package for this specific kernel you can use apt-get. Keep in mind, you must specify the kernel version listed from uname -r.

# apt-get install linux-headers-<kernel version>


# apt-get install linux-headers-3.11.0-12-generic

Installing sysdig

Now that the apt repository is setup and we have the required dependencies, we can install the sysdig command.

# apt-get install sysdig

Using sysdig

Basic Usage

The syntax for sysdig is similar to tcpdump in particular the saving and reading of trace files. All of sysdig's output can be saved to a file and read later just like tcpdump. This is useful if you are running a process or experiencing an issue and wanted to dig through the information later.

Writing trace files

To write a file we can use the -w flag with sysdig and specify the file name.


# sysdig -w <output file>


# sysdig -w tracefile.dump

Like tcpdump the sysdig command can be stopped with CTRL+C.

Reading trace files

Once you have written the trace file you will need to use sysdig to read the file, this can be accomplished with the -r flag.


# sysdig -r <output file>


    # sysdig -r tracefile.dump
    1 23:44:57.964150879 0 <NA> (7) > switch next=6200(sysdig) 
    2 23:44:57.966700100 0 rsyslogd (358) < read res=414 data=<6>[ 3785.473354] sysdig_probe: starting capture.<6>[ 3785.473523] sysdig_probe: 
    3 23:44:57.966707800 0 rsyslogd (358) > gettimeofday 
    4 23:44:57.966708216 0 rsyslogd (358) < gettimeofday 
    5 23:44:57.966717424 0 rsyslogd (358) > futex addr=13892708 op=133(FUTEX_PRIVATE_FLAG|FUTEX_WAKE_OP) val=1 
    6 23:44:57.966721656 0 rsyslogd (358) < futex res=1 
    7 23:44:57.966724081 0 rsyslogd (358) > gettimeofday 
    8 23:44:57.966724305 0 rsyslogd (358) < gettimeofday 
    9 23:44:57.966726254 0 rsyslogd (358) > gettimeofday 
    10 23:44:57.966726456 0 rsyslogd (358) < gettimeofday

Output in ASCII

By default sysdig saves the files in binary, however you can use the -A flag to have sysdig output in ASCII.


# sysdig -A


# sysdig -A > /var/tmp/out.txt
# cat /var/tmp/out.txt
1 22:26:15.076829633 0 <NA> (7) > switch next=11920(sysdig)

The above example will redirect the output to a file in plain text, this can be helpful if you wanted to save and review the data on a system that doesn't have sysdig installed.

sysdig filters

Much like tcpdump the sysdig command has filters that allow you to filter the output to specific information. You can find a list of available filters by running sysdig with the -l flag.


    # sysdig -l

    Field Class: fd

    fd.num            the unique number identifying the file descriptor.
    fd.type           type of FD. Can be 'file', 'ipv4', 'ipv6', 'unix', 'pipe', 'e
                      vent', 'signalfd', 'eventpoll', 'inotify' or 'signalfd'.
    fd.typechar       type of FD as a single character. Can be 'f' for file, 4 for 
                      IPv4 socket, 6 for IPv6 socket, 'u' for unix socket, p for pi
                      pe, 'e' for eventfd, 's' for signalfd, 'l' for eventpoll, 'i'
                       for inotify, 'o' for uknown.           FD full name. If the fd is a file, this field contains the fu
                      ll path. If the FD is a socket, this field contain the connec
                      tion tuple.
<truncated output>

Filter examples

Capturing a specific process

You can use the "" filter to capture all of the sysdig events for a specific process. In the example below I am filtering on any process named sshd.


    # sysdig -r tracefile.dump
    530 23:45:02.804469114 0 sshd (917) < select res=1 
    531 23:45:02.804476093 0 sshd (917) > rt_sigprocmask 
    532 23:45:02.804478942 0 sshd (917) < rt_sigprocmask 
    533 23:45:02.804479542 0 sshd (917) > rt_sigprocmask 
    534 23:45:02.804479767 0 sshd (917) < rt_sigprocmask 
    535 23:45:02.804487255 0 sshd (917) > read fd=3(<4t>> size=16384
Capturing all processes that open a specific file

The filter is used to filter events for a specific file name. This can be useful to see what processes are reading or writing a specific file or socket.


# sysdig
14 11:13:30.982445884 0 rsyslogd (357) < read res=414 data=<6>[  582.136312] sysdig_probe: starting capture.<6>[  582.136472] sysdig_probe:

Capturing all processes that open a specific filesystem

You can also use comparison operators with filters such as contains, =, !=, <=, >=, < and >.


    # sysdig contains /etc
    8675 11:16:18.424407754 0 apache2 (1287) < open fd=13(<f>/etc/apache2/.htpasswd) name=/etc/apache2/.htpasswd flags=1(O_RDONLY) mode=0 
    8678 11:16:18.424422599 0 apache2 (1287) > fstat fd=13(<f>/etc/apache2/.htpasswd) 
    8679 11:16:18.424423601 0 apache2 (1287) < fstat res=0 
    8680 11:16:18.424427497 0 apache2 (1287) > read fd=13(<f>/etc/apache2/.htpasswd) size=4096 
    8683 11:16:18.424606422 0 apache2 (1287) < read res=44 data=admin:$apr1$OXXed8Rc$rbXNhN/VqLCP.ojKu1aUN1. 
    8684 11:16:18.424623679 0 apache2 (1287) > close fd=13(<f>/etc/apache2/.htpasswd) 
    8685 11:16:18.424625424 0 apache2 (1287) < close res=0 
    9702 11:16:21.285934861 0 apache2 (1287) < open fd=13(<f>/etc/apache2/.htpasswd) name=/etc/apache2/.htpasswd flags=1(O_RDONLY) mode=0 
    9703 11:16:21.285936317 0 apache2 (1287) > fstat fd=13(<f>/etc/apache2/.htpasswd) 
    9704 11:16:21.285937024 0 apache2 (1287) < fstat res=0

As you can see from the above examples filters can be used for both reading from a file or the live event stream.


Earlier I compared sysdig to SystemTap, Chisels is why I made that reference. Similar tools like SystemTap have a SystemTap only scripting language that allows you to extend the functionality of SystemTap. In sysdig these are called chisels and they can be written in LUA which is a common programming language. I personally think the choice to use LUA was a good one, as it makes extending sysdig easy for newcomers.

List available chisels

To list the available chisels you can use the -cl flag with sysdig.


    # sysdig -cl

    Category: CPU Usage
    topprocs_cpu    Top processes by CPU usage

    Category: I/O
    echo_fds        Print the data read and written by processes.
    fdbytes_by      I/O bytes, aggregated by an arbitrary filter field
    fdcount_by      FD count, aggregated by an arbitrary filter field
    iobytes         Sum of I/O bytes on any type of FD
    iobytes_file    Sum of file I/O bytes
    stderr          Print stderr of processes
    stdin           Print stdin of processes
    stdout          Print stdout of processes
    <truncated output>

The list is fairly long even though sysdig is still pretty new, and since sysdig is on GitHub you can easily contribute and extend sysdig with your own chisels.

Display chisel information

While the list command gives a small description of the chisels you can display more information using the -i flag with the chisel name.


    # sysdig -i bottlenecks

    Category: Performance
    bottlenecks     Slowest system calls

    Use the -i flag to get detailed information about a specific chisel

    Lists the 10 system calls that took the longest to return dur
    ing the capture interval.


Running a chisel

To run a chisel you can run sysdig with the -c flag and specify the chisel name.


    # sysdig -c topprocs_net
    Bytes     Process
    296B      sshd

Running a chisel with filters

Even with chisels you can still use filters to run chisels against specific events.

Capturing all network traffic from a specific process

The below example shows using the echo_fds chisel against the processes named apache2.

# sysdig -A -c echo_fds
------ Read 444B from>

GET /wp-admin/install.php HTTP/1.1
Connection: keep-alive
Cache-Control: max-age=0
Authorization: Basic YWRtaW46ZUNCM3lyZmRRcg==
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8

Capturing network traffic exchanged between a specific ip

We can also use the the echo_fds chisel to show all network traffic for a single ip using the fd.cip filter.

# sysdig -A -c echo_fds fd.cip=
------ Write 1.92KB to>

HTTP/1.1 200 OK
Date: Thu, 17 Apr 2014 03:11:33 GMT
Server: Apache
X-Powered-By: PHP/5.5.3-1ubuntu2.3
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 1698
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=utf-8

Originally Posted on Go To Article

by Benjamin Cane at April 18, 2014 09:30 AM

Chris Siebenmann

What modern filesystems need from volume management

One of the things said about modern filesystems like btrfs and ZFS is that their volume management functionality is a layering violation; this view holds that filesystems should stick to filesystem stuff and volume managers should stick to that. For the moment let's not open that can of worms and just talk about what (theoretical) modern filesystems need from an underlying volume management layer.

Arguably the crucial defining aspect of modern filesystems like ZFS and btrfs is a focus on resilience against disk problems. A modern filesystem no longer trusts disks not to have silent errors; instead it checksums everything so that it can at least detect data faults and it often tries to create some internal resilience by duplicating metadata or at least spreading it around (copy on write is also common, partly because it gives resilience a boost).

In order to make checksums useful for healing data instead of just simply detecting when it's been corrupted, a modern filesystem needs an additional operation from any underlying volume management layer. Since the filesystem can actually identify the correct block from a number of copies, it needs to be able to get all copies or variations of a set of data blocks from the underlying volume manager (and then be able to tell the volume manager which is the correct copy). In mirroring this is straightforward; in RAID 5 and RAID 6 it gets a little more complex. This 'all variants' operation will be used both during regular reads if a corrupt block is detected and during a full verification check where the filesystem will deliberately read every copy to check that they're all intact.

(I'm not sure what the right primitive operation here should be for RAID 5 and RAID 6. On RAID 5 you basically need the ability to try all possible reconstructions of a stripe in order to see which one generates the correct block checksum. Things get even more convoluted if the filesystem level block that you're checksumming spans multiple stripes.)

Modern filesystems generally also want some way of saying 'put A and B on different devices or redundancy clusters' in situations where they're dealing with stripes of things. This enables them to create multiple copies of (important) metadata on different devices for even more protection against read errors. This is not as crucial if the volume manager is already providing redundancy.

This level of volume manager support is a minimum level, as it still leaves a modern filesystem with the RAID-5+ rewrite hole and a potentially inefficient resynchronization process. But it gets you the really important stuff, namely redundancy that will actually help you against disk corruption.

by cks at April 18, 2014 06:18 AM

April 17, 2014

Racker Hacker

DevOps and enterprise inertia

As I wait in the airport to fly back home from this year’s Red Hat Summit, I’m thinking back over the many conversations I had over breakfast, over lunch, and during the events. One common theme that kept cropping up was around bringing DevOps to the enterprise. I stumbled upon Mathias Meyer’s post, The Developer is Dead, Long Live the Developer, and I was inspired to write my own.

Before I go any further, here’s my definition of DevOps: it’s a mindset shift where everyone is responsible for the success of the customer experience. The success (and failure) of the project rests on everyone involved. If it goes well, everyone celebrates and looks for ways to highlight what worked well. If it fails, everyone gets involved to bring it back on track. Doing this correctly means that your usage of “us” and “them” should decrease sharply.

The issue at hand

One of the conference attendees told me that he and his technical colleagues are curious about trying DevOps but their organization isn’t set up in a way to make it work. On top of that, very few members of the teams knew about the concept of continuous delivery and only one or two people knew about tools that are commonly used to practice it.

I dug deeper and discovered that they have outages just like any other company and they treat outages as an operations problem primarily.  Operations teams don’t get much sleep and they get frustrated with poorly written code that is difficult to deploy, upgrade, and maintain.  Feedback loops with the development teams are relatively non-existent since the development teams report into a different portion of the business.  His manager knows that something needs to change but his manager wasn’t sure how to change it.

His company certainly isn’t unique.  My advice for him was to start a three step process:

Step 1: Start a conversation around responsibility.

Leaders need to understand that the customer experience is key and that experience depends on much more than just uptime. This applies to products and systems that support internal users within your company and those that support your external customers.

Imagine if you called for pizza delivery and received a pizza without any cheese. You drive back to the pizza place to show the manager the partial pizza you received. The manager turns to the employees and they point to the person assigned to putting toppings on the pizza. They might say: “It’s his fault, I did my part and put it in the oven.” The delivery driver might say: “Hey, I did what I was supposed to and I delivered the pizza. It’s not my fault.”

All this time, you, the customer, are stuck holding a half made pizza. Your experience is awful.

Looking back, the person who put the pizza in the oven should have asked why it was only partially made. The delivery driver should have asked about it when it was going into the box. Most important of all, the manager should have turned to the employees and put the responsibility on all of them to make it right.

Step 2: Foster collaboration via cross-training.

Once responsibility is shared, everyone within the group needs some knowledge of what other members of the group do. This is most obvious with developers and operations teams. Operations teams need to understand what the applications do and where their weak points are. Developers need to understand resource constraints and how to deploy their software. They don’t need to become experts but they need to know enough overlapping knowledge to build a strong, healthy feedback loop.

This cross-training must include product managers, project managers, and leaders. Feedback loops between these groups will only be successful if they can speak some of the language of the other groups.

Step 3: Don’t force tooling.

Use the tools that make the most sense to the groups that need to use them. Just because a particular software tool helps another company collaborate or deploy software more reliably doesn’t mean it will have a positive impact on your company.

Watch out for the “sunk cost” fallacy as well. Neal Ford talked about this during a talk at the Red Hat Summit and how it can really stunt the growth of a high performing team.


The big takeaway from this post is that making the mindset shift is the first and most critical step if you want to use the DevOps model in a large organization. The first results you’ll see will be in morale and camaraderie. That builds momentum faster than anything else and will carry teams into the idea of shared responsibility and ownership.

DevOps and enterprise inertia is a post from: Major Hayden's blog.

Thanks for following the blog via the RSS feed. Please don't copy my posts or quote portions of them without attribution.

by Major Hayden at April 17, 2014 05:46 PM

Chris Siebenmann

Partly getting around NFS's concurrent write problem

In a comment on my entry about NFS's problem with concurrent writes, a commentator asked this very good question:

So if A writes a file to an NFS directory and B needs to read it "immediately" as the file appears, is the only workaround to use low values of actimeo? Or should A and B be communicating directly with some simple mechanism instead of setting, say, actimeo=1?

(Let's assume that we've got 'close to open' consistency to start with, where A fully writes the file before B processes it.)

If I was faced with this problem and I had a free hand with A and B, I would make A create the file with some non-repeating name and then send an explicit message to B with 'look at file <X>' (using eg a TCP connection between the two). A should probably fsync() the file before it sends this message to make sure that the file's on the server. The goal of this approach is to avoid B's kernel having any cached information about whether or not file <X> might exist (or what the contents of the directory are). With no cached information, B's kernel must go ask the NFS fileserver and thus get accurate information back. I'd want to test this with my actual NFS server and client just to be sure (actual NFS implementations can be endlessly crazy) but I'd expect it to work reliably.

Note that it's important to not reuse filenames. If A ever reuses a filename, B's kernel may have stale information about the old version of the file cached; at the best this will get B a stale filehandle error and at the worst B will read old information from the old version of the file.

If you can't communicate between A and B directly and B operates by scanning the directory to look for new files, you have a moderate caching problem. B's kernel will normally cache information about the contents of the directory for a while and this caching can delay B noticing that there is a new file in the directory. Your only option is to force B's kernel to cache as little as possible. Note that if B is scanning it will presumably only be scanning, say, once a second and so there's always going to be at least a little processing lag (and this processing lag would happen even if A and B were on the same machine); if you really want immediately, you need A to explicitly poke B in some way no matter what.

(I don't think it matters what A's kernel caches about the directory, unless there's communication that runs the other way such as B removing files when it's done with them and A needing to know about this.)

Disclaimer: this is partly theoretical because I've never been trapped in this situation myself. The closest I've come is safely updating files that are read over NFS. See also.

by cks at April 17, 2014 04:11 AM

RISKS Digest

April 16, 2014

The Lone Sysadmin

The Eternal Wait For Vendor Software Updates

There’s been a fair amount of commentary & impatience from IT staff as we wait for vendors to patch their products for the OpenSSL Heartbleed vulnerability. Why don’t they hurry up? They’ve had 10 days now, what’s taking so long? How big of a deal is it to change a few libraries?

Perhaps, to understand this, we need to consider how software development works.

The Software Development Life Cycle

Software Development Life Cycle Image courtesy of the Wikimedia Commons.

To understand why vendors take a while to do their thing we need to understand how they work. In short, there are a few different phases they work through when designing a new system or responding to bug reports.

Requirement Analysis is where someone figures out precisely what the customer wants and what the constraints are, like budget. It’s a lot of back & forth between stakeholders, end users, and the project staff. In the case of a bug report, like “OMFG OPENSSL LEAKING DATA INTERNET HOLY CRAP” the requirements are often fairly clear. Bugs aren’t always clear, though, which is why you sometimes get a lot of questions from support guys.

Design is where the technical details of implementation show up. The project team takes the customer requirements and turns them into a technical design. In the case of a bug the team figures out how to fix the problem without breaking other stuff. That’s sometimes a real art. Read bugs filed against the kernel in Red Hat’s Bugzilla if you want to see guys try very hard to fix problems without breaking other things.

Implementation is where someone sits down and codes whatever was designed, or implements the agreed-upon fix.

The testing phase can be a variety of things. For new code it’s often it’s full system testing, integration testing, and end-user acceptance testing. But if this is a bug, the testing is often Quality Assurance. Basically a QA team is trying to make sure that whoever coded a fix didn’t introduce more problems along the way. If they find a problem, called a regression, they work with the Engineering team to get it resolved before it ships.

Evolution is basically just deploying what was built. For software vendors there’s a release cycle, and then the process starts again.

So what? Why can’t they just fix the OpenSSL problem?

Git Branching Model Image borrowed from Maescool’s Git Branching Model Tutorial.

The problem is that in an organization with a lot of coders, a sudden need for an unplanned release really messes with a lot of things, short-circuiting the requirements, design, and implementation phases and wreaking havoc in testing.

Using this fine graphic I’ve borrowed from a Git developer we can get an idea of how this happens. In this case there’s a “master” branch of the code that customer releases are done from. Feeding that, there’s a branch called “release” that is likely owned by the QA guys. When the developers think they’re ready for a release they merge “develop” up into “release” and QA tests it. If it is good it moves on to “master.”

Developers who are adding features and fixing bugs create their own branches (“feature/xxx” etc.) where they can work, and then merge into “develop.” At each level there’s usually senior coders and project managers acting as gatekeepers, doing review and managing the flow of updates. On big code bases there are sometimes hundreds of branches open at any given time.

So now imagine that you’re a company like VMware, and you’ve just done a big software release, like VMware vSphere 5.5 Update 1, that has huge new functionality in it (VSAN).[0] There’s a lot of coding activity against your code base because you’re fixing new bugs that are coming in. You’re probably also adding features, and you’re doing all this against multiple major versions of the product. You might have had a plan for a maintenance release in a couple of months, but suddenly this OpenSSL thing pops up. It’s such a basic system library that it affects everything, so everybody will need to get involved at some level.

On top of that, the QA team is in hell because it isn’t just the OpenSSL fix that needs testing. A ton of other stuff was checked in, and is in the queue to be released. But all that needs testing, first. And if they find a regression they might not even be able to jettison the problem code, because it’ll be intertwined with other code in the version control system. So they need to sort it out, and test more, and sort more out, and test again, until it works like it should. The best way out is through, but the particular OpenSSL fix can’t get released until everything else is ready.

This all takes time, to communicate and resolve problems and coordinate hundreds of people. We need to give them that time. While the problem is urgent, we don’t really want software developers doing poor work because they’re burnt out. We also don’t want QA to miss steps or burn out, either, because this is code that we need to work in our production environments. Everybody is going to run this code, because they have to. If something is wrong it’ll create a nightmare for customers and support, bad publicity, and ill will.

So let’s not complain about the pace of vendor-supplied software updates appearing, without at least recognizing our hypocrisy. Let’s encourage them to fix the problem correctly, doing solid QA and remediation so the problem doesn’t get worse. Cut them some slack for a few more days while we remember that this is why we have mitigating controls, and defense-in-depth. Because sometimes one of the controls fails, for an uncomfortably long time, and it’s completely out of our control.


[0] This is 100% speculative, while I have experience with development teams I have no insight into VMware or IBM or any of the other companies I’m waiting for patches from.

Did you like this article? Please give me a +1 back at the source: The Eternal Wait For Vendor Software Updates

This post was written by Bob Plankers for The Lone Sysadmin - Virtualization, System Administration, and Technology. Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License and copyrighted © 2005-2013. All rights reserved.

by Bob Plankers at April 16, 2014 06:00 PM

Chris Siebenmann

Where I feel that btrfs went wrong

I recently finished reading this LWN series on btrfs, which was the most in-depth exposure at the details of using btrfs that I've had so far. While I'm sure that LWN intended the series to make people enthused about btrfs, I came away with a rather different reaction; I've wound up feeling that btrfs has made a significant misstep along its way that's resulted in a number of design mistakes. To explain why I feel this way I need to contrast it with ZFS.

Btrfs and ZFS are each both volume managers and filesystems merged together. One of the fundamental interface differences between them is that ZFS has decided that it is a volume manager first and a filesystem second, while btrfs has decided that it is a filesystem first and a volume manager second. This is what I see as btrfs's core mistake.

(Overall I've been left with the strong impression that btrfs basically considers volume management to be icky and tries to have as little to do with it as possible. If correct, this is a terrible mistake.)

Since it's a volume manager first, ZFS places volume management front and center in operation. Before you do anything ZFS-related, you need to create a ZFS volume (which ZFS calls a pool); only once this is done do you really start dealing with ZFS filesystems. ZFS even puts the two jobs in two different commands (zpool for pool management, zfs for filesystem management). Because it's firmly made this split, ZFS is free to have filesystem level things such as df present a logical, filesystem based view of things like free space and device usage. If you want the actual physical details you go to the volume management commands.

Because btrfs puts the filesystem first it wedges volume creation in as a side effect of filesystem creation, not a separate activity, and then it carries a series of lies and uselessly physical details through to filesystem level operations like df. Consider the the discussion of what df shows for a RAID1 btrfs filesystem here, which has both a lie (that the filesystem uses only a single physical device) and a needlessly physical view (of the physical block usage and space free on a RAID 1 mirror pair). That btrfs refuses to expose itself as a first class volume manager and pretends that you're dealing with real devices forces it into utterly awkward things like mounting a multi-device btrfs filesystem with 'mount /dev/adevice /mnt'.

I think that this also leads to the asinine design decision that subvolumes have magic flat numeric IDs instead of useful names. Something that's willing to admit it's a volume manager, such as LVM or ZFS, has a name for the volume and can then hang sub-names off that name in a sensible way, even if where those sub-objects appear in the filesystem hierarchy (and under what names) gets shuffled around. But btrfs has no name for the volume to start with and there you go (the filesystem-volume has a mount point, but that's a different thing).

All of this really matters for how easily you can manage and keep track of things. df on ZFS filesystems does not lie to me; it tells me where the filesystem comes from (what pool and what object path within the pool), how much logical space the filesystem is using (more or less), and roughly how much more I can write to it. Since they have full names, ZFS objects such as snapshots can be more or less self documenting if you name them well. With an object hierarchy, ZFS has a natural way to inherit various things from parent object to sub-objects. And so on.

Btrfs's 'I am not a volume manager' approach also leads it to drastically limit the physical shape of a btrfs RAID array in a way that is actually painfully limiting. In ZFS, a pool stripes its data over a number of vdevs and each vdev can be any RAID type with any number of devices. Because ZFS allows multi-way mirrors this creates a straightforward way to create a three-way or four-way RAID 10 array; you just make all of the vdevs be three or four way mirrors. You can also change the mirror count on the fly, which is handy for all sorts of operations. In btrfs, the shape 'raid10' is a top level property of the overall btrfs 'filesystem' and, well, that's all you get. There is no easy place to put in multi-way mirroring; because of btrfs's model of not being a volume manager it would require changes in any number of places.

(And while I'm here, that btrfs requires you to specify both your data and your metadata RAID levels is crazy and gives people a great way to accidentally blow their own foot off.)

As a side note, I believe that btrfs's lack of allocation guarantees in a raid10 setup makes it impossible to create a btrfs filesystem split evenly across two controllers that is guaranteed to survive the loss of one entire controller. In ZFS this is trivial because of the explicit structure of vdevs in the pool.

PS: ZFS is too permissive in how you can assemble vdevs, because there is almost no point of a pool with, say, a mirror vdev plus a RAID-6 vdev. That configuration is all but guaranteed to be a mistake in some way.

by cks at April 16, 2014 05:28 AM

April 15, 2014

The Tech Teapot

Stack Overflow Driven Development

The rise of Stack Overflow has certainly changed how many programmers go about their trade.

I have recently been learning some new client side web skills because I need them for a new project. I have noticed that the way I go about learning is quite different from the way I used to learn pre-web.

I used to have a standard technique. I’d go through back issues of magazines I’d bought (I used to have hundreds of back issues) and read any articles related to the new technology. Then I’d purchase a book about the topic, read it and start a simple starter project. Whilst doing the starter project, I’d likely pick up a couple of extra books and skim them to find techniques I needed for the project. This method worked pretty well, I’d be working idiomatically, without a manual in anywhere from a month to three months.

Using the old method, if I got stuck on something, I’d have to figure it out on my own. I remember it took three days to get a simple window to display when I was learning Windows programming in 1991. Without the internet, there was nobody you could ask when you got stuck. If you didn’t own the reference materials you needed, then you were stuck.

Fast forward twenty years and things are rather different. For starters, I don’t have a bunch of magazines sitting around. I don’t even read tech magazines any more, either in print or digitally. None of my favourite magazines survived the transition to digital.

Now when I want to learn a new tech, I head to Wikipedia first to get a basic idea. Then I start trawling google for simple tutorials. I then read one of the new generation of short introductory books on my Kindle.

I then start my project safe in the knowledge that google will always be there. And, of course, google returns an awful lot of Stack Overflow pages. Whilst I would have felt very uncomfortable starting a project without a full grasp of a technology twenty years ago, now I think it would be odd not to. The main purpose of the initial reading is to get a basic understanding of the technology and, most importantly, the vocabulary. You can’t search properly if you don’t know what to search for.

Using my new approach, I’ve cut my learning time from one to three months down to one to three weeks.

The main downside to my approach is that, at the beginning at least, I may not write idiomatic code. But, whilst that is a problem, software is very maleable and you can always re-write parts later on if the project is a success. The biggest challenge now seems to be getting to the point when you know a project has legs as quickly as possible. Fully understanding a tech before starting a project, just delays the start and I doubt you’ll get that time back later in increased productivity.

Of course, by far the quickest approach is to use a tech stack you already know. Unfortunately, in my case that wasn’t possible because I don’t know a suitable client side tech. It is a testament to the designers of Angular.js, SignalR and NancyFX that I have found it pretty easy to get started. I wish everything was so well designed.

The post Stack Overflow Driven Development appeared first on Openxtra Tech Teapot.

by Jack Hughes at April 15, 2014 12:53 PM

Chris Siebenmann

Chasing SSL certificate chains to build a chain file

Supposes that you have some shiny new SSL certificates for some reason. These new certificates need a chain of intermediate certificates in order to work with everything, but for some reason you don't have the right set. In ideal circumstances you'll be able to easily find the right intermediate certificates on your SSL CA's website and won't need the rest of this entry.

Okay, let's assume that your SSL CA's website is an unhelpful swamp pit. Fortunately all is not lost, because these days at least some SSL certificates come with the information needed to find the intermediate certificates. First we need to dump out our certificate, following my OpenSSL basics:

openssl x509 -text -noout -in WHAT.crt

This will print out a bunch of information. If you're in luck (or possibly always), down at the bottom there will be a 'Authority Information Access' section with a 'CA Issuers - URI' bit. That is the URL of the next certificate up the chain, so we fetch it:

wget <SOME-URL>.crt

(In case it's not obvious: for this purpose you don't have to worry if this URL is being fetched over HTTP instead of HTTPS. Either your certificate is signed by this public key or it isn't.)

Generally or perhaps always this will not be a plain text file like your certificate is, but instead a binary blob. The plain text format is called PEM; your fetched binary blob of a certificate is probably in the binary DER encoding. To convert from DER to PEM we do:

openssl x509 -inform DER -in <WGOT-FILE>.crt -outform PEM -out intermediate-01.crt

Now you can inspect intermediate-01.crt in the same to see if it needs a further intermediate certificate; if it does, iterate this process. When you have a suitable collection of PEM format intermediate certificates, simply concatenate them together in order (from the first you fetched to the last, per here) to create your chain file.

PS: The Qualys SSL Server Test is a good way to see how correct your certificate chain is. If it reports that it had to download any certificates, your chain of intermediate certificates is not complete. Similarly it may report that some entries in your chain are not necessary, although in practice this rarely hurts.

Sidebar: Browsers and certificate chains

As you might guess, some but not all browsers appear to use this embedded intermediate certificate URL to automatically fetch any necessary intermediate certificates during certificate validation (as mentioned eg here). Relatedly, browsers will probably not tell you about unnecessary intermediate certificates they received from your website. The upshot of this can be a HTTPS website that works in some browsers but fails in others, and in the failing browser it may appear that you sent no additional certificates as part of a certificate chain. Always test with a tool that will tell you the low-level details.

(Doing otherwise can cause a great deal of head scratching and frustration. Don't ask how I came to know this.)

by cks at April 15, 2014 02:03 AM

April 14, 2014

Steve Kemp's Blog

Is lumail a stepping stone?

I'm pondering a rewrite of my console-based mail-client.

While it is "popular" it is not popular.

I suspect "console-based" is the killer.

I like console, and I ssh to a remote server to use it, but having different front-ends would be neat.

In the world of mailpipe, etc, is there room for a graphic console client? Possibly.

The limiting factor would be the lack of POP3/IMAP.

Reworking things such that there is a daemon to which a GUI, or a console client, could connect seems simple. The hard part would obviously be working the IPC and writing the GUI. Any toolkit selected would rule out 40% of the audience.

In other news I'm stalling on replying to emails. Irony.

April 14, 2014 11:21 PM

Everything Sysadmin

Time Management training at SpiceWorld Austin, 2014

I'll be doing a time management class at SpiceWorld.

Read about my talk and the conference at their website.

If you register, use code "LIMONCELLI20" to save 20%.

See you there!

April 14, 2014 03:00 PM

Interview with LOPSA-East Keynote: Vish Ishaya

Vish Ishaya will be giving the opening keynote at LOPSA-East this year. I caught up with him to talk about his keynote, OpenStack, and how he got his start in tech. The conference is May 2-3, 2014 in New Brunswick, NJ. If you haven't registered, do it now!

Tom Limoncelli: Tell us about your keynote. What should people expect / expect to learn?

Vish Ishaya: The keynote will be about OpenStack as well as the unique challenges of running a cloud in the datacenter. Cloud development methodologies mean different approaches to problems. These approaches bring with them a new set of concerns. By the end of the session people should understand where OpenStack came from, know why businesses are clamoring for it, and have strategies for bringing it into the datacenter effectively.

TL: How did you get started in tech?

VI: I started coding in 7th Grade, when I saw someone "doing machine language" on a computer at school (He was programming in QBasic). I started copying programs from books and I was hooked.

TL: If an attendee wanted to learn OpenStack, what's the smallest installation they can build to be able to experiment? How quickly could they go from bare metal to a working demo?

VI: The easiest way to get started experimenting with OpenStack is to run DevStack ( on a base Ubuntu or Fedora OS. It works on a single node and is generally running in just a few minutes.

TL: What are the early-adopters using OpenStack for? What do you see the next tier of customers using it for?

VI: OpenStack is a cloud toolkit, so the early-adopters are building clouds. These tend to be service providers and large enterprises. The next tier of customers are smaller businesses that just want access to a private cloud. These are the ones that are already solving interesting business problems using public clouds and want that same flexibility on their own infrastructure.

TL: Suppose a company had a big investment in AWS and wanted to bring it in-house and on-premise. What is the compatibility overlap between OpenStack and AWS?

We've spent quite a bit of time analyzing this at Nebula, because it is a big use-case for our customers. It really depends on what features in AWS one is using. If just the basics are being used, the transition is very easy. If you're using a bunch of the more esoteric services, finding an open source analog can be tricky.

TL: OpenStack was founded by Rackspace Hosting and NASA. Does OpenStack run well in zero-G environments? Would you go into space if NASA needed an OpenStack deployment on the moon?

When I was working on the Nebula project at NASA (where the OpenStack compute project gestated), everyone always asked if I had been to space. I haven't yet, but I would surely volunteer.

Thanks to Vish for taking the time to do this interview! See you at LOPSA-East!

April 14, 2014 02:41 PM

Chris Siebenmann

My reactions to Python's warnings module

A commentator on my entry on the warnings problem pointed out the existence of the warnings module as a possible solution to my issue. I've now played around with it and I don't think it fits my needs here, for two somewhat related reasons.

The first reason is that it simply makes me nervous to use or even take over the same infrastructure that Python itself uses for things like deprecation warnings. Warnings produced about Python code and warnings that my code produces are completely separate things and I don't like mingling them together, partly because they have significantly different needs.

The second reason is that the default formatting that the warnings module uses is completely wrong for the 'warnings produced from my program' case. I want my program warnings to produce standard Unix format (warning) messages and to, for example, not include the Python code snippet that generated them. Based on playing around with the warnings module briefly it's fairly clear that I would have to significantly reformat standard warnings to do what I want. At that point I'm not getting much out of the warnings module itself.

All of this is a sign of a fundamental decision in the warnings module: the warnings module is only designed to produce warnings about Python code. This core design purpose is reflected in many ways throughout the module, such as in the various sorts of filtering it offers and how you can't actually change the output format as far as I can see. I think that this makes it a bad fit for anything except that core purpose.

In short, if I want to log warnings I'm better off using general logging and general log filtering to control what warnings get printed. What features I want there are another entry.

by cks at April 14, 2014 05:20 AM

April 13, 2014

Security Monkey

Amazing Write-Up on BillGates Botnet - With Monitoring Tools Source!

Just stumbled upon this amazing write-up by ValdikSS on not only his discovery of the "BillGates" botnet, but of some source code he's developed that yo

April 13, 2014 09:32 PM

Chris Siebenmann

A problem: handling warnings generated at low levels in your code

Python has a well honed approach for handling errors that happen at a low level in your code; you raise a specific exception and let it bubble up through your program. There's even a pattern for adding more context as you go up through the call stack, where you catch the exception, add more context to it (through one of various ways), and then propagate the exception onwards.

(You can also use things like phase tracking to make error messages more specific. And you may want to catch and re-raise exceptions for other reasons, such as wrapping foreign exceptions.)

All of this is great when it's an error. But what about warnings? I recently ran into a case where I wanted to 'raise' (in the abstract) a warning at a very low level in my code, and that left me completely stymied about what the best way to do it was. The disconnect between errors and warnings is that in most cases errors immediately stop further processing while warnings don't, so you can't deal with warnings by raising an exception; you need to somehow both 'raise' the warning and continue further processing.

I can think of several ways of handling this, all of which I've sort of used in code in the past:

  • Explicitly return warnings as part of the function's output. This is the most straightforward but also sprays warnings through your APIs, which can be a problem if you realize that you've found a need to add warnings to existing code.

  • Have functions accumulate warnings on some global or relatively global object (perhaps hidden through 'record a warning' function calls). Then at the end of processing, high-level code will go through the accumulated warnings and do whatever is desired with them.

  • Log the warnings immediately through a general logging system that you're using for all program messages (ranging from simple to very complex). This has the benefit that both warnings and errors will be produced in the correct order.

The second and third approaches have the problem that it's hard for intermediate layers to add context to warning messages; they'll wind up wanting or needing to pass the context down to the low level routines that generate the warnings. The third approach can have the general options problem when it comes to controlling what warnings are and aren't produced, or you can try to control this by having the high level code configure the logging system to discard some messages.

I don't have any answers here, but I can't help thinking that I'm missing a way of doing this that would make it all easy. Probably logging is the best general approach for this and I should just give in, learn a Python logging system, and use it for everything in the future.

(In the incident that sparked this entry, I wound up punting and just printing out a message with sys.stderr.write() because I wasn't in a mood to significantly restructure the code just because I now wanted to emit a warning.)

by cks at April 13, 2014 06:15 AM

Pragmatic reactions to a possible SSL private key compromise

In light of the fact that the OpenSSL 'heartbleed' issue may have resulted in someone getting a copy of your private keys, there are least three possible reactions that people and organizations can take:

  • Do an explicit certificate revocation through your SSL CA and get a new certificate, paying whatever extra certificate revocation cost the CA requires for this (some do it for free, some normally charge extra).

  • Simply get new SSL certificates from whatever certificate vendor you prefer or can deal with and switch to them. Don't bother to explicitly revoke your old keys.

  • Don't revoke or replace SSL keys at all, based on an assessment that the actual risk that your keys were compromised is very low.

These are listed in declining order of theoretical goodness and also possibly declining order of cost.

Obviously the completely cautious approach is to assume that your private keys have been compromised and also that you should explicitly revoke them so that people might be protected from an attacker trying man in the middle attacks with your old certificates and private keys (if revocation actually works this time). The pragmatic issue is that this course of action probably costs the most money (if it doesn't, well, then there's no problem). If your organization has a lot riding on the security of your SSL certificates (in terms of money or other things) then this extra expense is easy to justify, and in many places the actual cost is small or trivial compared to other budget items.

But, as they say. There are places where this is not so true, where the extra cost of certificate revocations will to some degree hurt or require a fight to get. Given that certificate revocation may not actually do much in practice, there is a real question of whether you're actually getting anything worthwhile for your money (especially since you're probably doing this as merely a precaution against potential key compromise). If certificate revocation is an almost certainly pointless expense that's going to hurt, the pragmatics push people away from paying for it and towards one of the other two alternatives.

(If you want more depressing reading on browser revocation checking, see Adam Langley (via).)

Getting new certificates is the intermediate caution option (especially if you believe that certificate revocation is ineffective in practice), since it closes off future risks that you can actually do something about yourself. But it still probably costs you some money (how much money depends on how many certificates you have or need).

Doing nothing with your SSL keys is the cheapest and easiest approach and is therefor very attractive for people on a budget, and there are a number of arguments towards a low risk assessment (or at least away from a high one). People will say that this position is obviously stupid, which is itself obviously stupid; all security is a question of risk versus cost and thus requires an assessment of both risk and cost. If people feel that the pragmatic risk is low (and at this point we do not have evidence that it isn't for a random SSL site) or cannot convince decision makers that it is not low and the cost is perceived as high, well, there you go. Regardless of what you think, the resulting decision is rational.

(Note that there is at least one Certificate Authority that offers SSL certificates for free but normally charges a not insignificant cost for revoking and reissuing certificates, which can swing the various costs involved. When certificates are free it's easy to wind up with a lot of them to either revoke or replace.)

In fact, as a late-breaking update as I write this, Neel Mehta (the person who found the bug) has said that private key exposure is unlikely, although of course unlikely is nowhere near the same thing as 'impossible'. See also Thomas Ptacek's followup comment.
Update: But see Tomas Rzepka's success report on FreeBSD for bad news.

Update April 12: It's now clear from the results of the CloudFlare challenge and other testing by people that SSL private keys can definitely be extracted from servers that are vulnerable to Heartbleed.

My prediction is that pragmatics are going to push quite a lot of people towards at least the second option and probably the third. Sure, if revoking and reissuing certificates is free a lot of people will take advantage of it (assuming that the message reaches them, which I would not count on), but if it costs money there will be a lot of pragmatic pressure towards cheap options.

(Remember the real purpose of SSL certificates.)

Sidebar: Paths to high cost perceptions

Some people are busy saying that the cost of new SSL certificates is low (or sometimes free), so why not get new ones? There are at least three answers:

  • The use of SSL is for a hobby thing or personal project and the person involved doesn't feel like spending any more money on it than they already have or are.

  • There are a significant number of SSL certificates involved, for example for semi-internal hosts, and there's no clear justification for replacing only a few of their keys (except 'to save money', and if that's the justification you save even more money by not replacing any of them).

  • The people who must authorize the money will be called on to defend the expense in front of higher powers or to prioritize it against other costs in a fixed budget or both.

These answers can combine with each other.

by cks at April 13, 2014 12:31 AM

April 12, 2014

Geek and Artist - Tech

(something something) Big Data!

I recently wrote about how I’d historically been using Pig for some daily and some ad-hoc data analysis, and how I’d found Hive to be a much friendly tool for my purposes. As I mentioned then, I’m not a data analyst by any stretch of the imagination, but have occasional need to use these kinds of tools to get my job done. The title of this post (while originally a placeholder for something more accurate) is a representation of the feeling I have for these topics – only a vague idea of what is going on, but I know it has to do with Big Data (proper noun).

Since writing that post, attempting and failing to find a simple way of introducing Hive usage at work (it’s yet another tool and set of data representations to maintain and support) I’ve also been doing a bit of reading on comparable tools, and frankly Hive only scratches the surface. Having a mostly SQL-compliant interface, there is a lot of competition in this space (and this blog post from Cloudera sums up the issue very well). SQL as an interface to big data operations is desirable for the same reasons I found it useful, but it also introduces some performance implications that are not suited to traditional MapReduce-style jobs which tend to have completion times in the tens of minutes to hours rather than seconds.

Cloudera’s Impala and a few other competitors in this problem space are attempting to address this problem by combining large-scale data processing that is traditionally MapReduce’s strong-point, with very low latencies when generating results. Just a few seconds is not unusual. I haven’t investigated any of these in-depth, but I feel as a sometimes-user of Hadoop via Pig and Hive it is just as important to be abreast of these technologies as the “power users”, so that when we do have occasion to need such data analysis, it can be done with as low a barrier to entry as possible and with the maximum performance.


Spark is now an Apache project but originated in the AMPLab at UC Berkeley. My impression is that it is fairly similar to Apache Hadoop – its own parallel-computing cluster, with which you interact via native language APIs (in this case Java, Scala or Python). I’m sure it offers superior performance to Hadoop’s batch processing model, but unless you are already heavily integrating from these languages with Hadoop libraries it doesn’t offer a drastically different method of interaction.

On the other hand, there are already components built on top of the Spark framework which do allow this, for example, Shark (also from Berkeley). In this case, Shark even offers HiveQL compatibility, so if you are already using Hive there is a clear upgrade path. I haven’t tried it, but it sounds promising although being outside of the Cloudera distribution and not having first-class support on Amazon EMR makes it slightly harder to get at (although guides are available).


As already suggested, Impala was the first alternative I discovered and also is incorporated in Cloudera’s CDH distribution and available on Amazon EMR, which makes it more tempting to me for use both inside and outside of EMR. It supports ANSI SQL-92 rather than HiveQL, but coming from Pig or other non-SQL tools this may not matter to you.


Developed by Facebook, and can either use HDFS data without any additional metadata, or with the Hive metadata store using a plugin. For that reason I see it as somewhat closer to Impala, although it also lacks the wider support in MapReduce deployments like CDH and Amazon EMR just like Shark/Spark.

AWS Redshift

Not really an open source tool like the others above, but deserves a mention as it really fits in the same category. If you want to just get something up and running immediately, this is probably the easiest option.


I haven’t even begun to scratch the surface of tooling available in this part of the Big Data space, and the above are only the easiest to find amongst further open source and commercial varieties. Personally I am looking forward to the next occasion I have to analyse some data where I can really pit some of these solutions against each other and find the most efficient and easy framework for my ad-hoc data analysis needs.

by oliver at April 12, 2014 08:32 PM

Chris Siebenmann

The relationship between SSH, SSL, and the Heartbleed bug

I will lead with the summary: since the Heartbleed bug is a bug in OpenSSL's implementation of a part of the TLS protocol, no version or implementation of SSH is affected by Heartbleed because the SSH protocol is not built on top of TLS.

So, there's four things involved here:

  • SSL aka TLS is the underlying network encryption protocol used for HTTPS and a bunch of other SSL/TLS things. Heartbleed is an error in implementing the 'TLS heartbeat' protocol extension to the TLS protocol. A number of other secure protocols are built partially or completely on top of TLS, such as OpenVPN.

  • SSH is the protocol used for, well, SSH connections. It's completely separate from TLS and is not layered on top of it in any way. However, TLS and SSH both use a common set of cryptography primitives such as Diffie-Hellman key exchange, AES, and SHA1.

    (Anyone sane who's designing a secure protocol reuses these primitives instead of trying to invent their own.)

  • OpenSSL is an implementation of SSL/TLS in the form of a large cryptography library. It also exports a whole bunch of functions and so on that do various cryptography primitives and other lower-level operations that are useful for things doing cryptography in general.

  • OpenSSH is one implementation of the SSH protocol. It uses various functions exported by OpenSSL for a lot of cryptography related things such as generating randomness, but it doesn't use the SSL/TLS portions of OpenSSL because SSH (the protocol) doesn't involve TLS (the protocol).

Low level flaws in OpenSSL such as Debian breaking its randomness can affect OpenSSH when OpenSSH uses something that's affected by the low level flaw. In the case of the Debian issue, OpenSSH gets its random numbers from OpenSSL and so was affected in a number of ways.

High level flaws in OpenSSL's implementation of TLS itself will never affect OpenSSH because OpenSSH simply doesn't use those bits of OpenSSL. For instance, if OpenSSL turns out to have an SSL certificate verification bug (which happened recently with other SSL implementations) it won't affect OpenSSH's SSH user and host key verification.

As a corollary, OpenSSH (and all SSH implementations) aren't directly affected by TLS protocol attacks such as BEAST or Lucky Thirteen, although people may be able to develop similar attacks against SSH using the same general principles.

by cks at April 12, 2014 03:44 AM

April 11, 2014

RISKS Digest

Everything Sysadmin

Replace Kathleen Sebelius with a sysadmin!

Scientists complain that there are only 2 scientists in congress and how difficult they find it to explain basic science to their peers. What about system administrators? How many people in congress or on the president's cabinet have every had the root or administrator password to systems that other people depend on?

Health and Human Services Secretary Kathleen Sebelius announced her resignation and the media has been a mix of claiming she's leaving in disgrace after the failed ACA website launch countered with she stuck it out until it was a success, which redeems her.

The truth is, folks, how many of you have launched a website and had it work perfectly the first day? Zero. Either you've never been faced with such a task, or you have and it didn't go well. Very few people can say they've launched a big site and had it be perfect the first day.

Let me quote from a draft of the new book I'm working on with Strata and Christine ("The Practice of Cloud Administration", due out this autumn):

[Some companies] declare that all outages are unacceptable and only accept perfection. Any time there is an outage, therefore, it must be someone's fault and that person, being imperfect, is fired. By repeating this process eventually the company will only employ perfect people. While this is laughable, impossible, and unrealistic it is the methodology we have observed in many organizations. Perfect people don't exist, yet organizations often adopt strategies that assume they do.

Firing someone "to prove a point" makes for exciting press coverage but terrible IT. Quoting Allspaw, "an engineer who thinks they're going to be reprimanded are disincentivized to give the details necessary to get an understanding of the mechanism, pathology, and operation of the failure. This lack of understanding of how the accident occurred all but guarantees that it will repeat. If not with the original engineer, another one in the future." (link)

HHS wasn't doing the modern IT practices (DevOps) that Google, Facebook, and other companies use to have successful launches. However most companies today aren't either. The government is slower to adopt new practices and this is one area where that bites us all.

All the problems the site had were classic "old world IT thinking" leading to cascading failures that happen in business all the time. One of the major goals of DevOps is to eliminate this kind of problem.

Could you imagine a CEO today that didn't know what accounting is? No. They might not be experts at it, but at least they know it exists and why it is important. Can you imagine a CEO that doesn't understand what DevOps is and why small batches, blameless postmortems, and continuous delivery are important? Yes.. but not for long.

Obama did the right thing by not accepting her resignation until the system was up and running. It would have been disruptive and delayed the entire process. It would have also disincentivized engineers and managers to do the right thing in the future. [Yesterday I saw a quote from Obama where he basically paraphrased Allspaw's quote but I can't find it again. Links anyone?]

Healthcare is 5% "medical services" and 95% information management. Anyone in the industry can tell you that.

The next HHS Secretary needs to be a sysadmin. A DevOps-trained operations expert.

What government official has learned the most about doing IT right in the last year? Probably Sebelius. It's a shame she's leaving.

You can read about how DevOps techniques and getting rid of a lot of "old world IT thinking" saved the Obamacare website in this article at the Time Magazine website. Login required.)

April 11, 2014 03:00 PM