Planet SysAdmin

August 01, 2015


License Agreements and Changes Are Coming

The OpenSSL license is rather unique and idiosyncratic. It reflects views from when its predecessor, SSLeay, started twenty years ago. As a further complication, the original authors were hired by RSA in 1998, and the code forked into two versions: OpenSSL and RSA BSAFE SSL-C. (See Wikipedia for discussion.) I don’t want get into any specific details, and I certainly don’t know them all.

Things have evolved since then, and open source is an important part of the landscape – the Internet could not exist without it. (There are good reasons why Microsoft is a founding member of the Core Infrastructure Initiative (CII).

Our plan is to update the license to the Apache License version 2.0 (APLv2.0)). We are in consultation with various corporate partners, the CII, and the legal experts at the Software Freedom Law Center. In other words, we have a great deal of expertise and interest at our fingertips.

But in order to do this, we need to do two things:

  1. Stop making it worse
  2. Clean up the backlog

To stop making it worse, we will soon require almost every contributor to have a signed a Contributor License Agreement (CLA) on file.

A CLA is important to ensure that we have the rights to distribute the code. It is a lightweight agreement, signed by the copyright holder, that grants us the rights to redistribute your contribution as part of OpenSSL. Note that our CLA does not transfer copyright to us, nor does it limit any of your rights.

There will probably be some exceptions, like if your change is a simple or obvious patch. We’re not lawyers, we don’t want to be lawyers, and we don’t want to be in the business of writing legal opinions or counting how many lawyers can dance on the head of a pin. If this kind of thing does interest you, you might find this article from OSS Watch in August 2012 worth reading: [].

We have two versions of the CLA available: one for individuals and one for corporations. At this point, every member of the OpenSSL dev team has signed the ICLA, and most of our employers have signed the CCLA.

If your employer sponsors work on OpenSSL as part of your job, then it probably makes sense to get the CCLA signed. Both CLA’s are basically the Apache CLA, with just the obvious editorial changes.

You can find the CLA’s here:

If you or your employer has made code contributions to OpenSSL, or you are planning on doing so in the future, please download, sign, scan, and email the CCLA to us. The contact information is on the CLA. If your employer has any experience with open source, the CCLA should be very straightforward. For individuals, stay tuned as we set up a minimal-hassle submission process.

We’re not yet able to announce more details on the license change. There is a lot of grunt work needed to clean up the backlog and untangle all the years of work from the time when nobody paid much attention to this sort of detail. But times are different, we all care, and we’re going to do the right thing. It will just take some time, and we appreciate your patience.

August 01, 2015 09:00 AM

July 31, 2015

Everything Sysadmin

Happy SysAdmin Day! (July 31)

I hope you are fully appreciated today and every day.

For more info about SysAdmin Day, visit

If you are in the NYC area, please come to SysDrink's SysAd Day event tonight at 6pm at The Gingerman in mid-town Manhatten. There will be an open bar. This year's event is sponsored by Digital Ocean.

July 31, 2015 12:30 PM

Chris Siebenmann

Ubuntu once again fails at a good kernel security update announcement

Ubuntu just sent out USN-2700-1, a 14.04 LTS announcement about a kernel update for CVE-2015-3290, CVE-2015-3291, and CVE-2015-5157. People with good memories may at this point remember USN-2688-1, a 14.04 LTS announcement about a kernel update for CVE-2015-3290, CVE-2015-1333, CVE-2015-3291, and CVE-2015-5157. Gosh, that's a familiar list of CVEs, and it sort of looks like the 'repeated CVEs' thing Ubuntu has done before. If you already applied the USN-2688-1 kernel and rebooted everything, it certainly sounds like you can skip USN-2700-1.

That would be a mistake. What Ubuntu is not bothering to mention in USN-2700-1 is that the 64-bit x86 kernels from USN-2688-1 had a bad bug. In that kernel, if a 32-bit program forks and then execs a 64-bit program the 64-bit program segfaults on startup; for example, a 32-bit shell will be unable to run any 64-bit programs (which will be most of them). This bug is the sole reason USN-2700-1 was issued (literally).

The USN-2700-1 text should come with a prominent notification to the effect of 'the previous update introduced a serious bug on 64-bit systems; we are re-issuing corrected kernels without this problem'. Ubuntu has put such notices on updates in the past so the idea is not foreign to them; they just didn't bother doing it this time around. As a result, people who may be affected by this newly introduced kernel bug do not necessarily know that this is their problem and they should update to the USN-2700-1 kernel to fix it.

(At best they may start doing a launchpad bug search and find the bug report. But I don't think it's necessarily all that likely, because the bug's title is not particularly accurate about what it actually is; 'Segfault in while starting Steam after upgrade to 3.13.0-59.98' does not point clearly to a 32-bit on 64-bit issue. It doesn't even mention 'on 64-bit platforms' in the description.)

Kernel update notices matter because people use them to decide whether or not to go through the hassle of a system reboot. If a notice is misleading, this goes wrong; people don't update and reboot when they really should. When there are bugs in a kernel update, as there were here, not telling people about them causes them to try to troubleshoot a buggy system without realizing that there is a simple solution.

(Lucky people noticed failures on the USN-2688-1 kernel right away, and so were able to attribute them to the just-done kernel update. But unlucky people will only run into this once in a while, when they run a rare lingering 32-bit program that does this, and so they may not immediately realize that it was due to a kernel update that might now be a week or two in the past.)

(See also a previous Ubuntu kernel update failure, from 2011.)

by cks at July 31, 2015 04:41 AM

League of Professional System Administrators

SysAdminDay Contest Winners

LOPSA is pleased to announce the winners of the its SysAdminDay contest
sponsored by Ansible, SiliconMechanics, Opengear, and Druva.  And the winners are:
Maya Karp won a $100 Apple gift card and Tshirt from Opengear for her entry:
"I work with the systems teams at DreamWorks Animation. Being a SysAdmin for  me, is about conquering my own insecurities. I come from an artist background with no formal education in computer science at all. I always felt unsure  around computers, much less Linux. I became a SysAdmin to build my confidence  and grow. I know have an Ubuntu media server at home and run all three OS.
Truly, I believe the difference between a job and a career as a SysAdmin is 
attitude. I come in curious every day to learn something new. In this field, 
the change far outpaces anyone who is not voraciously learning. My career goal  is continually raising the challenge and curiosity bar as I progress- being a 
SysAdmin has been an awesome way to stay curious and challenged!
There once was a shell within Linux
Whose returns were feeling like gimmicks
Whether up arrow or tab
It was all becoming drab
Every day, every minute
Was just another init
Until Red Hat saw the symptom
And changed over to system ... D"

read more

by ski at July 31, 2015 03:23 AM

July 30, 2015

Errata Security

A quick review of the BIND9 code

BIND9 is the oldest and most popular DNS server. Today, they announced a DoS vulnerability was announced that would crash the server with a simply crafted query.  I could use my "masscan" tool to blanket the Internet with those packets and crash all publicly facing BIND9 DNS servers in about an hour. A single vuln doesn't mean much, but if you look at the recent BIND9 vulns, you see a pattern forming. BIND9 has lots of problems -- problems that critical infrastructure software should not have.

Its biggest problem is that it has too many feature. It attempts to implement every possible DNS feature known to man, few of which are needed on publicly facing servers. Today's bug was in the rarely used "TKEY" feature, for example. DNS servers exposed to the public should have the minimum number of features -- the server priding itself on having the maximum number of features is automatically disqualified.

Another problem is that DNS itself has some outdated design issues. The control-plane and data-plane need to be separate. This bug is in the control-plane code, but it's exploited from the data-plane. (Data-plane is queries from the Internet looking up names, control-plane is zones updates, key distribution, and configuration). The control-plane should be on a separate network adapter, separate network address, and separate port numbers. These should be hidden from the public, and protected by a firewall.

DNS should have hidden masters, servers with lots of rich functionality, such as automatic DNSSEC zone signing. It should have lightweight exposed slaves, with just enough code to answer queries on the data-plane, and keep synchronized with the master on the control-plane.

But what this post is really about is looking at BIND9's code. It's a nicer than the OpenSSL code and some other open-source projects, but there do appear to be some issues. The bug was in the "dns_message_findname()" function. The function header looks like:

dns_message_findname(dns_message_t *msg, dns_section_t section,
    dns_name_t *target, dns_rdatatype_t type,
    dns_rdatatype_t covers, dns_name_t **name,
    dns_rdataset_t **rdataset);

The thing you should notice here is that none of the variables are prefixed with const, even though all but one of them should be. A quick grep shows that lack of const correctness is pretty common throughout the BIND9 source code. Every quality guide in the world strongly suggests const correctness -- that's it's lacking here hints at larger problems.

The bug was an assertion failure on the "name" parameter in the code above, as you can see in the picture. An assertion is supposed to double-check internal consistency of data, to catch bugs early. But this case, there was no bug being caught -- it was the assertion itself that was the problem. The programmers are confused by the difference between in, out, and in/out parameters. You assert on the expected values of the in and in/out parameters, but not on write-only out parameters. Since the function doesn't read them, their value is immaterial. If the function wants it to be NULL on input, it can just set it itself -- demanding that the caller do this is just bad.

By the way, assertions are normally enabled only for testing, but not for production code. That's because they can introduce bugs (as in this case), and have performance problems. However, in the long run, aggressive double-checking leads to more reliable code. Thus, I'm a fan of such aggressive checking. However, quickly glancing at the recent BIND9 vulns, it appears many of them are caused by assertions failing. This may be good, meaning that the code was going to crash (or get exploited) anyway, and the assertion caught it early. Or, it may be bad, with the assertion being the bug itself, or at least, that the user would've been happier without the assertion triggering (because of a memory leak, for example). If the later is the case, then it sounds like people should just turn off the assertions when building BIND9 (it's a single command-line switch).

Last year, ISC (the organization that maintains BIND9) finished up their BIND10 project, which was to be a re-write of the code. This was a fiasco, of course. Rewrites of large software project are doomed to failure. The only path forward for BIND is with the current code-base. This means refactoring and cleaning up technical debt on a regular basis, such as fixing the const correctness problem. This means arbitrarily deciding to drop support for 1990s era computers when necessary. If the architecture needs to change (such as separating the data-plane from the control-plane), it can be done within the current code-base -- just create a solid regression test, then go wild on the changes relying upon the regression test to maintain the quality.

Lastly, I want to comment on the speed of BIND9. It's dog slow -- the slowest of all the DNS servers. That's a problem firstly because slow servers should not be exposed to DDoS attacks on the Internet. It's a problem secondly because slow servers should not be written in dangerous languages like C/C++ . These languages should only be used when speed is critical. If your code isn't fast anyway, then you should be using safe languages, like C#, Java, or JavaScript. A DNS server written in these languages is unlikely to be any slower than BIND9.


The point I'm trying to make here is that BIND9 should not be exposed to the public. It has code problems that should be unacceptable in this day and age of cybersecurity. Even if it were written perfectly, it has far too many features to be trustworthy. It's feature-richness makes it a great hidden master, it's just all those feature get in the way of it being a simple authoritative slave server, or a simple resolver. They shouldn't rewrite it from scratch, but if they did, they should choose a safe language and not use C/C++.

Example#2: strcpy()

BIND9 has 245 instances of the horribly unsafe strcpy() function, spread through 94 files. This is unacceptable -- yet another technical debt they need to fix. It needs to be replaced with the strcpy_s() function.

In the file lwresutil.c is an example of flawed thinking around strcpy(). It's not an exploitable bug, at least not yet, but it's still flawed.

{ unsigned int target_length;

target_length = strlen(name);
if (target_length >= sizeof(target_name))
strcpy(target_name, name); /* strcpy is safe */

The problem here, which I highlighted in bold. The problem is that on a 64-bit machine, an unsigned int is only 32-bits, but string lengths can be longer than a 32-bit value can hold. Thus, a 4-billion byte name would cause the integer to overflow and the length check to fail. I don't think you can get any name longer than 256 bytes through this code path, so it's likely not vulnerable now, but the "4-billion bytes of data" problem is pretty common in other code, and frequently exploitable in practice.

The comment /* strcpy is safe */ is no more accurate than those emails that claim "Checked by anti-virus".

Modern code should never use strcpy(), at all, under any circumstances, not even in the unit-test code where it doesn't matter. It's easy to manage projects by simply grepping for the string "strcpy()" and whether it exists or not, it's hard managing project with some strcpy()s. It's like being some pregnant.

by Robert Graham ( at July 30, 2015 11:47 AM

Steve Kemp's Blog

The differences in Finland start at home.

So we're in Finland, and the differences start out immediately.

We're renting a flat, in building ten, on a street. You'd think "10 Streetname" was a single building, but no. It is a pair of buildings: 10A, and 10B.

Both of the buildings have 12 flats in them, with 10A having 1-12, and 10B having 13-24.

There's a keypad at the main entrance, which I assumed was to let you press a button and talk to the people inside "Hello I'm the postmaster", but no. There is no intercom system, instead you type in a magic number and the door opens.

The magic number? Sounds like you want to keep that secret, since it lets people into the common-area? No. Everybody has it. The postman, the cleaners, the DHL delivery man, and all the ex-tenants. We invited somebody over recently and gave it out in advance so that they could knock on our flat-door.

Talking of cleaners: In the UK I lived in a flat and once a fortnight somebody would come and sweep the stair-well, since we didn't ever agree to do it ourselves. Here somebody turns up every day, be it to cut the grass, polish the hand-rail, clean the glass on the front-door, or mop the floors of the common area. Sounds awesome. But they cut the grass, right outside our window, at 7:30AM. On the dot. (Or use a leaf-blower, or something equally noisy.)

All this communal-care is paid for by the building-association, of which all flat-owners own shares. Sounds like something we see in England, or even like Americas idea of a Home-Owners-Association. (In Scotland you own your own flat, you don't own shares of an entity which owns the complete building. I guess there are pros and cons to both approaches.)

Moving onwards other things are often the same, but the differences when you spot them are odd. I'm struggling to think of them right now, somebody woke me up by cutting our grass for the second time this week (!)

Anyway I'm registered now with the Finnish government, and have a citizen-number, which will be useful, I've got an appointment booked to register with the police - which is something I had to do as a foreigner within the first three months - and today I've got an appointment with a local bank so that I can have a euro-bank-account.

Happily I did find a gym to join, the owner came over one Sunday to give me a tiny-tour, and then gave me a list of other gyms to try if his wasn't good enough - which was a nice touch - I joined a couple of days later, his gym is awesome.

(I'm getting paid in UK-pounds, to a UK-bank, so right now I'm getting local money by transferring to my wifes account here, but I want to do that to my own, and open a shared account for paying for rent, electricity, internet, water, & etc).

My flat back home is still not rented, because the nice property management company lost my keys. Yeah you can't make that up can you? With a bit of luck the second set of keys I mailed them will arrive soon and the damn thing can be occupied, while I'm not relying on that income I do wish to have it.

July 30, 2015 09:09 AM

Chris Siebenmann

My workflow for testing Github pull requests

Every so often a Github-based project I'm following has a pending pull request that might solve a bug or otherwise deal with something I care about, and it needs some testing by people like me. The simple case is when I am not carrying any local changes; it is adequately covered by part of Github's Checking out pull requests locally (skip to the bit where they talk about 'git fetch'). A more elaborate version is:

git fetch origin pull/<ID>/head:origin/pr/<ID>
git checkout pr/<ID>

That creates a proper remote branch and then a local branch that tracks it, so I can add any local changes to the PR that I turn out to need and then keep track of them relative to the upstream pull request. If the upstream PR is rebased, well, I assume I get to delete my remote and then re-fetch it and probably do other magic. I'll cross that bridge when I reach it.

The not so simple case is when I am carrying local changes on top of the upstream master. In the fully elaborate case I actually have two repos, the first being a pure upstream tracker and the second being a 'build' repo that pulls from the first repo and carries my local changes. I need to apply some of my local changes on top of the pull request while skipping others (in this case, because some of them are workarounds for the problem the pull request is supposed to solve), and I want to do all of this work on a branch so that I can cleanly revert back to 'all of my changes on top of the real upstream master'.

The workflow I've cobbled together for this is:

  • Add the Github master repo if I haven't already done so:
    git remote add github

  • Edit .git/config to add a new 'fetch =' line so that we can also fetch pull requests from the github remote, where they will get mapped to the remote branches github/pr/NNN. This will look like:
    [remote "github"]
       fetch = +refs/pull/*/head:refs/remotes/github/pr/*

    (This comes from here.)

  • Pull down all of the pull requests with 'git fetch github'.

    I think an alternate to configuring and fetching all pull requests is the limited version I did in the simple case (changing origin to github in both occurrences), but I haven't tested this. At the point that I have to do this complicated dance I'm in a 'swatting things with a hammer' mode, so pulling down all PRs seems perfectly fine. I may regret this later.

  • Create a branch from master that will be where I build and test the pull request (plus my local changes):
    git checkout -b pr-NNN

    It's vitally important that this branch start from master and thus already contain my local changes.

  • Do an interactive rebase relative to the upstream pull request:
    git rebase -i github/pr/NNN

    This incorporates the pull request's changes 'below' my local changes to master, and with -i I can drop conflicting or unneeded local changes. Effectively it is much like what happens when you do a regular 'git pull --rebase' on master; the changes in github/pr/NNN are being treated as upstream changes and we're rebasing my local changes on top of them.

  • Set the upstream of the pr-NNN branch to the actual Github pull request branch:
    git branch -u github/pr/NNN

    This makes 'git status' report things like 'Your branch is ahead of ... by X commits', where X is the number of local commits I've added.

If the pull request is refreshed, my current guess is that I will have to fully discard my local pr-NNN branch and restart from fetching the new PR and branching off master. I'll undoubtedly find out at some point.

Initially I thought I should be able to use a sufficiently clever invocation of 'git rebase' to copy some of my local commits from master on to a new branch that was based on the Github pull request. With work I could get the rebasing to work right; however, it always wound up with me on (and changing) the master branch, which is not what I wanted. Based on this very helpful page on what 'git rebase' is really doing, what I want is apparently impossible without explicitly making a new branch first (and that new branch must already include my local changes so they're what gets rebased, which is why we have to branch from master).

This is probably not the optimal way to do this, but having hacked my way through today's git adventure game I'm going to stop now. Feel free to tell me how to improve this in comments.

(This is the kind of thing I write down partly to understand it and partly because I would hate to have to derive it again, and I'm sure I'll need it in the future.)

Sidebar: Why I use two repos in the elaborate case

In the complex case I want to both monitor changes in the Github master repo and have strong control over what I incorporate into my builds. My approach is to routinely do 'git pull' in the pure tracking repo and read 'git log' for new changes. When it's time to actually build, I 'git pull' (with rebasing) from the tracking repo into the build repo and then proceed. Since I'm pulling from the tracking repo, not the upstream, I know exactly what changes I'm going to get in my build repo and I'll never be surprised by a just-added upstream change.

In theory I'm sure I could do this in a single repo with various tricks, but doing it in two repos is much easier for me to keep straight and reliable.

by cks at July 30, 2015 03:10 AM

July 29, 2015

Sarah Allen

visual management: whiteboard culture

Redgate‘s whiteboard culture takes visual management to a new level. Last week, when visiting the UK, I spent two days in Cambridge co-working with Business of Software‘s Mark Littlewood and team who share space with Redgate. Everywhere you look there’s a whiteboard filled with sticky notes, printouts and handwritten index cards with lines and labels drawn in colored tape and marker.

whiteboard with grid of colored sticky notes with column labels: stories, could do, should do, must do , in progress

Most agile teams apply this practice. Following the mantra of “Make it Visible,” we seek to publish our ideas and document our process with artifacts that facilitate communication. The agile practice of software creation has its roots in Toyota’s just-in-time production system where Kanban boards were first developed back in the 1940s.

Mark Wightman gave me a tour through their vibrant, open offices, providing an unexpected glimpse into their culture. At first glance these displays at Redgate looked familiar, but these boards were not just a nimble construction of team process, they were an expression of team values and identity. From my perspective as an outsider, they were tangible evidence of a culture that embraced individual creativity and independence, alongside knowledge sharing and cross-team alignment. It’s a hard balance to strike. As companies scale, processes need to be standardized for people and systems to interoperate. It’s impractical manage a business when everyone is doing their own thing.

whiteboard with speech bubble along the top and lots of sticky notes Customer quotes appear in speech bubbles along the top and bottom of the whitebaord, printouts show metrics and other reports, along with classic story cards arrayed kanban-style.

Redgate has some standardized reports: a one-pager that every team provides up the management chain, but the CEO also does whiteboard tours, visiting a few teams every week to see what they are working on and talk through their latest accomplishments and challenges with the help of these artifacts on their walls. The standard parts of the report help the management team act quickly on information needed to run the business, and these dynamic, diverse expressions make it so new ideas and new challenges can be seen quickly. Humans process huge amounts of information quickly and we are amazing pattern detection engines. When I remarked at how the Redgate culture clearly valued individual creativity, Marked reframed this the way that they achieved continuous improvement. For teams to adapt to change, variation must not just be allowed, it must be celebrated.

continuous integration lights in vertical strips, team headlines as printouts and rows of sticky notes with checkmarks Vertical strips of lights show status of builds and tests from a continuous integration system. Each team updates “headlines” weekly, using that space to introduce new team members and include success stories along with whimsical imagery. Colored sticky notes and arrows provide a quick short-hand status update.

On the flight back to SFO, I caught up on some 18F work and appreciated our virtual tools of github, waffle, slack and google docs that make it possible to work remotely in the UK or at 35,000 feet with full access to all of our team processes and artifacts. Yet these colorful images of whiteboards, printouts and sticky notes keep replaying through my mind’s eye and I wonder if there is some way to capture that vibrant and flexible communication in our virtual world of work.

Key takeaways:
– individual expression fosters continuous improvement
– tangible representation of how you work streamlines communication
– communicate status inside the team = status outside the team
– fluid form supports evolution, adaptation, innovation

Many thanks to Redgate who let me experience their world for a just a little while. Note: if you want to work like this, I noticed that they are hiring in Cambridge, UK and Pasadena, USA).

More whiteboards:

glass wall with grid of stick notes and large 2015 year planner in top-right corner

sales funnel in sticky notes on whiteboard

whiteboard divided into large 2x3 grid with printouts and sticky notes arranged differently in each

whiteboard with sticky notes

The post visual management: whiteboard culture appeared first on the evolving ultrasaurus.

by sarah at July 29, 2015 01:01 PM

Sean's IT Blog

VMworld 2015 Conference Tips

VMworld is only a month away, and as every seasoned conference veteran knows, there are a number of tips that will make the experience better.

I’ve put together my favorite tips for VMworld, and I’ve grouped the tips into categories based on


  • If you haven’t already done so, sign up for Twitter.  The #VMworld hashtag is one of the primary communications channels for conference attendees.  You can also follow sessions by using the session number as a hashtag.
  • Add the “Unofficial VMworld Bloggers List” to your RSS feed. (Note: Link does not work.  Will update when I have an updated  link.)  Bloggers will recap sessions they find interesting or provide analysis of the keynote contents.
  • Be sure to sign up for VMunderground – the best community-led event of the weekend.  This year, it will be at the City View right across from the Moscone Center.
  • Talk to your vendors and tell them you’re going.  Chances are they will have a booth.  They may also have a customer appreciation event that you can score tickets to. 
  • Ask your local VMUG leaders if they know anyone from your area that is going, and find a time to do a local meetup sometime during the conference.

Packing and Travel

  • You’ll be walking around a lot, so clean out your backpack before you leave.  Chances are, you won’t need that Cisco rollover cable or a bitdriver set during the conference.  VMware also provides a notebook, so you can leave yours at home.  If you plan to use the conference backpack, then there is no point in bringing along items that you will have to repack to bring home.
  • The average high temperature for late August is 70 degrees Fahrenheit.  The lows are around 55 degrees.  Pack accordingly, and don’t forget to bring a light coat as it can get a little chilly at night. 
  • Leave a little room in your suitcase to bring stuff back.  The VMworld conference kit usually includes a backpack of some sort, and T-shirts are a popular vendor giveaway.  FedEx and UPS both have retail locations on Kearny Street if you would prefer to ship stuff instead of bringing it on the plane.
  • If you forget anything, there is a Target in the Metreon, which is right across the street from Moscone.


  • This is another area where travelling light is key as you will have to carry it around for most of the day. 
  • Bring extra batteries or battery packs for your devices.  You will be on your phone, tablet, and laptop all day, and you may not get a chance to recharge them during the day, so it is wise to have a power bank, such as this 13000 mAh battery pack, to keep your mobile devices topped off.
  • Don’t forget to bring your chargers.  There is an Apple store nearby for Apple products, but finding chargers for other laptop brands might be difficult.  Target should have chargers for other mobile devices.

General Conference Info

  • You will be standing and walking a lot.  Comfortable, broken-in shoes are a must.  If you plan to buy new shoes, buy them soon and start breaking them in so you avoid blisters and other foot injuries.
  • It may also be a good idea to start going for a walk a couple of times a week just to prepare.  If you don’t, you may be a little sore after the first day or two. 
  • Drink water frequently.  The VMworld Welcome Kit includes a water bottle, and there are water coolers around Moscone that you can use to fill it.


  • Plan your schedule early as popular sessions may fill up quickly.
  • If a session is full, add yourself to the waiting list.
  • If you are interested in two sessions that overlap, book one and add the other to your Interest list.  The schedule changes frequently, and you may get an opportunity to register for both.
  • Every session is recorded, so don’t worry if you miss one.  You can always download and watch the video later.
  • Register for Group Discussions.  They’re a good opportunity to talk to your peers and experts from VMware.  These sessions are small, so register early if you want to guarantee a spot.

Solutions Exchange

  • Block off time in your schedule to peruse the Solutions Exchange.  You want to spend time here talking to vendors.
  • Make it a point to stop by booths for vendors that you already do business with.  This is a good time to ask your vendors’ experts and get your questions answered.

Places to Eat Near Moscone

San Francisco has a lot of good cuisine.  They’re especially noted for their Asian cuisine, which is sadly not reflect in the list below. 

  • Oasis Grill – Located across the street from Moscone West.  Servers gyros and Mediterranean
  • Super Duper Burger – Located on the north side of the Yeurba Buena Garden, this place is known for burgers, garlic fries (with fresh garlic), soft-serve ice cream
  • Mel’s Drive In – 50’s style diner near Moscone with excellent breakfast.
  • Thirsty Bear – Located on Howard just east of Third Street.  Known for craft bear and spanish-style cuisine.
  • The Melt – A few blocks away from Moscone at the corner of Minna and New Montgomery, this is a grilled cheese and soup place.
  • Boudin – A local chain with a few locations around Moscone.  Known for their Sourdough bread.
  • Ghirardelli – The famed chocolatier.  The location at Market and Montgomery is also an ice cream shop that is open late.
  • Lori’s Diner – Located at Powell and Sutter, this 50’s style diner is open 24/7 so you can get some post-party breakfast.

And finally – Sushirrito on New Montgomery.  It’s a sushi burrito.  Need I say more??

Other Stuff

  • Go out and enjoy yourself.
  • There is an online registry of events for the week at  Go out and socialize with your peers.
  • Even if you don’t know who the bands are, go to the VMworld party.  The last time it was at AT&T Park, it was a blast.
  • If packed events aren’t your thing, go explore the city.  Although a lot of museums close by 6:00 PM, restaurants on Fisherman’s Wharf and Ghirardelli’s are open late. 


  • San Francisco has a large indigent population.  Be careful when handling money in public or using ATMs.  (Note: This is a general safety tip for any large American city.  You don’t want to make yourself a victim or be “aggressively panhandled.”)
  • San Francisco has some rough areas, and its not hard to get lost and wander into them.  Many of the conference hotels are near Union Square, and it is very easy to take a wrong turn and wander into the Tenderloin District. 

by seanpmassey at July 29, 2015 01:00 PM

Debian Administration

DKIM-signing outgoing mail with exim4

There have been several systems designed to prevent mail spoofing over the years, the two most prominent solutions are DKIM and SPF. Here we're going to document the setup of using DKIM to signing outgoing mails with Debian's default mail transfer agent, exim4.

by Steve at July 29, 2015 06:28 AM

Chris Siebenmann

A cynical view on needing SSDs in all your machines in the future

Let's start with my tweets:

@thatcks: Dear Firefox Nightly: doing ten+ minutes of high disk IO on startup before you even start showing me my restored session is absurd.
@thatcks: Clearly the day is coming when using a SSD is going be not merely useful but essential to get modern programs to perform decently.

I didn't say this just because programs are going to want to do more and more disk IO over time. Instead, I said it because of a traditional developer behavior, namely that developers mostly assess how fast their work is based on how it runs on their machines and developer machines are generally very beefy ones. At this point it's extremely likely that most developer machines have decently fast SSDs (and for good reason), which means that it's actually going to be hard for developers to notice they've written code that basically assumes a SSD and only runs acceptably on it (either in general or when some moderate corner case triggers).

SSDs exacerbate this problem by being not just fast in general but especially hugely faster at random IO than traditional hard drives. If you accidentally write something that is random IO heavy (or becomes so under some circumstances, perhaps as you scale the size of the database up) but only run it on a SSD based system, you might not really notice. Run that same thing on a HD based one (with a large database) and it will grind to a halt for ten minutes.

(Today I don't think we have profiling tools for disk IO the way we do for CPU usage by code, so even if a developer wanted to check for this their only option is to find a machine with a HD and try things out. Perhaps part of the solution will be an 'act like a HD' emulation layer for software testing that does things like slowing down random IO. Of course it's much more likely that people will just say 'buy SSDs and stop bugging us', especially in a few years.)

by cks at July 29, 2015 05:21 AM

July 28, 2015


Beyond Reformatting: More Code Cleanup

The OpenSSL source doesn’t look the same as it did a year ago. Matt posted about the big code reformatting. In this post I want review some of the other changes – these rarely affect features, but are more than involved than “just” whitespace.

Previously, we’d accept platform patches from just about anyone. This led to some really hard-to-follow pre-processor guards:

    #if !defined(TERMIO) && !defined(TERMIOS) \
        && !defined(OPENSSL_SYS_VMS) && !defined(OPENSSL_SYS_MSDOS) \
        && !defined(OPENSSL_SYS_MACINTOSH_CLASSIC) && \

Can anyone reasonably tell when that applies? And there were many that were just plain silly:

    #if 1 /* new with OpenSSL 0.9.7 */

    #ifdef undef

So we did a couple of things.

Unsupported platforms

First, we’re no longer supporting every known platform; it has to be something that either someone on the team has direct access to, or we have extensive support from the vendor.

Back in the fall we kicked off a discussion on the openssl-dev mailing list about removing unsupported platforms. Based on feedback, Netware is still supported and people are contacting some vendors to get them to step up and help support their platform. But the following are now gone:

  • Sony NEWS4
  • BEOS and BEOS_R5
  • NeXT
  • MPE/iX
  • Sinix/ReliantUNIX RM400
  • DGUX
  • NCR
  • Tandem
  • Cray
  • Win16

OpenSSL never supported 16-bit (neither did SSLeay, really), and we just never made that explicit.

Live/dead code

Almost every instance of #if 0 or #if 1 had the appropriate code removed. There are a couple of places where they remain, because it could be useful documentation. For example, this block in ssl/s3_lib.c:

    #if 0
         * Do not set the compare functions, because this may lead to a
         * reordering by "id". We want to keep the original ordering. We may pay
         * a price in performance during sk_SSL_CIPHER_find(), but would have to
         * pay with the price of sk_SSL_CIPHER_dup().
         sk_SSL_CIPHER_set_cmp_func(srvr, ssl_cipher_ptr_id_cmp);
         sk_SSL_CIPHER_set_cmp_func(clnt, ssl_cipher_ptr_id_cmp);

The once place they do remain is in the crypto/bn component. That’s just because I’m scared to touch it. :) That directory counts for about one-third of all remaining instances. Compared to 1.0.2, which had 343 instances in 165 files, we now have 43 instances in 31 files, and I hope we can reduce that even more.

ifdef control

Over time, some feature-control #ifdef’s evolved into inconsistencies. We cleaned them up, merging OPENSSL_NO_RIPEMD160 and OPENSSL_NO_RIPEMD into OPENSSL_NO_RMD160. Similarly, OPENSSL_NO_FP_API was merged into OPENSSL_NO_STDIO. (We’ll soon make that buildable again, for embedded platforms.)

Also in this area, configuration used OPENSSSL_SYSNAME_xxx which was internally mapped to OPENSSL_SYS_xxx. Removing that mapping, in conjunction with removing some old platforms, made parts of the internal header file e_os.h much simpler, but we there is still room for much improvement there.

The biggest change was removing about one-third of the nearly 100 options that we used to support. Many of these had suffered severe bit-rot and no longer worked. The biggest part was removing the ability to build OpenSSL without other parts of the library: BIO, BUFFER, EVP, LHASH, HASH_COMP, LOCKING, OBJECT, STACK are no longer optional; OPENSSL_NO_BIO has no effect (and will hopefully generate a warning). We also removed support for the broken SHA0 and DSS0, the CBCM mode of DES (invented by IBM and not used anywhere), and the maintenance of our own free lists, OPENSSL_NO_BUF_FREELISTS. This last got us some flack erroneously sent our way, but it was clearly a holdover from when the major supported platforms did not have efficient malloc/free components.

Dynamic memory cleanup

We did a big cleanup on how we use various memory functions. For example:

    a = (foo *)malloc(n * sizeof(foo))

had a needless cast, and would quietly break if the type of a changed. We now do this consistently:

    a = malloc(n * sizeof(*a));

Similar sizeof changes were done for memset and memcpy calls.

We also stopped checking for NULL before calling free. That was huge, requiring over a dozen commits. It also reduced code complexity, and got us a few extra percentage points on our test coverage. :)


The OpenSSL code base has a long and storied history. With these changes, the reformatting, and others, I think we’re making good on our commitment for the future.

July 28, 2015 01:20 PM

Everything Sysadmin

LISA Conversations premieres on Tuesday!

Yes, I've started a video podcast that has a homework assignment built-in. Watch a famous talk from a past LISA conference (that's the homework) then watch Tom and Lee interview the speaker. What's new since the talk? Were their predictions validated? Come find out!

Watch it live or catch the recorded version later.

The first episode will be recorded live Tuesday July 28, 2015 at 1:30pm PDT. This month's guest will be Todd Underwood who will discuss his talk from LISA '13 titled, Post-Ops: A Non-Surgical tale of Software, Fragility, and Reliability

For links to the broadcast and other info, go here:

July 28, 2015 01:20 AM

July 27, 2015

Racker Hacker

Very slow ssh logins on Fedora 22

I’ve recently set up a Fedora 22 firewall/router at home (more on that later) and I noticed that remote ssh logins were extremely slow. In addition, sudo commands seemed to stall out for the same amount of time (about 25-30 seconds).

I’ve done all the basic troubleshooting already:

  • Switch to UseDNS no in /etc/ssh/sshd_config
  • Set GSSAPIAuthentication no in /etc/ssh/sshd_config
  • Tested DNS resolution

These lines kept cropping up in my system journal when I tried to access the server using ssh:

dbus[4865]: [system] Failed to activate service 'org.freedesktop.login1': timed out
sshd[7391]: pam_systemd(sshd:session): Failed to create session: Activation of org.freedesktop.login1 timed out
sshd[7388]: pam_systemd(sshd:session): Failed to create session: Activation of org.freedesktop.login1 timed out

The process list on the server looked fine. I could see dbus-daemon and systemd-logind processes and they were in good states. However, it looked like dbus-daemon had restarted at some point and systemd-logind had not been restarted since then. I crossed my fingers and bounced systemd-logind:

systemctl restart systemd-logind

Success! Logins via ssh and escalations with sudo worked instantly.

The post Very slow ssh logins on Fedora 22 appeared first on

by Major Hayden at July 27, 2015 12:09 PM

July 26, 2015

Running TSS/8 on the DEC PiDP-8/i and SIMH

In this guide I'll show you how run the TSS/8 operating system on the PiDP replica by Oscar Vermeulen, and on SIMH on any other computer. I'll also cover a few basic commands like the editor, user management and system information. TSS-8 was a little time-sharing operating system released in 1968 and requires a minimum of 12K words of memory and a swapping device; on a 24K word machine, it supports up to 17 users. Each user gets a virtual 4K PDP-8; many of the utilities users ran on these virtual machines were only slightly modified versions of utilities from the Disk Monitor System or paper-tape environments. Internally, TSS-8 consists of RMON, the resident monitor, DMON, the disk monitor (file system), and KMON, the keyboard monitor (command shell). BASIC was well supported, while restricted (4K) versions of FORTRAN D and Algol were available.

July 26, 2015 12:00 AM

July 25, 2015

Steve Kemp's Blog

We're in Finland now.

So we've recently spent our first week together in Helsinki, Finland.

Mostly this has been stress-free, but there are always oddities about living in new places, and moving to Europe didn't minimize them.

For the moment I'll gloss over the differences and instead document the computer problem I had. Our previous shared-desktop system had a pair of drives configured using software RAID. I pulled one of the drives to use in a smaller-cased system (smaller so it was easier to ship).

Only one drive of a pair being present make mdadm scream, via email, once per day, with reports of failure.

The output of cat /proc/mdstat looked like this:

md2 : active raid1 sdb6[0] [LVM-storage-area]
      1903576896 blocks super 1.2 2 near-copies [2/1] [_U]
md1 : active raid10 sdb5[1] [/root]
      48794112 blocks super 1.2 2 near-copies [2/1] [_U]
md0 : active raid1 sdb1[0]  [/boot]
      975296 blocks super 1.2 2 near-copies [2/1] [_U]

See the "_" there? That's the missing drive. I couldn't remove the drive as it wasn't present on-disk, so this failed:

mdadm --fail   /dev/md0 /dev/sda1
mdadm --remove /dev/md0 /dev/sda1
# repeat for md1, md2.

Similarly removing all "detached" drives failed, so the only thing to do was to mess around re-creating the arrays with a single drive:

lvchange -a n shelob-vol
mdadm --stop /dev/md2
mdadm --create /dev/md2 --level=1 --raid-devices=1 /dev/sdb6 --force

I did that on the LVM-storage area, and the /boot partition, but "/" is still to be updated. I'll use knoppix/similar to do it next week. That'll give me a "RAID" system which won't alert every day.

Thanks to the joys of re-creation the UUIDs of the devices changed, so /etc/mdadm/mdadm.conf needed updating. I realized that too late, when grub failed to show the menu, because it didn't find it's own UUID. Handy recipe for the future:

set prefix=(md/0)/grub/
insmod linux
linux (md/0)/vmlinuz-3.16.0-0.bpo.4-amd64 root=/dev/md1
initrd (md/0)//boot/initrd.img-3.16.0-0.bpo.4-amd64

July 25, 2015 02:00 AM


OSCON 2015

Following the Community Leadership Summit (CLS), which I wrote about wrote about here, I spent a couple of days at OSCON.

Monday kicked off by attending Jono Bacon’s Community leadership workshop. I attended one of these a couple years ago, so it was really interesting to see how his advice has evolved with the change in tooling and progress that communities in tech and beyond has changed. I took a lot of notes, but everything I wanted to say here has been summarized by others in a series of great posts on

…hopefully no one else went to Powell’s to pick up the recommended books, I cleared them out of a couple of them.

That afternoon Jono joined David Planella of the Community Team at Canonical and Michael Hall, Laura Czajkowski and I of the Ubuntu Community Council to look through our CLS notes and come up with some talking points to discuss with the rest of the Ubuntu community regarding everything from in person events (stronger centralized support of regional Ubucons needed?) to learning what inspires people about the active Ubuntu phone community and how we can make them feel more included in the broader community (and helping them become leaders!). There was also some interesting discussion around the Open Source projects managed by Canonical and expectations for community members with regard to where they can get involved. There are some projects where part time, community contributors are wanted and welcome, and others where it’s simply not realistic due to a variety of factors, from the desire for in-person collaboration (a lot of design and UI stuff) to the new projects with an exceptionally fast pace of development that makes it harder for part time contributors (right now I’m thinking anything related to Snappy). There are improvements that Canonical can make so that even these projects are more welcoming, but adjusting expectations about where contributions are most needed and wanted would be valuable to me. I’m looking forward to discussing these topics and more with the broader Ubuntu community.

Laura, David, Michael, Lyz

Monday night we invited members of the Oregon LoCo out and had an Out of Towners Dinner at Altabira City Tavern, the restaurant on top of the Hotel Eastlund where several of us were staying. Unfortunately the local Kubuntu folks had already cleared out of town for Akademy in Spain, but we were able to meet up with long-time Ubuntu member Dan Trevino, who used to be part of the Florida LoCo with Michael, and who I last saw at Google I/O last year. I enjoyed great food and company.

I wasn’t speaking at OSCON this year, so I attended with an Expo pass and after an amazing breakfast at Mother’s Bistro in downtown Portland with Laura, David and Michael (…and another quick stop at Powell’s), I spent Tuesday afternoon hanging out with various friends who were also attending OSCON. When 5PM rolled around the actual expo hall itself opened, and surprised me with how massive and expensive some of the company booths had become. My last OSCON was in 2013 and I don’t remember the expo hall being quite so extravagant. We’ve sure come a long way.

Still, my favorite part of the expo hall is always the non-profit/open source project/organization area where the more grass-roots tables are. I was able to chat with several people who are really passionate about what they do. As a former Linux Users Group organizer and someone who still does a lot of open source work for free as a hobby, these are my people.

Wednesday was my last morning at OSCON. I did another walk around the expo hall and chatted with several people. I also went by the HP booth and got a picture of myself… with myself. I remain very happy that HP continues to support my career in a way that allows me to work on really interesting open source infrastructure stuff and to travel the world to tell people about it.

My flight took me home Wednesday afternoon and with that my OSCON adventure for 2015 came to a close!

More OSCON and general Portland photos here:

July 25, 2015 12:27 AM

July 24, 2015

Everything Sysadmin

Save on The Practice of Cloud System Administration

Pearson / is running a promotion through August 11th on many open-source related books, including Volume 2, The Practice of Cloud System Administration. Use discount code OPEN2015 during checkout and received 35% off any one book, or 45% off 2 or more books.

See the website for details.

July 24, 2015 03:40 PM

That grumpy BSD guy

The OpenSSH Bug That Wasn't

Much has been written about a purported OpenSSH vulnerability. On closer inspection, the reports actually got most of their facts wrong. Read on for the full story.

It all started with a blog post dated July 16, 2015, titled OpenSSH keyboard-interactive authentication brute force vulnerability (MaxAuthTries bypass), where the TL;DR is that it's possible to get an almost infinite number of tries at authentication -- good for bruteforce password guessing, for example -- if you only tickle the OpenSSH server just so.

This sounded interesting and scary enough that I wanted to try it out myself. The blog quite helpfully supplies a one-liner that you can cut and paste to your own conmand line to check whether the systems you have within reach are indeed vulnerable.

Here's a transcript of running those tests on the machines I happened to try (Disclaimer: The recorded sessions here are from a second try, a few days after the first). First, my home gateway, running a recent OpenBSD 5.8-beta:

[Fri Jul 24 14:58:31] peter@elke:~$ ssh -lrazz -oKbdInteractiveDevices=`perl -e 'print "pam," x 10000'`
Host key fingerprint is SHA256:maeVFpNMibnkcwPSmjV4QBXfz5J97XLta6e2CrzsAYQ
+---[ECDSA 256]---+
| .o=o+.. |
| o.X+.o |
| EO.+* . |
| o.+oo+ = .|
| So=.o + o|
| B . o.|
| . + . +|
| . + .=.|
| .+ .o+++|
+----[SHA256]-----+'s password:
Permission denied, please try again.'s password:
Permission denied, please try again.'s password:
Permission denied (publickey,password,keyboard-interactive).
[Fri Jul 24 16:53:06] peter@elke:~$

Here, razz is a non-existent user, and as we can see we get exactly three password prompts before the connection is shut down. Now we know that an essentially untweaked SSH server configuration on a recent OpenBSD does not behave as described in the article.

But what about earlier OpenBSD releases? I have one box I should have upgraded a while ago (to my enduring shame, it's still on 5.3-stable, but don't tell anyone). So here's the same thing pointed at that box:

[Fri Jul 24 16:53:06] peter@elke:~$ ssh -lrazz -oKbdInteractiveDevices=`perl -e 'print "pam," x 10000'` delilah         
Host key fingerprint is SHA256:AO8rn6Va9+b3+7gdVUxby5zWQFaLnkIA6wcEsOVHukA
+---[ECDSA 256]---+
| Eoo.+.. .+.+|
| . +o+ . . .++B|
| o oo+ . . O+|
| ..+.. . . o .|
| ...S. . o .|
| .. . .|
| . o o . |
| + = .. . o.|
| ..+ oo. .=+o|

razz@delilah's password:
Permission denied, please try again.
razz@delilah's password:
Permission denied, please try again.
razz@delilah's password:
Permission denied (publickey,password,keyboard-interactive).
[Fri Jul 24 16:59:37] peter@elke:~$

razz is not a valid user here either, and even the old OpenSSH version here shuts down the connection after three failures. I don't have any OpenBSD boxes with older versions than this anywhere I know about, and we can be reasonably confident that at least close to default configurations on OpenBSD are not vulnerable.

But several of the articles hint that OpenSSH on Linux is vulnerable. I do have a few CentOS boxes within reach. I'll repeat the test on one of these, then:

[Fri Jul 24 17:05:13] peter@elke:~$ ssh -lrazz -oKbdInteractiveDevices=`perl -e 'print "pam," x 10000'` nms
Host key fingerprint is SHA256:fdFxpvSDLq3W9D1d8U6RzuYQcd0WzAmIFfJAzcIkD8I
+---[RSA 2048]----+
| .. o+==ooo*oB|
| E. ++++ o+X=|
| ....oo*.|
| . o.+ =|
| S ...= ++|
| .= =o+|
| o . ++|
| . .|
| |

razz@nms's password:
Permission denied, please try again.
razz@nms's password:
Permission denied, please try again.
razz@nms's password:
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

nms is a CentOS 6.6 box, and the result is as we can see here, pretty much identical to what we saw on the OpenBSD machines. So far we haven't seen anything like what the blogger kingcope had us to expect to see.

But looking back to the original article, he seems to have tested only on FreeBSD. I have some FreeBSD boxes within reach too, and rosalita runs a recent FreeBSD 10.1, freshly upgraded via freebsd-update. Here's what our experiment looks like pointed at rosalita:

[Fri Jul 24 17:15:03] peter@elke:~$ ssh -lrazz -oKbdInteractiveDevices=`perl -e 'print "pam," x 10000'` rosalita           
Host key fingerprint is SHA256:Ig6F8Au3f0KYNrzuc5qRrpZgY4Q/tz0bJrS0NZMxp1g
+---[ECDSA 256]---+
|. |
| o . |
|o + . E . |
|.= * o * |
|..X * B S |
|.=o@.= + |
|+ *oBo+ |
| =.oo=o. |
|oo*+ .o |

Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:
Password for razz@rosalita:

Bingo! We have finally seen the reported vulnerability in action on a live system. The ^C at the end is me pressing Ctrl-C to abort after ten tries, my heart wasn't quite in it for the full ten thousand. So far, it looks like this behavior is specific to FreeBSD, but of course it is conceivable that other systems ship with their sshds configured in a similar way.

After a bit of back and forth and reading articles elsehere, it seems that only OpenSSH servers that are set up to use PAM for authentication and with a very specific (non-default on OpenBSD and most other places) setup are in fact vulnerable. Even though there is a patch available which tightens up the code a bit in the PAM-specific parts, OpenBSD users don't actually need to apply it. One big reason being that OpenBSD does not use PAM for its authentication.

The question also came up in a thread on OpenBSD-misc, titled Alleged OpenSSH bug, where several OpenBSD developers commented. Do read the whole thread, but as we've already seen, it's easy to test whether your systems behave as described in the original blog post as well as this one.

And as OpenBSD developer Marc Espie says in his message,

Not surprisingly, as the patch clearly shows, the problem is right smack in the middle of USE_PAM code.

I wouldn't call that an OpenSSH bug. I would call it a systemic design flaw in PAM. As usual. LOTS of security holes in authentication systems stem from PAM. Why ? Because that stuff is over designed. Difficult to configure. Gives you MORE than you need to hang yourself several times over. It's been that way for as long as I can remember.

As they say, do read the whole thing. TL;DR this time around is: OpenBSD is not vulnerable, and on the systems that are, changing the configuration to close this particular bruteforcing opportunity is trivial. As is checking the facts before writing up a story. (And as several correspondents have reminded me already -- switching your sshd to keys only authentication will let you sleep better at night.)

by Peter N. M. Hansteen ( at July 24, 2015 03:28 PM

Openvaz: Creating credentials is very slow [FIXED]

When creating new credentials on Openvaz (6, 7 and 8), it takes a very long time to store the credentials.

The problem here is that the credentials are stored encrypted, and Openvaz (probably) has to generate a PGP key. This requires lots of random entropy, which is generally not abundantly available on a virtual machine. The solution is to install haveged:

sudo apt-get install haveged

Haveged will securely seed the random pool which will make a lot of random entropy available, even if you have no keyboard, mouse and soundcard attached. Ideal for VPSes.

by admin at July 24, 2015 10:50 AM

July 23, 2015

Running Adventure on the DEC PDP-8 with SIMH

In this guide I'll show you how run the classic Colossal Cave Adventure game on a PDP-8, emulated by the SIMH emulator. The PDP-8 was an 12 bit minicomputer made in 1964 by DEC, the Digital Equipment Corporation. We will install and set up SIMH, the emulator with a RK05 diskimage running OS/8. We will use FORTRAN on OS/8 to load ADVENTURE, then we use our brain to play the game. As a bonus, I also show you how to edit files using EDIT, and show you a bit of the OS/8 system.

July 23, 2015 12:00 AM

July 22, 2015

Errata Security

Infosec's inability to quantify risk

Infosec isn't a real profession. Among the things missing is proper "risk analysis". Instead of quantifying risk, we treat it as an absolute. Risk is binary, either there is risk or there isn't. We respond to risk emotionally rather than rationally, claiming all risk needs to be removed. This is why nobody listens to us. Business leaders quantify and prioritize risk, but we don't, so our useless advice is ignored.

An example of this is the car hacking stunt by Charlie Miller and Chris Valasek, where they turned off the engine at freeway speeds. This has lead to an outcry of criticism in our community from people who haven't quantified the risk. Any rational measure of the risk of that stunt is that it's pretty small -- while the benefits are very large.

In college, I owned a poorly maintained VW bug that would occasionally lose power on the freeway, such as from an electrical connection falling off from vibration. I caused more risk by not maintaining my car than these security researchers did.

Indeed, cars losing power on the freeway is a rather common occurrence. We often see cars on the side of the road. Few accidents are caused by such cars. Sure, they add risk, but so do people abruptly changing lanes.

No human is a perfect driver. Every time we get into our cars, instead of cycling or taking public transportation, we add risk to those around us. The majority of those criticizing this hacking stunt have caused more risk to other drivers this last year by commuting to work. They cause this risk not for some high ideal of improving infosec, but merely for personal convenience. Infosec is legendary for it's hypocrisy, this is just one more example.

Google, Tesla, and other companies are creating "self driving cars". Self-driving cars will always struggle to cope with unpredictable human drivers, and will occasionally cause accidents. However, in the long run, self-driving cars will be vastly safer. To reach that point, we need to quantify risk. We need to be able to show that for every life lost due to self-driving cars, two have been saved because they are inherently safer. But here's the thing, if we use the immature risk analysis from the infosec "profession", we'll always point to the one life lost, and never quantify the two lives saved. Using infosec risk analysis, safer self-driving cars will never happen.

In hindsight, it's obvious to everyone that Valasek and Miller went too far. Renting a track for a few hours costs less than the plane ticket for the journalist to come out and visit them. Infosec is like a pride of lions, that'll leap and devour one of their members when they show a sign of weakness. This minor mistake is weakness, so many in infosec have jumped on the pair, reveling in righteous rage. But any rational quantification of the risks show that the mistake is minor, compared to the huge benefit of their research. I, for one, praise these two, and hope they continue their research -- knowing full well that they'll likely continue to make other sorts of minor mistakes in the future.

by Robert Graham ( at July 22, 2015 05:21 PM

The Tech Teapot

Your business model is not a software feature

I created a product a few years ago and whilst it is doing fine on new sales it is really bad at monetising the existing customer base. The reason it is doing so badly at monetising our existing customers is because I assumed that the business model could be plugged in later, like any other software feature.

I was 100% wrong.

Why didn’t I build the business model in from the start? Patience. Or rather my lack of patience.

The software took quite a while to write and I was very keen to get it out of the door as quickly as possible. I got to the stage that I was sick of the sight of the software and just wanted it finished. There is nothing wrong with wanting your project finished. But your project cannot be done if the business model isn’t baked in.

When I originally created the product, I tried to create the simplest product possible that nevertheless delivered value to the customer.

I put two things off from the first version of the software. One was the ability to notify customers when a new version of the software is available. The second was the ability to renew the software subscription after the free period had elapsed.

I thought that I could plug-in the business model at a later stage. Just like a new feature. Turns out I was wrong. Whilst I could retrofit now, an awful lot of the value has been lost. Perhaps most of the value. The vast majority of existing customers will never know about the new software and so will never upgrade.

One of the nice things about developing for the various app stores is that you don’t need to build the business model. Somebody else has done that for you.

by Jack Hughes at July 22, 2015 11:21 AM

Standalone Sysadmin

So…containers. Why? How? What? Start here if you haven’t.

I tweeted a link today about running ceph inside of Docker, something that I would like to give a shot (mostly because I want to learn Docker more than I do, and I’ve never played with ceph, and it has a lot of interesting stuff going on):

I got to thinking about it, and realized that I haven’t written much about Docker, or even containers in general.

Containers are definitely the new hotness. Kubernetes just released 1.0 today, Docker has taken the world by storm, and here I am, still impressed by my Vagrant-fu and thinking that digital watches are a pretty neat idea. What happened? Containers? But, what about virtualization? Aren’t containers virtualization? Sort of? I mean, what is a container, anyway?

Lets start there before we get in too deep, alright?

UNIX (and by extension, Linux) has, for a long time, had a pretty cool command called ‘chroot‘. The chroot command allows you to point at an arbitrary directory and say “I want that directory to be the root (/) now”. This is useful if you had a particular process or user that you wanted to cordon off from the rest of the system, for example.
This is a really big advantage over virtual machines in several ways. First, it’s not very resource intensive at all. You aren’t emulating any hardware, you aren’t spinning up an entire new kernel, you’re just moving execution over to another environment (so executing another shell), plus any services that the new environment needs to have running. It’s also very fast, for the same reason. A VM may take minutes to spin up completely, but a lightweight chroot can be done in seconds.

It’s actually pretty easy to build a workable chroot environment. You just need to install all of the things that need to exist for a system to work properly. There’s a good instruction set on, but it’s a bit outdated (from 2010), so here’s a quick set of updated instructions.

Just go to Digital Ocean and spin up a quick CentOS 7 box (512MB is fine) and follow along:

# rpm –rebuilddb –root=/var/tmp/chroot/
# cd /var/tmp/chroot/
# wget
# rpm -i –root=/var/tmp/chroot –nodeps ./centos-release-7-1.1503.el7.centos.2.8.x86_64.rpm
# yum –installroot=/var/tmp/chroot install -y rpm-build yum
# cp /var/tmp/chroot/etc/skel/.??* /var/tmp/chroot/root/

At this point, there’s a pretty functional CentOS install in /var/tmp/chroot/ which you can see if you do an ‘ls’ on it.

Lets check just to make sure we are where we think we are:

# pwd

Now do the magic:

# chroot /var/tmp/chroot /bin/bash -l

and voila:

# pwd

How much storage are we paying for this with? Not too much, all things considered:

[root@dockertesting tmp]# du -hs
436M .

Not everything is going to be functional, which you’ll quickly discover. Linux expects things like /proc/ and /sys/ to be around, and when they aren’t, it gets grumpy. The obvious problem is that if you extend /proc/ and /sys/ into the chrooted environment, you expose the outside OS to the container…probably not what you were going for.

It turns out to be very hard to secure a straight chroot. BSD fixed this in the kernel by using jails, which go a lot farther in isolating the instance, but for a long time, there wasn’t an equivalent in Linux, which made escaping from a chroot jail something of a hobby for some people. Fortunately for our purposes, new tools and techniques were developed for Linux that went a long way to fixing this. Unfortunately, a lot of the old tools that you and I grew up with aren’t future compatible and are going to have to go away.

Enter the concept of cgroups. Cgroups are in-kernel walls that can be erected to limit resources for a group of processes. They also allow for prioritization, accounting, and control of processes, and paved the way for a lot of other features that containers take advantage of, like namespaces (think cgroups, but for networks or processes themselves, so each container gets its own network interface, say, but doesn’t have the ability to spy on its neighbors running on the same machine, or where each container can have its own PID 0, which isn’t possible with old chroot environments).

You can see that with containers, there are a lot of benefits, and thanks to modern kernel architecture, we lose a lot of the drawbacks we used to have. This is the reason that containers are so hot right now. I can spin up hundreds of docker containers in the time that it takes my fastest Vagrant image to boot.

Docker. I keep saying Docker. What’s Docker?

Well, it’s one tool for managing containers. Remember all of the stuff we went through above to get the chroot environment set up? We had to make a directory, force-install an RPM that could then tell yum what OS to install, then we had to actually have yum install the OS, and then we had to set up root’s skel so that we had aliases and all of that stuff. And after all of that work, we didn’t even have a machine that did anything. Wouldn’t it be nice if there was a way to just say the equivalent of “Hey! Make me a new machine!”? Enter docker.

Docker is a tool to manage containers. It manages the images, the instances of them, and overall, does a lot for you. It’s pretty easy to get started, too. In the CentOS machine you spun up above, just run

# yum install docker -y
# service docker start

Docker will install, and then it’ll start up the docker daemon that runs in the background, keeping tabs on instances.

At this point, you should be able to run Docker. Test it by running the hello-world instance:

[root@dockertesting ~]# docker run hello-world
Unable to find image ‘hello-world:latest’ locally
latest: Pulling from
a8219747be10: Pull complete
91c95931e552: Already exists The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.
Digest: sha256:aa03e5d0d5553b4c3473e89c8619cf79df368babd18681cf5daeb82aab55838d
Status: Downloaded newer image for
Usage of loopback devices is strongly discouraged for production use. Either use `–storage-opt dm.thinpooldev` or use `–storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
Hello from Docker.
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the “hello-world” image from the Docker Hub.
(Assuming it was not already locally available.)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

For more examples and ideas, visit:

The message is pretty explanatory, but you can see how it understood that you asked for something that it didn’t immediately know about, searched the repository at the Docker Hub to find it, pulled it down, and ran it.

There’s a wide world of Docker out there (start with the Docker docs, for instance, then maybe read Nathan LeClaire’s blog, among others).

And Docker is just the beginning. Docker’s backend uses something called “libcontainer” (actually now or soon to be runc, of the Open Container format), but it migrated to that from LXC, another set of tools and API for manipulating the kernel to create containers. And then there’s Kubernetes, which you can get started with on about a dozen platforms pretty easily.

Just remember to shut down and destroy your Digital Ocean droplet when you’re done, and in all likelihood, you’ve spent less than a quarter to have a fast machine with lots of bandwidth to serve as a really convenient learning lab. This is why I love Digital Ocean for stuff like this!

In wrapping this up, I feel like I should add this: Don’t get overwhelmed. There’s a TON of new stuff out there, and there’s new stuff coming up constantly. Don’t feel like you have to keep up with all of it, because that’s not possible to do while maintaining the rest of your life’s balance. Pick something, and learn how it works. Don’t worry about the rest of it – once you’ve learned something, you can use that as a springboard to understanding how the next thing works. They all sort of follow the same pattern, and they’re all working toward the same goal of rapidly-available short-lived service providing instances. They use different formats, backends, and so on, but it’s not important to master all of them, and feeling overwhelmed is a normal response to a situation like this.

Just take it one step at a time and try to have fun with it. Learning should be enjoyable, so don’t let it get you down. Comment below and let me know what you think of Docker (or Kubernetes, or whatever your favorite new tech is).

by Matt Simmons at July 22, 2015 10:32 AM

July 21, 2015

LZone - Sysadmin

PHP preg_replace() Examples

This post gives some simple examples for using regular expressions with preg_replace() in PHP scripts.

1. Syntax of preg_replace

While full syntax is
mixed preg_replace ( mixed $pattern , mixed 
$replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

2. Simple Replacing with preg_replace()

$result = preg_replace('/abc/', 'def', $string);   # Replace all 'abc' with 'def'
$result = preg_replace('/abc/i', 'def', $string);  # Replace with case insensitive matching
$result = preg_replace('/\s+/', '', $string);      # Strip all whitespaces

3. Advanced Usage of preg_replace()

Multiple replacements:

$result = preg_replace(
    array('/pattern1/', '/pattern2/'),
    array('replace1', 'replace2'),

Replacement Back References:

$result = preg_replace('/abc(def)hij/', '/\\1/', $string);
$result = preg_replace('/abc(def)hij/', '/$1/', $string);
$result = preg_replace('/abc(def)hij/', '/${1}/', $string);

Do only a finite number of replacements:

# Perform maximum of 5 replacements
$result = preg_replace('/abc/', 'def', $string, -1, 5);

Multi-line replacement

# Strip HTML tag
$result = preg_replace('#.*#m', '', $string);

July 21, 2015 10:19 PM


Going Too Far to Prove a Point

I just read Hackers Remotely Kill a Jeep on the Highway - With Me in It by Andy Greenberg. It includes the following:

"I was driving 70 mph on the edge of downtown St. Louis when the exploit began to take hold...

To better simulate the experience of driving a vehicle while it’s being hijacked by an invisible, virtual force, Miller and Valasek refused to tell me ahead of time what kinds of attacks they planned to launch from Miller’s laptop in his house 10 miles west. Instead, they merely assured me that they wouldn’t do anything life-threatening. Then they told me to drive the Jeep onto the highway. “Remember, Andy,” Miller had said through my iPhone’s speaker just before I pulled onto the I-40 on-ramp, “no matter what happens, don’t panic.”

As the two hackers remotely toyed with the air-conditioning, radio, and windshield wipers, I mentally congratulated myself on my courage under pressure. That’s when they cut the transmission.

Immediately my accelerator stopped working. As I frantically pressed the pedal and watched the RPMs climb, the Jeep lost half its speed, then slowed to a crawl. This occurred just as I reached a long overpass, with no shoulder to offer an escape. The experiment had ceased to be fun.

At that point, the interstate began to slope upward, so the Jeep lost more momentum and barely crept forward. Cars lined up behind my bumper before passing me, honking. I could see an 18-wheeler approaching in my rearview mirror. I hoped its driver saw me, too, and could tell I was paralyzed on the highway.

“You’re doomed!” Valasek shouted, but I couldn’t make out his heckling over the blast of the radio, now pumping Kanye West. The semi loomed in the mirror, bearing down on my immobilized Jeep.

I followed Miller’s advice: I didn’t panic. I did, however, drop any semblance of bravery, grab my iPhone with a clammy fist, and beg the hackers to make it stop...

After narrowly averting death by semi-trailer, I managed to roll the lame Jeep down an exit ramp, re-engaged the transmission by turning the ignition off and on, and found an empty lot where I could safely continue the experiment." (emphasis added)

I had two reactions to this article:

1. It is horrifying that hackers can remotely take control of a vehicle. The auto industry has a lot of work to do. It's unfortunate that it takes private research and media attention to force a patch (which has now been published.) Hopefully a combination of Congressional attention, product safety laws, and customer pressure will improve the security of the auto industry before lives and property are affected.

2. It is also horrifying to conduct a hacking "experiment" on I-40, with vehicles driving at 60 or more MPH, carrying passengers. It's not funny to put lives at risk, whether they are volunteers like the driver/author or other people on the highway.

Believing it is ok reflects the same juvenile thinking that motivated another "researcher," Chris Roberts, to apparently "experiment" with live airplanes, as reported by Wired and other news outlets.

Hackers are not entitled to jeopardize the lives of innocent people in order to make a point. They can prove their discoveries without putting others, who have not consented to be guinea pigs, at risk.

It would be a tragedy if the first death by physical-digital convergence occurs because a "security researcher" is "experimenting" in order to demonstrate a proof of concept.

by Richard Bejtlich ( at July 21, 2015 01:45 PM


Community Leadership Summit 2015

My Saturday kicked off with the Community Leadership Summit (CLS) here in Portland, Oregon.

CLS sign

Jono Bacon opened the event by talking about the growth of communities in the past several years as internet-connected communities of all kinds are springing up worldwide. Though this near-OSCON CLS is open source project heavy, he talked about communities that range from the Maker movement to political revolutions. While we work to develop best practices for all kinds of communities, it was nice to hear one of his key thoughts as we move forward in community building: “Community is not an extension of the Marketing department.”

The day continued with a series of plenaries, which were 15 minutes long and touched upon topics like empathy, authenticity and vulnerability in community management roles. The talks wrapped up with a Facilitation 101 talk to give tips on how to run the unconference sessions. We then did the session proposals and scheduling that would pick up after lunch.

CLS schedule

As mentioned in my earlier post we had some discussion points from our experiences in the Ubuntu community that we wanted to get feedback on from the broader leadership community so we proposed 4 sessions that lasted the afternoon.

Lack of new generation of leaders

The root of this session came from our current struggle in the Ubuntu community to find leaders, from those who wish to sit on councils and boards to leaders for the LoCo teams. In addition to several people who expressed similar problems in their own communities, there was some fantastic feedback from folks who attended, including:

  • Some folks don’t see themselves as “Leaders” so using that work can be intimidating, if you find this is the case, shift to using different types of titles that do more to describe the role they are taking.
  • Document tasks that you do as a leader and slowly hand them off to people in your community to build a supportive group of people who know the ins and outs and can take a leadership role in the future.
  • Evaluate your community every few years to determine whether your leadership structure still makes sense, and make changes with every generation of community leaders if needed (and it often is!).
  • If you’re seeking to get more contributions from people who are employed to do open source, you may need to engage their managers to prioritize appropriately. Also, make sure credit is given to companies who are paying employees to contribute.
  • Set a clear set of responsibilities and expectations for leadership positions so people understand the role, commitment level and expectations of them.
  • Actively promote people who are doing good work, whether by expressing thanks on social media, in blog posts and whatever other communications methods you employ, as well as inviting them to speak at other events, fund them to attend events and directly engage them. This will all serve to build satisfaction and their social capital in the community.
  • Casual mentorship of aspiring leaders who you can hand over projects for them to take over once they’ve begun to grow and understand the steps required.

Making lasting friendships that are bigger than the project

This was an interesting session that was proposed as many of us found that we built strong relationships with people early on in Ubuntu, but have noticed fewer of those developing in the past few years. Many of us have these friendships which have lasted even as people leave the project, and even leave the tech industry entirely, for us Ubuntu wasn’t just an open source project, we were all building lasting relationships.

Recommendations included:

  • In person events are hugely valuable to this (what we used to get from Ubuntu Developer Summits). Empower local communities to host major events.
  • Find a way to have discussions that are not directly related to the project with your fellow project members, including creating a space where there’s a weekly topic, giving a space to share accomplishments, and perhaps not lumping it all together (some new off-topic threads on Discourse?)
  • Provide a space to have check-ins with members of and teams in your community, how is life going? Do you have the resources you need?
  • Remember that tangential interests are what bring people together on a personal level and seek to facilitate that

There was also some interesting discussion around handling contributors whose behavior has become disruptive (often due to personal things that have come up in their life), from making sure a Code of Conduct is in place to set expectations for behavior to approaching people directly to check in to make sure they’re doing all right and to discuss the change in their behavior.

Declining Community Participation

We proposed this session because we’ve seen a decline in community participation since before the Ubuntu Developer Summits ceased. We spent some time framing this problem in the space it’s in, with many Linux distributions and “core” components seeing similar decline and disinterest in involvement. It was also noted that when a project works well, people are less inclined to help because they don’t need to fix things, which may certainly be the case with a product like the Ubuntu server. In this vein, it was noted that 10 years ago the contributor to user ratio was much higher, since many people who used it got involved in order to file bugs and collaborate to fix things.

Some of the recommendations that came out of this session:

  • Host contests and special events to showcase new technologies to get people excited about involvement (made me think of Xubuntu testing with XMir, we had a lot of people testing it because it was an interesting new thing!)
  • In one company, the co-founder set a community expectation for companies who were making money from the product to give back 5% in development (or community management, or community support).
  • Put a new spin on having your code reviewed: it’s constructive criticism from programmers with a high level of expertise, you’re getting training while they chime in on reviews. Note that the community must have a solid code review community that knows how to help people and be kind to them in reviews.
  • Look at bright spots in your community and recreate them: Where has the community grown? (Ubuntu Phone) How can you bring excitement there to other parts of your project? Who are your existing contributors in the areas where you’ve seen a decline and how can you find more contributors like them?
  • Share stories about how your existing members got involved so that new contributors see a solid on-ramp for themselves, and know that everyone started somewhere.
  • Make sure you have clear, well-defined on-ramps for various parts of your project, it was noted that Mozilla does a very good job with this (Ubuntu does use Mozilla’s Asknot, but it’s hard to find!).

Barriers related to single-vendor control and development of a project

This session came about because of the obvious control that Canonical has in the direction of the Ubuntu project. We sought to find advice from other communities where there was single-vendor control. Perhaps unfortunately the session trended heavily toward specifically Ubuntu, but we were able to get some feedback from other communities and how they handle decisions made in an ecosystem with both paid and volunteer contrbutors:

  • Decisions should happen in a public, organized space (not just an IRC log, Google Hangout or in person discussion, even if these things are made public). Some communities have used: Github repo, mailing list threads, Request For Comment system to gather feedback and discuss it.
  • Provide a space where community members can submit proposals that the development community can take seriously (we did used to have for this, but it wound down over the years and became less valuable.
  • Make sure the company counts contributions as real, tangible things that should be considered for monetary value (non-profits already do this for their volunteers).
  • Make sure the company understands the motivation of community members so they don’t accidentally undermine this.
  • Evaluate expectations in the community, are there some things the company won’t budge on? Are they honest about this and do they make this clear before community members make an investment? Ambiguity hurts the community.

I’m really excited to have further discussions in the Ubuntu community about how these insights can help us. Once I’m home I’ll be able to collect my thoughts and take thoughts and perhaps even action items to the ubuntu-community-team mailing list (which everyone is welcome to participate in).

This first day concluded with a feedback session for the summit itself, which brought up some great points. On to day two!

As with day one, we began the day with a series of plenaries. The first was presented by Richard Millington who talked about 10 “Social Psychology Hacks” that you can use to increase participation in your community. These included “priming” or using existing associations to encourage certain feelings, making sure you craft your story about your community, designing community rituals to make people feel included and use existing contributors to gain more through referrals. It was then time for Laura Czajkowski’s talk about “Making your the Marketing team happy”. My biggest take-away from this one was that not only has she learned to use the tools the marketing team uses, but she now attends their meetings so she can stay informed of their projects and chime in when a suggestion has been made that may cause disruption (or worse!) in the community. Henrik Ingo then gave a talk where he did an analysis of the governance types of many open source projects. He found that all the “extra large” projects developer/commit-wise were all run by a foundation, and that there seemed to be a limit as to how big single-vendor controlled projects could get. I had suspected this was the case, but it was wonderful to have his data to back up my suspicions. Finally, Gina Likins of Red Hat spoke about her work to get universities and open source projects working together. She began her talk by explaining how few college Computer Science majors are familiar with open source, and suggested that a kind of “dating site” be created to match up open source projects with professors looking to get their students involved. Brilliant! I attended her session related to it later in the afternoon.

My afternoon was spent first by joining Gina and others to talk about relationships between university professors and open source communities. Her team runs and it turns out I subscribed to their mailing list some time ago. She outlined several goals, from getting students familiar with open source tooling (IRC, mailing lists, revision control, bug trackers) all the way up to more active roles directly in open source projects where the students are submitting patches. I’m really excited to see where this goes and hope I can some day participate in working with some students beyond the direct mentoring through internships that I’m doing now.

Aside from substantial “hallway track” time where I got to catch up with some old friends and meet some people, I went to a session on having open and close-knit communities where people talked about various things, from reaching out to people when they disappear, the importance of conduct standards (and swift enforcement), and going out of your way to participate in discussions kicked off by newcomers in order to make them feel included. The last session I went to shared tips for organizing local communities, and drew from the off-line community organizing that has happened in the past. Suggestions for increasing participation for your group included cross-promotion of groups (either through sharing announcements or doing some joint meetups), not letting volunteers burn out/feel taken for granted and making sure you’re not tolerating poisonous people in your community.

The Community Leadership Summit concluded with a Question and Answer session. Many people really liked the format, keeping the morning pretty much confined to the set presentations and setting up the schedule, allowing us to take a 90 minute lunch (off-site) and come back to spend the whole afternoon in sessions. In all, I was really pleased with the event, kudos to all the organizers!

July 21, 2015 05:10 AM

Adnans Sysadmin/Dev Blog

No Choice

This is why I don't like using web services. As a user, I'm given no choice in what I want. On twitter, it used to be OK to set a wall paper for your home, now it isn't. For a free web service, I assume it should be acceptable for them to be able to make the choice for you.

Soon, with automatic updates in Windows 10 delivering new features, I'd assume something like this would happen for desktop software. Is that acceptable?

by Adnan Wasim ( at July 21, 2015 01:26 AM

Errata Security

My BIS/Wassenaar comment

This is my comment I submitted to the BIS on their Wassenaar rules:


I created the first “intrusion prevention system”, as well as many tools and much cybersecurity research over the last 20 years. I would not have done so had these rules been in place. The cost and dangers would have been too high. If you do not roll back the existing language, I will be forced to do something else.

After two months, reading your FAQ, consulting with lawyers and export experts, the cybersecurity industry still hasn’t figured out precisely what your rules mean. The language is so open-ended that it appears to control everything. My latest project is a simple “DNS server”, a piece of software wholly unrelated to cybersecurity. Yet, since hackers exploit “DNS” for malware command-and-control, it appears to be covered by your rules. It’s specifically designed for both the distribution and control of malware. This isn’t my intent, it’s just a consequence of how “DNS” works. I haven’t decided whether to make this tool open-source yet, so therefore traveling to foreign countries with the code on my laptop appears to be a felony violation of export controls.

Of course you don’t intend to criminalize this behavior, but that isn’t the point. The point is that the rules are so vague that they become impossible for anybody to know exactly what is prohibited. We therefore have to take the conservative approach. As we’ve seen with other vague laws, such as the CFAA, enforcement is arbitrary and discriminatory. None of us would have believed that downloading files published on a public website would be illegal until a member of community was convicted under the CFAA for doing it. None of us wants to be a similar test case for export controls. The current BIS rules are so open-ended that they would have a powerful chilling effect on our industry.

The solution, though, isn’t to clarify the rules, but to roll them back. You can’t clarify the difference between good/bad software because there is no difference between offensive and defensive tools -- just the people who use them. The best way to secure your network is to attack it yourself. For example, my “masscan” tool quickly scans large networks for vulnerabilities like “Heartbleed”. Defenders use it to quickly find vulnerable systems, to patch them. But hackers also use my tool to find vulnerable systems to hack them. There is no solution that stops bad governments from buying “intrusion” or “surveillance” software that doesn’t also stop their victims from buying software to protect themselves. Export controls on offensive software means export controls on defensive software. Export controls mean the Sudanese and Ethiopian people can no longer defend themselves from their own governments.

Wassenaar was intended to stop “proliferation” and “destabilization”, yet intrusion/surveillance software is neither of those. Human rights activists have hijacked the arrangement for their own purposes. This is a good purpose, of course, since these regimes are evil. It’s just that Wassenaar is the wrong way to do this, with a disproportionate impact on legitimate industry, while at the same time, hurting the very people it’s designed to help. Likewise, your own interpretation of Wassenaar seems to have been hijacked by the intelligence community in the United States for their own purposes to control “0days”.

Rather than the current open-end and vague interpretation of the Wassenaar changes, you must do the opposite, and create the narrowest of interpretations. Better yet, you need to go back and renegotiate the rules with the other Wassenaar members, as software is not a legitimate target of Wassenaar control. Computer code is not a weapon, if you make it one, then you’ll destroy America’s standing in the world. On a personal note, if you don’t drastically narrow this, my research and development will change. Either I will stay in this country and do something else, or I will move out of this country (despite being a fervent patriot).

Robert Graham
Creator of BlackICE, sidejacking, and masscan.
Frequent speaker at cybersecurity conferences.

by Robert Graham ( at July 21, 2015 12:47 AM

July 20, 2015

Cryptography Engineering

A history of backdoors

The past several months have seen an almost eerie re-awakening of the 'exceptional access' debate -- also known as 'Crypto Wars'. For those just joining the debate, the TL;DR is that law enforcement wants software manufacturers to build wiretapping mechanisms into modern encrypted messaging systems. Software manufacturers, including Google and Apple, aren't very thrilled with that.

The funny thing about this debate is that we've had it before. It happened during the 1990s with the discussion around Clipper chip, and the outcome was not spectacular for the pro-'access' side. But not everyone agrees.

Take, for example, former NSA general counsel Stewart Baker, who has his own reading of history:
A good example is the media’s distorted history of NSA’s 1994 Clipper chip. That chip embodied the Clinton administration’s proposal for strong encryption that “escrowed” the encryption keys to allow government access with a warrant. ... The Clipper chip and its key escrow mechanism were heavily scrutinized by hostile technologists, and one, Matthew Blaze, discovered that it was possible with considerable effort to use the encryption offered by the chip while bypassing the mechanism that escrowed the key and thus guaranteed government access. ... In any event, nothing about Matt Blaze’s paper questioned the security being offered by the chip, as his paper candidly admitted.
The press has largely ignored Blaze’s caveat.  It doesn’t fit the anti-FBI narrative, which is that government access always creates new security holes. I don’t think it’s an accident that no one talks these days about what Matt Blaze actually found except to say that he discovered “security flaws” in Clipper.  This formulation allows the reader to (falsely) assume that Blaze’s research shows that government access always undermines security. 
It's not clear why Mr. Baker is focusing on Clipper, rather than the much more recent train wreck of NSA's 'export-grade crypto' access proposals. It's possible that Baker just isn't that familiar with the issue. Indeed, it's the almost proud absence of technological expertise on the pro-'government access' side that has made this debate so worrying.

But before we get to the more recent history, we should clarify a few things. Yes: the fact that Clipper -- a multi-million dollar, NSA designed technology -- emerged with fundamental flaws in its design is a big deal. It matters regardless of whether the exploit led to plaintext recovery or merely allowed criminals to misuse the technology in ways they weren't supposed to.

But Clipper is hardly the end of the story. In fact, Clipper is only one of several examples of 'government access' mechanisms that failed and blew back on us catastrophically. More recent examples have occurred as recently as this year with the FREAK and LogJam attacks on TLS, resulting in vulnerabilities that affected nearly 1/3 of secure websites -- including (embarrassingly) the FBI and NSA themselves. And these did undermine security.

With Mr. Baker's post as inspiration, I'm going to spend the rest of this post talking about how real-world government access proposals have fared in practice -- and how the actual record is worse than any technologist could have imagined at the time.

The Clipper chip

 image: Travis Goodspeed
(CC BY 2.0 via Wikimedia
Clipper is the most famous of government access proposals. The chip was promoted as a ubiquitous hardware solution for voice encryption in the early 1990s -- coincidentally, right on the eve of a massive revolution in software-based encryption and network voice communications. In simple terms, this meant that technologically Clipper was already a bit of a dinosaur by the time it was proposed. 

Clipper was designed by the NSA, with key pieces of its design kept secret and hidden within tamper-resistant hardware. One major secret was the design of the Skipjack block cipher it used for encryption. All of this secrecy made it hard to evaluate the design, but the secrecy wasn’t simply the result of paranoia. Its purpose was to inhibit the development of unsanctioned Clipper-compatible devices that bypass Clipper's primary selling point — an overt law enforcement backdoor.

The backdoor worked as follows. Each Clipper chip shipped with a unique identifier and unit key that was programmed by blowing fuses during manufacture. Upon negotiating a session key with another Clipper, the chip would transmit a 128-bit Law Enforcement Access Field (LEAF) that contained an encrypted version of the ID and session key, wrapped using the device's unit key. The government maintained a copy of each device's access key, split and stored at two different sites.

To protect the government’s enormous investment in hardware and secret algorithms, the Clipper designers also incorporated an authentication mechanism consisting of a further 16-bit checksum on the two components of the LEAF key, further encrypted using a family key shared between all devices. This prevented a user from tampering with or destroying the LEAF checksum as it transited the wire -- any other compatible Clipper could decrypt and verify the checksum, then refuse the connection if it was invalid.

A simple way to visualize the Clipper design is to present it as three legs of a tripod, (badly) illustrated as follows:

The standout feature of Clipper's design is its essential fragility. If one leg of the tripod fails, the entire construction tumbles down around it. For example: if the algorithms and family keys became public, then any bad actor can build a software emulator that produced apparently valid but useless LEAFs. If tamper resistance failed, the family key and algorithm designs would leak out. And most critically: if the LEAF checksum failed to protect against on-the-wire modification, then all the rest of would be a waste of money and time. Criminals could hack legitimate Clippers to interoperate without fear of interception.

In other words, everything had to work, or nothing made any sense at all. Moreover, since most of the design was secret, users were forced to trust in its security. One high-profile engineering failure would tend to undermine that confidence.

Which brings us to Matt Blaze's results. In a famous 1994 paper, Blaze looked specifically at the LEAF authentication mechanism, and outlined several techniques for bypassing it on real Clipper prototypes. These ranged from the 'collaborative' -- the sender omits the LEAF from its transmission, and the receiver reflects its own LEAF back into its device -- to the 'unidirectional' where a sender simply generates random garbage LEAFs and until it finds one with a valid checksum. With only a 16-bit checksum, the latter techniques requires on average 65,536 attempts, and the sender's own device can be used as an oracle to check the consistency of each candidate. Blaze was able to implement a system that did this in minutes — and potentially in seconds, with parallelization.

That was essentially the ballgame for Clipper.

And now we can meditate on both the accuracy and utter irrelevance of Mr. Baker’s point. It’s true that Blaze's findings didn't break the confidentiality of Clipper conversations, nor were the techniques themselves terribly practical. But none of that mattered. 

What did matter were the implications for the Clipper system as a whole. The flaws in authentication illustrated that the designers and implementers of Clipper had made elementary mistakes that fundamentally undermined the purpose of all those other, expensive design componentsWithout the confidence of users or law enforcement, there was no reason for Clipper to exist. 

SSL/TLS Export ciphersuites: FREAK and LogJam

This would be the end of story if Clipper was the only 'government access' proposal to run off the road due to bad design and unintended consequences. Mr. Baker and Matt Blaze could call it a draw and go their separate ways. But of course, the story doesn't end with Clipper.

Mr. Baker doesn't mention this in his article, but we're still living with a much more pertinent example of a 'government access' system that failed catastrophically. Unlike Clipper, this failure really did have a devastating impact on the security of real encrypted connections. Indeed, it renders web browsing sessions completely transparent to a moderately clever attacker. Even worse, it affected hundreds of thousands of websites as recently as 2015.

The flaws I'm referring to stem from the U.S. government's pre-2000 promotion of 'export'-grade cryptography in the SSL and TLS protocols, which are used to secure web traffic and email all over the world. In order to export cryptography outside of the United States, the U.S. government required that web browsers and servers incorporate deliberately weakened ciphers that were (presumably) within the NSA's ability to access.

Unsurprisingly, while the export regulations were largely abandoned as a bad job in the late 1990s, the ciphersuites themselves live on in modern TLS implementations because that's what happens when you inter a broken thing into a widely-used standard. 

For the most part these weakened ciphers lay abandoned and ignored (but still active on many web servers) until this year, when researchers showed that it was possible to downgrade normal TLS connections to use export-grade ciphers. Ciphers that are, at this point, so weak that they can be broken in seconds on single personal computer.
Logjam is still unpatched in Chrome/MacOS as of the date of this post.
At the high watermark in March of this year, more than one out of three websites were vulnerable to either FREAK or LogJam downgrade attacks. This included banks, e-commerce sites, and yes -- the NSA website and FBI tip reporting line. Hope you didn't care much about that last one.

Now you could argue that the export requirements weren't designed to facilitate law enforcement access. But that's just shifting the blame from one government agency to another. Worse, it invites us to consider the notion that the FBI is going to get cryptography right when the NSA didn't. This is not a conceptual framework you want to hang your policies on.


This may sound disingenuous, but the truth is that I sympathize with Mr. Baker. It's frustrating that we're so bad at building security systems in this day and age. It's maddening that we can't engineer crypto reliably even when we're trying our very best.

But that's the world we live in. It's a world where we know our code is broken, and a world where a single stupid Heartbleed or Shellshock can burn off millions of dollars in a few hours. These bugs exist, and not just the ones I listed. They exist right now as new flaws that we haven't discovered yet. Sooner or later maybe I'll get to write about them.

The idea of deliberately engineering weakened crypto is, quite frankly, terrifying to experts. It gives us the willies. We're not just afraid to try it. We have seen it tried -- in the examples I list above, and in still others -- and it's just failed terribly.

by Matthew Green ( at July 20, 2015 06:32 PM

July 19, 2015

Racker Hacker

Restoring wireless and Bluetooth state after reboot in Fedora 22

Thinkpad X1 Carbon 3rd genMy upgrade to Fedora 22 on the ThinkPad X1 Carbon was fairly uneventful and the hiccups were minor. One of the more annoying items that I’ve been struggling with for quite some time is how to boot up with the wireless LAN and Bluetooth disabled by default. Restoring wireless and Bluetooth state between reboots is normally handled quite well in Fedora.

In Fedora 21, NetworkManager saved my settings between reboots. For example, if I shut down with wifi off and Bluetooth on, the laptop would boot up later with wifi off and Bluetooth on. This wasn’t working well in Fedora 22: both the wifi and Bluetooth were always enabled by default.

Digging into rfkill

I remembered rfkill and began testing out some commands. It detected that I had disabled both devices via NetworkManager (soft):

$ rfkill list
0: tpacpi_bluetooth_sw: Bluetooth
    Soft blocked: yes
    Hard blocked: no
2: phy0: Wireless LAN
    Soft blocked: yes
    Hard blocked: no

It looked like systemd has some hooks already configured to manage rfkill via the systemd-rfkill service. However, something strange happened when I tried to start the service:

# systemctl start systemd-rfkill@0
Failed to start systemd-rfkill@0.service: Unit systemd-rfkill@0.service is masked.

Well, that’s certainly weird. While looking into why it’s masked, I found an empty file in /etc/systemd:

# ls -al /etc/systemd/system/systemd-rfkill@.service 
-rwxr-xr-x. 1 root root 0 May 11 16:36 /etc/systemd/system/systemd-rfkill@.service

I don’t remember making that file. Did something else put it there?

# rpm -qf /etc/systemd/system/systemd-rfkill@.service

Ah, tlp!

Configuring tlp

I looked in tlp’s configuration file in /etc/default/tlp and found a few helpful configuration items:

# Restore radio device state (Bluetooth, WiFi, WWAN) from previous shutdown
# on system startup: 0=disable, 1=enable.
#   are ignored when this is enabled!
# Radio devices to disable on startup: bluetooth, wifi, wwan.
# Separate multiple devices with spaces.
#DEVICES_TO_DISABLE_ON_STARTUP="bluetooth wifi wwan"
# Radio devices to enable on startup: bluetooth, wifi, wwan.
# Separate multiple devices with spaces.
# Radio devices to disable on shutdown: bluetooth, wifi, wwan
# (workaround for devices that are blocking shutdown).
#DEVICES_TO_DISABLE_ON_SHUTDOWN="bluetooth wifi wwan"
# Radio devices to enable on shutdown: bluetooth, wifi, wwan
# (to prevent other operating systems from missing radios).
# Radio devices to enable on AC: bluetooth, wifi, wwan
#DEVICES_TO_ENABLE_ON_AC="bluetooth wifi wwan"
# Radio devices to disable on battery: bluetooth, wifi, wwan
#DEVICES_TO_DISABLE_ON_BAT="bluetooth wifi wwan"
# Radio devices to disable on battery when not in use (not connected):
# bluetooth, wifi, wwan
#DEVICES_TO_DISABLE_ON_BAT_NOT_IN_USE="bluetooth wifi wwan"

So tlp’s default configuration doesn’t restore device state and it masked systemd’s rfkill service. I adjusted one line in tlp’s configuration and rebooted:

DEVICES_TO_DISABLE_ON_STARTUP="bluetooth wifi wwan"

After the reboot, both the wifi and Bluetooth functionality were shut off! That’s exactly what I needed.

Extra credit

Thanks to a coworker, I was able to make a NetworkManager script to automatically shut off the wireless LAN whenever I connected to a network via ethernet. This is typically what I do when coming back from an in-person meeting to my desk (where I have ethernet connectivity).

If you want the same automation, just drop this script into /etc/NetworkManager/dispatcher.d/ and make it executable:

export LC_ALL=C
enable_disable_wifi ()
        result=$(nmcli dev | grep "ethernet" | grep -w "connected")
        if [ -n "$result" ]; then
                nmcli radio wifi off
if [ "$2" = "up" ]; then

Unplug the ethernet connection, start wifi, and then plug the ethernet connection back in. Once NetworkManager fully connects (DHCP lease obtained, connectivity check passes), the wireless LAN should shut off automatically.

The post Restoring wireless and Bluetooth state after reboot in Fedora 22 appeared first on

by Major Hayden at July 19, 2015 10:14 PM

July 18, 2015

Racker Hacker

Making things more super with supernova 2.0

OpenStackLogo supernovaI started supernova a little over three years ago with the idea of making it easier to use novaclient. Three years and a few downloads later, it manages multiple different OpenStack clients, like nova, glance, and trove along with some handy features for users who manage a large number of environments.

What’s new?

With some help from some friends who are much better at writing Python than I am (thanks Paul, Matt and Jason), I restructured supernova to make it more testable. The big, awkward SuperNova class was dropped and there are fewer circular imports. In addition, I migrated the cli management components to use the click module. It’s now compatible with Python versions 2.6, 2.7, 3.3 and 3.4.

The overall functionality hasn’t changed much, but there’s a new option to specify a custom supernova configuration that sits in a non-standard location or with a filename other than .supernova. Simply use the -c flag:

supernova -c ~/work/.supernova dfw list
supernova -c ~/personal/supernova-config-v1 staging list

The testing is done with Travis-CI and code coverage is checked with Codecov. Pull requests will automatically be checked with unit tests and I’ll do my best to urge committers to keep test coverage at 100%.

Updating supernova

Version 2.0.0 is already in PyPi, so an upgrade using pip is quite easy:

pip install -U supernova

The post Making things more super with supernova 2.0 appeared first on

by Major Hayden at July 18, 2015 05:42 PM

July 17, 2015

Evaggelos Balaskas

Timers in systemd

It’s time to see an example on timers in systemd.

Before we start, let’s clarify some things.

systemd’s Timers are units (the simplest form of a systemd files) that can describe when and if a unit service should or must run, based on real time or relative time.

A real time example is similar to a cron entry and it’s mostly declared with OnCalendar statement on timer.
You can read what values the OnCalendar= configuration setting can have, here.

A relative time example is something like: run this unit service ten minutes after boot but this Before this unit service and it depends on that unit service.


To find out what timers have been declared on your system, you can run the below command:

$ systemctl list-timers


ok, let’s start an example.

I will use, /usr/local/bin as the directory for my custom scripts, as this directory is on PATH enviroment variable and i can run the scripts from everywhere.
Our systemd unit files, must exist under: /etc/systemd/system/ as this directory is for the user to create their own units.


Part One: The Script

I want to sync my emails from one system to another. The basic script is something like this:

# /usr/bin/rsync -rax /backup/Maildir/ -e ssh homepc:/mnt/backup/Maildir/ &> /dev/null

Create a shell script with your command:

# vim /usr/local/bin/backup.maildir

/usr/bin/rsync -rax /backup/Maildir/ -e ssh homepc:/mnt/backup/Maildir/ &> /dev/null

and make it executable:

# chmod +x /usr/local/bin/backup.maildir

You can run this script, once or twice to see if everything goes as planned.

Part Two: The Service

Now it’s time to create a systemd service unit:

# vim /etc/systemd/system/backupmaildir.service
Description=Backup Maildir Service


Part Three: The Timer

Now it is time to create the systemd timer unit:

# vim /etc/systemd/system/backup.maildir.timer

We have to decide when we want to service to run.
eg. Every hour, but 15minutes after boot

Description=backup.maildir, Runs every hour



voila !

Part Four: Enabling it

Be aware, we havent finished yet.

Check that systemd can identify these files:

# systemctl list-unit-files | grep maildir
rsync.maildir.service                      static
rsync.maildir.timer                        disabled

We can run the systemd service by hand:

# systemctl start rsync.maildir.service

If you change the rsync options to ravx -P you can see the output via

# journalctl -f &
# systemctl start rsync.maildir.service

With & I put the journalctl to the background. Type fg to bring the process to foreground.

finally we need to start & enable (so that runs after reboot) the timer:

# systemctl start rsync.maildir.timer

# systemctl enable rsync.maildir.timer
Created symlink from /etc/systemd/system/ to /etc/systemd/system/rsync.maildir.timer.

after that:

# systemctl list-timers  | grep maildir
Fri 2015-07-17 18:42:30 UTC  59min left Fri 2015-07-17 17:42:30 UTC  40s ago rsync.maildir.timer          rsync.maildir.service

# systemctl list-unit-files | grep maildir
rsync.maildir.service                      static
rsync.maildir.timer                        enabled

To all the systemd haters, I KNOW, its one line on crontab !

Tag(s): systemd, timer

July 17, 2015 06:52 PM

July 16, 2015



Ubuntu at the upcoming Community Leadership Summit

This weekend I have the opportunity to attend the Community Leadership Summit. While there, I’ll be able to take advantage of an opportunity that’s rare now: meeting up with my fellow Ubuntu Community Council members Laura Czajkowski and Michael Hall, along with David Planella of the community team at Canonical. At the Community Council meeting today, I was able to work with David on narrowing down a few topics that impact us and we think would be of interest to other communities and we’ll propose for discussion at CLS:

  1. Declining participation
  2. Community cohesion
  3. Barriers related to [the perception of] company-driven control and development
  4. Lack of a new generation of leaders

As an unconference, we’ll be submitting these ideas for discussion and so we’ll see how many of them gain interest of enough people to have a discussion.


Community Leadership Summit 2015

Since we’ll all be together, we also managed to arrange some time together on Monday afternoon and Tuesday to talk about how these challenges impact Ubuntu specifically and get to any of the topics mentioned above that weren’t selected for discussion at CLS itself. By the end of this in person gathering we hope to have some action items, or at least some solidified talking points and ideas to bring to the ubuntu-community-team mailing list. I’ll also be doing a follow-up blog post where I share some of my takeaways.

What I need from you:

If you’re attending CLS join us for the discussions! If you just happen to be in the area for OSCON in general, feel free to reach out to me (email: to have a chat while I’m in town. I fly home Wednesday afternoon.

If you can’t attend CLS but are interested in these discussions, chime in on the ubuntu-community-team thread or send a message to the Community Council at community-council at with your feedback and we’ll work to incorporate it into the sessions. You’re also welcome to contact me directly and I’ll pass things along (anonymously if you’d like, just let me know).

Finally, a reminder that this time together is not a panacea. These are complicated concerns in our community that will not be solved over a weekend and a few members of the Ubuntu Community Council won’t be able to solve them alone. Like many of you, I’m a volunteer who cares about the Ubuntu community and am doing my best to find the best way forward. Please keep this in mind as you bring concerns to us. We’re all on the same team here.

July 16, 2015 06:59 PM

Alen Krmelj

Network Weathermap – Network Map Tool

Recently I came across a neat program written in PHP, called Network Weathermap ( It helps you with drawing a network map and shows live data on it. I never seen it before, although it’s been around for quite a while. I decided to post a tutorial of it because there are still people out […]

by Alen Krmelj at July 16, 2015 05:38 PM

July 14, 2015

Find all services using libssl to restart after an OpenSSL update

When you update OpenSSL, the software that currently has the ssl libraries loaded in memory do not automatically load the updated libraries. A full system reboot resolves that problem, but sometimes that is not possible. This command shows you all the software that has loaded the libraries, allowing you to restart only those services. If you don't restart or reload after an update, the software might still be vulnerable to issues that the update fixed.

July 14, 2015 12:00 AM

July 13, 2015

SSH ChrootDirectory / sftponly not working [FIXED]

I was trying to setup a jail for SSH on Ubuntu 14.04, but it didn't seem to work. The user I was trying to jail using ChrootDirectory could login with SFTP, but could still see everything. Turns out there were a few issues that were causing this. The summary is:

  • All directories to the ChrootDirectory path must be owned by root and must not have world or group writability permissions.
  • Ubuntu 14.04 sysv init and upstart scripts don't actually restart SSH, so changing the config file doesn't take effect.
  • The "Match User XXXX" or "Match Group XXXX" configuration section must be placed at the end of the sshd.config file.
  • Also don't forget to make your user a member of the sftponly group if you're using "Match Group sftponly".

All paths to the jail must have correct ownerships and permissions

All directories in the path to the jail must be owned by root. So if you configure the jail as:

ChrootDirectory /home/backup/jail

Than /home, /home/backup/ and /home/backup/jail must be owned by root:<usergroup>:

chown root:root /home
chown root:backup /home/backup
chown root:backup /home/backup/jail

Permissions on at least the home directory and the jail directory must not include world-writability or group-writability:

chmod 750 /home/backup
chmod 750 /home/backup/jail

Ubuntu's SSH init script sucks

Ubuntu's SSH init script (both sysv init and upstart) suck. They don't actually even restart SSH (notice the PID):

# netstat -pant | grep LISTEN | grep sshd
tcp   0   0*    LISTEN   13838/sshd     
# /etc/init.d/ssh restart
[root@eek]~# netstat -pant | grep LISTEN | grep sshd
tcp   0   0*    LISTEN   13838/sshd      

The PID never changes! SSH isn't actually being restarted! The bug has been reported here:

To restart it you should use the "service" command, but even then it might not actually restart:

# service ssh restart
ssh stop/waiting
ssh start/running
[root@eek]~# netstat -pant | grep LISTEN | grep sshd
tcp    0   0*   LISTEN   13838/sshd

This generally happens because you've got an error in your ssh configuration file. Naturally they don't actually bother with telling you as much, and the log file also shows nothing.

The Match section in the SSHd configuration must be placed at the end of the file

When I finally figured out that SSH wasn't being restarted, I tried starting it by hand. You might run into the following error:

# sshd -d
sshd re-exec requires execution with an absolute path

You should execute it with the full path because SSHd will start new sshd processes for each connection, so it needs to know where it lives:

# /usr/sbin/sshd

Now I finally found out the real problem:

# /usr/sbin/sshd
/etc/ssh/sshd_config line 94: Directive 'UsePAM' is not allowed within a Match block

My config looked like this:

Match User obnam
    ChrootDirectory /home/obnam/jail
    X11Forwarding no
    AllowTcpForwarding no
    ForceCommand internal-sftp
UsePAM yes
UseDNS no

Aparently SSH is too stupid to realize the Match section is indented and thinks it runs until the end of the file. The answer here is to move the section to the end of the file:

UsePAM yes
UseDNS no
Match User obnam
    ChrootDirectory /home/obnam/jail
    X11Forwarding no
    AllowTcpForwarding no
    ForceCommand internal-sftp

This will fix the problem and sftponly should work now.

by admin at July 13, 2015 08:22 AM

The Tech Teapot

Scam of the week

Last week we had to fend off a scam attempt. The scam worked like this:

  1. Order a product from a company and arrange to pay by bank transfer;
  2. Manually pay a counterfeit cheque into the company account. The cheque has no chance of clearing. Rather cleverly, the amount of the payment was 10x higher than the amount due. So it looks like a simple transcription error;
  3. Ask for a refund for the difference to be sent by bank transfer to another account;
  4. Profit!
  5. Disappear without trace.

The only thing that saved us from falling for the scam was the fact that we know that cheques show up in our account before the money is actually cleared. So, the money is sort of in your account but you can’t spend it.

Be careful out there folks, there are some odd sorts out there.

Update 2015/07/13: The scammer sent us an email over the weekend asking where his money is :)

by Jack Hughes at July 13, 2015 08:00 AM

July 10, 2015

The Tech Teapot

Why lone software projects fail

I have an awful lot of failed software projects. Most programmers do. It comes with the territory.

Most of the failures have been the result of running out of steam one way or another. Your early enthusiasm slowly wanes until the mere thought of carrying on makes you feel a little sick.

It is easy to forget that programming is an intensely psychological activity. Your attitude is central to the success or otherwise of your project.

Feature Creep

In my experience, I fail most often when I don’t maintain feature discipline. At the beginning of a project, you define the requirements for the project. The minimal set of features that are useful. You then start implementing those requirements. If those requirements stay basically the same whilst the project is being developed, you’ve greatly improved your chances of delivering the project successfully. If however during the development stage new requirements are being added to the project, then your chances of success greatly diminish.

Requirements are likely to change, in the beginning of a project you’ll be coming up with loads of new ideas. How can you control this process so it doesn’t knock the project off course?

Gather Requirements

You may think that requirements are only for large projects. You’d be wrong. Requirements are important whatever the size of project.

One of the things that can kill a project is the feeling that it is never going to end. You’re working away hour after hour, day after day and don’t feel like you’re getting any closer to the finish line.

That is why requirements are so important. If you nail the requirement down early on, ideally before you’ve written any code, you will have a much better understanding of what you need to deliver.

You will also avoid the feeling of being lost.

If you have any ideas for enhancements during the project, document the requirement and schedule it into a future version of the project. Then return to implementing the first set of requirements.

Your requirements need to be quite specific. Vague requirements like “Implement X” is really of no use. Be very specific. Outline precisely what is to be implemented, what data is required with examples. The examples can then form the basis for your tests. The requirements should specify the very minimum you need.

Your first version does not have to be released, you can iterate over a number of releases before you release your project.

Tools Can Help

There are loads of web apps out there to help you organise a software project. Many implement a variant of the agile board.

If you have a team of people, then using some kind of agile board is a great idea. For one person teams I think they add a little too much ceremony. If you find yourself spending more time fiddling with your agile board than actually working, then you’re over complicating matters.

I’ve found the very minimal GitHub Issue tracker to be ideal. You can’t spend a long time configuring it because there’s very little to configure. You can create issues, label them and add your requirements in the issue text. You can organise your requirements into milestones. When you’ve completed all of the issues attached to a milestone, you’re done. Move on to the next milestone.

Dyna Issue Tracker

Dyna project issue tracker.

Make issues that take a small amount of time to complete. A single issue that takes many months to implement will prove hard to gauge progress. But, taking a large task and breaking it down into lots of smaller issues begins to give you some momentum. You’ll feel like you’re achieving something when you are continually marking issues as closed. If you feel like you’re making progress, then you’ll have momentum and are more likely to finish.

Realistic Expectations

Related to requirements. You need to be sure that what you are building can be built in a time scale you are comfortable with. You are one programmer, the project you take on needs to be doable in a reasonable amount of time. I’d say that one year is the very outside of what I could cope with as a lone project. If I can’t program the project at the very outside in one year, then I don’t take it on.

In addition, if the project requires resources you don’t have, then either you need to have a plan to get hold of those resources or don’t start the project. A vague “I’ll sort that out sometime near the end of the project” will likely just mean the project is never finished. Resources like graphic designers or UX people don’t just pop out of thin air. Wishful thinking won’t help.


The temptation to create something perfect runs pretty deep in a lot of programmers. Perfectionism is very destructive because you can spend an inordinate amount of time ceaselessly tweaking something that, in the end, really isn’t very important. Worse, on lone projects, there is nobody to stop you disappearing down the perfectionism rabbit hole.

Every time you change something that already works you need to be questioning whether anybody will care about the change. Does the user really care about your obsessively edited product icon? The answer is almost always no. That is not to say that you need to put up with mediocrity. But, there is a big difference between excellence and perfection. Not understanding the difference between those two things causes an awful lot of projects to fail.

Getting Stuck

There aren’t many projects that you know how to implement every feature at the very beginning. At some point during the project you are going to be coding something that you don’t know how to implement.

Everybody gets stuck at some point. There are extra difficulties from getting stuck as a lone developer, not least that you don’t have any colleagues to bounce ideas off or to provide technical help.

How can you ensure that getting stuck does not kill your project?

You can ask questions on Q&A sites like Stack Overflow, but some questions don’t lend themselves to Q&A sites.

Let’s say you’ve exhausted the usual Q&A sites and you’re still stuck. What do you do now?

The simplest strategy is to isolate the problem. You need to focus solely on your problem, not worry about how you can integrate it into your project. You need to be able to try new ideas as quickly as possible without going to the trouble of fitting it into your existing code.

Most programmers end up with a disk full of small projects used to solve technical problems. A useful side effect is that they make asking good questions on Q&A sites a lot easier too.

If you have doubts about a particular technical area at the beginning of a project, it can help to tackle that right at the start. Then if you do get stuck you will at least be brimming full of enthusiam and therefore be less likely to give up.

No Version Control

Version control systems give you a lot of benefits whether working on your own or not. The main benefit they give is the ability to experiment, and more importantly, go back to a state before the experiment started.

I remember a lot of projects when I first started programming, when source control was not commonly used by PC programmers, a lot of projects were destroyed by experiments gone bad. I’d have an idea for a change and then hours or days into the change I’d want to abandon the changes but couldn’t. Either I somehow got the project back to a position I wanted it to be, or I’d abandon it. There is nothing worse than getting into a change and knowing you can’t go back.

There are no excuses for not using source control now. There are loads of providers, most with free plans, and the tooling is great too. You don’t even have to learn the command line anymore, Visual Studio and lots of other IDEs have source control tools baked right in.


It is easy to forget how much psychology comes into play whilst programming on your own. Momentum or the feeling that you are making progress is absolutely key. If you lose momentum, you are much more likely to give up.

Developing software is an intensely psychological activity. Developing software on your own doubly so.

Update 1015/07/13: Adds the “No Version Control” section.

by Jack Hughes at July 10, 2015 11:50 AM

July 08, 2015

Michael Biven

The Important, the Urgent and remembering it’s the people that Matter

Large scale outages at major service providers isn’t something new. In 2007 both Rackspace and the 365 Main data center in San Francisco suffered major outages that took down part of what made up Web 2.0. Amazon Web Services had major outages in 2011, 2013, and again last week each causing outages for several services.

Our businesses depend on the utility of AWS and we either unknowingly or willingly accept the risks of doing so. But, if you’re in only one region could you run your service out of multiple ones or use two different IaaS providers? Are you so reliant on them that you couldn’t recreate your environment on a different provider to offer basic services to your customers?

Maybe you’be been dealing with the urgent tasks and have been missing the important ones? I know I still get bogged down responding to issues that seem urgent at the time.

What is important is seldom urgent and what is urgent is seldom important.

– Dwight D. Eisenhower

That quote from Eisenhower shows how he approached this using a straightforward method. Called the “Eisenhower Decision Principle” it let him triage his tasks and focus on the important stuff. That simple quote and method should be all you need when sorting out your own priorities.

It’s also called the “Eisenhower Decision Matrix”, because you start off with a square divided up into four quadrants as shown below. They’re labeled as Important& Urgent, Important & Not Urgent, Not Important & Urgent, and Not Important & Not Urgent. You then place each task in the appropriate quadrant and then it should be easy to organize them.

Important & Urgent

These are items that require your immediate attention. Think of outages and deadlines.

Important & Not Urgent

These are the items that you want to focus on. This should be the bigger issues that need thought and planning, things that will help prevent items filling up the “Urgent & Important”.

Not Important & Urgent

Now we’re getting to stuff that should be handed off to someone else. These are things that need attention, but don’t get us to our goals. Examples could be security patches that need to be rolled out or auditing the key length of ssh keys used in your environment.

Not Important & Not Urgent

These are the distractions, the stuff that won’t help you towards your goals and aren’t critical. This should become a blackhole taking away the things that waste your time and efforts.

Even with such a system being able to delegate urgent tasks to someone else while you work on the important ones may be a problem for a small organization or team. At some point you need enough people to support them so they can delegate tasks out when needed. If you don’t, your people won’t be able to address the important stuff.

People, Ideas, and Things in that Order

– Colonel John R. Boyd, USAF

In the fire service there is the “Quint Concept”. This is the idea that you combine a fire engine with a fire truck (also called a ladder). The result is a vehicle that can carry fire fighters, a tank of water, a pump to move the water, hose to both draw water and to put water on the fire, and a large truck mounted ladder.

For the bean counters this gives the benefit of being able to reduce the number of people needed in the fire department. Less people to pay and less equipment to maintain.

A quick lesson in fire operations. The fire fighters all have the same training, but they perform different duties at an emergency depending on if they’re on an engine or truck. Over time they build up an expertise on those duties.

The result of combining the companies is a reduction in the number of people and apparatus needed. They almost never add more quints, it’s always a reduction to save money. You now have fewer people to deal with the same number of emergencies and you lose the benefits of having specialist, because you’ve made generalist out of them.

This creates a more dangerous environment for the fire fighters themselves as well as the general public. Unfortunately, budgets seem to always trump safety no matter the application.

Looking back you could say the powers at be focused on the urgent (budgetary concerns) rather than the important (public safety). The technology advanced that let the engine and truck merge into the quint, but no matter what new technology comes up it always requires having the people to do the job.

It’s the people that create the solutions to the problems we’re facing. Technology doesn’t do this for us. Without the right people and the right number of them you cannot expect to get past just the urgent and important.

July 08, 2015 10:07 AM

July 07, 2015


Goodbye, discoveryd. Hello again, mDNSResponder.


Once again, Apple has made a change in how DNS is handled in Mac OS X. Originally, Yosemite (10.10) had replaced mDNSResponder with discoveryd. This meant that all of those who had made a change to force mDNSResponder to always append search domains to DNS lookups had to make a new change to accommodate discoveryd.

Lo and behold, with 10.10.4, discoveryd is out and mDNSResponder is in. Below is the new process you'll need to follow to get your DNS lookups working as expected.

  1. Before you do anything, make sure you have updated to at least OS X 10.10.4.
  2. You will need to edit /System/Library/LaunchDaemons/ Add <string>-AlwaysAppendSearchDomains</string> to the list of strings in the ProgramArguments <array>. Please note that the argument only has a single leading dash.
  3. Restart mDNSResponder to see your changes take effect.
    $ sudo launchctl unload -w /System/Library/LaunchDaemons/
    $ sudo launchctl load -w /System/Library/LaunchDaemons/
  4. Profit!

by Scott Hebert at July 07, 2015 06:56 PM

July 04, 2015

Sarah Allen

the stealthy openness of the us digital services

Fast Company’s recent article “Inside Obama’s Stealth Startup” provides a nice overview of how industry experts have been steadily joining forces to transform how the United States government is using technology to provide services to its people. One of the key elements of this strategy is open data and open source — there’s little or no stealth in this “startup.”

One of my proudest moments after I joined 18F was when we announced our open source policy. Developing in the open creates an unprecedented level of transparency and offers new potential to engage members of the public in the operation of our democracy.

Before that time, most projects from the Presidential Innovation Fellows and the new 18F team were open source, but each project required specific sign off by agency leaders for it to be open. Creating a policy dramatically streamlined this sign-off process. Working in the open saves time and money:

  • streamlines communication
  • increases code reuse
  • reduces vendor lock-in

In 2013, the Open Data Executive Order set the stage for this work. By making it so that open data was the default expectation, it means that thousands of civil servants may provide open data as part of their process, without needing to get permission for each individual data set to be published.

It’s great to see industry press starting to take notice of this transformation happening inside the US Government. We’re really just getting started. If you want to read more, check out the 18F blog

The post the stealthy openness of the us digital services appeared first on the evolving ultrasaurus.

by sarah at July 04, 2015 05:58 PM

July 03, 2015

Michael Biven

The Concepts of John Boyd are much more than just the OODA Loop

Ever since John Boyd’s OODA Loop was introduced to the web community at Velocity, we’ve had an oversimplified and incomplete view of his ideas. One that has been reinforced by reusing a version of the same basic diagram that he never drew.

Each time that the OODA Loop has been mentioned in discussions, blog posts, books, and presentations we continue to reinforce a misunderstanding of what Boyd was after. I’m guilty of this as well. It wasn’t until this past weekend that I found out there was more to his ideas than just a basic logical process.

John used to say if his work is going to become dogma, if its going to stop you from thinking, then right now run out and burn it.

– Chet Richards

First, Boyd never drew anything that resembled what we think of for the OODA Loop. His view was more complex. It requires orienting yourself to a situation and having the adaptability to adjust to any changes in it. Reality isn’t a static thing. It’s fluid, it’s sly, and his loop accommodates for that.

The parts we’ve been missing from his work:

  • having a common culture or training

  • analyzing and synthesizing information to prevent us from using “the same mental recipes over and over again”

  • high level of trust between leaders and subordinates

  • implicit rather than explicit authority to give subordinates the freedom to achieve the outcomes asked of them

Without a common outlook superiors cannot give subordinates freedom-of-action and maintain coherency of ongoing action.

A common outlook possessed by “a body of officers” represents a unifying theme that can be used to simultaneously encourage subordinate initiative yet realize superior intent.

– Colonel John R. Boyd, USAF

One key part missing is the basic loop includes an amount of inertia or friction that has to be overcome to take any action. To address this, Boyd describes increasing the tempo by moving from Cheng (expected/passive) to Ch’i (unexpected/active) more quickly than the basic loop allows.

He calls this “Asymmetric Fast Transients”. In what I believe is the only diagram that he drew for the OODA Loop there is a way to immediately jump from Observe to Act. He called this Implicit Guidance and Control (IG&C). By keeping orientation closer to reality and having a common culture, organizations can quickly respond to situations they recognize as they come up.

The actions taken in the IG&C are the repertoire of organizations. These are the actions that make up the muscle memory of a team, because they are repeatable and predictable.

But there is a danger in becoming stale in both your cultural view and your repertoire. While it is important to have everyone on the same page, it is equally important to promote people with unusual or unconventional thinking (like Boyd himself). This helps push back against everyone thinking the same way and increasing the confirmation bias in a team over time.

And at no point should our repertoire become nothing more than a runbook that we flip through to react to events. Instead we continue to use it when we can while at the same time thinking and considering new methods to add to it or old ones that need to be removed.

This leads us to the analysis and synthesis process that should be taking place. Which allows us to work through events that cannot sidestep the process by using the IG&C. Instead we go through the basic OODA Loop (there are others like Plan-Do-Check-Act) that we’ve known letting us engineer new possibilities into our repertoire while at the same time improving our orientation to what is happening.

Boyd said there are three ways to insult him. The first is to call him an analyst, because you’re telling him that he is a halfwit and that he has half a brain. The second is to call him an expert, because you’re then saying he has it all figured out and can’t learn anything new. The third is to call him an analytical expert. Now that’s saying that not only is he a halfwit, but he thinks he has it all figured out.

don’t try to assume that something is wrong because it doesn’t fit into your orientation”

– Colonel John R. Boyd, USAF

His admiration of the Toyota Production Systems (TPS) can be seen directly in his lectures. By flipping the process from a top down approach to a bottoms-up Toyota was able to create a chain of customers and providers within its own assembly line. The customer tells the provider what they want and when resulting in what we now call Just-In-Time (JIT) manufacturing. This reduced time of production, the amount of inventory needing to be on hand and allowed for different cars to be made on the same assembly line.

Interesting to consider new tools like Mesos as a way to use TPS concepts for web operations. One assembly line or in this case infrastructure being able to retool the resources available to make different services available quicker than before.

During the twenty years that he was actively thinking, tinkering and sharing his ideas about these concepts he was always adjusting his own orientation to what he learned. In one lecture he mentions that instead of saying no, he wants to be listening instead. To have himself in receiving mode to learn from the other person instead of letting his “own orientation drown out the other view.”

The question you have to ask yourself, is what is the purpose of the question in the first place?

– Colonel John R. Boyd, USAF

Don’t be afraid to ask the dumb questions and if you don’t understand the answer then ask again. If you don’t you can’t really orient yourself to this new view.

Boyd drew from many different resources and his experience to come to his conclusions. From the 2nd Law of Thermodynamics, Werner Heisenberg’s Uncertainty Principle, Kurt Gödel’s two Incompleteness Theorems, to Taiichi Ohno and Shigeo Shingo from Toyota. The point being he never stopped learning and he seemed willing to keep asking the dumb questions to adjust his orientation when the facts supported it.

If your knowledge of Boyd was only the OODA Loop you can find all of his writings available online (links below). It’s also worth reading Chet Richards and Chuck Spinney who both worked with Boyd. If you’re involved with the infrastructure or architecture decisions for any web service the Toyota Production System is perfectly applicable.

John R. Boyd’s A Discourse on Winning and Losing


Conceptual Spiral

Abstract of the Discourse and Conceptual Spiral

Destruction and Creation

Patters of Conflict

The Strategic Game of ? and ?

Organic Design for Command and Control

The Essence of Winning and Losing


Aerial Attack Study

New Conception for Air-to-Air Combat

Fast Transients

All sourced from Defense and the National Interest with exception for “New Conception for Air-to-Air Combat” which was sourced from Chet Richards. You can also find updated and edited versions of Boyd’s work on Richards’ site. Both he and Spinney list several writings on Boyd’s work including some of their own.

For an introduction into the Toyota Production System and why going slower can make better products (read the last article for that one).

Ohno, Taiichi (March 1998), Toyota Production System: Beyond Large-Scale Production

Taiichi Ohno (2007), Workplace Management

Shigeo Shingo (1989), A study of the Toyota Production System

Allen Ward, Jeffrey K. Liker, John J. Cristiano and Durward K. Sobek II, “The Second Toyota Paradox: How Delaying Decisions Can Make Better Cars FasterMIT Sloan Management Review, April 15, 1995

July 03, 2015 10:07 AM

July 02, 2015

League of Professional System Administrators

Election Results for 2015

I'm pleased to announce our new board members! Our independent monitor Andrew Hume compiled the results of the election and the report. Welcome our new and returning board members:

read more

by warner at July 02, 2015 04:14 PM


The culture-shift of moving to public cloud

Public Cloud (AWS, Azure, etc) is a very different thing than on-prem infrastructures. The low orbit view of the sector is that this is entirely intentional: create a new way of doing things to enable businesses to focus on what they're good at. A lot of high executives get that message and embrace it... until it comes time to integrate this new way with the way things have always been done. Then we get some problems.

The view from 1000ft is much different than the one from 250 miles up.

From my point of view, there are two big ways that integrating public cloud will cause culture problems.

  • Black-box infrastructure.
  • Completely different cost-model.

I've already spoken on the second point so I won't spend much time on it here. In brief: AWS costing makes you pay for what you use every month with no way to defer it for a quarter or two, which is completely not the on-prem cost model.

Black-box infrastructure

You don't know how it works.

You don't know for sure that it's being run by competent professionals who have good working habits.

You don't know for sure if they have sufficient controls in place to keep your data absolutely out of the hands of anyone but you or nosy employees. SOC reports help, but still.

You may not get console access to your instances.

You're not big enough to warrant the white glove treatment of a service contract that addresses your specific needs. Or will accept any kind of penalties for non-delivery of service.

They'll turn your account off if you defer payment for a couple of months.

The SLA they offer on the service is all you're going to get. If you need more than that... well, you'll have to figure out how to re-engineer your own software to deal with that kind of failure.

Your monitoring system doesn't know how to handle the public cloud monitoring end-points.

These are all business items that you've taken for granted in running your own datacenter, or contracting for datacenter services with another company. Service levels aren't really negotiable, this throws some enterprises. You can't simply mandate higher redundancies in certain must-always-be-up single-system services, you have to re-engineer them to be multi-system or live with the risk. As any cloud integrator will tell you if asked, public cloud requires some changes to how you think about infrastructure and that includes how you ensure it behaves the way you need it to.

Having worked for a managed services provider and a SaaS site, I've heard of the ways companies try to lever contracts as well as lazy payment of bills. If you're big enough (AWS) you can afford to lose customers by being strict about on-time payment for services. Companies that habitually defer payment on bills for a month or two in order to game quarterly results will describe such services as, 'unfriendly to my business'. Companies that expect to get into protracted SLA negotiations will find not nearly enough wiggle room, and the lack of penalties for SLA failures to be contrary to internal best practices. These are abuses that can be levered at startup and mid-size businesses, quite effectively, but not so much at the big public cloud providers.

It really does require a new way of thinking about infrastructure, at all levels. From finance, to SLAs, to application engineering, and to staffing. That's a big hill to climb.

by SysAdmin1138 at July 02, 2015 02:08 PM

League of Professional System Administrators

Elections are over - results coming soon

As several folks have said, the elections are over.  As soon as the results are verified they will be posted here and in the LOPSAgram.  Thanks to all the people on the LC who worked hard to make this election possible, to the candidates that stood for election, and to all the people who voted.

by ski at July 02, 2015 04:43 AM

Steve Kemp's Blog

My new fitness challenge

So recently I posted on twitter about a sudden gain in strength:

To put that more into context I should give a few more details. In the past I've been using an assisted pull-up machine, which offers a counterweight to make such things easier.

When I started the exercise I assumed I couldn't do it for real, so I used the machine and set it on 150lb. Over a few weeks I got as far as being able to use it with only 80lb. (Which means I was lifting my entire body-weight minus 80lb. With the assisted-pullup machine smaller numbers are best!)

One evening I was walking to the cinema with my wife and told her I thought I'd be getting close to doing one real pull-up soon, which sounds a little silly, but I guess is pretty common for random men who are 40 as I almost am. As it happens there were some climbing equipment nearby so I said "Here see how close I am", and I proceeded to do 1.5 pullups. (The second one was bad, and didn't count, as I got 90% of the way "up".)

Having had that success I knew I could do "almost two", and I set a goal for the next gym visit: 3 x 3-pullups. I did that. Then I did two more for fun on the way out (couldn't quite manage a complete set.)

So that's the story of how I went from doing 1.5 pullus to doing 11 in less than a week. These days I can easily do 3x3, but struggle with more. It'll come, slowly.

So pull-up vs. chin-up? This just relates to which way you place your hands: palm facing you (chin-up) and palm way from you (pull-up).

Some technical details here but chinups are easier, and more bicep-centric.

Anyway too much writing. My next challenge is the one-armed pushup. However long it takes, and I think it will take a while, that's what I'm working toward.

July 02, 2015 02:00 AM

July 01, 2015

toolsmith: Malware Analysis with REMnux Docker Containers

Docker, runs on Ubuntu, Mac OS X, and Windows

ISSA Journal’s theme of the month is “Malware and what to do with it”. This invites so many possible smart-alecky responses, including where you can stick it, means by which to smoke it, and a variety of other abuses for the plethora of malware authors whose handy work we so enjoy each and every day of our security professional lives. But alas, that won’t get us further than a few chuckles, so I’ll just share the best summary response I’ve read to date, courtesy of @infosecjerk, and move on.
“Security is easy:
1)      Don't install malicious software.
2)      Don't click bad stuff.
3)      Only trust pretty women you don't know.
4)      Do what Gartner says.”
Wait, now I’m not sure there’s even a reason to continue here. :-)

One of the true benefits of being a SANS Internet Storm Center Handler is working with top notch security industry experts, and one such person is Lenny Zeltser. I’ve enjoyed Lenny’s work for many years; if you’ve taken SANS training you’ve either heard of or attended his GIAC Reverse Engineering Malware course and likely learned a great deal. You’re hopefully also aware of Lenny’s Linux toolkit for reverse-engineering and analyzing malware, REMnux. I covered REMnux in September 2010, but it, and the landscape, have evolved so much in the five years since. Be sure to grab the latest OVA and revisit it, if you haven’t utilized it lately. Rather than revisit REMnux specifically this month, I’ll draw your attention to a really slick way to analyze malware with Docker and specific malware-analysis related REMnux project Docker containers that Lenny’s created. Lenny expressed that he is personally interested in packaging malware analysis apps as containers because it gives him the opportunity to learn about container technologies and understand how they might be related to his work, customers and hobbies. Lenny’s packaging tools that are “useful in a malware analysis lab, that like-minded security professionals who work with malware or forensics might also find an interesting starting point for experimenting with containers and assessing their applicability to other contexts.”
Docker can be utilized on Ubuntu, Mac OS X, and Windows, I ran it on the SANS SIFT 3.0 virtual machine distribution, as well as my Mac Mini. The advantage of Docker containers, per the What Is Docker page, is simple to understand. First, “Docker allows you to package an application with all of its dependencies into a standardized unit for software development.” Everything you need therefore resides in a container: “Containers have similar resource isolation and allocation benefits as virtual machines but a different architectural approach allows them to be much more portable and efficient.” The Docker Engine is just that, the source from whom all container blessings flow. It utilizes Linux-specific kernel features so to run it on Windows and Mac OS X, it will install VirtualBox and boot2docker to create a Linux VM for the containers to run on Windows and Mac OS X. Windows Server is soon adding direct support for Docker with Windows Server Containers. In the meantime, if you’re going to go this extent, rather than just run natively on Linux, you might as well treat yourself to Kitematic, the desktop GUI for Docker. Read up on Docker before proceeding if you aren’t already well informed. Most importantly, read Security Risks and Benefits of Docker Application Containers.
Lenny mentioned that he is not planning to use containers as the architecture for the REMnux distro, stating that “This distribution has lots of useful tools installed directly on the REMnux host alongside the OS. It's fine to run most tools this way. However, I like the idea of being able to run some applications as separate containers, which is certainly possible using Docker on top of a system running the REMnux distro.” As an example, he struggled to set up Maltrieve and JSDetox directly on REMnux without introducing dependencies and settings that might break other tools but “running these applications as Docker containers allows people to have access to these handy utilities without worrying about such issues.” Lenny started the Docker image repository under the REMnux project umbrella to provide people with “the opportunity to conveniently use the tools available via the REMnux Docker repository even if they are not running REMnux.”
Before we dig in to REMnux Docker containers, I wanted to treat you to a very cool idea I’ve implemented after reading it on the SANS Digital Forensics and Incident Response Blog as posted by Lenny. He describes methods to install REMnux on a SIFT workstation, or SIFT on a REMnux workstation. I opted for the former because Docker runs really cleanly and natively on SIFT as it is Ubuntu 14.04 x64 under the hood. Installing REMnux on SIFT is as easy as wget --quiet -O - | sudo bash, then wait a bit. The script will update APT repositories (yes, we’re talking about malware analysis but no, not that APT) and install all the REMnux packages. When finished you’ll have all the power of SIFT and REMnux on one glorious workstation. By the way, if you want to use the full REMnux distribution as your Docker host, Docker is already fully installed.

Docker setup

After you’ve squared away your preferred distribution, be sure to run sudo apt-get update && sudo apt-get upgrade, then run sudo apt-get install

REMnux Docker Containers

Included in the REMnux container collection as of this writing you will find the V8 JavaScript engine, the Thug low-interaction honeyclient, the Viper binary analysis framework, Rekall and Volatility memory forensic frameworks, the JSDetox JavaScript analysis tool, the Radare2 reverse engineering framework, the Pescanner static malware analysis tool, the MASTIFF static analysis framework, and the Maltrieve malware samples downloader. This may well give you everything you possibly need as a great start for malware reverse engineering and analysis in one collection of Docker containers. I won’t discuss the Rekall or Volatility containers as toolsmith readers should already be intimately familiar with, and happily using, those tools. But it is mighty convenient to know you can spin them up via Docker.
The first time you run a Docker container it will be automatically pulled down from the Docker Hub if you don’t already have a local copy. All the REMnux containers reside there, you can, as I did, start with @kylemaxwell’s wicked good Maltrieve by executing sudo docker run --rm -it remnux/maltrieve bash. Once the container is downloaded and ready, exit and rerun it with sudo docker run --rm -it -v ~/samples:/home/sansforensics/samples remnux/maltrieve bash after you build a samples directory in your home directory. Important note: the -v parameter defines a shared directory that the container and the supporting host can both access and utilized. Liken it to Shared Folders in VMWare. Be sure to run sudo chmod a+xwr against it so it’s world readable/writeable. When all said and done you should be dropped to a nonroot prompt (a good thing), simply run maltrieve -d /home/sansforensics/samples/ -l /home/sansforensics/samples/maltieve.logand wait again as it populates malware samples to your sample directory, as seen in Figure 1, from the likes of Malc0de, Malware Domain List, Malware URLs, VX Vault, URLquery, CleanMX, and ZeusTracker.

Figure 1 – Maltrieve completes its downloads, 780 delicious samples ready for REMnux
So nice to have a current local collection. The above mentioned sources update regularly so you can keep your sample farm fresh. You can also define your preferred DUMPDIR and log directories in maltrieve.cfg for ease of use.

Next up, a look at the REMnux MASTIFF container. “MASTIFF is a static analysis framework that automates the process of extracting key characteristics from a number of different file formats” from @SecShoggoth.  I ran it as follows: sudo docker run --dns=my.dns.server.ip --rm -it -v ~/samples:/home/sansforensics/samples remnux/mastiff bash. You may want or need to replace --dns=my.dns.server.ip with your preferred DNS server if you don’t want to use the default I found this ensured name resolution for me from inside the container. MASTIFF can call the VirusTotal API and submit malware if you configure it to do so with mastiff.conf, it will fail if DNS isn’t working properly. You need to edit mastiff.conf via vi with you API key and enable submit=yes. Also note that, when invoked with --rm parameters, the container will be ephemeral and all customization will disappear once the container exits. You can invoke the container differently to save the customization and the state.
You may want to also instruct the log_dirdirective to point at your shared samples directory so the results are written outside the container.
You can then run /your/working/directory/samplename with your correct preferences and the result should resemble Figure 2.

Figure 2 – Successful REMnux MASTIFF run
All of the results can be found in /workdir/log under a folder named for each sample analyzed. Checking the Yara results in yara.txt will inform you that the while the payload is a PE32 it exhibits malicious document attributes per Didier Steven’s (another brilliant Internet Storm Center handler) maldoc rules as seen in Figure 3.

Figure 3 – Yara results indicating a malicious document attributes
The peinfo-full and peinfo-quick results will provide further details, indicators, and behaviors necessary to complete your analysis.

Our last example is the REMnux JSDetox container. Per its website, courtesy of @sven_t, JSDetox “is a tool to support the manual analysis of malicious Javascript code.” To run it is as simple as sudo docker run --rm -p 3000:3000 remnux/jsdetox, then point your browser to http://localhost:3000on your container host system. One of my favorite obfuscated malicious JavaScipt examples comes courtesy of and is seen in its raw, hidden ugliness in Figure 4.

Figure 4 – Obfuscated malicious JavaScript
Feed said script to JSDetox under the Code Analysis tab, run Analyze, choose the Execution tab, then Show Code and you’ll quickly learn that the obfuscated code serves up a malicious script from, flagged by major browsers and as distributing malware and acting as a redirector. The results are evident in Figure 5.

Figure 5 – JSDetox results
All the malware analysis horsepower you can imagine in the convenience of Docker containers, running on top of SIFT with a full REMnux install too. Way to go, Lenny, my journey is complete. J

In Conclusion

Lenny’s plans for the future include maintaining and enhancing the REMnux distro with the help of the Debian package repository he set up for this purpose with Docker and containers part of his design. Independently, he will continue to build and catalog Docker containers for useful malware analysis tools, so they can be utilized with or without the REMnux distro. I am certain this is the best way possible for you readers to immerse yourself in both Docker technology and some of the best of the REMnux collection at the same time. Enjoy!
Ping me via email or Twitter if you have questions (russ at holisticinfosec dot org or @holisticinfosec).
Cheers…until next month.


Thanks again to Lenny Zeltser, @lennyzeltser, for years of REMnux, and these Docker containers.

by Russ McRee ( at July 01, 2015 10:13 PM

Anton Chuvakin - Security Warrior

Monthly Blog Round-Up – June 2015

Here is my next monthly "Security Warrior" blog round-up of top 5 popular posts/topics this month:
  1. Why No Open Source SIEM, EVER?” contains some of my SIEM thinking from 2009. Is it relevant now? Well, you be the judge.  Current popularity of open source log search tools, BTW, does not break the logic of that post. Succeeding with SIEM requires a lot of work, whether you paid for the software, or not. That – and developing a SIEM is much harder than most people think  [278 pageviews]
  2. Top 10 Criteria for a SIEM?” came from one of my last projects I did when running my SIEM consulting firm in 2009-2011 (for my recent work on evaluating SIEM, see this document) [198 pageviews]
  3. Simple Log Review Checklist Released!” is often at the top of this list – the checklist is still a very useful tool for many people. “On Free Log Management Tools” is a companion to the checklist (updated version) [114 pageviews]
  4. My classic PCI DSS Log Review series is always popular! The series of 18 posts cover a comprehensive log review approach (OK for PCI DSS 3.0 as well), useful for building log review processes and procedures , whether regulatory or not. It is also described in more detail in our Log Management book and mentioned in our PCI book (just out in its 4th edition!) [100+ pageviews to the main tag]
  5. On Choosing SIEM” is about the least wrong way of choosing a SIEM tool – as well as why the right way is so unpopular. [60 pageviews out of a total of 4941 pageviews to all blog pages]
In addition, I’d like to draw your attention to a few recent posts from my Gartner blog:

Current research on cloud security monitoring:
Past research on security analytics:
Miscellaneous fun posts:

(see all my published Gartner research here)
Also see my past monthly and annual “Top Popular Blog Posts” – 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014.
Disclaimer: most content at SecurityWarrior blog was written before I joined Gartner on Aug 1, 2011 and is solely my personal view at the time of writing. For my current security blogging, go here.

Previous post in this endless series:

by Anton Chuvakin ( at July 01, 2015 03:09 PM

LZone - Sysadmin

Screen tmux Cheat Sheet

Here is a side by side comparison of screen and tmux commands and hotkeys.
Function Screen tmux
Start instance screen screen -S <name> tmux
Attach to instance screen -r <name> screen -x <name> tmux attach
List instances screen -ls screen -ls <user name>/ tmux ls
New Window ^a c ^b c
Switch Window ^a n ^a p ^b n ^b p
List Windows ^a " ^b w
Name Window ^a A ^b ,
Split Horizontal ^a S ^b "
Split Vertical ^a | ^b %
Switch Pane ^a Tab ^b o
Kill Pane ^a x ^b x
Paging ^b PgUp ^b PgDown
Scrolling Mode ^a [ ^b [

July 01, 2015 02:19 PM

The damage of one second

Update: According to the AWS status page the incident was a problem related to BGP route leaking. AWS does not hint on a leap second related incident as originally suggested by this post!

Tonight we had another leap second and not without suffering at the same time. At the end of the post you can find two screenshots of outages suggested by The screenshots were taken shortly after midnight UTC and you can easily spot those sites with problems by the disting peak at the right site of the graph.

AWS Outage

What is common to many of the affected sites: them being hosted at AWS which had some problems.

[RESOLVED] Internet connectivity issues

Between 5:25 PM and 6:07 PM PDT we experienced an Internet connectivity issue with a provider outside of our network which affected traffic from some end-user networks. The issue has been resolved and the service is operating normally.

The root cause of this issue was an external Internet service provider incorrectly accepting a set of routes for some AWS addresses from a third-party who inadvertently advertised these routes. Providers should normally reject these routes by policy, but in this case the routes were accepted and propagated to other ISPs affecting some end-user’s ability to access AWS resources. Once we identified the provider and third-party network, we took action to route traffic around this incorrect routing configuration. We have worked with this external Internet service provider to ensure that this does not reoccur.

Incident Details

Graphs from

Note that those graphs indicate user reported issues:

July 01, 2015 04:20 AM

June 30, 2015


My Security Strategy: The "Third Way"

Over the last two weeks I listened to and watched all of the hearings related to the OPM breach. During the exchanges between the witnesses and legislators, I noticed several themes. One presented the situation facing OPM (and other Federal agencies) as confronting the following choice:

You can either 1) "secure your network," which is very difficult and going to "take years," due to "years of insufficient investment," or 2) suffer intrusions and breaches, which is what happened to OPM.

This struck me as an odd dichotomy. The reasoning appeared to be that because OPM did not make "sufficient investment" in security, a breach was the result.

In other words, if OPM had "sufficiently invested" in security, they would not have suffered a breach.

I do not see the situation in this way, for two main reasons.

First, there is a difference between an "intrusion" and a "breach." An intrusion is unauthorized access to a computing resource. A breach is the theft, alteration, or destruction of that computing resource, following an intrusion.

It therefore follows that one can suffer an intrusion, but not suffer a breach.

One can avoid a breach following an intrusion if the security team can stop the adversary before he accomplishes his mission.

Second, there is no point at which any network is "secure," i.e., intrusion-proof. It is more likely one could operate a breach-proof network, but that is not completely attainable, either.

Still, the most effective strategy is a combination of preventing as many intrusions as possible, complemented by an aggressive detection and response operation that improves the chances of avoiding a breach, or at least minimizes the impact of a breach.

This is why I call "detection and response" the "third way" strategy. The first way, "secure your network" by making it "intrusion-proof," is not possible. The second way, suffer intrusions and breaches, is not acceptable. Therefore, organizations should implement a third way strategy that stops as many intrusions as possible, but detects and responds to those intrusions that do occur, prior to their progression to breach status.

by Richard Bejtlich ( at June 30, 2015 07:23 PM

My Prediction for Top Gun 2 Plot

We've known for about a year that Tom Cruise is returning to his iconic "Maverick" role from Top Gun, and that drone warfare would be involved. A few days ago we heard a few more details in this Collider story:

[Producer David Ellison]: There is an amazing role for Maverick in the movie and there is no Top Gun without Maverick, and it is going to be Maverick playing Maverick. It is I don’t think what people are going to expect, and we are very, very hopeful that we get to make the movie very soon. But like all things, it all comes down to the script, and Justin is writing as we speak.

[Interviewer]; You’re gonna do what a lot of sequels have been doing now which is incorporate real use of time from the first one to now.

ELLISON and DANA GOLDBERG: Absolutely...

ELLISON:  As everyone knows with Tom, he is 100% going to want to be in those airplanes shooting it practically. When you look at the world of dogfighting, what’s interesting about it is that it’s not a world that exists to the same degree when the original movie came out. This world has not been explored. It is very much a world we live in today where it’s drone technology and fifth generation fighters are really what the United States Navy is calling the last man-made fighter that we’re actually going to produce so it’s really exploring the end of an era of dogfighting and fighter pilots and what that culture is today are all fun things that we’re gonna get to dive into in this movie.

What could the plot involve?

First, who is the adversary? You can't have dogfighting without a foe. Consider the leading candidates:

  • Russia: Maybe. Nobody is fond of what President Putin is doing in Ukraine.
  • Iran: Possible, but Hollywood types are close to the Democrats, and they will not likely want to upset Iran if Secretary Kerry secures a nuclear deal.
  • China: No way. Studios want to release movies in China, and despite the possibility of aerial conflict in the East or South China Seas, no studio is going to make China the bad guy. In fact, the studio will want to promote China as a good guy to please that audience.
  • North Korea: No way. Prior to "The Interview," this was a possibility. Not anymore!
My money is on an Islamic terrorist group, either unnamed, or possibly Islamic State. They don't have an air force, you say? This is where the drone angle comes into play.

Here is my prediction for the Top Gun 2 plot.

Oil tankers are trying to pass through the Gulf of Aden, or maybe the Strait of Hormuz, carrying their precious cargo. Suddenly a swarm of small, yet armed, drones attack and destroy the convoy, setting the oil ablaze in a commercial and environmental disaster. The stock market suffers a huge drop and gas prices skyrocket.

The US Fifth Fleet, and its Chinese counterpart, performing counter-piracy duties nearby, rush to rescue the survivors. They set up joint patrols to guard other commercial sea traffic. Later the Islamic group sends another swarm of drones to attack the American and Chinese ships. This time the enemy includes some sort of electronic warfare-capable drones that jam US and Chinese GPS, communications, and computer equipment. (I'm seeing a modern "Battlestar Galactica" theme here.) American and Chinese pilots die, and their ships are heavily damaged. (By the way, this is Hollywood, not real life.)

The US Navy realizes that its "net-centric," "technologically superior" force can't compete with this new era of warfare. Cue the similarities with the pre-Fighter Weapons School, early Vietnam situation described in the first scenes at Miramar in the original movie. (Remember, a 12-1 kill ratio in Korea, 3-1 in early Vietnam due to reliance on missiles and atrophied dogfighting skills, back to 12-1 in Vietnam after Top Gun training?)

The US Navy decides it needs to bring back someone who thinks unconventionally in order to counter the drone threat and resume commercial traffic in the Gulf. They find Maverick, barely hanging on to a job teaching at a civilian flight school. His personal life is a mess, and he was kicked out of the Navy during the first Gulf War in 1991 for breaking too many rules. Now the Navy wants him to teach a new generation of pilots how to fight once their "net-centric crutches" disappear.

You know what happens next. Maverick returns to the Navy as a contractor. Top Gun is now the Naval Strike and Air Warfare Center (NSAWC) at NAS Fallon, Nevada. The Navy retired his beloved F-14 in 2006, so there is a choice to be made about what aircraft awaits him in Nevada. I see three possibilities:

1) The Navy resurrects the F-14 because it's "not vulnerable" to the drone electronic warfare. This would be cool, but they aren't going to be able to fly American F-14s due to their retirement. CGI maybe?

2) The Navy flies the new F-35, because it's new and cool. However, the Navy will probably not have any to fly. CGI again?

3) The Navy flies the F-18. This is most likely, because producers could film live operations as they did in the 1980s.

Beyond the aircraft issues, I expect themes involving relevance as one ages, re-integration with military culture, and possibly friction between members of the joint US-China task force created to counter the Islamic threat.

In the end, thanks to the ingenuity of Maverick's teaching and tactics, the Americans and Chinese prevail over the Islamic forces. It might require Maverick to make the ultimate sacrifice, showing he's learned that warfare is a team sport, and that he really misses Goose. The Chinese name their next aircraft carrier the "Pete Mitchell" in honor of Maverick's sacrifice. (Forget calling it the "Maverick" -- too much rebellion for the CCP.)

I'm looking forward to this movie.

by Richard Bejtlich ( at June 30, 2015 03:01 PM

Sean's IT Blog

The Approaching Backup (Hyper)Convergence #VFD5

When we talk about convergence in IT, it usually means bringing things together to make them easier to manage and use.  Network convergence, in the data center, is bringing together your storage and IP stacks, while hyperconverged is about bringing together compute and storage together in a platform that can easily scale as new capacity is needed.

One area where we haven’t seen a lot of convergence is the backup industry.  One new startup, fresh out of stealth mode, aims to change that by bringing together backup storage, compute, and virtualization backup software in a scalable and easy to use package.

I had the opportunity to hear from Rubrik, a new player in the backup space, at Virtualization Field Day 5.   My coworker, and fellow VFD5 delegate, Eric Shanks, has also written his thoughts on Rubrik.

Note: All travel and incidental expenses for attending Virtualization Field Day 5 were paid for by Gestalt IT.  This was the only compensation provided, and it did not influence the content of this post.

One of the challenges of architecting backup solutions for IT environments is that you need to bring together a number of disparate pieces, often from different vendors, and try to make them function as one.  Even if multiple components are from the same vendor, they’re often not integrated in a way to make them easy to deploy.

Rubrik’s goal is to be a “Time Machine for private cloud” and to make backup so simple that you can have the appliance racked and starting backups within 15 minutes.  Their product, which hit general availability in May, combines backup software, storage, and hardware in a package that is easy to deploy, use, and scale.

They front this with an HTML5 interface and advanced search capabilities for virtual machines and files within the virtual machine file system.  This works across both locally stored data and data that has been aged out to the cloud due to a local metadata cache.

Because they control the hardware and software for the entire platform, Rubrik is able to engineer everything for the best performance.  They utilize flash in each node to store backup metadata as well as ingest the inbound data streams to deduplicate and compress data.

Rubrik uses SLAs to determine how often virtual machines are protected and how long that data is saved.  Over time, that data can be aged out to Amazon S3.  They do not currently support replication to another Rubrik appliance in another location, but that is on the roadmap.

Although there are a lot of cool features in Rubrik, it is a version 1.0 product.  It is missing some things that more mature products have such as application-level item recovery and role-based access control.  They only support vSphere in this reslease.  However, the vendor has committed to adding many more features, and support for additional hypervisors, in future releases.

You can watch the introduction and technical deep dive for the Rubrik presentation on Youtube.  The links are below.

If you want to see a hands-on review of Rubrik, you can read Brian Suhr’s unboxing post here.

Rubrik has brought an innovative and exciting product to market, and I look forward to seeing more from them in the future.

by seanpmassey at June 30, 2015 01:00 PM

Standalone Sysadmin

Are you monitoring your switchports the right way?

Graphite might be the best thing I’ve rolled out here in my position at CCIS.

One of our graduate students has been working on a really interesting paper for a while. I can’t go into details, because he’s going to publish before too long, but he has been making good use of my network diagrams. Since he has a lot riding on the accuracy of the data, he’s been asking me very specific questions about how the data was obtained, and how the graphs are produced, and so on.

One of the questions he asked me had to do with a bandwidth graph, much like this one:

His question revolved around the actual amount of traffic each datapoint represented. I explained briefly that we were looking at Megabytes per second, and he asked for clarification – specifically, whether each point was the sum total of data sent per second between updates, or whether it was the average bandwidth used over the interval.

We did some calculations, and decided that if it were, in fact, the total number of bytes received since the previous data point, it would mean my network had basically no traffic, and I know that not to be the case. But still, these things need verified, so I dug in and re-determined the entire path that the metrics take.

These metrics are coming from a pair of Cisco Nexus Switches via SNMP. The data being pulled is a per-interface ifInOctets and ifOutOctets. As you can see from the linked pages, each of those are 32 bit counters, with “The total number of octets transmitted [in|out] of the interface, including framing characters”.

Practically speaking, what this gives you is an ever-increasing number. The idea behind this counter is that you query it, and receive a number of bytes (say, 100). This indicates that at the time you queried it, the interface has sent (in the case of ifOutOctets) 100 bytes. If you query it again ten seconds later, and you get 150, then you know that in the intervening ten seconds, the interface has sent 50 bytes, and since you queried it ten seconds apart, you determine that the interface has transmitted 5 bytes per second.

Having the counter work like this means that, in theory, you don’t have to worry about how frequently you query it. You could query it tomorrow, and if it went from 100 to 100000000, you could be able to figure out how many seconds it was since you asked before, divide the byte difference, and figure out the average bytes per second that way. Granted, the resolution on those stats isn’t stellar at that frequency, but it would still be a number.

Incidentally, you might wonder, “wait, didn’t you say it was 32 bits? That’s not huge. How big can it get?”. The answer is found in RFC 1155: Counter

This application-wide type represents a non-negative integer which monotonically increases until it reaches a maximum value, when it wraps around and starts increasing again from zero. This memo specifies a maximum value of 2^32-1 (4294967295 decimal) for counters.

In other words, 4.29 gigabytes (or just over 34 gigabits). It turns out that this is actually kind of an important facet to the whole “monitoring bandwith” thing, because in our modern networks, switch interfaces are routinely 1Gb/s, often 10Gb/s, and sometimes even more. If our standard network interfaces can transfer one gigabits per second, then a fully utilized network interface can roll over an entire counter in 35 seconds. If we’re only querying that interface once a minute, then we’re potentially losing a lot of data. Consider, then, a 10Gb/s interface. Are you pulling metrics more often than once every 4 seconds? If not, you may be losing data.

Fortunately, there’s an easy fix. Instead of ifInOctets and ifOutOctets, query ifHCInOctets and ifHCOutOctets.  They are 64 bit counters, and only roll over once every 18 exabytes. Even with a 100% utilized 100Gb/s interface, you’ll still only roll over a counter every 5.8 years or so.

I made this change to my collectd configuration as soon as I figured out what I was doing wrong, and fortunately, none of my metrics jumped, so I’m going to say I got lucky. Don’t be me – start out doing it the right way, and save yourself confusion and embarrassment later.  Use 64-bit counters from the start!

(Also, there are the equivalent HC versions for all of the other interface counters you’re interested in, like the UCast, Multicast, and broadcast packet stats – make sure to use the 64-bit version of all of them).

Thanks, I hope I managed to help someone!

by Matt Simmons at June 30, 2015 09:14 AM

June 29, 2015

Sean's IT Blog

GPUs Should Be Optional for VDI

Note: I disabled comments on my blog in 2014 because of spammers. Please comment on this discussion on Twitter using the #VDIGPU hashtag.

Brian Madden recently published a blog arguing that GPU should not be considered optional for VDI.  This post stemmed from a conversation that he had with Dane Young about a BriForum 2015 London session on his podcast

Dane’s statement that kicked off this discussion was:
”I’m trying to convince people that GPUs should not be optional for VDI.”

The arguments that were laid out in Brian’s blog post were:

1. You don’t think of buying a desktop without a GPU
2. They’re not as expensive as people think

I think these are poor arguments for adopting a technology.  GPUs are not required for general purpose VDI, and they should only be used when the use case calls for it.  There are a couple of reasons why:

1. It doesn’t solve user experience issues: User experience is a big issue in VDI environments, and many of the complaints from users have to do with their experience.  From what I have seen, a good majority of those issues have resulted from a) IT doing a poor job of setting expectations, b) storage issues, and/or c) network issues.

Installing GPUs in virtual environments will not resolve any of those issues, and the best practices are to turn off or disable graphics intensive options like Aero to reduce the bandwidth used on wide-area network links.

Some modern applications, like Microsoft Office and Internet Explorer, will offload some processing to the GPU.  The software GPU in vSphere can easily handle these requirements with some additional CPU overhead.  CPU overhead, however, is rarely the bottleneck in VDI environments, so you’re not taking a huge performance hit by not having a dedicated hardware GPU.

2. It has serious impacts on consolidation ratios and user densities: There are three ways to do hardware graphics acceleration for virtual machines running on vSphere with discrete GPUs.

(Note: These methods only apply to VMware vSphere. Hyper-V and XenServer have their own methods of sharing GPUs that may be similar to this.)

  • Pass-Thru (vDGA): The physical GPU is passed directly through to the virtual machines on a 1 GPU:1 Virtual Desktop basis.  Density is limited to the number of GPUs installed on the host. The VM cannot be moved to another host unless the GPU is removed. The only video cards currently supported for this method are high-end NVIDIA Quadro and GRID cards.
  • Shared Virtual Graphics (vSGA): VMs share access to GPU resources through a driver that is installed at the host level, and the GPU is abstracted away from the VM. The software GPU driver is used, and the hypervisor-level driver acts as an interface to the physical GPU.  Density depends on configuration…and math is involved (note: PDF link) due to the allocated video memory being split between the host’s and GPU’s RAM. vSGA is the only 3D graphics type that can be vMotioned to another host while the VM is running, even if that host does not have a physical GPU installed. This method supports NVIDIA GRID cards along with select QUADRO, AMD FirePro, and Intel HD graphics cards.
  • vGPU: VMs share access to an NVIDIA GRID card.  A manager application is installed that controls the profiles and schedules access to GPU resources.  Profiles are assigned to virtual desktops that control resource allocation and number of virtual desktops that can utilize the card. A Shared PCI device is added to VMs that need to access the GPU, and VMs may not be live-migrated to a new host while running. VMs may not start up if there are no GPU resources available to use.

Figure 1: NVIDIA GRID Profiles and User Densities

There is a hard limit to the number of users that you can place on a host when you give every desktop access to a GPU, so it would require additional hosts to meet the needs of the VDI environment.  That also means that hardware could be sitting idle and not used to its optimal capacity because the GPU becomes the bottleneck.

The alternative is to try and load up servers with a large number of GPUs, but there are limits to the number of GPUs that a server can hold.  This is usually determined by the number of available PCIe x16 slots and available power, and the standard 2U rackmount server can usually only handle two cards.   This means I would still need to take on additional expenses to give all users a virtual desktop with some GPU support.

Either way, you are taking on unnecessary additional costs.

There are few use cases that currently benefit from 3D acceleration.  Those cases, such as CAD or medical imaging, often have other requirements that make high user consolidation ratios unlikely and are replacing expensive, high-end workstations.

Do I Need GPUs?

So do I need a GPU?  The answer to that question, like any other design question, is “It Depends.”

It greatly depends on your use case, and the decision to deploy GPUs will be determined by the applications in your use case.  Some of the applications where a GPU will be required are:

  • CAD and BIM
  • Medical Imaging
  • 3D Modeling
  • Computer Animation
  • Graphic Design

You’ll notice that these are all higher-end applications where 3D graphics are a core requirement.

But what about Office, Internet Explorer, and other basic apps?  Yes, more applications are offloading some things to the GPU, but these are often minor things to improve UI performance.  They can also be disabled, and the user usually won’t notice any performance difference.

Even if they aren’t disabled, the software GPU can handle these elements.  There would be some additional CPU overhead, but as I said above, VDI environments usually constrained by memory and have enough available CPU capacity to accommodate this.

But My Desktop Has a GPU…

So let’s wrap up by addressing the point that all business computers have GPUs and how that should be a justification for putting GPUs in the servers that host VDI environments.

It is true that all desktops and laptops come with some form of a GPU.  But there is a very good reason for this. Business desktops and laptops are designed to be general purpose computers that can handle a wide-range of use cases and needs.  The GPUs in these computers are usually integrated Intel graphics cards, and they lack the capabilities and horsepower of the professional grade NVIDIA and AMD products used in VDI environments. 

Virtual desktops are not general purpose computers.  They should be tailored to their use case and the applications that will be running in them.  Most users only need a few core applications, and if they do not require that GPU, it should not be there.

It’s also worth noting that adding NVIDIA GRID cards to servers is a non-trivial task.  Servers require special factory configurations to support GPUs that need to be certified by the graphics manufacturer.  There are two reasons for this – GPUs often draw more than the 75W that a PCIe x16 slot can provide and are passively cooled, requiring additional fans.  Aside from one vendor on Amazon, these cards can only be acquired from OEM vendors as part of the server build.

The argument that GPUs should be required for VDI will make much more sense when hypervisors have support for mid-range GPUs from multiple vendors. Until that happens, adding GPUs to your virtual desktops is a decision that needs to be made carefully, and it needs to fit your intended use cases.  While there are many use cases where they are required or would add significant value, there are also many use cases where they would add unneeded constraints and costs to the environment. 

by seanpmassey at June 29, 2015 01:26 PM

June 25, 2015

The Geekess

I won Red Hat’s Women in Open Source Award!

At Red Hat Summit, I was presented with the first ever Women in Open Source Award.  I’m really honored to be recognized for both my technical contributions, and my efforts to make open source communities a better place.

For the past two years, I’ve worked as a coordinator for Outreachy, a program providing paid internships in open source to women (cis and trans), trans men, genderqueer people, and all participants of the Ascend Project.  I truly believe that newcomers to open source thrive when they’re provided mentorship, a supportive community, and good documentation.  When newcomers build relationships with their mentors and present their work at conferences, it leads to job opportunities working in open source.

That’s why I’m donating the $2,500 stipend for the Women in Open Source Award to Outreachy.  It may go towards internships, travel funding, or even paying consultants to advise us as we expand the program to include other underrepresented minorities.  There’s a saying in the activist community, “Nothing about us without us.”  We want to make sure that people of color are involved with the effort to expand Outreachy, and it’s unfair to ask those people to perform free labor when they’re already paid less than their white coworkers, and they may even be penalized for promoting diversity.

I urge people to donate to Outreachy, so we can get more Outreachy interns to conferences, and expand our internships to bring more underrepresented minorities into open source.  Any donation amount helps, and it’s tax deductible!

Sarah Sharp wins Women in Open Source Award!

by sarah at June 25, 2015 04:00 PM

June 22, 2015

Keep your home dir in Git with a detached working directory

logo@2xMany posts have been written on putting your homedir in git. Nearly everyone uses a different method of doing so. I've found the method I'm about to describe in this blog post to work the best for me. I've been using it for more than a year now, and it hasn't failed me yet. My method was put together from different sources all over the web; long since gone or untracable. So I'm documenting my setup here.

The features

So, what makes my method better than the rest? What makes it better than the multitude of pre-made tools out there? The answer is: it depends. I've simply found that this methods suits me personally because:

  • It's simple to implement, simple to understand and simple to use.
  • It gets out of your way. It doesn't mess with repositories deeper in your home directory, or with any tools looking for a .git directory. In fact, your home directory won't be a git repository at all.
  • It's simple to see what's changed since you last committed. It's a little harder to see new files not yet in your repository . This is because by default everything is ignored unless you specifically add it.
  • No special tools required, other than Git itself. A tiny alias in your .profile takes care of all of it.
  • No fiddling with symlinks and other nonsense.

How does it work?

It's simple. We create what is called a "detached working tree". In a normal git repository, you've got your .git dir, which is basically your repository database. When you perform a checkout, the directory containing this .git dir is populated with files from the git database. This is problematic when you want to keep your home directory in Git, since many tools (including git itself) will scan upwards in the directory tree in order to find a .git dir. This creates crazy scenario's such as Vim's CtrlP plugin trying to scan your entire home directory for file completions. Not cool. A detached working tree means your .git dir lives somewhere else entirely. Only the actual checkout lives in your home dir. This means no more nasty .git directory.

An alias 'dgit' is added to your .profile that wraps around the git command. It understands this detached working directory and lets you use git like you would normally. The dgit alias looks like this:

alias dgit='git --git-dir ~/.dotfiles/.git --work-tree=$HOME'

Simple enough, isn't it? We simply tell git that our working tree doesn't reside in the same directory as the .git dir (~/.dotfiles), but rather it's our directory. We set the git-dir so git will always know where our actual git repository resides. Otherwise it would scan up from the curent directory your in and won't find the .git dir, since that's the whole point of this exercise.

Setting it up

Create a directory to hold your git database (the .git dir):

$ mkdir ~/.dotfiles/
$ cd ~/.dotfiles/
~/.dotfiles$ git init .

Create a .gitifnore file that will ignore everything. You can be more conservative here and only ignore things you don't want in git. I like to pick and choose exactly which things I'll add, so I ignore everything by default and then add it later.

~/.dotfiles$ echo "*" > .gitignore
~/.dotfiles$ git add -f .gitignore 
~/.dotfiles$ git commit -m "gitignore"

Now we've got a repository set up for our files. It's out of the way of our home directory, so the .git directory won't cause any conflicts with other repositories in your home directory. Here comes the magic part that lets us use this repository to keep our home directory in. Add the dgit alias to your .bashrc or .profile, whichever you prefer:

~/.dotfiles$ echo "alias dgit='git --git-dir ~/.dotfiles/.git --work-tree=\$HOME'" >> ~/.bashrc

​You'll have to log out and in again, or just copy-paste the alias defnition in your current shell. We can now the repository out in our home directory with the dgit command:

~/.dotfiles$ cd ~
$ dgit reset --hard
HEAD is now at 642d86f gitignore

Now the repository is checked out in our home directory, and it's ready to have stuff added to it. The dgit reset --hard command might seem spooky (and I do suggest you make a backup before running it), but since we're ignoring everything, it'll work just fine.

Using it

Everything we do now, we do with the dgit command instead of normal git. In case you forget to use dgit, it simply won't work, so don't worry about that.

A dgit status shows nothing, since we've gitignored everything:

$ dgit status
On branch master
nothing to commit, working directory clean

We add things by overriding the ignore with -f:

$ dgit add -f .profile 
$ dgit commit -m "Added .profile"
[master f437f9f] Added .profile
 1 file changed, 22 insertions(+)
 create mode 100644 .profile

We can push our configuration files to a remote repository:

$ dgit remote add origin ssh://
$ dgit push origin master
 * [new branch]      master -> master

And easily deploy them to a new machine:

$ ssh someothermachine
$ git clone ssh:// ./.dotfiles
$ alias dgit='git --git-dir ~/.dotfiles/.git --work-tree=$HOME'
$ dgit reset --hard
HEAD is now at f437f9f Added .profile

Please note that any files that exist in your home directory will be overwritten by the files from your repository if they're present.


This DIY method of keeping your homedir in git should be easy to understand. Although there are tools out there that are easier to use, this method requires no installing other than Git. As I've stated in the introduction, I've been using this method for more than a year, and have found it to be the best way of keeping my home directory in git. 

by admin at June 22, 2015 01:30 PM

Warren Guy

Deploy a Tor hidden service to Heroku in under a minute

Getting a Tor hidden service running doesn't have to be hard. I've just published an example Sinatra application demonstrating how to deploy a hidden service to Heroku (or Dokku, etc) in just a few lines. The app uses my ruby-hidden-service library with the multi and apt Heroku buildpacks to install and configure Tor. A deployed example is running at

Here are the complete steps required to deploy the sample app:

Read full post

June 22, 2015 01:09 PM