Planet SysAdmin

August 27, 2016

Errata Security

Notes on that StJude/MuddyWatters/MedSec thing

I thought I'd write up some notes on the StJude/MedSec/MuddyWaters affair. Some references: [1] [2] [3] [4].

The story so far

tl;dr: hackers drop 0day on medical device company hoping to profit by shorting their stock

St Jude Medical (STJ) is one of the largest providers of pacemakers (aka. cardiac devices) in the country, around ~$2.5 billion in revenue, which accounts for about half their business. They provide "smart" pacemakers with an on-board computer that talks via radio-waves to a nearby monitor that records the functioning of the device (and health data). That monitor, "Merlin@Home", then talks back up to St Jude (via phone lines, 3G cell phone, or wifi). Pretty much all pacemakers work that way (my father's does, although his is from a different vendor).

MedSec is a bunch of cybersecurity researchers (white-hat hackers) who have been investigating medical devices. In theory, their primary business is to sell their services to medical device companies, to help companies secure their devices. Their CEO is Justine Bone, a long-time white-hat hacker. Despite Muddy Waters garbling the research, there's no reason to doubt that there's quality research underlying all this.

Muddy Waters is an investment company known for investigating companies, finding problems like accounting fraud, and profiting by shorting the stock of misbehaving companies.

Apparently, MedSec did a survey of many pacemaker manufacturers, chose the one with the most cybersecurity problems, and went to Muddy Waters with their findings, asking for a share of the profits Muddy Waters got from shorting the stock.

Muddy Waters published their findings in [1] above. St Jude published their response in [2] above. They are both highly dishonest. I point that out because people want to discuss the ethics of using 0day to short stock when we should talk about the ethics of lying.

"Why you should sell the stock" [finance issues]

In this section, I try to briefly summarize Muddy Water's argument why St Jude's stock will drop. I'm not an expert in this area (though I do a bunch of investment), but they do seem flimsy to me.

Muddy Water's argument is that these pacemakers are half of St Jude's business, and that fixing them will first require recalling them all, then take another 2 year to fix, during which time they can't be selling pacemakers. Much of the Muddy Waters paper is taken up explaining this, citing similar medical cases, and so on.

If at all true, and if the cybersecurity claims hold up, then yes, this would be good reason to short the stock. However, I suspect they aren't true -- and they are simply trying to scare people about long-term consequences allowing Muddy Waters to profit in the short term.

@selenakyle on Twitter suggests this interest document [4] about market-solutions to vuln-disclosure, if you are interested in this angle of things.

Update from @lippard: Abbot Labs agreed in April to buy St Jude at $85 a share (when St Jude's stock was $60/share). Presumable, for this Muddy Waters attack on St Jude's stock price to profit from anything more than a really short term stock drop (like dumping their short position today), Muddy Waters would have believe this effort will cause Abbot Labs to walk away from the deal. Normally, there are penalties for doing so, but material things like massive vulnerabilities in a product should allow Abbot Labs to walk away without penalties.

The 0day being dropped

Well, they didn't actually drop 0day as such, just claims that 0day exists -- that it's been "demonstrated". Reading through their document a few times, I've created a list of the 0day they found, to the granularity that one would expect from CVE numbers (CVE is group within the Department of Homeland security that assigns standard reference numbers to discovered vulnerabilities).

The first two, which can kill somebody, are the salient ones. The others are more normal cybersecurity issues, and may be of concern because they can leak HIPAA-protected info.

CVE-2016-xxxx: Pacemaker can be crashed, leading to death
Within a reasonable distance (under 50 feet) over several hours, pounding the pacemaker with malformed packets (either from an SDR or a hacked version of the Merlin@Home monitor), the pacemaker can crash. Sometimes such crashes will brick the device, other times put it into a state that may kill the patient by zapping the heart too quickly.

CVE-2016-xxxx: Pacemaker power can be drained, leading to death
Within a reasonable distance (under 50 feet) over several days, the pacemaker's power can slowly be drained at the rate of 3% per hour. While the user will receive a warning from their Merlin@Home monitoring device that the battery is getting low, it's possible the battery may be fully depleted before they can get to a doctor for a replacement. A non-functioning pacemaker may lead to death.

CVE-2016-xxxx: Pacemaker uses unauthenticated/unencrypted RF protocol
The above two items are possible because there is no encryption nor authentication in the wireless protocol, allowing any evildoer access to the pacemaker device or the monitoring device.

CVE-2016-xxxx: Merlin@Home contained hard-coded credentials and SSH keys
The password to connect to the St Jude network is the same for all device, and thus easily reverse engineered.

CVE-2016-xxxx: local proximity wand not required
It's unclear in the report, but it seems that most other products require a wand in local promixity (inches) in order to enable communication with the pacemaker. This seems like a requirement -- otherwise, even with authentication, remote RF would be able to drain the device in the person's chest.

So these are, as far as I can tell, the explicit bugs they outline. Unfortunately, none are described in detail. I don't see enough detail for any of these to actually be assigned a CVE number. I'm being generous here, trying to describe them as such, giving them the benefit of the doubt, there's enough weasel language in there that makes me doubt all of them. Though, if the first two prove not to be reproducible, then there will be a great defamation case, so I presume those two are true.

The movie/TV plot scenarios

So if you wanted to use this as a realistic TV/movie plot, here are two of them.

#1 You (the executive of the acquiring company) are meeting with the CEO and executives of a smaller company you want to buy. It's a family concern, and the CEO really doesn't want to sell. But you know his/her children want to sell. Therefore, during the meeting, you pull out your notebook and an SDR device and put it on the conference room table. You start running the exploit to crash that CEO's pacemaker. It crashes, the CEO grabs his/her chest, who gets carted off the hospital. The children continue negotiations, selling off their company.

#2 You are a hacker in Russia going after a target. After many phishing attempts, you finally break into the home desktop computer. From that computer, you branch out and connect to the Merlin@Home devices through the hard-coded password. You then run an exploit from the device, using that device's own radio, to slowly drain the battery from the pacemaker, day after day, while the target sleeps. You patch the software so it no longer warns the user that the battery is getting low. The battery dies, and a few days later while the victim is digging a ditch, s/he falls over dead from heart failure.

The Muddy Water's document is crap

There are many ethical issues, but the first should be dishonesty and spin of the Muddy Waters research report.

The report is clearly designed to scare other investors to drop St Jude stock price in the short term so that Muddy Waters can profit. It's not designed to withstand long term scrutiny. It's full of misleading details and outright lies.

For example, it keeps stressing how shockingly bad the security vulnerabilities are, such as saying:
We find STJ Cardiac Devices’ vulnerabilities orders of magnitude more worrying than the medical device hacks that have been publicly discussed in the past. 
This is factually untrue. St Jude problems are no worse than the 2013 issue where doctors disable the RF capabilities of Dick Cheney's pacemaker in response to disclosures. They are no worse than that insulin pump hack. Bad cybersecurity is the norm for medical devices. St Jude may be among the worst, but not by an order-of-magnitude.

The term "orders of magnitude" is math, by the way, and means "at least 100 times worse". As an expert, I claim these problems are not even one order of magnitude (10 times worse). I challenge MedSec's experts to stand behind the claim that these vulnerabilities are at least 100 times worse than other public medical device hacks.

In many places, the language is wishy-washy. Consider this quote:
Despite having no background in cybersecurity, Muddy Waters has been able to replicate in-house key exploits that help to enable these attacks
The semantic content of this is nil. It says they weren't able to replicate the attacks themselves. They don't have sufficient background in cybersecurity to understand what they replicated.

Such language is pervasive throughout the document, things that aren't technically lies, but which aren't true, either.

Also pervasive throughout the document, repeatedly interjected for no reason in the middle of text, are statements like this, repeatedly stressing why you should sell the stock:
Regardless, we have little doubt that STJ is about to enter a period of protracted litigation over these products. Should these trials reach verdicts, we expect the courts will hold that STJ has been grossly negligent in its product design. (We estimate awards could total $6.4 billion.15)
I point this out because Muddy Waters obviously doesn't feel the content of the document stands on its own, so that you can make this conclusion yourself. It instead feels the need to repeat this message over and over on every page.

Muddy Waters violation of Kerckhoff's Principle

One of the most important principles of cyber security is Kerckhoff's Principle, that more openness is better. Or, phrased another way, that trying to achieve security through obscurity is bad.

The Muddy Water's document attempts to violate this principle. Besides the the individual vulnerabilities, it makes the claim that St Jude cybersecurity is inherently bad because it's open. it uses off-the-shelf chips, standard software (line Linux), and standard protocols. St Jude does nothing to hide or obfuscate these things.

Everyone in cybersecurity would agree this is good. Muddy Waters claims this is bad.

For example, some of their quotes:
One competitor went as far as developing a highly proprietary embedded OS, which is quite costly and rarely seen
In contrast, the other manufacturers have proprietary RF chips developed specifically for their protocols
Again, as the cybersecurity experts in this case, I challenge MedSec to publicly defend Muddy Waters in these claims.

Medical device manufacturers should do the opposite of what Muddy Waters claims. I'll explain why.

Either your system is secure or it isn't. If it's secure, then making the details public won't hurt you. If it's insecure, then making the details obscure won't help you: hackers are far more adept at reverse engineering than you can possibly understand. Making things obscure, though, does stop helpful hackers (i.e. cybersecurity consultants you hire) from making your system secure, since it's hard figuring out the details.

Said another way: your adversaries (such as me) hate seeing open systems that are obviously secure. We love seeing obscure systems, because we know you couldn't possibly have validated their security.

The point is this: Muddy Waters is trying to profit from the public's misconception about cybersecurity, namely that obscurity is good. The actual principle is that obscurity is bad.

St Jude's response was no better

In response to the Muddy Water's document, St Jude published this document [2]. It's equally full of lies -- the sort that may deserve a share holder lawsuit. (I see lawsuits galore over this). It says the following:
We have examined the allegations made by Capital and MedSec on August 25, 2016 regarding the safety and security of our pacemakers and defibrillators, and while we would have preferred the opportunity to review a detailed account of the information, based on available information, we conclude that the report is false and misleading.
If that's true, if they can prove this in court, then that will mean they could win millions in a defamation lawsuit against Muddy Waters, and millions more for stock manipulation.

But it's almost certainly not true. Without authentication/encryption, then the fact that hackers can crash/drain a pacemaker is pretty obvious, especially since (as claimed by Muddy Waters), they've successfully done it. Specifically, the picture on page 17 of the 34 page Muddy Waters document is a smoking gun of a pacemaker misbehaving.

The rest of their document contains weasel-word denials that may be technically true, but which have no meaning.
St. Jude Medical stands behind the security and safety of our devices as confirmed by independent third parties and supported through our regulatory submissions. 
Our software has been evaluated and assessed by several independent organizations and researchers including Deloitte and Optiv.
In 2015, we successfully completed an upgrade to the ISO 27001:2013 certification.
These are all myths of the cybersecurity industry. Conformance with security standards, such as ISO 27001:2013, has absolutely zero bearing on whether you are secure. Having some consultants/white-hat claim your product is secure doesn't mean other white-hat hackers won't find an insecurity.

Indeed, having been assessed by Deloitte is a good indicator that something is wrong. It's not that they are incompetent (they've got some smart people working for them), but ultimately the way the security market works is that you demand of such auditors that the find reasons to believe your product is secure, not that they keep hunting until something is found that is insecure. It's why outsiders, like MedSec, are better, because they strive to find why your product is insecure. The bigger the enemy, the more resources they'll put into finding a problem.

It's like after you get a hair cut, your enemies and your friends will have different opinions on your new look. Enemies are more honest.

The most obvious lie from the St Jude response is the following:
The report claimed that the battery could be depleted at a 50-foot range. This is not possible since once the device is implanted into a patient, wireless communication has an approximate 7-foot range. This brings into question the entire testing methodology that has been used as the basis for the Muddy Waters Capital and MedSec report.
That's not how wireless works. With directional antennas and amplifiers, 7-feet easily becomes 50-feet or more. Even without that, something designed for reliable operation at 7-feet often works less reliably at 50-feet. There's no cutoff at 7-feet within which it will work, outside of which it won't.

That St Jude deliberately lies here brings into question their entire rebuttal. (see what I did there?)


First let's discuss the ethics of lying, using weasel words, and being deliberately misleading. Both St Jude and Muddy Waters do this, and it's ethically wrong. I point this out to uninterested readers who want to get at that other ethical issue. Clear violations of ethics we all agree interest nobody -- but they ought to. We should be lambasting Muddy Waters for their clear ethical violations, not the unclear one.

So let's get to the ethical issue everyone wants to discuss:
Is it ethical to profit from shorting stock while dropping 0day.
Let's discuss some of the issues.

There's no insider trading. Some people wonder if there are insider trading issues. There aren't. While it's true that Muddy Waters knew some secrets that nobody else knew, as long as they weren't insider secrets, it's not insider trading. In other words, only insiders know about a key customer contract won or lost recently. But, vulnerabilities researched by outsiders is still outside the company.

Watching a CEO walk into the building of a competitor is still outsider knowledge -- you can trade on the likely merger, even though insider employees cannot.

Dropping 0day might kill/harm people. That may be true, but that's never an ethical reason to not drop it. That's because it's not this one event in isolation. If companies knew ethical researchers would never drop an 0day, then they'd never patch it. It's like the government's warrantless surveillance of American citizens: the courts won't let us challenge it, because we can't prove it exists, and we can't prove it exists, because the courts allow it to be kept secret, because revealing the surveillance would harm national intelligence. That harm may happen shouldn't stop the right thing from happening.

In other words, in the long run, dropping this 0day doesn't necessarily harm people -- and thus profiting on it is not an ethical issue. We need incentives to find vulns. This moves the debate from an ethical one to more of a factual debate about the long-term/short-term risk from vuln disclosure.

As MedSec points out, St Jude has already proven itself an untrustworthy consumer of vulnerability disclosures. When that happens, the dropping 0day is ethically permissible for "responsible disclosure". Indeed, that St Jude then lied about it in their response ex post facto justifies the dropping of the 0day.

No 0day was actually dropped here. In this case, what was dropped was claims of 0day. This may be good or bad, depending on your arguments. It's good that the vendor will have some extra time to fix the problems before hackers can start exploiting them. It's bad because we can't properly evaluate the true impact of the 0day unless we get more detail -- allowing Muddy Waters to exaggerate and mislead people in order to move the stock more than is warranted.

In other words, the lack of actual 0day here is the problem -- actual 0day would've been better.

This 0day is not necessarily harmful. Okay, it is harmful, but it requires close proximity. It's not as if the hacker can reach out from across the world and kill everyone (barring my movie-plot section above). If you are within 50 feet of somebody, it's easier shooting, stabbing, or poisoning them.

Shorting on bad news is common. Before we address the issue whether this is unethical for cybersecurity researchers, we should first address the ethics for anybody doing this. Muddy Waters already does this by investigating companies for fraudulent accounting practice, then shorting the stock while revealing the fraud.

Yes, it's bad that Muddy Waters profits on the misfortunes of others, but it's others who are doing fraud -- who deserve it. [Snide capitalism trigger warning] To claim this is unethical means you are a typical socialist who believe the State should defend companies, even those who do illegal thing, in order to stop illegitimate/windfall profits. Supporting the ethics of this means you are a capitalist, who believe companies should succeed or fail on their own merits -- which means bad companies need to fail, and investors in those companies should lose money.

Yes, this is bad for cybersec research. There is constant tension between cybersecurity researchers doing "responsible" (sic) research and companies lobbying congress to pass laws against it. We see this recently how Detroit lobbied for DMCA (copyright) rules to bar security research, and how the DMCA regulators gave us an exemption. MedSec's action means now all medical devices manufacturers will now lobby congress for rules to stop MedSec -- and the rest of us security researchers. The lack of public research means medical devices will continue to be flawed, which is worse for everyone.

Personally, I don't care about this argument. How others might respond badly to my actions is not an ethical constraint on my actions. It's like speech: that others may be triggered into lobbying for anti-speech laws is still not constraint on what ethics allow me to say.

There were no lies or betrayal in the research. For me, "ethics" is usually a problem of lying, cheating, theft, and betrayal. As long as these things don't happen, then it's ethically okay. If MedSec had been hired by St Jude, had promised to keep things private, and then later disclosed them, then we'd have an ethical problem. Or consider this: frequently clients ask me to lie or omit things in pentest reports. It's an ethical quagmire. The quick answer, by the way, is "can you make that request in writing?". The long answer is "no". It's ethically permissible to omit minor things or do minor rewording, but not when it impinges on my credibility.

A life is worth about $10-million. Most people agree that "you can't put value on a human life", and that those who do are evil. The opposite is true. Should we spend more on airplane safety, breast cancer research, or the military budget to fight ISIS. Each can be measured in the number of lives saved. Should we spend more on breast cancer research, which affects people in their 30s, or solving heart disease, which affects people's in their 70s? All these decisions means putting value on human life, and sometimes putting different value on human life. Whether you think it's ethical, it's the way the world works.

Thus, we can measure this disclosure of 0day in terms of potential value of life lost, vs. potential value of life saved.

Is this market manipulation? This is more of a legal question than an ethical one, but people are discussing it. If the data is true, then it's not "manipulation" -- only if it's false. As documented in this post, there's good reason to doubt the complete truth of what Muddy Waters claims. I suspect it'll cost Muddy Waters more in legal fees in the long run than they could possibly hope to gain in the short run. I recommend investment companies stick to areas of their own expertise (accounting fraud) instead of branching out into things like cyber where they really don't grasp things.

This is again bad for security research. Frankly, we aren't a trusted community, because we claim the "sky is falling" too often, and are proven wrong. As this is proven to be market manipulation, as the stock recovers back to its former level, and the scary stories of mass product recalls fail to emerge, we'll be blamed yet again for being wrong. That hurts are credibility.

On the other the other hand, if any of the scary things Muddy Waters claims actually come to pass, then maybe people will start heading our warnings.

Ethics conclusion: I'm a die-hard troll, so therefore I'm going to vigorously defend the idea of shorting stock while dropping 0day. (Most of you appear to think it's unethical -- I therefore must disagree with you).  But I'm also a capitalist. This case creates an incentive to drop harmful 0days -- but it creates an even greater incentive for device manufacturers not to have 0days to begin with. Thus, despite being a dishonest troll, I do sincerely support the ethics of this.


The two 0days are about crashing the device (killing the patient sooner) or draining the battery (killin them later). Both attacks require hours (if not days) in close proximity to the target. If you can get into the local network (such as through phishing), you might be able to hack the Merlin@Home monitor, which is in close proximity to the target for hours every night.

Muddy Waters thinks the security problems are severe enough that it'll destroy St Jude's $2.5 billion pacemaker business. The argument is flimsy. St Jude's retort is equally flimsy.

My prediction: a year from now we'll see little change in St Jude's pacemaker business earners, while there may be some one time costs cleaning some stuff up. This will stop the shenanigans of future 0day+shorting, even when it's valid, because nobody will believe researchers.

by Robert Graham ( at August 27, 2016 05:09 PM

Chris Siebenmann

Why ZFS L2ARC is probably not a good fit for our setup

Back when I wrote up our second generation of ZFS fileservers, I mentioned in an aside that we expected to eventually add some L2ARCs to our pools. Since then, I've started to think that L2ARCs are not a good fit for our particular and somewhat peculiar setup. The simple problem is that as far as I know, there is no such thing as a L2ARC that's shared between pools.

The easy and popular way to think about L2ARC is right there in its name; it's a second level cache for the in-RAM ARC. As with traditional kernel buffer caches, the in-RAM ARC is a shared, global cache for all ZFS pools on your system, where space is allocated to active data regardless of where it comes from. If you have multiple pools and one pool is very hot, data from it can wind up taking up most of the ARC; when the pool cools down and others start being active, the ARC shifts to caching them instead.

L2ARC doesn't behave like this, because a given L2ARC device can't be shared between pools. You don't and can't have a global L2ARC, with X GB of fast SSD space that simply holds the overflow from the ARC regardless of where that overflow came from. Instead you must decide up front how much of your total L2ARC space each pool will get (and I believe that how much L2ARC space you have in total has an impact on how much RAM will get used for L2ARC metadata). A hot pool cannot expand its L2ARC usage beyond what you gave it, and a cool pool cannot donate some of its unneeded space to a hot pool.

My impression is that many people operate ZFS servers where there are only one or two active pools (plus maybe the system pool). For these people, a per-pool L2ARC is effectively the same or almost the same as a global L2ARC. We are not in this situation. For administrative reasons we split different people and groups into different pools, which means that each of our three main fileservers has nine or ten ZFS pools.

As far as I can see, adding a decent sized L2ARC for each pool would rapidly put us over the normal recommendations for total L2ARC size (we have only 64 GB of RAM on each fileserver). Adding a small enough L2ARC to each pool to keep total L2ARC size down is likely to result in L2ARCs that are more decorative than meaningful. And splitting the difference probably doesn't really help either side; we might well wind up with excessive RAM use for L2ARC metadata and L2ARCs that are too small to be really useful. If we're going to spend money here at all, it would probably be more sensible and useful to add more RAM.

(RAM costs more than SSDs, but it's automatically global and thus balanced dynamically between pools. Whatever hot data needs it just gets it, no tuning required.)

All of this leads me to the conclusion that L2ARC is probably not a good fit for a situation like ours where you have a bunch of pools and your activity (and desire for L2 caching) is spread relatively evenly over them all. You can maybe make it work and it might improve things a bit, but the effort to performance increase ratio doesn't seem likely to be all that favorable.

(This old entry of mine has some information on L2ARC metadata memory requirements, although I don't know if things have changed since 2013.)

by cks at August 27, 2016 03:26 AM

August 26, 2016

Podcast: Application Security, Cryptography & PHP

The post Podcast: Application Security, Cryptography & PHP appeared first on

I just uploaded the newest SysCast episode, it's available for your podcasting hearing pleasures.

In the latest episode I talk to Scott Arciszewski to discuss all things security: from the OWASP top 10 to cache timing attacks, SQL injection and local/remote file inclusion. We also talk about his secure CMS called Airship, which takes a different approach to over-the-air updates.

Go have a listen: SysCast -- 6 -- Application Security & Cryptography with Scott Arciszewski.

The post Podcast: Application Security, Cryptography & PHP appeared first on

by Mattias Geniar at August 26, 2016 10:37 AM

Chris Siebenmann

The single editor myth(ology)

One of the perniciously popular stories that you'll hear people saying and repeating all around the tech world is that you imprint on your first serious editor (or the first editor you master) and you can never really shift to another one. If you start out with GNU Emacs it's absurd to think that you can really move to vim, and vice versa, and so on. Among other things this leads people at the start of their career to worry about what editor to learn and to try to figure out which one they can stay with for a long time.

This story is garbage.

The reality is that editors are tools, just like many other things you use (such as programming languages). Yes, they're complex tools that you spend a lot of time with, but they're still tools. And you can and will learn and use many tools over the course of your career, editors included. It is both absurd and limiting to believe that you can't (or won't) shift tools repeatedly over the course of your career, or to insist that you can't possibly do so. Computing is far too fluid for that (and I say this as someone with an extremely durable environment).

(I lump IDEs in with editors here.)

Also, it's not as if editors and especially the environments that surround them are going to stand still over the multiple decades of your career. Let's take GNU Emacs as an example. The core behavior of GNU Emacs may be thirty or more years old, but the packages and add-ons around it that make up a high quality editing environment have changed, are changing, and will continue changing and evolving in the future. The high quality GNU Emacs environment of 1996 was quite different from the high quality GNU Emacs environment of 2016, and the 2026 version is likely to be different again. Much as with browsers and addons, this effectively creates a different editor over time, albeit one with broad similarities to its past self.

(One obvious thing driving the changes in sophisticated editor environment are the changes in everything else in computing. There was no git or Mercurial in 1996, for example, so there were no packages to smoothly integrate with them.)

So all of these 'single editor' myths are wrong. Your first editor won't be your entire world (although it will inevitably shape your early views of editing and what you like); you can change your main editor over time (and you probably will); it's perfectly possible to be fluent in multiple editors and editing environments at once; and it's even possible to like different editors for different things instead of trying to do everything in one.

There are sensible reasons to focus on a single editor (or a few) instead of flitting around from editor to editor at the drop of a hat, but these are the same reasons that you might focus your programming efforts on a few languages instead of chasing after every interesting seeming one that catches your eye. And if you're so inclined, you might as well try out editors just as you try out programming languages, either to see if you like them or simply to expand your ideas on what's possible and useful.

(I am not so inclined in either editors or programming languages. With editors it generally feels as if there is very little point in exploring a new one, partly because most of my needs are modest and already pretty well met by editors that I already know. But I know that there are people out there who delight in taking new editors out for test drives.)

by cks at August 26, 2016 03:36 AM

August 25, 2016

Errata Security

Notes on the Apple/NSO Trident 0days

I thought I'd write up some comments on today's news of the NSO malware using 0days to infect human rights activist phones. For full reference, you want to read the Citizen's Lab report and the Lookout report.

Press: it's news to you, it's not news to us

I'm seeing breathless news articles appear. I dread the next time that I talk to my mom that she's going to ask about it (including "were you involved"). I suppose it is new to those outside the cybersec community, but for those of us insiders, it's not particularly newsworthy. It's just more government malware going after activists. It's just one more set of 0days.

I point this out in case press wants to contact for some awesome sounding quote about how exciting/important this is. I'll have the opposite quote.

Don't panic: all patches fix 0days

We should pay attention to context: all patches (for iPhone, Windows, etc.) fix 0days that hackers can use to break into devices. Normally these 0days are discovered by the company itself or by outside researchers intending to fix (and not exploit) the problem. What's different here is that where most 0days are just a theoretical danger, these 0days are an actual danger -- currently being exploited by the NSO Group's products. Thus, there's maybe a bit more urgency in this patch compared to other patches.

Don't panic: NSA/Chinese/Russians using secret 0days anyway

It's almost certain the NSA, the Chinese, and the Russian have similar 0days. That means applying this patch makes you safe from the NSO Group (for a while, until they find new 0days), but it's unlikely this patch makes you safe from the others.

Of course it's multiple 0days

Some people are marveling how the attack includes three 0days. That's been the norm for browser exploits for a decade now. There's sandboxes and ASLR protections to get through. There's privilege escalation to get into the kernel. And then there's persistence. How far you get in solving one or more of these problems with a single 0day depends upon luck.

It's actually four 0days

While it wasn't given a CVE number, there was a fourth 0day: the persistence using the JavaScriptCore binary to run a JavaScript text file. The JavaScriptCore program appears to be only a tool for developers and not needed the functioning of the phone. It appears that the iOS 9.3.5 patch disables. While technically, it's not a coding "bug", it's still a design bug. 0days solving the persistence problem (where the malware/implant runs when phone is rebooted) are worth over a hundred thousand dollars all on their own.

That about wraps it up for VEP

VEP is Vulnerability Equities Process that's supposed to, but doesn't, manage how the government uses 0days it acquires.

Agitators like the EFF have been fighting against the NSA's acquisition and use of 0days, as if this makes us all less secure. What today's incident shows is that acquisition/use of 0days will be widespread around the world, regardless what the NSA does. It's be nice to get more transparency about what they NSA is doing through the VEP process, but the reality is the EFF is never going to get anything close to what it's agitating for.

That about wraps is up for Wassenaar

Wassenaar is an internal arms control "treaty". Left-wing agitators convinced the Wassenaar folks to add 0days and malware to the treaty -- with horrific results. There is essentially no difference between bad code and good code, only how it's used, so the the Wassenaar extensions have essentially outlawed all good code and security research.

Some agitators are convinced Wassenaar can be still be fixed (it can't). Israel, where NSO Group is based, is not a member of Wassenaar, and thus whatever limitations Wassenaar could come up with would not stop the NSO.

Some have pointed out that Israel frequently adopts Wassenaar rules anyway, but they would then simply transfer the company somewhere else, such as Singapore.

The point is that 0day development is intensely international. There are great 0day researchers throughout the non-Wassenaar countries. It's not like precision tooling for aluminum cylinders (for nuclear enrichment) that can only be made in an industrialized country. Some of the best 0day researchers come from backwards countries, growing up with only an Internet connection.


The victim in this case, Ahmed Mansoor, has apparently been hacked many time, including with HackingTeam's malware and Finfisher malware -- notorious commercial products used by evil government's to hack into dissident's computers.

Obviously, he'll be hacked again. He's a gold mine for researchers in this area. The NSA, anti-virus companies, Apple jailbreak companies, and the like should be jumping over themselves offering this guy a phone. One way this would work is giving him a new phone every 6 months in exchange for the previous phone to analyze.

Apple, of course, should head the list of companies doing this, proving "activist phones" to activists with their own secret monitoring tools installed so that they can regularly check if some new malware/implant has been installed.

iPhones are still better, suck it Android

Despite the fact that everybody and their mother is buying iPhone 0days to hack phones, it's still the most secure phone. Androids are open to any old hacker -- iPhone are open only to nation state hackers.

Use signal, use Tor

I didn't see Signal on the list of apps the malware tapped into. There's no particular reason for this, other than NSO haven't gotten around to it yet. But I thought I'd point how yet again, Signal wins.

SMS vs. MitM

Some have pointed to SMS as the exploit vector, which gave Citizen's Lab the evidence that the phone had been hacked.

It's a Safari exploit, so getting the user to visit a web page is required. This can be done over SMS, over email, over Twitter, or over any other messaging service the user uses. Presumably, SMS was chosen because users are more paranoid of links in phishing emails than they are SMS messages.

However, the way it should be doing is with man-in-the-middle (MitM) tools in the infrastructure. Such a tool would wait until the victim visited any webpage via Safari, then magically append the exploit to the page. As Snowden showed, this is apparently how the NSA does it, which is probably why they haven't gotten caught yet after exploiting iPhones for years.

The UAE (the government who is almost certainly trying to hack Mansoor's phone) has the control over their infrastructure in theory to conduct a hack. We've already caught other governments doing similar things (like Tunisia). My guess is they were just lazy, and wanted to do it the easiest way for them.

by Robert Graham ( at August 25, 2016 10:16 PM

Cryptography Engineering

Attack of the week: 64-bit ciphers in TLS

A few months ago it was starting to seem like you couldn't go a week without a new attack on TLS. In that context, this summer has been a blessed relief. Sadly, it looks like our vacation is over, and it's time to go back to school.

Today brings the news that Karthikeyan Bhargavan and Gaëtan Leurent out of INRIA have a new paper that demonstrates a practical attack on legacy ciphersuites in TLS (it's called "Sweet32", website here). What they show is that ciphersuites that use 64-bit blocklength ciphers -- notably 3DES -- are vulnerable to plaintext recovery attacks that work even if the attacker cannot recover the encryption key.

While the principles behind this attack are well known, there's always a difference between attacks in principle and attacks in practice. What this paper shows is that we really need to start paying attention to the practice.

So what's the matter with 64-bit block ciphers?
source: Wikipedia

Block ciphers are one of the most widely-used cryptographic primitives. As the name implies, these are schemes designed to encipher data in blocks, rather than a single bit at a time.

The two main parameters that define a block cipher are its block size (the number of bits it processes in one go), and its key size. The two parameters need not be related. So for example, DES has a 56-bit key and a 64-bit block. Whereas 3DES (which is built from DES) can use up to a 168-bit key and yet still has the same 64-bit block. More recent ciphers have opted for both larger blocks and larger keys.

When it comes to the security provided by a block cipher, the most important parameter is generally the key size. A cipher like DES, with its tiny 56-bit key, is trivially vulnerable to brute force attacks that attempt decryption with every possible key (often using specialized hardware). A cipher like AES or 3DES is generally not vulnerable to this sort of attack, since the keys are much longer.

However, as they say: key size is not everything. Sometimes the block size matters too.

You see, in practice, we often need to encrypt messages that are longer than a single block. We also tend to want our encryption to be randomized. To accomplish this, most protocols use a block cipher in a scheme called a mode of operation. The most popular mode used in TLS is CBC mode. Encryption in CBC looks like this:

(source: wikipedia)
The nice thing about CBC is that (leaving aside authentication issues) it can be proven (semantically) secure if we make various assumptions about the security of the underlying block cipher. Yet these security proofs have one important requirement. Namely, the attacker must not receive too much data encrypted with a single key.

The reason for this can be illustrated via the following simple attack.

Imagine that an honest encryptor is encrypting a bunch of messages using CBC mode. Following the diagram above, this involves selecting a random Initialization Vector (IV) of size equal to the block size of the cipher, then XORing the IV with the first plaintext block (P), and enciphering the result (PIV). The IV is sent (in the clear) along with the ciphertext.

Most of the time, the resulting ciphertext block will be unique -- that is, it won't match any previous ciphertext block that an attacker may have seen. However, if the encryptor processes enough messages, sooner or later the attacker will see a collision. That is, it will see a ciphertext block that is the same as some previous ciphertext block. Since the cipher is deterministic, this means the cipher's input (PIV) must be identical to the cipher's previous input (P' ⊕ IV') that created the previous block.

In other words, we have (P ⊕ IV) = (P' ⊕ IV'), which can be rearranged as (P ⊕ P') = (IV ⊕ IV'). Since the IVs are random and known to the attacker, the attacker has (with high probability) learned the XOR of two (unknown) plaintexts!

What can you do with the XOR of two unknown plaintexts? Well, if you happen to know one of those two plaintext blocks -- as you might if you were able to choose some of the plaintexts the encryptor was processing -- then you can easily recover the other plaintext. Alternatively, there are known techniques that can sometimes recover useful data even when you don't know both blocks.

The main lesson here is that this entire mess only occurs if the attacker sees a collision. And the probability of such a collision is entirely dependent on the size of the cipher block. Worse, thanks to the (non-intuitive) nature of the birthday bound, this happens much more quickly than you might think it would. Roughly speaking, if the cipher block is b bits long, then we should expect a collision after roughly 2^{b/2} encrypted blocks.

In the case of a 64-bit blocksize cipher like 3DES, this is somewhere in the vicinity of 2^32, or around 4 billion enciphered blocks.

(As a note, the collision does not really need to occur in the first block. Since all blocks in CBC are calculated in the same way, it could be a collision anywhere within the messages.)

Whew. I thought this was a practical attack. 4 billion is a big number!

It's true that 4 billion blocks seems like an awfully large number. In a practical attack, the requirements would be even larger -- since the most efficient attack is for the attacker to know a lot of the plaintexts, in the hope that she will be able to recover one unknown plaintext when she learns the value (P ⊕ P').

However, it's worth keeping in mind that these traffic numbers aren't absurd for TLS. In practice, 4 billion 3DES blocks works out to 32GB of raw ciphertext. A lot to be sure, but not impossible. If, as the Sweet32 authors do, we assume that half of the plaintext blocks are known to the attacker, we'd need to increase the amount of ciphertext to about 64GB. This is a lot, but not impossible.

The Sweet32 authors take this one step further. They imagine that the ciphertext consists of many HTTPS connections, consisting of 512 bytes of plaintext, in each of which is embedded the same secret 8-byte cookie -- and the rest of the session plaintext is known. Calculating from these values, they obtain a requirement of approximately 256GB of ciphertext needed to recover the cookie with high probability.

That is really a lot.

But keep in mind that TLS connections are being used to encipher increasingly more data. Moreover, a single open browser frame running attacker-controlled Javascript can produce many gigabytes of ciphertext in a single hour. So these attacks are not outside of the realm of what we can run today, and presumably will be very feasible in the future.

How does the TLS attack work?

While the cryptographic community has been largely pushing TLS away from ciphersuites like CBC, in favor of modern authenticated modes of operation, these modes still exist in TLS. And they exist not only for use not only with modern ciphers like AES, but they are often available for older ciphersuites like 3DES. For example, here's a connection I just made to Google:

Of course, just because a server supports 3DES does not mean that it's vulnerable to this attack. In order for a particular connection to be vulnerable, both the client and server must satisfy three main requirements:
  1. The client and server must negotiate a 64-bit cipher. This is a relatively rare occurrence, but can happen in cases where one of the two sides is using an out-of-date client. For example, stock Windows XP* does not support any of the AES-based ciphersuites. Similarly, SSL3 connections may negotiate 3DES ciphersuites. 
  2. The server and client must support long-lived TLS sessions, i.e., encrypting a great deal of data with the same key. Unfortunately, most web browsers place no limit on the length of an HTTPS session if Keep-Alive is used, provided that the server allows the session. The Sweet32 authors scanned and discovered that many servers (including IIS) will allow sessions long enough to run their attack. Across the Internet, the percentage of vulnerable servers is small (less than 1%), but includes some important sites.
  3. Sites vulnerable to the attack (source: Sweet32 paper).
  4. The client must encipher a great deal of known data, including a secret session cookie. This is generally achieved by running adversarial Javascript code in the browser, although it could be done using standard HTML as well. 
These caveats aside, the authors were able to run their attack using Firefox, sending at a rate of about 1500 connections per second. With a few optimizations, they were able to recover a 16-byte secret cookie in about 30 hours (a lucky result, given an expected 38 hour run time).

So what do we do now?

While this is not an earthshaking result, it's roughly comparable to previous results we've seen with legacy ciphers like RC4.

In short, while these are not the easiest attacks to run, it's a big problem that there even exist semi-practical attacks that succeed against the encryption used in standard encryption protocols. This is a problem that we should address, and papers like this one can make a big difference in doing that.


* Note that by "stock" Windows XP, I'm referring to Windows XP as it was originally sold. According to Stefan Kanthak, Microsoft added AES support to SChannel via a series of updates in August 11, 2009. It's not clear when these became "automatic install". So if you haven't updated your XP in a long time, that's probably a bad thing.

by Matthew Green ( at August 25, 2016 03:19 PM

Errata Security

Another lesson in confirmation bias

The biggest problem with hacker attribution is the confirmation bias problem. Once you develop a theory, your mind shifts to distorting evidence trying to prove the theory. After a while, only your theory seems possible as one that can fit all your carefully selected evidence.

You can watch this happen in two recent blogposts [1] [2] by Krypt3ia attributing bitcoin payments to the Shadow Broker hackers as coming from the government (FBI, NSA, TAO). These posts are absolutely wrong. Nonetheless, the press has picked up on the story and run with it [*]. [Note: click on the pictures in this post to blow them up so you can see them better].

The Shadow Brokers published their bitcoin address (19BY2XCgbDe6WtTVbTyzM9eR3LYr6VitWK) asking for donations to release the rest of their tools. They've received 66 transactions so far, totally 1.78 bitcoin, or roughly $1000 at today's exchange rate.

Bitcoin is not anonymous by pseudonymous. Bitcoin is a public ledger with all transaction visible by everyone. Sometimes we can't tie addresses back to people, but sometimes we can. There are a lot of researchers who spent a lot of time on "taint anlysis" trying to track down the real identity of evildoers. Thus, it seems plausible that we might be able to discover the identities of those people making contributions to Shadow Brokers.

The first of Krypt3ia's errant blogposts tries to use the Bitcoin taint analysis plugin within Maltego in order to do some analysis on the Shadow Broker address. What he found was links to the Silk Road address -- the address controlled by the FBI since they took down that darknet marketplace several years ago. Therefore, he created the theory that the government (FBI? NSA? TAO?) was up to some evil tricks, such as trying to fill the account with money so that they could then track where the money went in the public blockchain.

But he misinterpreted the links. (He was wrong.) There were no payments from the Silk Road accounts to the Shadow Broker account. Instead, there were people making payments to both accounts. As a prank.

To demonstrate how this prank wors, I made my own transaction, where I pay money to the Shadow Brokers (19BY2...), to Silk Road (1F1A...), and to a few other well-known accounts controlled by the government.

The point here is that anybody can do these shenanigans. That government controlled addresses are involved means nothing. They are public, and anybody can send coin to them.

That blogpost points to yet more shenanigans, such as somebody "rick rolling", to confirm that TAO hackers were involved. What you see in the picture below is a series of transactions using bitcoin addresses containing the phrase "never gonna give you up", the title of Rich Astley's song (I underlined the words in red).

Far from the government being involved, somebody else took credit for the hack, with the Twitter handle @MalwareTechBlog. In a blogpost [*], he describes what he did. He then proves his identity by signing a message at the bottom of his post, using the same key (the 1never.... key above) in his tricks. Below is a screenshot of how I verified (and how anybody can verify) the key.

Moreover, these pranks should be seen in context. Goofball shenanigans on the blockchain are really, really common. An example is the following transaction:

Notice the vanity bitcoin address transfering money to the Silk Road account. There is also a "Public Note" on this transaction, a feature unique to -- which recently removed the feature because it was so extensively abused.

Bitcoin also has a feature where 40 bytes of a message can be added to transactions. The first transaction sending bitcoins to both Shadow Brokers and Silk Road was this one. If you tell it to "show scripts", you see that it contains an email address for Cryptome, the biggest and oldest Internet leaks site (albeit not as notorious as Wikileaks).

The point is this: shenanigans and pranks are common on the Internet. What we see with Shadow Brokers is normal trickery. If you are unfamiliar with Bitcoin culture, it may look like some extra special trickery just for Shadow Brokers, but it isn't.

After much criticism why his first blogpost was wrong, Krypt3ia published a second. The point of the second was to lambaste his critics -- just because he jotted down some idle thoughts in a post doesn't make him responsible for journalists like ZDnet picking up as a story that's now suddenly being passed around.

But his continues with the claim that there is somehow evidence of government involvement, even though his original claim of payments from Silk Road were wrong. As he says:
However, my contention still stands that there be some fuckery going on here with those wallet transactions by the looks of it and that the likely candidate would be the government
Krypt3ia goes onto then claim, about the Rick Astley trick:
So yeah, these accounts as far as I can tell so far without going and spending way to many fucking hours on bitcoin.ifo or some such site, were created to purposely rick roll and fuck with the ShadowBrokers. Now, they may be fractions of bitcoins but I ask you, who the fuck has bitcoin money to burn here? Any of you out there? I certainly don’t and the way it was done, so tongue in cheek kinda reminds me of the audacity of TAO…
Who has bitcoin money to burn? The answer is everyone. Krypt3ia obvious isn't paying attention to the value of bitcoin here, which are pennies. Each transaction of 0.0001337 bitcoins is worth about 10 cents at current exchange rates, meaning this Rick Roll was less than $1. It takes minutes to open an account (like at and use your credit card (or debit card) to $1 worth of bitcoin and carry out this prank.

He goes on to say:
If you also look at the wallets that I have marked with the super cool “Invisible Man” logo, you can see how some of those were actually transfering money from wallet to wallet in sequence to then each post transactions to Shadow. Now what is that all about huh? More wallets acting together? As Velma would often say in Scooby Doo, JINKY’S! Something is going on there.
Well, no, it's normal bitcoin transactions. (I've made this mistake too -- learned about it, then forgot about it, then had to relearn about it). A Bitcoin transaction needs to consume all the previous transactions that it refers to. This invariably leaves some bitcoin left over, so has to be transferred back into the user's wallet. Thus, on my hijinx at the top of this post, you see the address 1HFWw... receives most of the bitcoin. That was a newly created by my wallet back in 2014 to receive the unspent portions of transactions. While it looks strange, it's perfectly normal.

It's easy to point out that Krypt3ia just doesn't understand much about bitcoin, and is getting excited by Maltego output he doesn't understand.

But the real issue is confirmation bias. He's developed a theory, and searches for confirmation of that theory. He says "there are connections that cannot be discounted", when in fact all the connections can easily be discounted with more research, with more knowledge. When he gets attacked, he's becomes even more motivated to search for reasons why he's actually right. He's not motivated to be proven wrong.

And this is the case of most "attribution" in the cybersec issue. We don't have smoking guns (such as bitcoin coming from the Silk Road account), and must make do with flimsy data (like here, bitcoin going to the Silk Road account). Sometimes our intuition is right, and this flimsy data does indeed point us to the hacker. In other cases, it leads us astray, as I've documented before in this blog. The less we understand something, the more it confirms our theory rather than conforming we just don't understand. That "we just don't know" is rarely an acceptable answer.

I point this out because I'm always the skeptic when the government attributes attacks to North Korea, China, Russia, Iran, and so on. I've seen them be right sometimes, and I've seem them be absolutely wrong. And when they are wrong, it's easy figuring out why -- because of things like confirmation bias.

Maltego plugin showing my Bitcoin hijinx transaction from above

Creating vanity addresses, for rickrolling or other reasons

by Robert Graham ( at August 25, 2016 04:32 AM

Chris Siebenmann

Blindly trying to copy a web site these days is hazardous

The other day, someone pointed a piece of software called HTTrack at Wandering Thoughts. HTTrack is a piece of free software that makes offline copies of things, so I presume that this person for some reason wanted this. I don't think it went as they intended and wanted.

The basic numbers are there in the logs. Over the course of a bit over 18 hours, they made 72,393 requests and received just over 193 MBytes of data. Needless to say, Wandering Thoughts does not have that many actual content pages; at the moment there are a bit over 6400 pages that my sitemap generation code considers to be 'real', some of them with partially duplicated content. How did 6400 pages turn into 72,000? Through what I call 'virtual directories', where various sorts of range based and date based views and so on are layered on top of an underlying directory structure. These dynamic pages multiply like weeds.

(I'm reasonably sure that 72,000 URLs doesn't cover them all by now, although I could be wrong. The crawl does seem to have gotten every real page, so maybe it actually got absolutely everything.)

Dynamic views of things are not exactly uncommon in modern software, and that means that blindly trying to copy a web site is very hazardous to your bandwidth and disk space (and it is likely to irritate the target a lot). You can no longer point a simple crawler (HTTrack included) at a site or a URL hierarchy and say 'follow every link', because it's very likely that you're not going to achieve your goals. Even if you do get 'everything', you're going to wind up with a sprawling mess that has tons of duplicated content.

(Of course HTTrack doesn't respect nofollow, and it also lies in its User-Agent by claiming to be running on Windows 98. For these and other reasons, I've now set things up so that it will be refused service on future visits. In fact I'm in a sufficiently grumpy mood that anything claiming to still be using Windows 98 is now banned, at least temporarily. If people are going to lie in their User-Agent, please make it more plausible. In fact, according to the SSL Server Test, Windows 98 machines can't even establish a TLS connection to this server. Well, I'm assuming that based on the fact that Windows XP fails, as the SSL Server Test doesn't explicitly cover Windows 98.)

PS: DWiki and this host didn't even notice the load from the HTTrack copy. We found out about it more or less by coincidence; a university traffic monitoring system noticed a suspiciously high number of sessions from a single remote IP to the server and sent in a report.

by cks at August 25, 2016 02:19 AM

August 24, 2016


The SWEET32 Issue, CVE-2016-2183

Today, Karthik Bhargavan and Gaetan Leurent from Inria have unveiled a new attack on Triple-DES, SWEET32, Birthday attacks on 64-bit block ciphers in TLS and OpenVPN. It has been assigned CVE-2016-2183.

This post gives a bit of background and describes what OpenSSL is doing. For more details, see their website.

Because DES (and triple-DES) has only a 64-bit block size, birthday attacks are a real concern. With the ability to run Javascript in a browser, it is possible to send enough traffic to cause a collision, and then use that information to recover something like a session Cookie. Their experiments have been able to recover a cookie in under two days. More details are available at their website. But the take-away is this: triple-DES should now be considered as “bad” as RC4.

Triple-DES, which shows up as “DES-CBC3” in an OpenSSL cipher string, is still used on the Web, and major browsers are not yet willing to completely disable it.

If you run a server, you should disable triple-DES. This is generally a configuration issue. If you run an old server that doesn’t support any better ciphers than DES or RC4, you should upgrade.

Within the OpenSSL team, we discussed how to classify this, using our security policy, and we decided to rate it LOW. This means that we just pushed the fix into our repositories. Here is what we did:

  • For 1.0.2 and 1.0.1, we removed the triple-DES ciphers from the “HIGH” keyword and put them into “MEDIUM.” Note that we did not remove them from the “DEFAULT” keyword.

  • For the 1.1.0 release, which we expect to release tomorrow, we will treat triple-DES just like we are treating RC4. It is not compiled by default; you have to use “enable-weak-ssl-ciphers” as a config option. Even when those ciphers are compiled, triple-DES is only in the “MEDIUM” keyword. In addition, because this is a new release, we also removed it from the “DEFAULT” keyword.

When you have a large installed base, it is hard to move forward in a way that will please everyone. Leaving triple-DES in “DEFAULT” for 1.0.x and removing it from 1.1.0 is admittedly a compromise. We hope the changes above make sense, and even if you disagree and you run a server, you can explicitly protect your users through configuration.

Finally, we would like to thank Karthik and Gaeten for reaching out to us, and working closely to coordinate our releases with their disclosure.

August 24, 2016 11:16 PM

Geek and Artist - Tech

Adding Meaning to Code

This is the product of only about 5 minutes worth of thought, so take it with a grain of salt. When it comes to how to write maintainable, understandable code, there are as many opinions out there as there are developers. Personally I favour simple, understandable, even “boring” method bodies that don’t try to be flashy or use fancy language features. Method and class names should clearly signal intent and what the thing is or does. And, code should (IMHO) include good comments.

This last part is probably an area I’ve seen the most dissent. For some reason people hate writing comments, and think that the code should be “self-documenting”. I’ve rarely, perhaps never seen this in practice. While perhaps the intent was for it to be self-documenting, that never arose in practice.

Recently (and this is related, I promise), I watched a lot of talks (one, in person) and read a lot about the Zalando engineering principles. They base their engineering organisation around three pillars of How, What and Why. I think the same thing can be said for how you should write code and document it:

class Widget
  def initialize
    @expires_at = + 86400

  # Customer X was asking for the ability to expire     #  <--- Why
  # widgets, but some may not have an expiry date or
  # do not expire at all. This method handles these
  # edge cases safely.
  def is_expired?()                                     #  <--- What
    return !!@expires_at && > @expires_at    #  <--- How

This very simple example shows what I mean (in Ruby, since it's flexible and lends itself well to artificial examples like this). The method body itself should convey the How of the equation. The method name itself should convey the intent of the method - What does this do? Ultimately, the How and What can probably never fully explain the history and reasoning for their own existence. Therefore I find it helpful to accompany these with the Why in a method comment to this effect (and a comment above the method could also be within the method, or distributed across the method - it's not really important).

You could argue that history and reasoning for having the method can be determined from version control history. This turns coding from what should be a straightforward exercise into some bizarre trip through the Wheel of Time novels, cross-referencing back to earlier volumes in order to try to find some obscure fact that may or may not actually exist, so that you can figure out the reference you are currently reading. Why make the future maintainer of your code go through that? Once again, it relies entirely on the original committer having left a comprehensive and thoughtful message that is also easy to find.

The other counter argument is that no comments are better than out of date or incorrect comments. Again, personally I haven't run into this (or at least, not nearly as frequently as comments missing completely). Usually it will be pretty obvious where the comment does not match up with the code, and in this (hopefully outlier) case you can then go version control diving to find out when they diverged. Assessing contents of the code itself is usually far easier than searching for an original comment on the first commit of that method, so it seems like this should be an easier exercise.

Writing understandable code (and let's face it, most of the code written in the world is probably doing menial things like checking if statements, manipulating strings or adding/removing items from arrays) and comments is less fun than hacking out stuff that just works when you are feeling inspired, so no wonder we've invented an assortment of excuses to avoid doing it. So if you are one of the few actually doing this, thank you.

by oliver at August 24, 2016 06:45 PM

Everything Sysadmin

Automating a CI environment for Android Apps

One of the things my team at StackOverflow does is maintain the CI/CD system which builds all the software we use and produce. This includes the Stack Exchange Android App.

Automating the CI/CD workflow for Android apps is a PITA. The process is full of trips and traps. Here are some notes I made recently.

First, [this is the paragraph where I explain why CI/CD is important. But I'm not going to write it because you should know how important it is already. Plus, Google definitely knows already. That is why the need to write this blog post is so frustrating.]

And therefore, there are two important things that vendors should provide that make CI/CD easy for developers:

  • Rule 1: Builds should work from the command line on a multi-user system.
    1. Builds must work from a script, with no UI popping up. A CI system only has stdin/stdout/sterr.
    2. Multiuser systems protect files that shouldn't be modified from being modified.
    3. The build process should not rely on the internet. If it must download a file during the build, then we can't do builds if that resource disappears.
  • Rule 2: The build environment should be reproducible in an automated fashion.
    1. We must be able to create the build environment on a VM, tear it down, and built it back up again. We might do this to create dozens (or hundreds) of build machines, or we might delete the build VM between builds.
    2. This process should not require user interaction.
    3. It should be possible to automate this process, in a language such as Puppet or Chef. The steps should be idempotent.
    4. This process should not rely on any external systems.

Android builds can be done from the command line. Hw, but the process itself updates files in the build area. Creating the build environment simply can not be automated, without repackaging all of the files (something I'm not willing to do).

Here are my notes from creating a CI/CD system using TeamCity (a commercial product comparable to Jenkins) for the StackOverflow mobile developers:

Step 1. Install Java 8

The manual way:

CentOS has no pre-packaged Oracle Java 8 package. Instead, you must download it and install it manually.

Method 1: Download it from the Oracle web site. Pick the latest release, 8uXXX where XXX is a release number. (Be sure to pick "Linux x64" and not "Linux x86").

Method 2: Use the above web site to figure out the URL, then use this code to automate the downloading: (H/T to this SO post)

# cd /root
# wget --no-cookies --no-check-certificate --header \
    "Cookie:; oraclelicense=accept-securebackup-cookie" \

Dear Oracle: I know you employ more lawyers than engineers, but FFS please just make it possible to download that package with a simple curl or wget. Oh, and the fact that the certificate is invalid means that if this did come to a lawsuit, people would just claim that a MITM attack forged their agreement to the licence.

Install the package:

# yum localinstall jdk-8u102-linux-x64.rpm

...and make a symlink so that our CI system can specify JAVA8_HOME=/usr/java and not have to update every individual configuration.

# ln -sf /usr/java/jdk1.8.0_102 /usr/java/jdk

We could add this package to our YUM repo, but the benfit would be negligible plus whether or not the license permits this is questionable.

EVALUATION: This step violates Rule 2 above because the download process is manual. It would be better if Oracle provided a YUM repo. In the future I'll probably put it in our local YUM repo. I'm sure Oracle won't mind.

Step 2. Party like it's 2010.

The Android tools are compiled for 32-bit Linux. I'm not sure why. I presume it is because they want to be friendly to the few developers out there that still do their development on 32-bit Linux systems.

However, I have a few other theories: (a) The Android team has developed a time machine that lets them travel back to 2010 because I happen to know for a fact that Google moved to 64-bit Linux internally around 2011; they created teams of people to find and eliminate any 32-bit Linux hosts. Therefore the only way the Android team could actually still be developing on 32-bit Linux is if they either hidden their machines from their employer, or they have a time machine. (b) There is no "b". I can't imagine any other reason, and I'm jealous of their time machine.

Therefore, we install some 32-bit libraries to gain backwards compatibility. We do this and pray that the other builds happening on this host won't get confused. Sigh. (This is one area where containers would be very useful.)

# yum install -y glibc.i686 zlib.i686 libstdc++.i686

EVALUATION: B-. Android should provide 64-bit binaries.

Step 3. Install the Android SDK

The SDK has a comand-line installer. The URL is obscured, making it difficult to automate this download. However you can find the current URL by reading this web page, then clicking on "Download Options", and then selecting Linux. The last time we did the the URL was:

You can install this in 1 line:

cd /usr/java && tar xzpvf /path/to/android-sdk_r24.4.1-linux.tgz

EVALUATION: Violates Rule 2 because it is not in a format that can easily be automated. It would be better to have this in a YUM repo. In the future I'll probably put this tarfile into an RPM with an install script that untars the file.

Step 4. Install/update the SDK modules.

Props to the Android SDK team for making an installer that works from the command line. Sadly it is difficult to figure out which modules should be installed. Once you know the modules you need, specifying them on the command line is "fun"... which is my polite way of saying "ugly."

First I asked the developers which modules they need installed. They gave me a list, which was wrong. It wasn't their fault. There's no history of what got installed. There's no command that shows what is installed. So there was a lot of guess-work and back-and-forth. However, we finally figured out which modules were needed.

The command to list all modules is:

/usr/java/android-sdk/tools/android list sdk -a

The modules we happened to need are:

  1- Android SDK Tools, revision 25.1.7
  3- Android SDK Platform-tools, revision 24.0.1
  4- Android SDK Build-tools, revision 24.0.1
  6- Android SDK Build-tools, revision 23.0.3
  7- Android SDK Build-tools, revision 23.0.2
  9- Android SDK Build-tools, revision 23 (Obsolete)
 19- Android SDK Build-tools, revision 19.1
 29- SDK Platform Android 7.0, API 24, revision 2
 30- SDK Platform Android 6.0, API 23, revision 3
 39- SDK Platform Android 4.0, API 14, revision 4
141- Android Support Repository, revision 36
142- Android Support Library, revision 23.2.1 (Obsolete)
149- Google Repository, revision 32

If that list looks like it includes a lot of redundant items, you are right. I don't know why we need 5 versions of the build tools (one which is marked "obsolete") and 3 version of the SDK. However I do know that if I remove any of those, our builds break.

You can install these with this command:

/usr/java/android-sdk/tools/android update sdk \
    --no-ui --all --filter 1,3,4,6,7,9,19,29,30,39,141,142,149

However there's a small problem with this. Those numbers might be different as new packages are added and removed from the repository.

Luckily there is a "name" for each module that (I hope) doesn't change. However the names aren't shown unless you specify the -e option:

# /usr/java/android-sdk/tools/android list sdk -a -e

The output looks like:

Packages available for installation or update: 154
id: 1 or "tools"
     Type: Tool
     Desc: Android SDK Tools, revision 25.1.7
id: 2 or "tools-preview"
     Type: Tool
     Desc: Android SDK Tools, revision 25.2.2 rc1

Therefore a command that will always install that set of modules would be:

/usr/java/android-sdk/tools/android update sdk --no-ui --all \
    --filter tools,platform-tools,build-tools-24.0.1,\

Feature request: The name assigned to each module should be listed in the regular listing (without the -e) or the normal listing should end with a note: "For details, add the -e flag."

EVALUATION: Great! (a) Thank you for the command-line tool. The docs could be a little bit better (I had to figure out the -e trick) but I got this to work. (b) Sadly, I can't automate this with Puppet/Chef because they have no way of knowing if a module is already installed, therefore I can't make an idempotent installer. Without that, the automation would blindly re-install the modules every time it runs, which is usually twice an hour. (c) I'd rather have these individual modules packaged as RPMs so I could just install the ones I need. (d) I'd appreciate a way to list which modules are installed. (e) update should not re-install modules that are already installed, unless a --force flag is given. What are we, barbarians?

Step 4: Install license agreements

The software won't run unless you've agreed to the license. According to Android's own website you do this by asking a developer to do it on their machine, then copy those files to the CI server. Yes. I laughed too.

EVALUATION: There's no way to automate this. In the future I will probably make a package out of these files so that we can install them on any CI machine. I'm taking suggestions on what I should call this package. I think android-sdk-lie-about-license-agreements.rpm might be a good name.

Step 5: Fix stupidity.

At this point we though we were done, but the app build process was still breaking. Sigh. I'll save you the long story, but basically we discovered that the build tools want to be able to write to /usr/java/android-sdk/extras

It isn't clear if they need to be able to create files in that directory or write within the subdirectories. Fuck it. I don't have time for this shit. I just did:

chmod 0775 /usr/java/android-sdk/extras
chown $BUILDUSER /usr/java/android-sdk
chown -R $BUILDUSER /usr/java/android-sdk/extras

("$BUILDUSER" is the username that does the compiles. In our case it is teamcity because we use TeamCity.)

Maybe I'll use my copious spare time some day to figure out if the -R is needed. I mean... what sysadmin doesn't have tons of spare time to do science experiments like that? We're all just sitting around with nothing to do, right? In the meanwhile, -R works so I'm not touching it.

EVALUATION: OMG. Please fix this, Android folks! Builds should not modify themselves! At least document what needs to be writable!

Step 6: All done!

At this point the CI system started working.

Some of the steps I automated via Puppet, the rest I documented in a wiki page. In the future when we build additional CI hosts Puppet will do the easy stuff and we'll manually do the rest.

I don't like having manual steps but at our scale that is sufficient. At least the process is repeatable now. If I had to build dozens of machines, I'd wrap all of this up into RPMs and deploy them. However then the next time Android produces a new release, I'd have to do a lot of work wrapping the new files in an RPM, testing them, and so on. That's enough effort that it should be in a CI system. If you find that you need a CI system to build the CI system, you know your tools weren't designed with automation in mind.

Hopefully this blog post will help others going through this process.

If I have missed steps, or if I've missed ways of simplifying this, please post in the comments!

P.S. Dear Android team: I love you folks. I think Android is awesome and I love that you name your releases after desserts (though I was disappointed that "L" wasn't Limoncello.... but that's just me being selfish.). I hope you take my snark in good humor. I am a sysadmin that wants to support his developers as best he can and fixing this problems with the Android SDK would really help. Then we can make the most awesome Android apps ever.... which is what we all want. Thanks!

by Tom Limoncelli at August 24, 2016 03:00 PM

Sean's IT Blog

What’s New in NVIDIA GRID August 2016

Over the last year, the great folks over at NVIDIA have been very busy.  Last year at this time, they announced the M6 and M60 cards, bringing the Maxwell architecture to GRID, adding support for blade server architectures, and introducing the software licensing model for the drivers.  In March, GRID 3.0 was announced, and it was a fix for the new licensing model.

Today, NVIDIA announced the August 2016 release of GRID.  This is the latest edition of the GRID software stack, and it coincides with the general availability of the high-density M10 card that supports up to 64 users.

So aside from the hardware, what’s new in this release?

The big addition to the GRID product line is monitoring.  In previous versions of GRID, there was a limited amount of performance data that any of the NVIDIA monitoring tools could see.  NVIDIA SMI, the hypervisor component, could only really report on the GPU core temperature and wattage, and the NVIDIA WMI counters on Windows VMs could only see framebuffer utilization.

The GRID software now exposes more performance metrics from the host and the guest VM level.  These metrics include discovery of the vGPU types currently in use on the physical card as well as utilization statistics for 3D, encode, and decode engines from the hypervisor and guest VM levels.  These stats can be viewed using the NVIDIA-SMI tool in the hypervisor or by using NVIDIA WMI in the guest OS.  This will enable 3rd-party monitoring tools, like Liquidware Stratusphere UX, to extract and analyze the performance data.  The NVIDIA SDK has been updated to provide API access to this data.

Monitoring was one of the missing pieces in the GRID stack, and the latest release addresses this.  It’s now possible to see how the GPU’s resources are being utilized and if the correct profiles are being utilized.

The latest GRID release supports the M6, M60 and M10 cards, and customers have an active software support contract with NVIDIA customers.  Unfortunately, the 1st generation K1 and K2 cards are not supported.

by seanpmassey at August 24, 2016 02:00 PM

Anton Chuvakin - Security Warrior

August 23, 2016

Racker Hacker

What’s Happening in OpenStack-Ansible (WHOA) – August 2016

OpenStackWelcome to the third post in the series of What’s Happening in OpenStack-Ansible (WHOA) posts that I’m assembling each month. OpenStack-Ansible is a flexible framework for deploying enterprise-grade OpenStack clouds. In fact, I use OpenStack-Ansible to deploy the OpenStack cloud underneath the virtual machine that runs this blog!

My goal with these posts is to inform more people about what we’re doing in the OpenStack-Ansible community and bring on more contributors to the project.

There are plenty of updates since the last post from mid-July. We’ve had our Mid-cycle meeting and there are plenty of new improvements in flight.

New releases

The OpenStack-Ansible releases are announced on the OpenStack development mailing list. Here are the things you need to know:


The latest Liberty release, 12.2.1, contains lots of updates and fixes. There are plenty of neutron bug fixes included in the release along with upgrade improvements. Deployers also have the option to block all container restarts until they are ready to reboot containers during a maintenance window.


Mitaka is the latest stable release available and the latest version is 13.3.1. This release also brings in a bunch of neutron fixes and several “behind the scenes” fixes for OpenStack-Ansible.

Notable discussions

This section covers discussions from the OpenStack-Ansible weekly meetings, IRC channels, mailing lists, or in-person events.

Mid-cycle meeting

We had a great mid-cycle meeting at the Rackspace headquarters in San Antonio, Texas:

The meeting drew community members from various companies from all over the United States and the United Kingdom. We talked about the improvements we need to make in the remainder of the Newton cycle, including upgrades, documentation improvements, and new roles.

Here is a run-down of the biggest topics:

Install guide overhaul

The new install guide is quickly coming together and it’s much easier to follow for newcomers. There is a big need now for detailed technical reviews to ensure that the new content is clear and accurate!

Ansible 2.1

The decision was made to bump each of the role repositories to Ansible 2.1 to match the integrated repository. It was noted that Ansible 2.2 will bring some performance improvements once it is released.

Ubuntu 16.04 Xenial Support

This is a high priority for the remainder of the Newton cycle. The Xenial gate jobs will be switched to voting and Xenial failures will need to be dealt with before any additional patches will be merged.

Ubuntu 14.04 Trusty Support

Many of the upstream OpenStack projects are removing 14.04 support soon and OpenStack-Ansible will drop 14.04 support in the Ocata release.

Power CPU support

There’s already support for Power systems as hypervisors within OpenStack-Ansible now and IBM is testing mixed x86 and PPC environments now. However, we still need some way to test these mixed environments in the OpenStack gate tests. Two IBMers from the OpenStack-Ansible community are working with the infra team to find out how this can be

Inventory improvements

The inventory generation process for OpenStack-Ansible is getting more tests and better documentation. Generating inventory is a difficult process to understand, but it is critical for the project’s success.

Gnocchi / Telemetry improvements

We got an update on gnocchi/ceilometer and set some plans on how to go forward with the OpenStack services and the data storage challenges that go along with each.

Mailing list

The OpenStack-Ansible tag was fairly quiet on the OpenStack Development mailing list during the time frame of this report, but there were a few threads:


Michael Gugino wrote about deploying nova-lxd with OpenStack-Ansible. This is one of the newest features in OpenStack-Ansible.

I wrote one about an issue that was really difficult to track down. I had instances coming online with multiple network ports attached when I only asked for one port. It turned out to be a glance issue that caused a problem in nova.

Notable developments

This section covers some of the improvements coming to Newton, the upcoming OpenStack release.

Bug fixes

Services were logging to stderr and this caused some log messages to be logged multiple times on the same host. This ate up additional disk space and disk I/O performance. Topic

The xtrabackup utility causes crashes during certain situations when the compact option is used. Reviews


LBaaS v2 panels were added to Newton and Mitaka. Ironic and Magnum panels were added to Newton.


Plenty of work was merged towards improving the installation guide to make it more concise and easy to follow. Topic

Multi-Architecture support

Repo servers can now build Python wheels for multiple architectures. This allows for mixed control/data plane environments, such as x86 (Intel) control planes with PPC (Power8) hypervisors running PowerKVM or PowerVM. Review

Performance improvements

The repo servers now act as an apt repository cache, which helps improve the speed of deployments. This also helps with deployers who don’t have an active internet connection in their cloud.

The repo servers now only build the wheels and virtual environments necessary for the services which are actually being deployed. This reduces the wait required while the wheels and virtual environments are built, but it also has an added benefit of reducing the disk space consumed. Topic


The MariaDB restarts during upgrades are now handled more carefully to avoid disruptions from container restarts. Review

Deployers now have the option to choose if they want packages updated during a deployment or during an upgrade. There are per-role switches as well as a global switch that can be toggled. Topic


The goal of this newsletter is three fold:

  • Keep OpenStack-Ansible developers updated with new changes
  • Inform operators about new features, fixes, and long-term goals
  • Bring more people into the OpenStack-Ansible community to share their use
    cases, bugs, and code

Please let me know if you spot any errors, areas for improvement, or items that I missed altogether. I’m mhayden on Freenode IRC and you can find me on Twitter anytime.

The post What’s Happening in OpenStack-Ansible (WHOA) – August 2016 appeared first on

by Major Hayden at August 23, 2016 08:35 PM

August 22, 2016

Evaggelos Balaskas

Read It Later or Read It Never ?



I really like this comic.
I try to read/learn something every day.


Sometimes, when I find an interesting article, I like to mark it for reading it later.


I use many forms of marking, like pin tabs, bookmarking, sending url via email, save the html page to a folder, save it to my wallabag instance, leave my browser open to this tab, send the URL QR to my phone etc etc etc.


Are all the above ways productive?


None … the time to read something is now!
I mean the first time you lay your eyes upon the article.


Not later, not when you have free time, now.


That’s the way it works with me. Perhaps with you something else is more productive.


I have a short attention span and it is better for me to drop everything and read something carefully that save it for later or some other time.


When I really have to save it for later, my preferable way is to save it to my wallabag instance. It’s perfect and you will love it.


I also have a kobo ebook (e-ink) reader. Not the android based.
From my wallabag I can save them to epub and export them to my kobo.


But I am lazy and I never do it.


My kobo reader has a pocket (getpocket) account.


So I’ve tried to save some articles but not always pocket can parse properly the content of an article. Not even wallabag always work 100%.


The superiority of wallabag (and self-hosted application) is that when a parsing problem occurs I can fix them! Open a git push request and then EVERYBODY in the community will be able to read-this article from this content provider-later. I cant do something like that with pocket or readability.


And then … there are ads !!! Lots of ads, Tons of ads !!!


There is a correct way to do ads and this is when you are not covering the article you want people to read!
The are a lot of wrong ways to do ads: inline the text, above the article, hiding some of the content, make people buy a fee, provide an article to small pages (you know that height in HTML is not a problem, right?) and then there is bandwidth issues.

When I am on my mobile, I DONT want to pay extra for bandwidth I DIDNT ask and certainly do not care about it!!!
If I read the article on my tiny mobile display DO NOT COVER the article with huge ads that I can not find the X-close button because it doesnt fit to my display !!!

So yes, there is a correct way to do ads and that is by respecting the reader and there is a wrong way to do ads.


Getting back to the article’s subject, below you will see six (6) ways to read an article on my desktop. Of course there are hundreds ways but there are the most common ones:


Article: The cyberpunk dystopia we were warned about is already here

Extra info:
windows width: 852
2 times zoom-out to view more text


01. Pocket
02. Original Post in Firefox 48.0.1
03. Wallabag
04. Reader View in Firefox
05. Chromium 52.0.2743.116
06. Midori 0.5.11 - WebKitGTK+ 2.4.11


Click to zoom:

I believe that Reader View in Firefox is the winner of this test. It is clean and it is focusing on the actual article.
Impressive !

Tag(s): wallabag

August 22, 2016 05:42 PM

Sean's IT Blog

Horizon 7.0 Part 7–Installing Composer

The last couple of posts have dealt with preparing the environment to install Horizon 7.0.  We’ve covered prerequisites, design considerations, preparing Active Directory, and even setting up the service accounts that will be used for accessing services and databases.

Now its time to actually install and configure the Horizon View components.  These tasks will be completed in the following order:

  • Install Horizon Composer
  • Install Horizon Connection Servers
  • Configure the Environment for the first time
  • Install and Configure Remote Access Components

One note that I want to point out is that the installation process for most components has not changed significantly from previous versions.  If you’ve installed Horizon 6.x, this process will look very familiar to you.

Before we can install Composer, we need to create an ODBC Data Source to connect to the Composer database.  The database and the account for accessing the database were created in Part 6.  Composer can be installed once the ODBC data source has been created.

Composer can either be installed on your vCenter Server or on a separate Windows Server.  The first option is only available if you are using the Windows version of vCenter.  This walkthrough assumes that Composer is being installed on a separate server.

Service Account

Part 6 covers the steps for creating the Composer service account that will be used to connect Composer to vCenter.  This account will require local administrator rights on the server prior to installing Composer.

Creating the ODBC Data Source

Unfortunately, the Composer installer does not create the ODBC Data Source driver as part of the Composer installation, and this is something that will need to be created by hand before Composer can be successfully installed.  The View Composer database doesn’t require any special settings in the ODBC setup, so this step is pretty easy.

The SQL Server Native Client is not bundled with the Composer installation.  Prior to configuring the ODBC Data Source, the SQL Server Native Client for your version of SQL Server will need to be installed.  The Native Client for common versions of SQL Server can be found at:

The SQL Server Native Client was discontinued after SQL Server 2012, and it was replaced with a SQL Server ODBC Driver.  I do not know if this driver is supported with Composer, and I do not have a SQL Server 2014 database server to test with.

Once the Native Client is installed, you can begin creating the ODBC Data Source.

Note: The ODBC DSN setup can be launched from within the installer, but I prefer to create the data source before starting the installer.  The steps for creating the data source are the same whether you launch the ODBC setup from the start menu or in the installer.

1. Go to Start –> Administrative Tools –> Data Sources (ODBC).  On Windows Server 2012 R2, go to Start –> All Programs –> ODBC Data Sources (64-bit)

2. Click on the System DSN tab.


3. Click Add.

4. Select the correct SQL Server Native Client and click Finish.  If your database is on SQL Server 2008 R2, the native client will be version 10.0, and if it is on SQL Server 2012 or later, the correct version of the native client is 11.0. This will launch the wizard that will guide you through setting up the data source.

5. When the Create a New Data Source wizard launches, you will need to enter a name for the data source, a description, and the name of the SQL Server that the database resides on.  If you have multiple instances on your SQL Server, it should be entered as ServerName\InstanceName.  Click next to continue.


6. Select SQL Server Authentication.  Enter your SQL Server username and password that you created above.  Click Next to continue.


7. Change the default database to the viewComposer database that you created in Part 6.  Click Next to continue.


8. Click Test Data Source to verify that your settings are correct.


9. If your database settings are correct, you will see the windows below.  If you do not see the TESTS COMPLETED SUCCESSFULLY, verify that you have entered the correct username and password and that your login has the appropriate permissions on the database object.  Click OK to return to the previous window.


10. Click OK to close the Data Source Administrator and return to the desktop.


Installing Horizon Composer

Once the database connection has been set up, Composer can be installed.  The steps for installing Composer are:

1.  Launch the Horizon 7 Composer installer.

2.  If .Net Framework 3.5 SP1 is not installed, you will be prompted to install the feature before continuing. Note: Windows Server 2012 R2 does not contain the binaries for the .Net 3.5 feature, and you need to choose an alternate source path before installing.  Please see this article from Microsoft.

3.  Click Next to continue.


4.  Accept the license agreement and click next.


5.  Select the destination folder where Composer will be installed.


6. Configure Composer to use the ODBC data source that you set up.  You will need to enter the data source name, SQL login, and password before continuing.


7. After the data source has been configured, you will need to select the port that Composer will use for communicating with the Horizon Connection Servers. 


8. Click Use an existing SSL certificate, and then click Choose.  Select the certificate and click OK.  Click Next.


Click Install to start the installation.


9. Once the installation is finished, you will be prompted to restart your computer.


So now that Composer is installed, what can we do with it?  Not much at the moment.  A connection server is required to configure and use Composer for linked clone desktops, and the next post in this series will cover how to install that Connection Server.

by seanpmassey at August 22, 2016 02:59 PM

cfengine-tap now on GitHub

github-logo Back from the holiday season, I have finally found the time to publish a small library on GitHub. It’s called cfengine-tap and can help you writing TAP-compatible tests for your CFEngine policies.

TAP is the test anything protocol. It is a simple text format that test scripts can use to print out the results and test suites can consume. Originally born in the Perl world, it is now supported in many other languages.

Using this library it’s easier to write test suites for your CFEngine policies. Since it’s publicly available on GitHub and published under a GPL license, you are free to use it and welcome to contribute and make it better (please do).


Tagged: cfengine, Configuration management, DevOps, Github, TAP, testing

by bronto at August 22, 2016 01:00 PM

August 20, 2016

Feeding the Cloud

Remplacer un disque RAID défectueux

Traduction de l'article original anglais à

Voici la procédure que j'ai suivi pour remplacer un disque RAID défectueux sur une machine Debian.

Remplacer le disque

Après avoir remarqué que /dev/sdb a été expulsé de mon RAID, j'ai utilisé smartmontools pour identifier le numéro de série du disque à retirer :

smartctl -a /dev/sdb

Cette information en main, j'ai fermé l'ordinateur, retiré le disque défectueux et mis un nouveau disque vide à la place.

Initialiser le nouveau disque

Après avoir démarré avec le nouveau disque vide, j'ai copié la table de partitions avec parted.

Premièrement, j'ai examiné la table de partitions sur le disque dur non-défectueux :

$ parted /dev/sda
unit s

et créé une nouvelle table de partitions sur le disque de remplacement :

$ parted /dev/sdb
unit s
mktable gpt

Ensuite j'ai utilisé la commande mkpart pour mes 4 partitions et je leur ai toutes donné la même taille que les partitions équivalentes sur /dev/sda.

Finalement, j'ai utilisé les commandes toggle 1 bios_grub (partition d'amorce) et toggle X raid (où X est le numéro de la partition) pour toutes les partitions RAID, avant de vérifier avec la commande print que les deux tables de partitions sont maintenant identiques.

Resynchroniser/recréer les RAID

Pour synchroniser les données du bon disque (/dev/sda) vers celui de remplacement (/dev/sdb), j'ai exécuté les commandes suivantes sur mes partitions RAID1 :

mdadm /dev/md0 -a /dev/sdb2
mdadm /dev/md2 -a /dev/sdb4

et j'ai gardé un oeil sur le statut de la synchronisation avec :

watch -n 2 cat /proc/mdstat

Pour accélérer le processus, j'ai utilisé le truc suivant :

blockdev --setra 65536 "/dev/md0"
blockdev --setra 65536 "/dev/md2"
echo 300000 > /proc/sys/dev/raid/speed_limit_min
echo 1000000 > /proc/sys/dev/raid/speed_limit_max

Ensuite, j'ai recréé ma partition swap RAID0 comme suit :

mdadm /dev/md1 --create --level=0 --raid-devices=2 /dev/sda3 /dev/sdb3
mkswap /dev/md1

Par que la partition swap est toute neuve (il n'est pas possible de restorer une partition RAID0, il faut la re-créer complètement), j'ai dû faire deux choses:

  • remplacer le UUID pour swap dans /etc/fstab, avec le UUID donné par la commande mkswap (ou bien en utilisant la command blkid et en prenant le UUID pour /dev/md1)
  • remplacer le UUID de /dev/md1 dans /etc/mdadm/mdadm.conf avec celui retourné pour /dev/md1 par la commande mdadm --detail --scan

S'assurer que l'on peut démarrer avec le disque de remplacement

Pour être certain de bien pouvoir démarrer la machine avec n'importe quel des deux disques, j'ai réinstallé le boot loader grub sur le nouveau disque :

grub-install /dev/sdb

avant de redémarrer avec les deux disques connectés. Ceci confirme que ma configuration fonctionne bien.

Ensuite, j'ai démarré sans le disque /dev/sda pour m'assurer que tout fonctionnerait bien si ce disque décidait de mourir et de me laisser seulement avec le nouveau (/dev/sdb).

Ce test brise évidemment la synchronisation entre les deux disques, donc j'ai dû redémarrer avec les deux disques connectés et puis ré-ajouter /dev/sda à tous les RAID1 :

mdadm /dev/md0 -a /dev/sda2
mdadm /dev/md2 -a /dev/sda4

Une fois le tout fini, j'ai redémarrer à nouveau avec les deux disques pour confirmer que tout fonctionne bien :

cat /proc/mdstat

et j'ai ensuite exécuter un test SMART complet sur le nouveau disque :

smartctl -t long /dev/sdb

August 20, 2016 10:00 PM


Reverse-Engineered GEOS 2.0 for C64 Source Code

The GEOS operating system managed to clone the Macintosh GUI on the Commodore 64, a computer with an 8 bit CPU and 64 KB of RAM. Based on Maciej Witkowiak's work, I created a reverse-engineered source version of the C64 GEOS 2.0 KERNAL for the cc65 compiler suite:

  • The source compiles into the exact same binary as shipped with GEOS 2.0.
  • The source is well-structured and split up into 31 source files.
  • Machine-specific code is marked up.
  • Copy protection/trap mechanisms can be disabled.
  • The build system makes sure binary layout requirements are met.

This makes the source a great starting point for

  • adding (optional) optimized code paths or features
  • integrating existing patches from various sources
  • integrating versions for other computers
  • porting it to different 6502-based computers

Just fork the project and send pull requests!

by Michael Steil at August 20, 2016 01:11 AM

August 19, 2016

Toolsmith Release Advisory: Faraday v2.0 - Collaborative Penetration Test & Vulnerability Management Platform

Toolsmith first covered Faraday in March 2015 with Faraday IPE - When Tinfoil Won’t Work for Pentesting. As it's just hit its 2.0 release milestone, I'm reprinting Francisco Amato's announcement regarding Faraday 2.0 as sent via to the webappsec mailing list.

"Faraday is the Integrated Multiuser Risk Environment you were looking
for! It maps and leverages all the knowledge you generate in real
time, letting you track and understand your audits. Our dashboard for
CISOs and managers uncovers the impact and risk being assessed by the
audit in real-time without the need for a single email. Developed with
a specialized set of functionalities that help users improve their own
work, the main purpose is to re-use the available tools in the
community taking advantage of them in a collaborative way! Check out
the Faraday project in Github.

Two years ago we published our first community version consisting
mainly of what we now know as the Faraday Client and a very basic Web
UI. Over the years we introduced some pretty radical changes, but
nothing like what you are about to see - we believe this is a turning
point for the platform, and we are more than happy to share it with
all of you. Without further ado we would like to introduce you to
Faraday 2.0!

This release, presented at Black Hat Arsenal 2016, spins around our
four main goals for this year:

* Faraday Server - a fundamental pillar for Faraday's future. Some of
the latest features in Faraday required a server that could step
between the client and CouchDB, so we implemented one! It still
supports a small amount of operations but it was built thinking about
performance. Which brings us to objective #2...

* Better performance - Faraday will now scale as you see fit. The new
server allows to have huge workspaces without a performance slowdown.
200k hosts? No problem!

* Deprecate QT3 - the QT3 interface has been completely erased, while
the GTK one presented some versions ago will be the default interface
from now on. This means no more problems with QT3 non-standard
packages, smooth OSX support and a lighter Faraday Client for

* Licenses - managing a lot of products is time consuming. As you may
already know we've launched Faraday's own App Store where you can get all of your
favourite tools (Burp suite, IDA Debugger, etc) whether they're open
source or commercial ones. But also, in order to keep your licenses up
to date and never miss an expiry date we've built a Licenses Manager
inside Faraday. Our platform now stores the licenses of third party
products so you can easily keep track of your licenses while
monitoring your pentest.

With this new release we can proudly say we already met all of this
year's objectives, so now we have more than four months to polish the
details. Some of the features released in this version are quite
basic, and we plan to extend them in the next few iterations.


* Improved executive report generation performance.
* Totally removed QT3, GTK is now the only GUI.
* Added Faraday Server.
* Added some basic APIs to Faraday Server.
* Deprecated FileSystem databases: now Faraday works exclusively with
Faraday Server and CouchDB.
* Improved performance in web UI.
* Added licenses management section in web UI.
* Fixed bug when deleting objects from Faraday Web.
* Fixed bug when editing services in the web UI.
* Fixed bug where icons were not copied to the correct directory on
* Added a button to go to the Faraday Web directly from GTK.
* Fixed bug where current workspace wouldn't correspond to selected
workspace on the sidebar on GTK.
* Fixed bug in 'Refresh Workspace' button on GTK.
* Fixed bug when searching for a non-existent workspace in GTK.
* Fixed bug where Host Sidebar and Status Bar information wasn't
correctly updated on GTK.
* Fixed sqlmap plugin.
* Fixed metasploit plugin.

We hope you enjoy it, and let us know if you have any questions or comments."

Ping me via email or Twitter if you have questions (russ at holisticinfosec dot org or @holisticinfosec).
Cheers…until next time. 

by Russ McRee ( at August 19, 2016 04:16 PM

August 18, 2016

Sean's IT Blog

You Got Your Nutanix in My UCS #NTC

In the 1980s, there was a commercial for Reese’s Peanut Butter cups that described the product with the following lines: “You got your chocolate in my peanut butter.  You got your peanut butter in my chocolate.”

This describes  hyperconverged infrastructure and the Cisco UCS converged platform.

By combining storage and compute into the same nodes, a hyperconverged infrastructure offers many features for customers.  These include highly performant storage, simplified management, and a reduced footprint in the datacenter.  But networking has remained an island unto it’s own, and its still it’s own silo in data centers with hyperconverged infrastructure.

Cisco’s UCS provides similar benefits by converging compute and network – simplified policy-based management combined with high performance compute and networking.

So what happens when you combine them?

You get a best of breed platform that brings the fully-converged stack to life.  Network, storage, and compute are unified with a Nutanix-powered hyperconverged platform running with policy-based hardware management through UCSM.

Today, Nutanix announced support for UCS C-series rackmount servers.  Unlike the previous platform expansions, Nutanix on Cisco UCS will not be an OEM partnership.  It will be a meet-in-the-channel approach where customers would buy Cisco UCS and Nutanix licensing separately, and the integration would take place when the system is installed at the customer site.

Nutanix has certified some Cisco UCS C-series platforms, and they provide the certified Bill of Materials to simplify ordering with your preferred channel partner.  Nutanix is not supporting Cisco B-series blades.

The combination of Nutanix and UCS brings yet another powerful combination to customers who want to utilize the best-of-breed technologies in their data centers.

by seanpmassey at August 18, 2016 02:26 PM

Michael Biven

The Loss of a Sense Doesn’t Always Heighten the Others

Over a two week break my wife and I were talking about a statement she read where someone was called the “long time embedded photojournalist for Burning Man” and how she disagreed with this. This person wasn’t shooting for any news organization. They’re known to be one of the Burning Man faithful which removes some impartiality they may have. In essence they’re a PR photographer for Burning Man.

This made me consider most of the output from social media is in one of two different camps. The first is “Free PR” and the second is “Critics”. You’re either giving away free material to promote an idea, organization, product, or a person or you’re criticizing them.

There is a third camp (Journalism), but so few people have the patience to provide an accurate and objective comment. And so few organization have the integrity to support those ideals. It’s like the goals of our past that helped drive society to do better has been toppled by a rush of individuals trying to create a better self-image.

After mentioning this idea she told me about a piece she heard recently on the radio where the guest was referencing an article from the New York Times about Trump. The host interrupted them to mention that it was an Op-Ed and not an article. This baffled the guest who didn’t understand that it was an opinion piece and not journalism.

The flood of sources of information, both accurate and inaccurate, provided on the internet hasn’t lead people to judge the validity of something better. Instead we have seen a rise in lies, ignorance, and the absurd being reported as fact. This phenomena even has a name… post-fact. As in we now live in a post-fact world. Think birthers, anti-vaxxers, or any other conspiracy theory movement.

Progress has been delayed or reversed by having to debunk ignorance being held up as fact. The time and energy being wasted by these distractions makes me wonder if this period of time will be know as the Age of Misfeasance.

P.S. After reading this post over and fixing one too many mistakes before I hit publish, I thought are people just writing so quickly and then having faith in a technology to fix their mistakes? Is autocomplete damaging our ability to think clearly, because we’re not reading and editing what we’re writing as much as in the past?

August 18, 2016 01:34 PM

August 16, 2016

TCP vulnerability in Linux kernels pre 4.7: CVE-2016-5696

The post TCP vulnerability in Linux kernels pre 4.7: CVE-2016-5696 appeared first on

This is a very interesting vulnerability in the TCP stack of Linux kernels pre < 4.7. The bad news: there are a lot of systems online running those kernel versions. The bug/vulnerability is as follows.

Red Hat Product Security has been made aware of an important issue in
the Linux kernel's implementation of challenge ACKS as specified in
RFC 5961. An attacker which knows a connections client IP, server IP
and server port can abuse the challenge ACK mechanism
to determine the accuracy of a normally 'blind' attack on the client or server.

Successful exploitation of this flaw could allow a remote attacker to
inject or control a TCP stream contents in a connection between a
Linux device and its connected client/server.

* This does NOT mean that cryptographic information is exposed.
* This is not a Man in the Middle (MITM) attack.
[oss-security] CVE-2016-5389: linux kernel -- challange ack information leak

In short: a successful attack could hijack a TCP session and facilitate a man-in-the-middle attack and allow the attacker to inject data. Ie: altering the content on websites, modifying responses from webservers, ...

This Stack Overflow post explains it very well.

The hard part of taking over a TCP connection is to guess the source port of the client and the current sequence number.

The global rate limit for sending Challenge ACK's (100/s in Linux) introduced together with Challenge ACK (RFC5961) makes it possible in the first step to guess a source port used by the clients connection and in the next step to guess the sequence number. The main idea is to open a connection to the server and send with the source of the attacker as much RST packets with the wrong sequence mixed with a few spoofed packets.

By counting how much Challenge ACK get returned to the attacker and by knowing the rate limit one can infer how much of the spoofed packets resulted in a Challenge ACK to the spoofed client and thus how many of the guesses where correct. This way can can quickly narrow down which values of port and sequence are correct. This attack can be done within a few seconds.

And of course the attacker need to be able to spoof the IP address of the client which is not true in all environments. It might be possible in local networks (depending on the security measures) but ISP will often block IP spoofing when done from the usual DSL/cable/mobile accounts.

TCP “off-path” Attack (CVE-2016-5696)

For RHEL (and CentOS derivatives), the following OS's are affected.


While it's no permanent fix, the following config will make it a lot harder to abuse this vulnerability.

$ sysctl -w net.ipv4.tcp_challenge_ack_limit=999999999

And make it permanent so it persists on reboot:

$ echo "net.ipv4.tcp_challenge_ack_limit=999999999" >> /etc/sysctl.d/net.ipv4.tcp_challenge_ack_limit.conf

While the attack isn't actually prevented, it is damn hard to reach the ACK limits.

Further reading:

The post TCP vulnerability in Linux kernels pre 4.7: CVE-2016-5696 appeared first on

by Mattias Geniar at August 16, 2016 10:13 AM

August 15, 2016

Cryptography Engineering

Is Apple's Cloud Key Vault a crypto backdoor?

TL;DR: No, it isn't. If that's all you wanted to know, you can stop reading.

Still, as you can see there's been some talk on Twitter about the subject, and I'm afraid it could lead to a misunderstanding. That would be too bad, since Apple's new technology is kind of a neat experiment.

So while I promise that this blog is not going to become all-Apple-all-the-time, I figured I'd take a minute to explain what I'm talking about. This post is loosely based on an explanation of Apple's new escrow technology that Ivan Krstic gave at BlackHat. You should read the original for the fascinating details.

What is Cloud Key Vault (and what is iCloud Keychain)?

A few years ago Apple quietly introduced a new service called iCloud Keychain. This service is designed to allow you to back up your passwords and secret keys to the cloud. Now, if backing up your sensitive passwords gives you the willies, you aren't crazy. Since these probably include things like bank and email passwords, you really want these to be kept extremely secure.

And -- at least going by past experience -- security is not where iCloud shines:

The problem here is that passwords need to be secured at a much higher assurance level than most types of data backup. But how can Apple ensure this? We can't simply upload our secret passwords the way we upload photos of our kids. That would create a number of risks, including:
  1. The risk that someone will guess, reset or brute-force your iCloud password. Password resets are a particular problem. Unfortunately these seem necessary for normal iCloud usage, since people do forget their passwords. But that's a huge risk when you're talking about someone's entire password collection.
  2. The risk that someone will break into Apple's infrastructure. Even if Apple gets their front-end brute-forcing protections right (and removes password resets), the password vaults themselves are a huge target. You want to make sure that even someone who hacks Apple can't get them out of the system.
  3. The risk that a government will compel Apple to produce data. Maybe you're thinking of the U.S. government here. But that's myopic: Apple stores iCloud data all over the world. 
So clearly Apple needs a better way to protect these passwords. How do to it?

Why not just encrypt the passwords?

It is certainly possible for an Apple device to encrypt your password vault before sending it to iCloud. The problem here is that Apple doesn't necessarily have a strong encryption key to do this with. Remember that the point of a backup is to survive the loss of your device, and thus we can't assume the existence of a strong recovery key stored on your phone.

This leaves us with basically one option: a user password. This could be either the user's iCloud password or their device passcode. Unfortunately for the typical user, these tend to be lousy. They may be strong enough to use as a login password -- in a system that allows only a very limited number of login attempts. But the kinds of passwords typical users choose to enter on mobile devices are rarely strong enough to stand up to an offline dictionary attack, which is the real threat when using passwords as encryption keys.

(Even using a strong memory-hard password hash like scrypt -- with crazy huge parameters -- probably won't save a user who chooses a crappy password. Blame phone manufacturers for making it painful to type in complicated passwords by forcing you to type them so often.)

So what's Apple to do?

So Apple finds itself in a situation where they can't trust the user to pick a strong password. They can't trust their own infrastructure. And they can't trust themselves. That's a problem. Fundamentally, computer security requires some degree of trust -- someone has to be reliable somewhere.

Apple's solution is clever: they decided to make something more trustworthy than themselves. To create a new trust anchor, Apple purchased a bunch of fancy devices called Hardware Security Modules, or HSMs. These are sophisticated, tamper-resistant specialized computers that store and operate with cryptographic keys, while preventing even malicious users from extracting them. The high-end HSMs Apple uses also allow the owner to include custom programming.

Rather than trusting Apple, your phone encrypts its secrets under a hardcoded 2048-bit RSA public key that belongs to Apple's HSM. It also encrypts a function of your device passcode, and sends the resulting encrypted blob to iCloud. Critically, only the HSM has a copy of the corresponding RSA decryption key, thus only the HSM can actually view any of this information. Apple's network sees only an encrypted blob of data, which is essentially useless.

When a user wishes to recover their secrets, they authenticate themselves directly to the HSM. This is done using a user's "iCloud Security Code" (iCSC), which is almost always your device passcode -- something most people remember after typing it every day. This authentication is done using the Secure Remote Password protocol, ensuring that Apple (outside of the HSM) never sees any function of your password.

Now, I said that device passcodes are lousy secrets. That's true when we're talking about using them as encryption keys -- since offline decryption attacks allow the attacker to make an unlimited number of attempts. However, with the assistance of an HSM, Apple can implement a common-sense countermeasure to such attacks: they limit you to a fixed number of login attempts. This is roughly the same protection that Apple implements on the devices themselves.

The encrypted contents of the data sent to the HSM (source).
The upshot of all these ideas is that -- provided that the HSM works as designed, and that it can't be reprogrammed -- even Apple can't access your stored data except by logging in with a correct passcode. And they only get a limited number of attempts to guess correctly, after which the account locks.

This rules out both malicious insiders and government access, with one big caveat.

What stops Apple from just reprogramming its HSM?

This is probably the biggest weakness of the system, and the part that's driving the "backdoor' concerns above. You see, the HSMs Apple uses are programmable. This means that -- as long as Apple still has the code signing keys -- the company can potentially update the custom code it includes onto the HSM to do all sort sorts of things.

These things might include: programming the HSM to output decrypted escrow keys. Or disabling the maximum login attempt counting mechanism. Or even inserting a program that runs a brute-force dictionary attack on the HSM itself. This would allow Apple to brute-force your passcode and/or recover your passwords.

Fortunately Apple has thought about this problem and taken steps to deal with it. Note that on HSMs like the one Apple is using, the code signing keys live on a special set of admin smartcards. To remove these keys as a concern, once Apple is done programming the HSM, they run these cards through a process that they call a "physical one-way hash function".

If that sounds complicated, here's Ivan's slightly simpler explanation.

So, with the code signing keys destroyed, updating the HSM to allow nefarious actions should not be possible. Pretty much the only action Apple can take is to  wipe the HSM, which would destroy the HSM's RSA secret keys and thus all of the encrypted records it's responsible for. To make sure all admin cards are destroyed, the company has developed a complex ceremony for controlling the cards prior to their destruction. This mostly involves people making assertions that they haven't made copies of the code signing key -- which isn't quite foolproof. But overall it's pretty impressive.

The downside for Apple, of course, is that there had better not be a bug in any of their programming. Because right now there's nothing they can do to fix it -- except to wipe all of their HSMs and start over.

Couldn't we use this idea to implement real crypto backdoors?

A key assertion I've heard is that if Apple can do this, then surely they can do something similar to escrow your keys for law enforcement. But looking at the system shows isn't true at all.

To be sure, Apple's reliance on a Hardware Security Module indicates a great deal of faith in a single hardware/software solution for storing many keys. Only time will tell if that faith is really justified. To be honest, I think it's an overly-strong assumption. But iCloud Keychain is opt-in, so individuals can decide for themselves whether or not to take the risk. That wouldn't be true of a mandatory law enforcement backdoor.

But the argument that Apple has enabled a law enforcement backdoor seems to miss what Apple has actually done. Instead of building a system that allows the company to recover your secret information, Apple has devoted enormous resources to locking themselves out. Only customers can access their own information. In other words, Apple has decided that the only way they can hold this information is if they don't even trust themselves with it.

That's radically different from what would be required to build a mandatory key escrow system for law enforcement. In fact, one of the big objections to such a backdoor -- which my co-authors and I recently outlined in a report -- is the danger that any of the numerous actors in such a system could misuse it. By eliminating themselves from the equation, Apple has effectively neutralized that concern.

If Apple can secure your passwords this way, then why don't they do the same for your backed up photos, videos, and documents?

That's a good question. Maybe you should ask them?

by Matthew Green ( at August 15, 2016 02:41 PM

Disable "New release available" emails on Ubuntu

We have our Ubuntu machines set up to mail us the output of cron jobs like so:

$ cat /etc/crontab 

# m h dom mon dow user    command

This is encrible useful, since cronjobs should never output anything unless something is wrong.

Unfortunately, this means we also get emails like this:

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

You can fairly easily disable these by modifying the corresponding cronjob /etc/cron.weekly/update-notifier-common:

 set -e
 [ -x /usr/lib/ubuntu-release-upgrader/release-upgrade-motd ] || exit 0
 # Check to see whether there is a new version of Ubuntu available
+/usr/lib/ubuntu-release-upgrader/release-upgrade-motd > /dev/null

Now you'll no longer receive these emails. It's also possible to remove the cronjob entirely, but then an upgrade is likely to put it back, and I have no idea if the cronjob has any other side effects besides emailing.

by admin at August 15, 2016 09:15 AM

August 14, 2016

Gmail and their IPV6 spam filter

Failing forwards

Earlier this week Gmail's servers decided that any email sent from Gmail and then forwarded from back to Gmail was now, as their servers put it "likely unsolicited mail". People sending mail from Gmail to were getting bounce messages, which looks bad. All other email from any other domain was coming in without issue. I have been forwarding email from Gmail accounts for many years now without issue.

Time to recheck settings

  1. DNS A,AAAA and PTR records for IPV4,6 setup and work correctly
  2. SPF record setup correctly, but this is a forward so it always shows fail. The bounce message passes SPF, so that's nice.
  3. SMTP w/TLS working and available
  4. does not modify, remove or shuffle message headers or modify the body of the message in any way.
  5. No increase in spam getting through. filters most mail before it is forwarded.
  6. Not using Sender Rewriting Scheme (SRS).
  7. No DKIM setup

The fix

After seeing that everything checked out, I hit up Google to see if anyone else was having this issue. From the results it seems that many people have had this same issue. Some people just started using SRS to fix their issue. Others had to fix their PTR records in DNS. The last group of people had to stop using IPV6 for mail delivery. Since all of the other mail server settings were correct, the only thing I could try was implementing SRS or turn off IPV6. Turning off IPV6 delivery was the easiest test. After turning off IPV6 mail delivery, and just leaving IPV4, all mail from Gmail being forwarded through was now being accepted again. How dumb is that?

What is up with IPV6?

It seems Gmail has changed a setting (or I hit some new threshold) on their side dealing with only IPV6. Since Google will not tell you why certain mail is considered "unsolicited mail" we can not figure out what was done to try to fix the issue. If I had to speculate on what is happening, my guess is they turned up the sensitivity on email coming from IPV6 as it is obvious that IPV4 filter is not as sensitive. It is not just my server as it is happening to many other people as well.

I had also noticed that mail coming in from a friend whose server delivers mail to my server via IPV6, and then was forwarded to Gmail via IPV6 was being marked as spam every time. According to Google if the user is in your contacts list (and his email address is) the email is not supposed to be marked as spam. That is straight broken. Now that I switched back to just IPV4 delivery, all of his mail is not being marked as spam anymore. I believe Google has an issue with IPV6 mail delivery and spam classification.

What now?

I hate that I had to turn off IPV6 for mail forwarding to Gmail. My next likely step is to implement SRS for forwarding, and see if I can turn IPV6 back on. The best article I found on setting this up on Postfix is here. It also shows how to setup DKIM which might be fun to do as well.

by at August 14, 2016 10:30 PM

Nitrokey Start: Getting started guide (gnuk openpgp token)

The Nitrokey Start is an OpenPGP USB token. It supports three 2048 bit GPG keys and is based on gnuk. Gnuk is an implementation of USB cryptographic token for GPG. Cryptographic token is a store of private keys and it computes cryptographic functions on the device. The main difference with other GPG cards like the Nitrokey Pro or the OpenPGP card is that this device does not use a smartcard. Whereas the other devices are basically USB smartcard readers, the Nitrokey Start has everything in it's firmware. This article is a getting started guide where I talk about the initial setup of the device, setting up a user PIN, an admin PIN and a reset code, generating the key and subkeys on the device, or loading external keys into the device and usage examples with GPG, OpenSSH and Thunderbird.

August 14, 2016 12:00 AM

August 12, 2016

youtube-dl: download audio-only files from YouTube on Mac

The post youtube-dl: download audio-only files from YouTube on Mac appeared first on

I may or may not have become addicted to a particular video on YouTube, and I wanted to download the MP3 for offline use.

(Whether it's allowed or not is up for debate, knowing copyright laws it probably depends per country.)

Luckily, I remember I featured a YouTube downloader once in cron.weekly issue #23 that I could use for this.

So, a couple of simple steps on Mac to download the MP3 from any YouTube video. All further commands assume the Brew package manager is installed on your Mac.

$ brew install ffmpeg youtube-dl

To download and convert to MP3:

$ youtube-dl --extract-audio --audio-format mp3 --prefer-ffmpeg
 3UOtF4J9wpo: Downloading webpage
 3UOtF4J9wpo: Downloading video info webpage
 3UOtF4J9wpo: Extracting video information
 3UOtF4J9wpo: Downloading MPD manifest
[download] 100% of 58.05MiB
[ffmpeg] Destination: 3UOtF4J9wpo.mp3

Deleting original file 3UOtF4J9wpo.webm

And bingo, all that remains is the MP3!

The post youtube-dl: download audio-only files from YouTube on Mac appeared first on

by Mattias Geniar at August 12, 2016 08:30 PM

Geek and Artist - Tech

Thoughts on creating an engineering Tech Radar

Perhaps you are familiar with the ThoughtWorks Tech Radar – I really like it as a useful summary of global technology trends and what I should be looking at familiarising myself with. Even the stuff on the “hold” list (such as Scaled Agile Framework – sometimes anti-patterns are equally useful to understand and appreciate). There’s a degree of satisfaction in seeing your favourite technology rise through the ranks to become something recommended to everyone, but also in my current (new) role it has a different purpose.

Since I started a new job just over a month ago, I’ve come into an organisation with a far simpler tech stack and in some regards, less well-defined technology strategy. I like to put in place measures to help engineers be as autonomous in their decision-making process as possible, so a Tech Radar can help frame which technologies they can or should consider when going about their jobs. This ranges from techniques they should strongly consider adopting (which can be much more of a tactical decision) to databases they could select from when building a new service that doesn’t fit the existing databases already in use. The Tech Radar forms something like a “garden fence” – you don’t necessarily need to implement everything within it, but it shows you where the limits are in case you need something new.

So basically, I wanted to use the Tech Radar as a way to avoid needing to continually make top-down decisions when stepping into unknown territory, and help the organisation and decision-making scale as we add more engineers. The process I followed to generate it was very open and democratic – each development team was gathered together for an hour, and I drew the radar format on the whiteboard. Then engineers contributed post-it notes with names of technologies and placed them on the board. After about 10 minutes of this, I read through all of the notes and got everyone to describe for the room the “what” and the “why” of their note. Duplicates were removed and misplaced notes moved to their correct place.

Afterwards, I transcribed everything into a Google Doc and asked everyone to again add the “what” and “why” of each contributed note to the document. What resulted was an 11-page gargantuan collection of technologies and techniques that seemed to cover everything that everyone could think of in the moment, and didn’t quite match up with my expectations. I’ll describe my observations about the process and outcomes.

Strategy vs Tactics, and Quadrants

The purpose of the overall radar is to be somewhat strategic. ThoughtWorks prepares their radar twice a year, so it is expected to cover at least the next 6 months. Smaller companies might only prepare it once a year. However, amongst the different quadrants there is a reasonable amount of room for tactics as well. In particular I would say that the Techniques and Tools quadrants are much more tactical, whereas the Platforms and Languages & Frameworks quadrants are much more strategic.

For example, let’s say you have Pair Programming in the Techniques quadrant. Of course, you might strategically adopt this across the whole company, but a single team (in fact, just two developers) can try instituting it this very day, at no impact to anyone in other teams and probably not even others in the same team. It comes with virtually no cost to just try out, and start gaining benefit from immediately, even if nobody else is using it. Similarly, on the Tools side, you might decide to add a code test coverage reporting tool to your build pipeline. It’s purely informational, you benefit from it immediately and it doesn’t require anyone else’s help or participation, nor does it impact anyone else. For that reason it’s arguable whether these things are so urgent to place on the radar – developers can largely make the decisions themselves to adopt such techniques or tools.

On the other hand, the adoption of a new Language or Framework, or building on top of a new Platform (let’s say you want to start deploying your containers to Kubernetes) will come with a large time investment both immediately and ongoing, as well as needing wide-scale adoption across teams to benefit from that investment. Of course there is room for disagreement here – e.g. is a service like New Relic a tool or a platform? Adoption of a new monitoring tool definitely comes with a large cost (you don’t want every team using a different SaaS monitoring suite). But the Tech Radar is just a tool itself and shouldn’t be considered the final definition of anything – just a guide for making better decisions.

Strategic Impact

As touched on above, adopting a Platform or new Language/Framework has significant costs. While putting together a radar like this with input from all people, who may have different levels of experience, you might find that not all of the strategic impacts have been considered when adding an item to the list. An incomplete list of things I believe need to be examined when selecting a Language or Framework could be:

  • What are the hiring opportunities around this technology? Is it easier or harder to hire people with this skillset?
  • Is it a growing community, and are we likely to find engineers at all maturity levels (junior/intermediate/senior) with experience in the technology?
  • For people already in the company, is it easy and desirable to learn? How long does it take to become proficient?
  • Similarly, how many people already at the company already know the technology well enough to be considered proficient for daily work?
  • Does the technology actually solve a problem we have? Are there any things our current technologies do very well that would suffer from the new technology’s introduction?
  • What other parts of our tech stack would need to change as a result of adopting it? Testing? Build tooling? Deployments? Libraries and Dependencies?
  • Do we understand not only the language but also the runtime?
  • Would it help us deliver more value to the customer, or deliver value faster?
  • By taking on the adoption costs, would we be sacrificing time spent on maximising some other current opportunity?
  • Is there a strong ecosystem of libraries and code around the technology? Is there a reliable, well-supported, stable equivalent to all of the libraries we use with our current technologies? If not, is it easy and fast to write our own replacements?
  • How well does adoption of the technology align with our current product and technology roadmaps?

By no means is this list exhaustive, but I think all points need some thought, rather than just “is it nicer to program in than my current language”.

Filtering the list and assembling the radar

As mentioned, I ended up with a fairly huge list of items which now needs to be filtered. This is a task for a CTO or VP of Engineering depending on your organisation size. Ultimately people accountable for the technology strategy need to set the bounds of the radar. For my list, I will attempt to pre-filter the items that have little strategic importance – like tools or techniques (unless we determine it’s something that could/should have widespread adoption and benefit).

Ultimately we’ll have to see what the output looks like and whether engineers feel it answers questions for them – that will determine whether we try to build a follow-up radar in the next quarter or year. If I end up running the process again, I suspect I’ll make use of a smaller group of people to add inputs – who have already collected and moderated inputs from their respective teams. The other benefit of the moderation/filtering process is that the document that is later produced is a way of expressing to engineers (perhaps with less experience) the inherent strategic importance of the original suggestions. There are no wrong suggestions, but we should aim to help people learn and think more about the application of strategy and business importance in their day to day work.

by oliver at August 12, 2016 01:44 PM

August 09, 2016

Racker Hacker

Preventing critical services from deploying on the same OpenStack host

Houses on a hillOpenStack’s compute service, nova, manages all of the virtual machines within a OpenStack cloud. When you ask nova to build an instance, or a group of instances, nova’s scheduler system determines which hypervisors should run each instance. The scheduler uses filters to figure out where each instance belongs.

However, there are situations where the scheduler might put more than one of your instances on the same host, especially when resources are constrained. This can be a problem when you deploy certain highly available applications, like MariaDB and Galera. If more than one of your database instances landed on the same physical host, a failure of that physical host could take down more than one database instance.

Filters to the rescue

The scheduler offers the ServerGroupAntiAffinityFilter filter for these deployments. This allows a user to create a server group, apply a policy to the group, and then begin adding servers to that group.

If the scheduler filter can’t find a way to fulfill the anti-affinity request (which often happens if the hosts are low on resources), it will fail the entire build transaction with an error. In other words, unless the entire request can be fulfilled, it won’t be deployed.

Let’s see how this works in action on an OpenStack Mitaka cloud deployed with OpenStack-Ansible.

Creating a server group

We can use the openstackclient tool to create our server group:

$ openstack server group create --policy anti-affinity db-production
| Field    | Value                                |
| id       | cd234914-980a-42f2-b77c-602a7cc0080f |
| members  |                                      |
| name     | db-production                        |
| policies | anti-affinity                        |

We’ve told nova that we want all of the instances in the db-production group to land on different OpenStack hosts. I’ll copy the id to my clipboard since I’ll need that UUID for the next step.

Adding hosts to the group

My small OpenStack cloud has four hypervisors, so I can add four instances to this server group:

$ openstack server create \
  --flavor m1.small \
  --image "Fedora 24" \
  --nic net-id=bc8895ab-98f7-478f-a54a-36b121f7bb3f \
  --key-name personal_servers \
  --hint "group=cd234914-980a-42f2-b77c-602a7cc0080f" \
  --max 4

This server create command looks fairly standard, but I’ve added the --hint parameter to specify that we want these servers scheduled as part of the group we just created. Also, I’ve requested for four servers to be built at the same time. After a few moments, we should have four servers active:

$ openstack server list --name prod-db -c ID -c Name -c Status
| ID                                   | Name      | Status |
| 7e7a81f3-eb02-4751-93c1-a0de999b8423 | prod-db-4 | ACTIVE |
| b742fb58-8ea4-4e26-bfbf-645a698fbb26 | prod-db-3 | ACTIVE |
| 78c7a43c-4deb-40da-a419-e62db37ab41a | prod-db-2 | ACTIVE |
| 7b8af038-6441-40c0-87c8-0a1acced17a6 | prod-db-1 | ACTIVE |

If we check the instances, they should be on different hosts:

$ for i in {1..4}; do openstack server show prod-db-${i} -c hostId -f shell; done


If we try to build one more instance, it should fail since the scheduler cannot fulfill the anti-affinity policy applied to server group:

$ openstack server create \
  --flavor m1.small \
  --image "Fedora 24" \
  --nic net-id=bc8895ab-98f7-478f-a54a-36b121f7bb3f \
  --key-name personal_servers \
  --hint "group=cd234914-980a-42f2-b77c-602a7cc0080f" \
  --wait \
Error creating server: prod-db-5
Error creating server
$ openstack server show prod-db-5 -c fault -f shell
fault="{u'message': u'No valid host was found. There are not enough hosts available.', u'code': 500...

The scheduler couldn’t find a valid host for a fifth server in the anti-affinity group.

Photo credit: “crowded houses” from jesuscm on Flickr

The post Preventing critical services from deploying on the same OpenStack host appeared first on

by Major Hayden at August 09, 2016 05:07 PM


Copy Protection Traps in GEOS for C64

Major GEOS applications on the Commodore 64 protect themselves from unauthorized duplication by keying themselves to the operating system's serial number. To avoid tampering with this mechanism, the system contains some elaborate traps, which will be discussed in this article.

GEOS Copy Protection

The GEOS boot disk protects itself with a complex copy protection scheme, which uses code uploaded to the disk drive to verify the authenticity of the boot disk. Berkeley Softworks, the creators of GEOS, found it necessary to also protect their applications like geoCalc and geoPublish from unauthorized duplication. Since these applications were running inside the GEOS "KERNAL" environment, which abstracted most hardware details away, these applications could not use the same kind of low-level tricks that the system was using to protect itself.

Serial Numbers for Protection

The solution was to use serial numbers. On the very first boot, the GEOS system created a 16 bit random number, the "serial number", and stored it in the KERNAL binary. (Since the system came with a "backup" boot disk, the system asked for that disk to be inserted, and stored the same serial in the backup's KERNAL.) Now whenever an application was run for the first time, it read the system's serial number and stored it in the application's binary. On subsequent runs, it read the system's serial number and compared it with the stored version. If the serial numbers didn't match, the application knew it was running on a different GEOS system than the first time – presumably as a copy on someone else's system: Since the boot disk could not be copied, two different people had to buy their own copies of GEOS, and different copies of GEOS had different serial numbers.

Serial Numbers in Practice

The code to verify the serial number usually looked something like this:

.,D5EF  20 D8 C1    JSR $C1D8 ; GetSerialNumber
.,D5F2  A5 03       LDA $03   ; read the hi byte
.,D5F4  CD 2F D8    CMP $D82F ; compare with stored version
.,D5F7  F0 03       BEQ $D5FC ; branch if equal
.,D5F9  EE 18 C2    INC $C218 ; sabotage LdDeskAcc syscall: increment vector
.,D5FC  A0 00       LDY #$00  ; ...

If the highest 8 bits of the serial don't match the value stored in the application's binary, it increments the pointer of the LdDeskAcc vector. This code was taken from the "DeskTop" file manager, which uses this subtle sabotage to make loading a "desk accessory" (a small helper program that can be run from within an application) unstable. Every time DeskTop gets loaded, the pointer gets incremented, and while LdDeskAcc might still work by coincidence the first few times (because it only skips a few instructions), it will break eventually. Other applications used different checks and sabotaged the system in different ways, but they all had in common that they called GetSerialNumber.

(DeskTop came with every GEOS system and didn't need any extra copy protection, but it checked the serial anyway to prevent users from permanantly changing their GEOS serial to match one specific pirated application.)

A Potential Generic Hack

The downside of this scheme is that all applications are protected the same way, and a single hack could potentially circumvent the protection of all applications.

A generic hack would change the system's GetSerialNumber implementation to return exactly the serial number expected by the application by reading the saved value from the application's binary. The address where the saved valus is stored is different for every application, so the hack could either analyze the instructions after the GetSerialNumber call to detect the address, or come with a small table that knows these addresses for all major applications.

GEOS supports auto-execute applications (file type $0E) that will be executed right after boot – this would be the perfect way to make this hack available at startup without patching the (encrypted) system files.

Trap 1: Preventing Changing the Vector

Such a hack would change the GetSerialNumber vector in the system call jump table to point to new code in some previously unused memory. But the GEOS KERNAL has some code to counter this:

                                ; (Y = $FF from the code before)
.,EE59  B9 98 C0    LDA $C098,Y ; read lo byte of GetSerialNumber vector
.,EE5C  18          CLC
.,EE5D  69 5A       ADC #$5A    ; add $5A
.,EE5F  99 38 C0    STA $C038,Y ; overwrite low byte GraphicsString vector

In the middle of code that deals with the menu bar and menus, it uses this obfuscated code to sabotage the GraphicsString system call if the GetSerialNumber vector was changed. If the GetSerialNumber vector is unchanged, these instructions are effectively a no-op: The lo byte of the system's GetSerialNumber vector ($F3) plus $5A equals the lo byte of the GraphicsString vector ($4D). But if the GetSerialNumber vector was changed, then GraphicsString will point to a random location and probably crash.

Berkely Softworks was cross-developing GEOS on UNIX machines with a toolchain that supported complex expressions, so they probably used code like this to express this:

    ; Y = $FF
    lda GetSerialNumber + 1 - $FF,y
    adc #<(_GraphicsString - _GetSerialNumber)
    sta GraphicsString + 1 - $FF,y

In fact, different variations of GEOS (like the GeoRAM version) were separate builds with different build time arguments, so because of different memory layouts, they were using different ADC values here.

Note that the pointers to the GetSerialNumber and GraphicsString have been obfuscated, so that an attacker that has detected the trashed GraphicsString vector won't be able to find the sabotage code by looking for the address.

Trap 2: Preventing Changing the Implementation

If the hack can't change the GetSerialNumber vector, it could put a JMP instruction at the beginning of the implementation to the new code. But the GEOS KERNAL counters this as well. The GetSerialNumber implementation looks like this:

.,CFF3  AD A7 9E    LDA $9EA7 ; load lo byte of serial
.,CFF6  85 02       STA $02   ; into return value (lo)
.,CFF8  AD A8 9E    LDA $9EA8 ; load hi byte of serial
.,CFFB  85 03       STA $03   ; into return value (hi)
.,CFFD  60          RTS       ; return

At the end of the system call function UseSystemFont, it does this:

.,E6C9  AD 2F D8    LDA $D82F ; read copy of hi byte of serial
.,E6CC  D0 06       BNE $E6D4 ; non-zero? done this before already
.,E6CE  20 F8 CF    JSR $CFF8 ; call second half of GetSerialNumber
.,E6D1  8D 2F D8    STA $D82F ; and store the hi byte in our copy
.,E6D4  60          RTS       ; ...

And in the middle of the system call function FindFTypes, it does this:

.,D5EB  A2 C1       LDX #$C1
.,D5ED  A9 96       LDA #$96  ; public GetSerialNumber vector ($C196)
.,D5EF  20 D8 C1    JSR $C1D8 ; "CallRoutine": call indirectly (obfuscation)
.,D5F2  A5 03       LDA $03   ; read hi byte of serial
.,D5F4  CD 2F D8    CMP $D82F ; compare with copy
.,D5F7  F0 03       BEQ $D5FC ; if identical, skip next instruction
.,D5F9  EE 18 C2    INC $C218 ; sabotage LdDeskAcc by incrementing its vector
.,D5FC  A0 00       LDY #$00  ; ...

So UseSystemFont makes a copy of the hi byte of the serial, and FindFTypes compares the copy with the serial – so what's the protection? The trick is that one path goes through the proper GetSerialNumber vector, while the other one calls into the bottom half of the original implementation. If the hack overwrites the first instruction of the implementation (or managed to disable the first trap and changed the system call vector directly), calling though the vector will reach the hack, while calling into the middle of the original implementation will still reach the original code. If the hack returns a different value than the original code, this will sabotage the system in a subtle way, by incrementing the LdDeskAcc system call vector.

Note that this code calls a KERNAL system function that will call GetSerialNumber indirectly, so the function pointer is split into two 8 bit loads and can't be found by just searching for the constant. Since the code in UseSystemFont doesn't call GetSerialNumber either, an attacker won't find a call to that function anywhere inside KERNAL.


I don't know whether anyone has ever created a generic serial number hack for GEOS, but it would have been a major effort – in a time before emulators allowed for memory watch points. An attacker would have needed to suspect the memory corruption, compared memory dumps before and after, and then found the two places that change code. The LdDeskAcc sabotage would have been easy to find, because it encodes the address as a constant, but the GraphicsString sabotage would have been nearly impossible to find, because the trap neither uses the verbatim GraphicsString address nor GetSerialNumber function.

Usual effective hacks were much more low-tech and instead "un-keyed" applications, i.e. removed the cached serial number from their binaries to revert them into their original, out-of-the-box state.

by Michael Steil at August 09, 2016 09:13 AM

Toolsmith In-depth Analysis: ProcFilter - YARA-integrated Windows process denial framework

Note: Next month, toolsmith #120 will represent ten years of award winning security tools coverage. It's been an amazing journey; I look to you, dear reader, for ideas on what tool you'd like to see me cover for the decade anniversary edition. Contact information at the end of this post.

Toolsmith #119 focuses on ProcFilter, a new project, just a month old as this is written, found on Github by one of my blue team members (shout-out to Ina). Brought to you by the GoDaddy Engineering crew, I see a lot of upside and potential in this project. Per it's GitHub readme, ProcFilter is "a process filtering system for Windows with built-in YARA integration. YARA rules can be instrumented with custom meta tags that tailor its response to rule matches. It runs as a Windows service and is integrated with Microsoft's ETW API, making results viewable in the Windows Event Log. Installation, activation, and removal can be done dynamically and do not require a reboot."
Malware analysts can use ProcFilter to create YARA signatures to protect Windows environments against specific threats. It's a lightweight, precise, targeted tool that does not include a large signature set. "ProcFilter is also intended for use in controlled analysis environments where custom plugins can perform artifact-specific actions."
GoDaddy's Emerson Wiley, the ProcFilter project lead provided me with valuable insight on the tool and its future.
"For us at GoDaddy the idea was to get YARA signatures deployed proactively. YARA has a lot of traction within the malware analysis community and its flexible nature makes it perfect for malware categorization. What ProcFilter brings to the table is the ability to get those signatures out there in a preventative fashion with minimal overhead. What we're saying by making this project open source is, “This is interesting to us; if it’s interesting to you then lets work together and move it forward.”

Endpoint tools don’t typically provide openness, flexibility, and extensibility so those are other areas where ProcFilter stands out. I’m passionate about creating extensible software - if people have the opportunity to implement their own ideas it’s pretty much guaranteed that you’ll be blown away by what they create. With the core of the project trending towards stability we’re going to start focusing on building plugins that do unique and interesting things with process manipulation. We’re just starting to scratch the surface there and I look forward to what comes next.

Something I haven’t mentioned elsewhere is a desire to integrate support for Python or Lua plugins. This could provide security engineers a quick, easy way to react to process and thread events. There’s a testing branch with some of these features and we’ll see where it goes."

ProcFilter integrates nicely with Git and Windows Event Logging to minimize the need for additional tools or infrastructure for rules deployment or results acquisition.

ProcFilter is a beta offering with lots of commits from Emerson. I grabbed the x64 release (debug available too) installer for the 1.0.0-beta.2 release. Installation was seamless and rapid. It runs as a service by default, you'll see ProcFilter Service via the services.msc as follows.

ProcFilter Service
You'll want to review, and likely modify, procfilter.ini as it lets you manage ProcFilter with flexible granularity.  You'll be able to manage plugins, rules files, blocking, logging, and quarantine, as well as scanning parameters and white-listing.

ProcFilter Use
You can also work with ProcFilter interactively via the command prompt, again with impressive flexibility. A quick procfilter -status will advise you of your running state.
ProcFilter Service
Given that ProcFilter installs out of the gate with a lean rule set, I opted to grab a few additional rules for detection of my test scenario. There is one rule set built by Florian Roth (Neo23x0) that you may want to deploy situationally as it's quite broad, but offers extensive and effective coverage. As my test scenario was specific to PowerShell-born attacks such as Invoke-Mimikatz, I zoomed in for a very specific rule set devised by the man himself, mimikatz creator Benjamin Delpy. Yes, he's written very effective rules to detect his own craftsmanship. 
mimikatz power_pe_injection Yara rule
I opted to drop the rules in my master.yara file in the localrules directory, specifically here: C:\Program Files\ProcFilter\localrules\master.yara. I restarted the service and also ran procfilter -compile from the command line to ensure a clean rules build. Command-line options follow:
Command-line options

As noted in our May 2015 discussion of Rekall use for hunting in-memory adversaries, an attack such as IEX (New-Object Net.WebClient).DownloadString(''); Invoke-Mimikatz -DumpCreds should present a good opportunity for testing. Given the endless amount of red vs. blue PowerShell scenarios, this one lined up perfectly. Using a .NET webclient call, this expression grabs Invoke-Mimikatz.ps1 from @mattifestation's PowerSploit Github and runs it in memory.

The attack didn't get very far at all on Windows 10, by design, but that doesn't mean we don't want to detect such attempted abuses in our environment, even if unsuccessful.

You can use command-line options to sweep broadly or target a specific process. In this case, I was able to reasonably assert (good job, Russ) that the PowerShell process might be the culprit. Sheer genius, I know. :-)

Suspect process

Running procfilter -memoryscan 10612 came through like a champ.
Command-line ProcFilter detection
The real upside of ProcFilter though is that it writes to the Windows Event Log, you just need to build a custom filter as described in the readme.
The result, as dictated in part by your procfilter.ini configuration should like something like the following, once you trigger an event that matches one of your Yara rules.
ProcFilter event log view
In closing

Great stuff from the GoDaddy Engineering team, I see significant promise in ProcFilter and really look forward to its continued support and development. Thanks to Emerson Wiley for all his great feedback.
Ping me via email or Twitter if you have questions: russ at holisticinfosec dot org or @holisticinfosec.
Cheers…until next time, for the big decade anniversary edition.

by Russ McRee ( at August 09, 2016 08:09 AM

Anton Chuvakin - Security Warrior

August 08, 2016

That grumpy BSD guy

Chinese Hunting Chinese Over POP3 In Fjord Country

Yes, you read that right: There is a coordinated effort in progress to steal Chinese-sounding users' mail, targeting machines at the opposite end of the Eurasian landmass (and probably elsewhere).

More specifically, here at we've been seeing attempts at logging in to the pop3 mail retrieval service using usernames that sound distinctively like Chinese names, and the attempts originate almost exclusively from Chinese networks.

This table lists the user names and corresponding real life names attempted so far:

Name Username
Chen Qiang chenqiang
Fa Dum fadum
Gao Dang gaodang
Gao Di gaodi
Gao Guan gaoguan
Gao Hei gaohei
Gao Hua gaohua
Gao Liu gaoliu
Gao Yang gaoyang
Gao Zhang gaozhang
He An hean
He Biao hebiao
He Bing hebing
He Chang hechuang
He Chao hechao
He Chen hechen
He Cheng hecheng
He Chun hechun
He Cong hecong
He Da heda
He Di hedi
He Die hedie
He Ding heding
He Dong hedong
He Duo heduo
He Fa hefa
He Ging heqing
He Guo heguo
He Han hehan
He Hao hehao
He Heng heheng
He Hui hehui
He Jia hejia
He Jian hejian
He Jiang hejiang
He Jie hejie
He Jin hejin
He Juan hejuan
He Kai hekai
He Kan hekan
He Kong hekong
He La hela
He Le hele
He Leng heleng
He Li heli
He Lian helian
He Lie helie
He Mu hemu
He Niang heniang
He Quan hequan
He Ran heran
He Sha hesha
He Shan heshan
He Shi heshi
He Si hesi
He Song hesong
He Xiao hexiao
He Yao heyao
He Yi heyi
He Yin heyin
He Yu heyu
He Yun heyun
He Zeng hezeng
He Zeng hezhan
He Zhang hezhangxxxx
He Zhe hezhe
He Zheng hezheng
He Zhi hezhi
He Zhong hezhong
He Zhuang hezhuang
Li An lian
Li Biao libiao
Li Bin libin
Li Bo libo
Li Cheng licheng
Li Chi lichi
Li Chong lichong
Li Chuang lichuang
Li Chun lichun
Li Da lida
Li Deng lideng
Li Di lidi
Li Die lidie
Li Ding liding
Li Dong lidong
Li Duo liduo
Li Fa lifa
Li Fang lifang
Li Fen lifen
Li Feng lifeng
Li Gang ligang
Li Gao ligao
Li Guan liguan
Li Guang liguang
Li Hai lihai
Li Ka lika
Li Kai likai
Li La lila
Li Le lile
Li Lei lilei
Li Lin lilin
Li Ling liling
Li Liu liliu
Li Long lilong
Li Man liman
Li Mei limei
Li Mu limu
Li Neng lineng
Li Niang liniang
Li Peng lipeng
Li Pian lipian
Li Qian liqian
Li Qu liqu
Li Rang lirang
Li Ren liren
Li Ru liru
Li Sha lisha
Li Shi lishi
Li Shuai lishuai
Li Shun lishun
Li Si lisi
Li Song lisong
Li Tao litao
Li Teng liteng
Li Tian litian
Li Ting liting
Li Wang liwang
Li Wei liwei
Li Wen liwen
Li Xiang lixiang
Li Xing lixing
Li Xiu lixiu
Li Ying liying
Li You liyou
Li Ze lize
Li Zeng lizeng
Li Zheng lizheng
Li Zhong lizhong
Li Zhu lizhu
Li Zhuang lizhuang
Li Zhuo lizhuo
Liang Min liangmin
Liang Ming liangming
Liang Qiang liangqiang
Liang Rui liangrui
Lin Chen linchen
Lin Cheng lincheng
Lin He linhe
Lin Hua linhua
Lin Huang linhuang
Lin Neng linneng
Lin Pian linpian
Lin Qu linqu
Lin Ru linru
Lin Zhang linzhang
Liu Bin liubin
Liu Duo liuduo
Liu Fang liufang
Liu Han liuhan
Liu Hao liuhao
Liu Heng liuheng
Liu Hong liuhong
Liu Hui liuhui
Liu Jia liujia
Liu Jiang liujiang
Liu Jiao liujiao
Liu Ju liuju
Liu Juan liujuan
Liu Kai liukai
Liu Kan liukan
Liu Kang liukang
Liu Ke liuke
Liu Kong liukong
Liu Lang liulang
Liu Long liulong
Liu Mu liumu
Liu Nuo liunuo
Liu Qin liuqin
Liu Qing liuqing
Liu Qiong liuqiong
Liu Rong liurong
Liu Sen liusen
Liu Sha liusha
Liu Shun liushun
Liu Si liusi
Liu Tian liutian
Liu Wang liuwang
Liu Wei liuwei
Liu Xia liuxia
Liu Xiu liuxiu
Liu Yao liuyao
Liu Yi liuyi
Liu Ying liuying
Liu Yu liuyu
Liu Yuan liuyuan
Liu Yun liuyun
Liu Zhen liuzhen
Liu Zheng liuzheng
Liu Zhi liuzhi
Liu Zun liuzun
Lou Liu luoliu
Lu Huang lihuang
Luo Chang luochuang
Luo Chen luochen
Luo Cheng luocheng
Luo Deng luochi
Luo Deng luodeng
Luo Di luodi
Luo Dian luodian
Luo Gao luogao
Luo Guai luoguai
Luo Hang luohuang
Luo Hua luohua
Luo Lie luolie
Luo Neng luoneng
Luo Pian luopian
Luo Qi luoqi
Luo Qin luoqin
Luo Qing luoqing
Luo Qu luoqu
Luo Rong luorong
Luo Ru luoru
Luo Rui luorui
Luo Shuang luoshuang
Luo Ting luoting
Luo Tong luotong
Luo Wang luowang
Luo Wei luowei
Luo Yang luoyang
Luo Ze luoze
Song Chen songchen
Song Cheng songcheng
Song Chuang songchuang
Song Da songda
Song Deng songdeng
Song Dian songdian
Song Die songdie
Song Fei songfei
Song Fen songfen
Song Gang songgang
Song Gao songgao
Song Guai songguai
Song Guan songguan
Song Guo songguo
Song Hai songhai
Song Han songhan
Song Hang songhang
Song He songhe
Song Hei songhei
Song Heng songheng
Song Hu songhu
Song Hua songhua
Song Jia songjia
Song Jiao songjiao
Song Jie songjie
Song Jin songjin
Song Jing songjing
Song Ka songka
Song Kan songkan
Song Kang songkang
Song Kong songkong
Song Lan songlan
Song Le songle
Song Lei songlei
Song Lian songlian
Song Liang songliang
Song Liang songliao
Song Liang songliang
Song Liao songliao
Song Lin songlin
Song Liu songliu
Song Meng songmeng
Song Ming songming
Song Mu songmu
Song Nan songnan
Song Neng songneng
Song Ning songning
Song Pian songpian
Song Pin songpin
Song Qi songqi
Song Qiang songqiang
Song Qing songqing
Song Qiu songqiu
Song Ran songran
Song Rong songrong
Song Rui songrui
Song Sha songsha
Song Shuai songshuai
Song Shuang songshuang
Song Song songsong
Song Song Jun songsongjun
Song Tao songtao
Song Teng songteng
Song Wang songwang
Song Wei songwei
Song Xi songxi
Song Xia songxia
Song Xiu songxiu
Song Ya songya
Song Yang songyang
Song Yong songyong
Song You songyou
Song Yuan songyuan
Song Yue songyue
Song Yun songyun
Song Zhe songzhe
Song Zhen songzhen
Song Zheng songzheng
Song Zhuang songzhuang
Tan Qian tangqian
Tang Bing tangbing
Tang Chi tangchi
Tang Chong tangchong
Tang Chuang tangchuang
Tang Cong tangcong
Tang Di tangdi
Tang Dian tangdian
Tang Duo tangduo
Tang Fa tangfa
Tang Fan tangfan
Tang Fang tangfang
Tang Fei tangfei
Tang Fen tangfen
Tang Feng tangfeng
Tang Gang tanggang
Tang Guai tangguai
Tang Guan tangguan
Tang Guang tangguang
Tang Guo tangguo
Tang Han tanghan
Tang Hao tanghao
Tang Hei tanghei
Tang Heng tangheng
Tang Hong tanghong
Tang Hu tanghu
Tang Hui tanghui
Tang Jie tangjie
Tang Jin tangjin
Tang Jing tangjing
Tang Ju tangju
Tang Ka tangka
Tang Kai tangkai
Tang Kan tangkan
Tang Kang tangkang
Tang Ke tangke
Tang Kong tangkong
Tang La tangla
Tang Lang tanglang
Tang Le tangle
Tang Leng tangleng
Tang Li tangli
Tang Lian tanglian
Tang Lie tanglie
Tang Lin tanglin
Tang Ling tangling
Tang Liu tangliu
Tang Long tanglong
Tang Mei tangmei
Tang Mo tangmo
Tang Mu tangmu
Tang Neng tangneng
Tang Niang tangniang
Tang Nuo tangnuo
Tang Peng tangpeng
Tang Pian tangpian
Tang Ping tangping
Tang Qian tangqian
Tang Qin tangqin
Tang Qu tangqu
Tang Quan tangquan
Tang Quing tangqing
Tang Rang tangrang
Tang Ren tangren
Tang Ru tangru
Tang Ruan tangruan
Tang Rui tangrui
Tang Sen tangsen
Tang Sha tangsha
Tang Shan tangshan
Tang Shi tangshi
Tang Shun tangshun
Tang Song tangsong
Tang Tang Jun tangtangjun
Tang Tao tangtao
Tang Tian tangtian
Tang Tian tangyan
Tang Wei tangwei
Tang Xi tangxi
Tang Xia tangxia
Tang Xing tangxing
Tang Xiong tangxiong
Tang Yan tangyan
Tang Yang tangyang
Tang Yao tangyao
Tang Yi tangyi
Tang Ying tangying
Tang Yong tangyong
Tang You tangyou
Tang Yue tangyue
Tang Yun tangyun
Tang Ze tangze
Tang Zeng tangzeng
Tang Zhang tangzhang
Tang Zhe tangzhe
Tang Zhen tangzhen
Tang Zun tangzun
Xie An xiean
Xie Bin xiebin
Xie Bo xiebo
Xie Chao xiechao
Xie Cong xiecong
Xie Da xieda
Xie Di xiedi
Xie Dian xiedian
Xie Die xiedie
Xie Ding xieding
Xie Dong xiedong
Xie Duo xieduo
Xie Fang xiefang
Xie Fei xiefei
Xie Feng xiefeng
Xie Gang xiegang
Xie Gao xiegao
Xie Guai xieguai
Xie Guan xieguan
Xie Hai xiehai
Xie Hang xiehang
Xie Heng xieheng
Xie Heng xieneng
Xie Heng xieheng
Xie Heng xieneng
Xie Hong xiehong
Xie Hu xiehu
Xie Hui xiehui
Xie Jia xiejia
Xie Jian xiejian
Xie Jiang xiejiang
Xie Jiao xiejiao
Xie Jie xiejie
Xie Jing xiejing
Xie Ju xieju
Xie Kai xiekai
Xie La xiela
Xie Leng xieleng
Xie Liang xieliang
Xie Lie xielie
Xie Lin xielin
Xie Ling xieling
Xie Long xielong
Xie Man xieman
Xie Meng xiemeng
Xie Min xiemin
Xie Ming xieming
Xie Na xiena
Xie Niang xieniang
Xie Peng xiepeng
Xie Pian xiepian
Xie Pin xiepin
Xie Qi xieqi
Xie Qing xieqing
Xie Qiong xieqiong
Xie Qiu xieqiu
Xie Qu xiequ
Xie Quan xiequan
Xie Ran xieran
Xie Ruan xieruan
Xie Rui xierui
Xie Sha xiesha
Xie Shuang xieshuang
Xie Si xiesi
Xie Tao xietao
Xie Ting xieting
Xie Tong xietong
Xie Wei xiewei
Xie Wen xiewen
Xie Xi xiexi
Xie Xiang xiexiang
Xie Xin xiexin
Xie Xing xiexing
Xie Xiu xiexiu
Xie Ya xieya
Xie Yi xieyi
Xie Yin xieyin
Xie Ying xieying
Xie Yong xieyong
Xie Yu xieyu
Xie Yue xieyue
Xie Zeng xiezeng
Xie Zhan xiezhan
Xie Zhang xiezhang
Xie Zhe xiezhe
Xie Zhuo xiezhuo
Zheng Nan zhengnan

That list of some 493 names is up to date as of this writing, 2016-08-23 early evening CEST. A few more turn up with the bursts of activity we have seen every day since June 19th, 2016.

A possibly more up to date list is available here. That's a .csv file, if that sounds unfamiliar, think of it as a platform neutral text representation (to wit, "Comma Separated Values") of a spreadsheet or database -- take a peek with Notepad.exe or similar if you're not sure. I'll be updating that second list along with other related data at quasi-random intervals as time allows and as long as interesting entries keep turning up in my logs.

If your name or username is on either of those lists, you would be well advised to change your passwords right now and to check breach notification sites such as Troy Hunt's or for clues to where your accounts could have been compromised.

That's your scoop for now. If you're interested in some more background and data, keep reading.

If you are a regular or returning reader of this column, you are most likely aware that I am a Unix sysadmin. In addition to operating and maintaining variuos systems in my employers' care, I run a small set of servers of my own that run a few Internet-facing services for myself and a small circle of friends and family.

For the most part those systems are roundly ignored by the world at large, but when they are not, funny, bizarre or interesting things happen. And mundane activities like these sometimes have interesting byproducts. When you run a mail service, you are bound to find a way to handle the spam people will try to send, and about ten years ago I started publishing a blacklist of known spamming hosts, generated from attempts to deliver mail to a slowly expanding list of known bad, invalid, never to be deliverable addresses in the domains we handle mail for.

After a while, I discovered that the list of spamtrap addresses (once again, invalid and destined never to be deliverable, ever) had been hilariously repurposed: The local parts (the string before the @ or 'at sign') started turning up as usernames in failed attempts to log on to our pop3 mail retrieval service. That was enough fun to watch that I wrote that article, and for reasons known only to the operators of the machines at the other end, those attempts have never stopped entirely.

These attempts to log in as our imaginary friends is a strong contender for the most bizarre and useless activity ever, but when those attempts were no longer news, there was nothing to write about. The spamtrap login attempts make up sort of a background noise in the authentication logs, and whenever there is an attempt to log in as a valid user from somewhere that user is clearly not, the result is usually that an entire network (whatever I could figure out from whois output) would be blocked from any communication with our site for 24 hours.

There are of course also attempts to log in as postmaster, webmaster and other IDs, some RFC mandated, that most sites including this one would handle as aliases to make up the rest of the background noise.

Then recently, something new happened. The first burst looked like this in my logs (times given in local timezone, CEST at the time):

Jun 19 06:14:58 skapet spop3d[37601]: authentication failed: no such user: lilei -
Jun 19 06:15:01 skapet spop3d[46539]: authentication failed: no such user: lilei -
Jun 19 06:15:03 skapet spop3d[8180]: authentication failed: no such user: lilei -

-- and so on, for a total of 78 attempts to log in as the non-existing user lilei, in the space of about five minutes. A little later, a similar burst of activity came for the user name lika:

Jun 19 14:11:30 skapet spop3d[68573]: authentication failed: no such user: lika -
Jun 19 14:12:22 skapet spop3d[22421]: authentication failed: no such user: lika -
Jun 19 14:12:26 skapet spop3d[7587]: authentication failed: no such user: lika -
Jun 19 14:12:30 skapet spop3d[16753]: authentication failed: no such user: lika -

and so on, for a total of 76 attempts. Over the next few days I noticed an uptick in failed pop3 access attempts that were not for valid users and did not match any entry on our spamtraps list. Still, those attempts were for users that do not exist, and would produce no useful result so I did not do anything much about them.

It was only during the early weeks of July that it struck me that the user name attempted here

Jul  8 12:19:08 skapet spop3d[54818]: authentication failed: no such user: lixing -
Jul  8 12:19:28 skapet spop3d[1987]: authentication failed: no such user: lixing -
Jul  8 12:19:37 skapet spop3d[70622]: authentication failed: no such user: lixing -
Jul  8 12:19:49 skapet spop3d[31208]: authentication failed: no such user: lixing -

(a total of 54 attempts for that user name) might actually be based on the name of a Chinese person. "Li Xing" sounded plausible enough as a possible real person. It's perhaps worth noting that at the time I had just finished reading the first two volumes of Cixin Liu's The Three Body Problem, so I was a bit more in tune than usual with what could be plausible Chinese names than I had been. (And yes, the books are very much to my taste and I have the yet unpublished translation of the third volume on pre-order.)

Unsurprisingly, a quick whois lookup revealed that the machines that tried reading the hypothetical person Li Xing's mail all had IP addresses that belonged to Chinese networks.

Once I realized I might be on to a new pattern, I went back over a few days' worth of failed pop3 login attempts and found more than a handful of usernames that looked like they could be based on Chinese names. Checking the whois data for the IP addresses in those attempts, all turned out to be from Chinese networks.

That was in itself an interesting realization, but a small, random sample does not make for proof. In order to establish an actual data set, it was back to collecting data and analysing the content.

First, collect all log data on failed pop3 attempts for a long enough period that we have a reasonable baseline and can distinguish between the background noise and new, exciting developements.

The file bigauthlog is that collection of data. Digging through my archives going back in time, I stopped at January 16, 2016 for no other reason than this would be roughly six months' worth of data, probably enough to give a reasonable baseline and to spot anomalies.

If you've read the previous columns, you will be familiar with the scripts that produce various text and CSV reports from log data input: A text report of user names by number of access attempts, a CSV dump of the same, with first and last spotted, a text report of hosts attempting access, sorted by number of attempts, a CSV dump of the same, with first and last seen dates as for the user names.

But what I wanted to see was where the login attempts were coming from for which usernames, so I started extracting the unique host to username mappings. For each entry in this CSV file, there is a host and a user name it has tried at least once (if you import that somewhere, make sure you mark the Username column as text -- LibreOffice Calc at least becomes confused when trying to parse some of those strings). The data also records whether that particular username was part of the spamtrap database at the time. If you want to do that particular check on your own greytrapping database, any matching output from

$ doas spamdb | grep -i username@

on your greytrapper box will mean it is in your list. And then finally for each entry there is the expected extract from available whois info: network address range, the network name and the country.

The most useful thing to do with that little database is to play with sorting on various fields and field combinations. If you sort on the "In spamtraps" field, the supposed Chinese names turn up with "No"s, along with a few more random-seeming combinations.

While I was building the data set I decided to add those new usernames with appended to the spamtraps, and this is what finally pushed the number of spamtraps past the 30,000 mark.

Just browsing the data or perhaps sorting by IP address will show you that the pop3 gropers are spread across a large number of networks in a number of countries and territories with numbers roughly in proportion to the size of that country or territory's economy. Some, such as a particular Mexican ISP and cable TV operator stand out as being slightly over-represented, and as expected networks in the US and China stand for a large number of the total.

If you sort on the In spamtraps field, you will see that a large number of the entries that were not in the spamtraps are the ones identified as Chinese personal names, but not all. Some of the No entries are the RFC mandated mailboxes, some are aliases that are in use here for other reasons, and finally more than a handful that would fit the general description of the rest of the spamtraps: Strings superficially resembling personal names or simply random strings. These may be parts of the potential spamtraps I missed while fishing  spamtrap candidates out of logfiles some time over the decade of weirdness that has gone into maintaining the spamtraps list.

But if you sort the data primarily on the fields Name, Country, and if you like IP address and User name, you will see that as anticipated the attempts on Chinese-sounding user names come exclusively from Chinese networks, except only the "Fa Dum" (fadum) user, which appears to have been attempted only twice (on June 6th) from an IP address registered in the USA and may very well be a misclassification on my part. That particular sorting, with duplicates removed, is the origin of the list of names and usernames given earlier in this article and this CSV file.

Now that we have established that the attempts at Chinese user names come exclusively from Chinese networks, the next questions become: Who are the cyber criminals behind this activity, and what are their motivations? And why are they bothering with hosts in faraway Europe to begin with?

For the first question, it is hard to tell from this perch, but whoever runs those attempts apparently have the run of large swathes of network real estate and seem to not take any special care not to be detected, other than of course distributing the attempts widely across the network ranges and coming in only in short bursts.

So are those attempts by, let us say the public sector, to steal political dissidents' email? Or perhaps, still with a public sector slant, simply hunting for any and all overseas assets belonging to Chinese nationals? Or are we simply seeing the activities of Chinese private sector cyber criminals who are trying out likely user names wherever they can find a service that listens?

Any of all of these things could be true, but in any case it's not unlikely that what we are seeing somebody trying to find new places where username and password combinations from a recent breach might work. After all, username and password combinations that have been verified to work somewhere are likely worth more on the market than the unverified ones.

Looking at the log entries, there are sequences there that could plausibly have been produced by humans typing at keyboards. Imagine if you please vast, badly lit and insufficiently ventilated Asian cyber-sweatshops, but I would not be too surprised to find that this is actually a highly automated operation, with timing tuned to avoid detection.

Security professionals have been recommending that people stop using the pop3 protocol since as long as I care to remember, but typing "pop3" into still produces a whopping 684,291 results, meaning that the pop3 service is nowhere near as extinct as some would have preferred.

The large number of possible targets is a likely explanation for the burstiness of the activity we are seeing: with that many hosts to cover, the groping hosts will need to set up some sort of rotation, and in addition there is the need to stay below some volume of traffic per host in order to avoid detection. This means that what any one site sees is only a very small part of the total activity. The pop3 hunt for Chinese users is most likely not exclusive to the fjord country.

If you run a pop3 service, please do yourself a favor and check your setup for any weaknesses including any not yet applied updates, as you were about to do anyway. Once you've done that, take some moments to browse your logs for strange looking login attempts.

If you find something similar to what I've reported here, I would like to hear from you. Please note that at least one of the pop3 deaemons out there by default does not report the username for failed authentication attempts but notes that the username was unknown instead. Anyway, your war stories will be appreciated in email or comments.

If your name or username appears in the table at the start of this article or in this CSV file, please start checking for unusual activity involving your accounts and start changing passwords right away. Ask your service providers if they offer more secure alternatives, and if they do, consider using these alternatives. And as I mentioned earlier, do check breach notification sites such as or for clues to help find out whether your data could be at risk in any of the services you do use. And of course, feedback in comments or email is welcome.

And finally, if you have information on one or more breaches that may have been the source of this list of likely Chinese user names, I'd like to hear from you too.

Good night and good luck.

I would like to thank Tore Nordstrand and Øystein Alsaker for valuable input on various aspects of this article.

The data referenced in this article will likely be updated on a roughly daily basis while the Chinese episode lasts. You can fetch them from the links in the article or from this directory, which also contains some trivial data extraction and data massaging scripts I use. If you find any errors or have any concerns, please let me know.

by Peter N. M. Hansteen ( at August 08, 2016 06:29 PM

Everything Sysadmin

I'll be speaking at NYC SRE Meetup next week (Wed)

I'll be giving a new talk called "Teaching DevOps to Ops without Devs" at the NYC SRE Tech Talks meetup, Wednesday, August 17, 2016, from 6pm to 8:30pm, at Squarespace's offices in the West Village (8 Clarkson St, 12th Floor). There are 4 speakers during the event. This is a free event. Seating is limited. Info and registration here.

by Tom Limoncelli at August 08, 2016 05:33 PM

August 06, 2016


How Amica Paint protected tampering with its credits

In mid-1990, the floppy disk of special issue 55 of the German Commodore 64 magazine "64'er" contained the "Amica Paint" graphics program – which was broken beyond usefulness. I'll describe what went wrong.

"Amica Paint" was devloped by Oliver Stiller and first published in 64'er special issue 27 in 1988, as a type-in program that filled 25 pages of the magazine.

Two years later, Amica Paint was published again in special issue 55, which this time came with a floppy disk. But this version was completely broken: Just drawing a simple line would cause severe glitches.

Alt text

64'er issue 9/1990 published an erratum to fix Amica Paint, which described three ways (BASIC script, asm monitor and disk monitor) to patch 7 bytes in one of the executable files:

--- a/a.paint c000.txt
+++ b/a.paint c000.txt
@@ -67,8 +67,8 @@
 00000420  a5 19 85 ef a5 1a 85 f0  4c 29 c4 20 11 c8 20 6c  |........L). .. l|
 00000430  c8 08 20 ed c7 28 90 f6  60 01 38 00 20 20 41 4d  |.. ..(..`.8.  AM|
 00000440  49 43 41 20 50 41 49 4e  54 20 56 31 2e 34 20 20  |ICA PAINT V1.4  |
-00000450  01 38 03 20 20 20 4f 2e  53 54 49 4c 4c 45 52 20  |.8.   O.STILLER |
-00000460  31 39 39 30 20 20 20 01  31 00 58 3d 30 30 30 20  |1990   .1.X=000 |
+00000450  01 38 03 42 59 20 4f 2e  53 54 49 4c 4c 45 52 20  |.8.BY O.STILLER |
+00000460  31 39 38 36 2f 38 37 01  31 00 58 3d 30 30 30 20  |1986/87.1.X=000 |
 00000470  59 3d 30 30 30 20 20 20  20 20 20 20 20 20 00 01  |Y=000         ..|
 00000480  31 00 42 49 54 54 45 20  57 41 52 54 45 4e 20 2e  |1.BITTE WARTEN .|
 00000490  2e 2e 20 20 20 20 00 64  0a 01 00 53 43 48 57 41  |..    .d...SCHWA|

This changes the credits message from "O.STILLER 1990" to "BY O.STILLER 1986/87" – which is the original message from the previous publication.

64'er magazine had published the exact same application without any updates, but binary patched the credits message from "1986/87" to "1990", and unfortunately for them, Amica Paint contained code to detect exactly this kind of tampering:

.,C5F5  A0 14       LDY #$14     ; check 20 bytes
.,C5F7  A9 00       LDA #$00     ; init checksum with 0
.,C5F9  18          CLC
.,C5FA  88          DEY
.,C5FB  79 51 C4    ADC $C451,Y  ; add character from message
.,C5FE  88          DEY
.,C5FF  18          CLC
.,C600  10 F9       BPL $C5FB    ; loop
.,C602  EE FD C5    INC $C5FD
.,C605  C9 ED       CMP #$ED     ; checksum should be $ED
.,C607  F0 05       BEQ $C60E
.,C609  A9 A9       LDA #$A9
.,C60B  8D E4 C7    STA $C7E4    ; otherwise sabotage line drawing
.,C60E  60          RTS

The code checksums the message "BY O.STILLER 1986/87". If the checksum does not match, the code will overwrite an instruction in the following code:

.,C7DC  65 EC       ADC $EC
.,C7DE  85 EC       STA $EC
.,C7E0  90 02       BCC $C7E4
.,C7E2  E6 ED       INC $ED
.,C7E4  A4 DD       LDY $DD
.,C7E6  60          RTS

The "LDY $DD" instruction at $C7E4 will be overwritten with "LDA #$DD", which will cause the glitches in line drawing.

The proper fix would have been to change the comparison with $ED into a comparison with $4F, the checksum of the updated message – a single byte fix. But instead of properly debugging the issue, 64'er magazine published a patch to restore the original message, practically admitting that they had cheated by implying the re-release was not the exact same software.

by Michael Steil at August 06, 2016 10:02 AM

August 05, 2016

Steve Kemp's Blog

Using the compiler to help you debug segfaults

Recently somebody reported that my console-based mail-client was segfaulting when opening an IMAP folder, and then when they tried with a local Maildir-hierarchy the same fault was observed.

I couldn't reproduce the problem at all, as neither my development host (read "my personal desktop"), nor my mail-host had been crashing at all, both being in use to read my email for several months.

Debugging crashes with no backtrace, or real hint of where to start, is a challenge. Even when downloading the same Maildir samples I couldn't see a problem. It was only when I decided to see if I could add some more diagnostics to my code that I came across a solution.

My intention was to make it easier to receive a backtrace, by adding more compiler options:

  -fsanitize=address -fno-omit-frame-pointer

I added those options and my mail-client immediately started to segfault on my own machine(s), almost as soon as it started. Ultimately I found three pieces of code where I was allocating C++ objects and passing them to the Lua stack, a pretty fundamental part of the code, which were buggy. Once I'd tracked down the areas of code that were broken and fixed them the user was happy, and I was happy too.

Its interesting that I've been running for over a year with these bogus things in place, which "just happened" to not crash for me or anybody else. In the future I'll be adding these options to more of my C-based projects, as there seems to be virtually no downside.

In related news my console editor has now achieved almost everything I want it to, having gained:

  • Syntax highlighting via Lua + LPEG
  • Support for TAB completion of Lua-code and filenames.
  • Bookmark support.
  • Support for setting the mark and copying/cutting regions.

The only outstanding feature, which is a biggy, is support for Undo which I need to add.

Happily no segfaults here, so far..

August 05, 2016 06:22 AM

August 04, 2016

Evaggelos Balaskas

Open compressed file with gzip zcat perl php lua python

I have a compressed file of:

250.000.000 lines
Compressed the file size is: 671M
Uncompressed, it's: 6,5G

Need to extract a plethora of things and verify some others.

I dont want to use bash but something more elegant, like python or lua.

Looking through “The-Internet”, I’ve created some examples for the single purpose of educating my self.

So here are my results.
BE AWARE they are far-far-far away from perfect in code or execution.

Sorted by (less) time of execution:


pigz - Parallel gzip - Zlib

# time pigz  -p4 -cd  2016-08-04-06.ldif.gz &> /dev/null 

real	0m9.980s
user	0m16.570s
sys	0m0.980s


gzip 1.8

# time /bin/gzip -cd 2016-08-04-06.ldif.gz &> /dev/null

real	0m23.951s
user	0m23.790s
sys	0m0.150s


zcat (gzip) 1.8

# time zcat 2016-08-04-06.ldif.gz &> /dev/null

real	0m24.202s
user	0m24.100s
sys	0m0.090s


Perl v5.24.0



open (FILE, '/bin/gzip -cd 2016-08-04-06.ldif.gz |');

while (my $line = ) {
  print $line;

close FILE;


# time ./ &> /dev/null

real	0m49.942s
user	1m14.260s
sys	0m2.350s


PHP 7.0.9 (cli)



< ? php

  $fp = gzopen("2016-08-04-06.ldif.gz", "r");

  while (($buffer = fgets($fp, 4096)) !== false) {
        echo $buffer;


 ? >


# time php -f dump.php &> /dev/null

real	1m19.407s
user	1m4.840s
sys	0m14.340s

PHP - Iteration #2

PHP 7.0.9 (cli)

Impressed with php results, I took the perl-approach on code:

< ? php

  $fp = popen("/bin/gzip -cd 2016-08-04-06.ldif.gz", "r");

  while (($buffer = fgets($fp, 4096)) !== false) {
        echo $buffer;


 ? >


# time php -f dump2.php &> /dev/null 

real	1m6.845s
user	1m15.590s
sys	0m19.940s

not bad !


Lua 5.3.3



local gzip = require 'gzip'

local filename = "2016-08-04-06.ldif.gz"

for l in gzip.lines(filename) do


# time ./dump.lua &> /dev/null

real	3m50.899s
user	3m35.080s
sys	0m15.780s

Lua - Iteration #2

Lua 5.3.3

I was depressed to see that php is faster than lua!!
Depressed I say !

So here is my next iteration on lua:



local file = assert(io.popen('/bin/gzip -cd 2016-08-04-06.ldif.gz', 'r'))

while true do
        line = file:read()
        if line == nil then break end
        print (line)


# time ./dump2.lua &> /dev/null 

real	2m45.908s
user	2m54.470s
sys	0m21.360s

One minute faster than before, but still too slow !!

Lua - Zlib

Lua 5.3.3

My next iteration with lua is using zlib :



local zlib = require 'zlib'
local filename = "2016-08-04-06.ldif.gz"

local block = 64
local d = zlib.inflate()

local file = assert(, "rb"))
while true do
  bytes = file:read(block)
  if not bytes then break end
  print (d(bytes))



# time ./dump.lua  &> /dev/null 

real	0m41.546s
user	0m40.460s
sys	0m1.080s

Now, that's what I am talking about !!!

Playing with window_size (block) can make your code faster or slower.

Python v3

Python 3.5.2



import gzip

with, 'r') as f:
	for line in f:


# time ./ &> /dev/null

real	13m14.460s
user	13m13.440s
sys	0m0.670s

Not enough tissues on the whole damn world!

Python v3 - Iteration #2

Python 3.5.2

but wait ... a moment ... The default mode for is 'rb'.
(read binary)

let's try this once more with rt(read-text) mode:



import gzip

with, 'rt') as f:
	for line in f:
	    print(line, end="")


# time ./ &> /dev/null 

real	5m33.098s
user	5m32.610s
sys	0m0.410s

With only one super tiny change and run time in half!!!
But still tooo slow.

Python v3 - Iteration #3

Python 3.5.2

Let's try a third iteration with popen this time.



import os

cmd = "/bin/gzip -cd 2016-08-04-06.ldif.gz"
f = os.popen(cmd)
for line in f:
  print(line, end="")


# time ./ &> /dev/null 

real	6m45.646s
user	7m13.280s
sys	0m6.470s

Python v3 - zlib Iteration #1

Python 3.5.2

Let's try a zlib iteration this time.



import zlib

d = zlib.decompressobj(zlib.MAX_WBITS | 16)

with open(filename, 'rb') as f:
    for line in f:


# time ./ &> /dev/null 

real	1m4.389s
user	1m3.440s
sys	0m0.410s

finally some proper values with python !!!


All the running tests occurred to this machine:

4 x Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz


Ok, I Know !

The shell-pipe approach of using gzip for opening the compressed file, is not fair to all the above code snippets.
But ... who cares ?

I need something that run fast as hell and does smart things on those data.

Get in touch

As I am not a developer, I know that you people know how to do these things even better!

So I would love to hear any suggestions or even criticism on the above examples.

I will update/report everything that will pass the "I think I know what this code do" rule and ... be gently with me ;)

PLZ use my email address: evaggelos [ _at_ ] balaskas [ _dot_ ] gr

to send me any suggestions

Thanks !

Tag(s): php, perl, python, lua, pigz

August 04, 2016 06:12 PM

August 03, 2016

Evaggelos Balaskas

How to dockerize a live system

I need to run some ansible playbooks to a running (live) machine.
But, of-course, I cant use a production server for testing purposes !!

So here comes docker!
I have ssh access from my docker-server to this production server:

[docker-server] ssh livebox tar -c / | docker import - centos6:livebox 

Then run the new docker image:

[docker-server]  docker run -t -i --rm -p 2222:22 centos6:livebox bash                                                  

[root@40b2bab2f306 /]# /usr/sbin/sshd -D                                                                             

Create a new entry on your hosts inventory file, that uses ssh port 2222
or create a new separated inventory file

and test it with ansible ping module:

# ansible -m ping -i hosts.docker dockerlivebox

dockerlivebox | success >> {
    "changed": false,
    "ping": "pong"
Tag(s): docker

August 03, 2016 03:20 PM

Racker Hacker

OpenStack instances come online with multiple network ports attached

ethernet cablesI ran into an interesting problem recently in my production OpenStack deployment that runs the Mitaka release. On various occasions, instances were coming online with multiple network ports attached, even though I only asked for one network port.

The problem

If I issued a build request for ten instances, I’d usually end up with this:

  • 6 instances with one network port attached
  • 2-3 instances with two network ports attached (not what I want)
  • 1-2 instances with three or four network ports attached (definitely not what I want)

When I examined the instances with multiple network ports attached, I found that one of the network ports would be marked as up while the others would be marked as down. However, the IP addresses associated with those extra ports would still be associated with the instance in horizon and via the nova API. All of the network ports seemed to be fully configured on the neutron side.

Digging into neutron

The neutron API logs are fairly chatty, especially while instances are building, but I found two interesting log lines for one of my instances:, - - [02/Aug/2016 14:03:11] "GET /v2.0/ports.json?tenant_id=a7b0519330ed481884431102a72dd04c&device_id=05eef1bb-5356-43d9-86c9-4d9854d4d46b HTTP/1.1" 200 2137 0.025282, - - [02/Aug/2016 14:03:15] "GET /v2.0/ports.json?tenant_id=a7b0519330ed481884431102a72dd04c&device_id=05eef1bb-5356-43d9-86c9-4d9854d4d46b HTTP/1.1" 200 3098 0.027803

There are two requests to create network ports for this instance and neutron is allocating ports to both requests. This would normally be just fine, but I only asked for one network port on this instance.

The IP addresses making the requests are unusual, though. and are two of the hypervisors within my cloud. Why are both of them asking neutron for network ports? Only one of those hypervisors should be building my instance, not both. After checking both hypervisors, I verified that the instance was only provisioned on one of the hosts and not both.

Looking at nova-compute

The instance ended up on the hypervisor once it finished building and the logs on that hypervisor looked fine:

nova.virt.libvirt.driver [-] [instance: 05eef1bb-5356-43d9-86c9-4d9854d4d46b] Instance spawned successfully.

I logged into the hypervisor since it was the one that asked neutron for a port but it never built the instance. The logs there had a much different story:

[instance: 05eef1bb-5356-43d9-86c9-4d9854d4d46b] Instance failed to spawn
Traceback (most recent call last):
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/compute/", line 2218, in _build_resources
    yield resources
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/compute/", line 2064, in _build_and_run_instance
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/libvirt/", line 2773, in spawn
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/libvirt/", line 3191, in _create_image
    instance, size, fallback_from_host)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/libvirt/", line 6765, in _try_fetch_image_cache
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/libvirt/", line 251, in cache
    *args, **kwargs)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/libvirt/", line 591, in create_image
    prepare_template(target=base, max_size=size, *args, **kwargs)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/oslo_concurrency/", line 271, in inner
    return f(*args, **kwargs)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/libvirt/", line 241, in fetch_func_sync
    fetch_func(target=target, *args, **kwargs)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/libvirt/", line 429, in fetch_image
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/", line 120, in fetch_to_raw
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/virt/", line 110, in fetch, image_href, dest_path=path)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/image/", line 182, in download
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/image/", line 383, in download
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/image/", line 682, in _reraise_translated_image_exception
    six.reraise(new_exc, None, exc_trace)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/image/", line 381, in download
    image_chunks =, 1, 'data', image_id)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/nova/image/", line 250, in call
    result = getattr(client.images, method)(*args, **kwargs)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/glanceclient/v1/", line 148, in data
    % urlparse.quote(str(image_id)))
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/glanceclient/common/", line 275, in get
    return self._request('GET', url, **kwargs)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/glanceclient/common/", line 267, in _request
    resp, body_iter = self._handle_response(resp)
  File "/openstack/venvs/nova-13.3.0/lib/python2.7/site-packages/glanceclient/common/", line 83, in _handle_response
    raise exc.from_response(resp, resp.content)
ImageNotFound: Image 8feacda9-91fd-48ce-b983-54f7b6de6650 could not be found.

This is one of those occasions where I was glad to find an exception in the log. The image that couldn’t be found is an image I’ve used regularly in the environment before, and I know it exists.

Gandering at glance

First off, I asked glance what it knew about the image:

$ openstack image show 8feacda9-91fd-48ce-b983-54f7b6de6650
| Field            | Value                                                |
| checksum         | 8de08e3fe24ee788e50a6a508235aa64                     |
| container_format | bare                                                 |
| created_at       | 2016-08-03T01:25:34Z                                 |
| disk_format      | qcow2                                                |
| file             | /v2/images/8feacda9-91fd-48ce-b983-54f7b6de6650/file |
| id               | 8feacda9-91fd-48ce-b983-54f7b6de6650                 |
| min_disk         | 0                                                    |
| min_ram          | 0                                                    |
| name             | Fedora 24                                            |
| owner            | a7b0519330ed481884431102a72dd04c                     |
| properties       | description=''                                       |
| protected        | False                                                |
| schema           | /v2/schemas/image                                    |
| size             | 204590080                                            |
| status           | active                                               |
| tags             |                                                      |
| updated_at       | 2016-08-03T01:25:39Z                                 |
| virtual_size     | None                                                 |
| visibility       | public                                               |

If glance knows about the image, why can’t that hypervisor build an instance with that image? While I was scratching my head, Kevin Carter walked by my desk and joined in the debugging.

He asked about how I had deployed glance and what storage backend I was using. I was using the regular file storage backend since I don’t have swift deployed in the environment. He asked me how many glance nodes I had (I had two) and if I was doing anything to sync the images between the glance nodes.

Then it hit me.

Frustrated Stitch

Although both glance nodes knew about the image (since that data is in the database), only one of the glance nodes had the actual image content (the actual qcow2 file) stored. That means that if a hypervisor requests the image from a glance node that knows about the image but doesn’t have it stored, the hypervisor won’t be able to retrieve the image.

Unfortunately, the checks go in this order on the nova-compute side:

  1. Ask glance if this image exists and if this tenant can use it
  2. Configure the network
  3. Retrieve the image

If a hypervisor rolls through steps one and two without issues, but then fails on step 3, the network port will be provisioned but won’t come up on the instance. There’s nothing that cleans up that port in the Mitaka release, so it requires manual intervention.

The fix

As a temporary workaround, I took one of the glance nodes offline so that only one glance node is being used. After hundreds of builds, all of the instances came up with only one network port attached!

There are a few options for long-term fixes.

I could deploy swift and put glance images into swift. That would allow me to use multiple glance nodes with the same swift backend. Another option would be to use an existing swift deployment, such as Rackspace’s Cloud Files product.

Since I’m not eager to deploy swift in my environment for now, I decided to remove the second glance node and reconfigure nova to use only one glance node. That means I’m running with only one glance node and a failure there could be highly annoying. However, that trade-off is fine with me until I can get around to deploying swift.

UPDATE: I’ve opened a bug for nova so that the network ports are cleaned up if the instance fails to build.

Photo credit: Flickr: pascalcharest

The post OpenStack instances come online with multiple network ports attached appeared first on

by Major Hayden at August 03, 2016 02:40 PM

August 01, 2016

Simon Lyall

Putting Prometheus node_exporter behind apache proxy

I’ve been playing with Prometheus monitoring lately. It is fairly new software that is getting popular. Prometheus works using a pull architecture. A central server connects to each thing you want to monitor every few seconds and grabs stats from it.

In the simplest case you run the node_exporter on each machine which gathers about 600-800 (!) metrics such as load, disk space and interface stats. This exporter listens on port 9100 and effectively works as an http server that responds to “GET /metrics HTTP/1.1” and spits several hundred lines of:

node_forks 7916
node_intr 3.8090539e+07
node_load1 0.47
node_load15 0.21
node_load5 0.31
node_memory_Active 6.23935488e+08

Other exporters listen on different ports and export stats for apache or mysql while more complicated ones will act as proxies for outgoing tests (via snmp, icmp, http). The full list of them is on the Prometheus website.

So my problem was that I wanted to check my virtual machine that is on Linode. The machine only has a public IP and I didn’t want to:

  1. Allow random people to check my servers stats
  2. Have to setup some sort of VPN.

So I decided that the best way was to just use put a user/password on the exporter.

However the node_exporter does not  implement authentication itself since the authors wanted the avoid maintaining lots of security code. So I decided to put it behind a reverse proxy using apache mod_proxy.

Step 1 – Install node_exporter

Node_exporter is a single binary that I started via an upstart script. As part of the upstart script I told it to listen on localhost port 19100 instead of port 9100 on all interfaces

# cat /etc/init/prometheus_node_exporter.conf
description "Prometheus Node Exporter"

start on startup

chdir /home/prometheus/

/home/prometheus/node_exporter -web.listen-address
end script

Once I start the exporter a simple “curl” makes sure it is working and returning data.

Step 2 – Add Apache proxy entry

First make sure apache is listening on port 9100 . On Ubuntu edit the /etc/apache2/ports.conf file and add the line:

Listen 9100

Next create a simple apache proxy without authentication (don’t forget to enable mod_proxy too):

# more /etc/apache2/sites-available/prometheus.conf 
<VirtualHost *:9100>
 ServerName prometheus

CustomLog /var/log/apache2/prometheus_access.log combined
 ErrorLog /var/log/apache2/prometheus_error.log

ProxyRequests Off
 <Proxy *>
Allow from all

ProxyErrorOverride On
 ProxyPass /
 ProxyPassReverse /


This simply takes requests on port 9100 and forwards them to localhost port 19100 . Now reload apache and test via curl to port 9100. You can also use netstat to see what is listening on which ports:

Proto Recv-Q Send-Q Local Address   Foreign Address State  PID/Program name
tcp   0      0*       LISTEN 8416/node_exporter
tcp6  0      0      :::9100         :::*            LISTEN 8725/apache2


Step 3 – Get Prometheus working

I’ll assume at this point you have other servers working. What you need to do now is add the following entries for you server in you prometheus.yml file.

First add basic_auth into your scape config for “node” and then add your servers, eg:

- job_name: 'node'

  scrape_interval: 15s

    username: prom
    password: mypassword

    - targets: ['']
         group: 'servers'
         alias: 'myserver'

Now restart Prometheus and make sure it is working. You should see the following lines in your apache logs plus stats for the server should start appearing: - - [31/Jul/2016:11:31:38 +0000] "GET /metrics HTTP/1.1" 200 11377 "-" "Go-http-client/1.1" - - [31/Jul/2016:11:31:53 +0000] "GET /metrics HTTP/1.1" 200 11398 "-" "Go-http-client/1.1" - - [31/Jul/2016:11:32:08 +0000] "GET /metrics HTTP/1.1" 200 11377 "-" "Go-http-client/1.1"

Notice that connections are 15 seconds apart, get http code 200 and are 11k in size. The Prometheus server is using Authentication but apache doesn’t need it yet.

Step 4 – Enable Authentication.

Now create an apache password file:

htpasswd -cb /home/prometheus/passwd prom mypassword

and update your apache entry to the followign to enable authentication:

# more /etc/apache2/sites-available/prometheus.conf
 <VirtualHost *:9100>
 ServerName prometheus

 CustomLog /var/log/apache2/prometheus_access.log combined
 ErrorLog /var/log/apache2/prometheus_error.log

 ProxyRequests Off
 <Proxy *>
 Order deny,allow
 Allow from all
 AuthType Basic
 AuthName "Password Required"
 AuthBasicProvider file
 AuthUserFile "/home/prometheus/passwd"
 Require valid-user

 ProxyErrorOverride On
 ProxyPass /
 ProxyPassReverse /

After you reload apache you should see the following: - prom [01/Aug/2016:04:42:08 +0000] "GET /metrics HTTP/1.1" 200 11394 "-" "Go-http-client/1.1" - prom [01/Aug/2016:04:42:23 +0000] "GET /metrics HTTP/1.1" 200 11392 "-" "Go-http-client/1.1" - prom [01/Aug/2016:04:42:38 +0000] "GET /metrics HTTP/1.1" 200 11391 "-" "Go-http-client/1.1"

Note that the “prom” in field 3 indicates that we are logging in for each connection. If you try to connect to the port without authentication you will get:

This server could not verify that you
are authorized to access the document
requested. Either you supplied the wrong
credentials (e.g., bad password), or your
browser doesn't understand how to supply
the credentials required.

That is pretty much it. Note that will need to add additional Virtualhost entries for more ports if you run other exporters on the server.



by simon at August 01, 2016 11:03 PM

Anton Chuvakin - Security Warrior

Monthly Blog Round-Up – July 2016

Here is my next monthly "Security Warrior" blog round-up of top 5 popular posts/topics this month:
  1. Why No Open Source SIEM, EVER?” contains some of my SIEM thinking from 2009. Is it relevant now? You be the judge.  Succeeding with SIEM requires a lot of work, whether you paid for the software, or not. BTW, this post has an amazing “staying power” that is hard to explain – I suspect it has to do with people wanting “free stuff” and googling for “open source SIEM” …  [235 pageviews]
  2. “New SIEM Whitepaper on Use Cases In-Depth OUT!” (dated 2010) presents a whitepaper on select SIEM use cases described in depth with rules and reports [using now-defunct SIEM product]; also see this SIEM use case in depth and this for a more current list of popular SIEM use cases. Finally, see our 2016 research on security monitoring use cases here! [156 pageviews]
  3. Simple Log Review Checklist Released!” is often at the top of this list – this aging checklist is still a very useful tool for many people. “On Free Log Management Tools” is a companion to the checklist (updated version) [56 pageviews]
  4. My classic PCI DSS Log Review series is always popular! The series of 18 posts cover a comprehensive log review approach (OK for PCI DSS 3+ as well), useful for building log review processes and procedures , whether regulatory or not. It is also described in more detail in our Log Management book and mentioned in our PCI book (out in its 4th edition!)[40+ pageviews to the main tag]
  5. “SIEM Resourcing or How Much the Friggin’ Thing Would REALLY Cost Me?” is a quick framework for assessing the SIEM project (well, a program, really) costs at an organization (much more details on this here in this paper). [70 pageviews of total 2891 pageviews to all blog page]
In addition, I’d like to draw your attention to a few recent posts from my Gartner blog [which, BTW, now has about 5X of the traffic of this blog]: 
Current research on SOC and threat intelligence [2 projects]:
Miscellaneous fun posts:

(see all my published Gartner research here)
Also see my past monthly and annual “Top Popular Blog Posts” – 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015.
Disclaimer: most content at SecurityWarrior blog was written before I joined Gartner on Aug 1, 2011 and is solely my personal view at the time of writing. For my current security blogging, go here.

Previous post in this endless series:

by Anton Chuvakin ( at August 01, 2016 04:31 PM

Nitrokey HSM/SmartCard-HSM and Raspberry Pi web cluster

This article sets up a Nitrokey HSM/SmartCard-HSM web cluster and has a lot of benchmarks. This specific HSM is not a fast HSM since it's very inexpensive and targeted at secure key storage, not performance. But, what if you do want more performance? Then you scale horizontally, just add some more HSM's and a loadbalancer in front. The cluster consists of Raspberry Pi's and Nitrokey HSM's and SmartCard-HSM's, softwarewise we use Apache, `mod_nss` and haproxy. We benchmark a small HTML file and a Wordpress site, with a regular 4096 bit RSA certificate without using the HSM's, a regular 2048 bit RSA certificate without using the HSM's, a 2048 bit RSA certificate in the HSM, a 1024 bit RSA certificate in the HSM and an EC prime256v1 key in the HSM. We do these benchmarks with the `OpenSC` module and with the `sc-hsm-embedded` module to see if that makes any difference.

August 01, 2016 12:00 AM

July 29, 2016


TACACS Detected 'Invalid Argument'

As always, I've changed pertinent details for reasons.

I was working on an ASR the other day and received the follow error:

RP/0/RSP0/CPU0:ASR9K(config-tacacs-host)# commit
Fri Jul 29 12:55:46.243 PDT

% Failed to commit one or more configuration items during a pseudo-atomic
operation. All changes made have been reverted. Please issue 'show configuration
failed [inheritance]' from this session to view the errors
RP/0/RSP0/CPU0:ASR9K(config-tacacs-host)# show configuration failed
Fri Jul 29 12:55:55.421 PDT
!! SEMANTIC ERRORS: This configuration was rejected by
!! the system due to semantic errors. The individual
!! errors with each failed configuration command can be
!! found below.

tacacs-server host port 49
!!% 'TACACS' detected the 'fatal' condition 'Invalid Argument'

The problem here is that the tacacs daemon thinks the configuration contains an invalid argument. It doesn't. So, restart tacacs:

RP/0/RSP0/CPU0:ASR9K# show proc | inc tacacs
Fri Jul 29 12:56:32.376 PDT
1142   1    2  108K  16 Sigwaitinfo 7399:06:34:0893    0:00:00:0109 tacacsd
1142   2    0  108K  10 Receive     7399:06:35:0099    0:00:00:0000 tacacsd
1142   3    2  108K  10 Nanosleep      0:00:05:0940    0:00:00:0057 tacacsd
1142   4    1  108K  10 Receive     7399:06:34:0957    0:00:00:0000 tacacsd
1142   5    1  108K  10 Receive        0:00:00:0664    0:00:41:0447 tacacsd
1142   6    1  108K  10 Receive     2057:20:44:0638    0:00:44:0805 tacacsd
1142   7    2  108K  10 Receive     1167:26:53:0781    0:01:02:0991 tacacsd
1142   8    3  108K  10 Receive     1167:26:51:0567    0:01:29:0541 tacacsd
1142   9    2  108K  10 Receive      403:35:55:0206    0:01:09:0700 tacacsd
RP/0/RSP0/CPU0:ASR9K# process restart tacacsd
Fri Jul 29 12:56:54.768 PDT
RP/0/RSP0/CPU0:ASR9K# show proc | inc tacacs
Fri Jul 29 12:56:58.455 PDT
1142   1    3   64K  16 Sigwaitinfo    0:00:03:0806    0:00:00:0069 tacacsd
1142   2    1   64K  10 Receive        0:00:03:0998    0:00:00:0000 tacacsd
1142   3    3   64K  10 Nanosleep      0:00:03:0977    0:00:00:0000 tacacsd
1142   4    1   64K  10 Receive        0:00:03:0867    0:00:00:0002 tacacsd
1142   5    3   64K  10 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   6    2   64K  16 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   7    1   64K  16 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   8    3   64K  16 Receive        0:00:03:0818    0:00:00:0000 tacacsd
1142   9    3   64K  10 Receive        0:00:00:0673    0:00:00:0003 tacacsd

And try again:

RP/0/RSP0/CPU0:ASR9K# config t
Fri Jul 29 12:57:04.787 PDT
RP/0/RSP0/CPU0:ASR9K(config)# tacacs-server host port 49
RP/0/RSP0/CPU0:ASR9K(config-tacacs-host)# commit
Fri Jul 29 12:57:20.627 PDT

by Scott Hebert at July 29, 2016 01:00 PM

July 28, 2016

Ansible-cmdb v1.15: Generate a host overview of Ansible facts.

I've just released ansible-cmdb v1.15. Ansible-cmdb takes the output of Ansible's fact gathering and converts it into a static HTML overview page containing system configuration information. It supports multiple templates and extending information gathered by Ansible with custom data.

This release includes the following bugfixes and feature improvements:

  • Improvements to the resilience against wrong, unsupported and missing data.
  • SQL template. Generates SQL for use with SQLite and MySQL.
  • Minor bugfixes.

As always, packages are available for Debian, Ubuntu, Redhat, Centos and other systems. Get the new release from the Github releases page.

by admin at July 28, 2016 04:39 PM


A look at the Puppet 4 Application Orchestration feature

Puppet 4 got some new language constructs that let you model multi node applications and it assist with passing information between nodes for you. I recently wrote a open source orchestrator for this stuff which is part of my Choria suite, figured I’ll write up a bit about these multi node applications since they are now useable in open source.

The basic problem this feature solves is about passing details between modules. Lets say you have a LAMP stack, you’re going to have Web Apps that need access to a DB and that DB will have a IP, User, Password etc. Exported resources never worked for this because it’s just stupid to have your DB exporting web resources, there are unlimited amount of web apps and configs, no DB module support this. So something new is needed.

The problem is made worse by the fact that Puppet does not really have something like a Java interface, you can’t say that foo-mysql module implements a standard interface called database so that you can swap out one mysql module for another, they’re all snowflakes. So a intermediate translation layer is needed.

In a way you can say this new feature brings a way to create an interface – lets say SQL – and allows you to hook random modules into both sides of the interface. On one end a database and on the other a webapp. Puppet will then transfer the information across the interface for you, feeding the web app with knowledge of port, user, hosts etc.

LAMP Stack Walkthrough

Lets walk through creating a standard definition for a multi node LAMP stack, and we’ll create 2 instances of the entire stack. It will involve 4 machines sharing data and duties.

These interfaces are called capabilities, here’s an example of a SQL one:

Puppet::Type.newtype :sql, :is_capability => true do
  newparam :name, :is_namevar => true
  newparam :user
  newparam :password
  newparam :host
  newparam :database

This is a generic interface to a database, you can imagine Postgres or MySQL etc can all satisfy this interface, perhaps you could add here a field to confer the type of database, but lets keep it simple. The capability provides a translation layer between 2 unrelated modules.

It’s a pretty big deal conceptually, I can see down the line there be some blessed official capabilities and we’ll see forge modules starting to declare their compatibility. And finally we can get to a world of interchangeable infrastructure modules.

Now I’ll create a defined type to make my database for my LAMP stack app, I’m just going to stick a notify in instead of the actual creating of a database to keep it easy to demo:

define lamp::mysql (
  $host     = $::hostname,
  $database = $name,
) {
  notify{"creating mysql db ${database} on ${host} for user ${db_user}": }

I need to tell Puppet this defined type exist to satisfy the producing side of the interface, there’s some new language syntax to do this, it feels kind of out of place not having a logical file to stick this in, I just put it in my lamp/manifests/mysql.pp:

Lamp::Mysql produces Sql {
  user     => $db_user,
  password => $db_password,
  host     => $host,
  database => $database,

Here you can see the mapping from the variables in the defined type to those in the capability above. $db_user feeds into the capability property $user etc.

With this in place if you have a lamp::mysql or one based on some other database, you can always query it’s properties based on the standard user etc, more on that below.

So we have a database, and we want to hook a web app onto it, again for this we use a defined type and again just using notifies to show the data flow:

define lamp::webapp (
  $docroot = '/var/www/html'
) {
  notify{"creating web app ${name} with db ${db_user}@${db_host}/${db_name}": }

As this is the other end of the translation layer enabled by the capability we tell Puppet that this defined type consumes a Sql capability:

Lamp::Webapp consumes Sql {
  db_user     => $user,
  db_password => $password,
  db_host     => $host,
  db_name     => $database,

This tells Puppet to read the value of user from the capability and stick it into db_user of the defined type. Thus we can plumb arbitrary modules found on the forge together with a translation layer between their properties!

So you have a data producer and a data consumer that communicates across a translation layer called a capability.

The final piece of the puzzle that defines our LAMP application stack is again some new language features:

application lamp (
  String $db_user,
  String $db_password,
  Integer $web_instances = 1
) {
  lamp::mysql { $name:
    db_user     => $db_user,
    db_password => $db_password,
    export      => Sql[$name],
  range(1, $web_instances).each |$instance| {
    lamp::webapp {"${name}-${instance}":
      consume => Sql[$name],

Pay particular attention to the application bit and export and consume meta parameters here. This tells the system to feed data from the above created translation layer between these defined types.

You should kind of think of the lamp::mysql and lamp::webapp as node roles, these define what an individual node will do in this stack. If I create this application and set $instances = 10 I will need 1 x database machine and 10 x web machines. You can cohabit some of these roles but I think that’s a anti pattern. And since these are different nodes – as in entirely different machines – the magic here is that the capability based data system will feed these variables from one node to the next without you having to create any specific data on your web instances.

Finally, like a traditional node we now have a site which defines a bunch of nodes and allocate resources to them.

site {
    db_user       => 'user2',
    db_password   => 'secr3t',
    web_instances => 3,
    nodes         => {
      Node[''] => Lamp::Mysql['app2'],
      Node[''] => Lamp::Webapp['app2-1'],
      Node[''] => Lamp::Webapp['app2-2'],
      Node[''] => Lamp::Webapp['app2-3']
    db_user       => 'user1',
    db_password   => 's3cret',
    web_instances => 3,
    nodes         => {
      Node[''] => Lamp::Mysql['app1'],
      Node[''] => Lamp::Webapp['app1-1'],
      Node[''] => Lamp::Webapp['app1-2'],
      Node[''] => Lamp::Webapp['app1-3']

Here we are creating two instances of the LAMP application stack, each with it’s own database and with 3 web servers assigned to the cluster.

You have to be super careful about this stuff, if I tried to put my Mysql for app1 on dev1 and the Mysql for app2 on dev2 this would basically just blow up, it would be a cyclic dependency across the nodes. You generally best avoid sharing nodes across many app stacks or if you do you need to really think this stuff through. It’s a pain.

You now have this giant multi node monolith with order problems not just inter resource but inter node too.


Deploying these stacks with the abilities the system provide is pretty low tech. If you take a look at the site above you can infer dependencies. First we have to run It will both produce the data needed and install the needed infrastructure, and then we can run all the web nodes in any order or even at the same time.

There’s a problem though, traditionally Puppet runs every 30 minutes and gets a new catalog every 30 minutes. We can’t have these nodes randomly get catalogs in random order since there’s no giant network aware lock/ordering system. So Puppet now has a new model, nodes are supposed to run cached catalogs for ever and only get a new catalog when specifically told so. You tell it to deploy this stack and once deployed Puppet goes into a remediation cycle fixing the stack as it is with an unchanging catalog. If you want to change code, you again have to run this entire stack in this specific order.

This is a nice improvement for release management and knowing your state, but without tooling to manage this process it’s a fail, and today that tooling is embryonic and PE only.

So Choria which I released in Beta yesterday provides at least some relief, it brings a manual orchestrator for these things so you can kick of a app deploy on demand, later maybe some daemon will do this regularly I don’t know yet.

Lets take a look at Choria interacting with the above manifests, lets just show the plan:

This shows all the defined stacks in your site and group them in terms of what can run in parallel and in what order.

Lets deploy the stack, Choria is used again and it uses MCollective to do the runs using the normal Puppet agent, it tries to avoid humans interfering with a stack deploy by disabling Puppet and enabling Puppet at various stages etc:

It has options to limit the runs to a certain node batch size so you don’t nuke your entire site at once etc.

Lets look at some of the logs and notifies:

07:46:53> puppet-agent[27756]: creating mysql db app2 on dev1 for user user2
07:46:53> puppet-agent[27756]: creating mysql db app1 on dev1 for user user1
07:47:57> puppet-agent[27607]: creating web app app2-3 with db user2@dev1/app2
07:47:57> puppet-agent[27607]: creating web app app1-3 with db user1@dev1/app1
07:47:58> puppet-agent[23728]: creating web app app2-1 with db user2@dev1/app2
07:47:58> puppet-agent[23728]: creating web app app1-1 with db user1@dev1/app1
07:47:58> puppet-agent[23728]: creating web app app2-2 with db user2@dev1/app2
07:47:58> puppet-agent[23728]: creating web app app1-2 with db user1@dev1/app1

All our data flowed nicely through the capabilities and the stack was built with the right usernames and passwords etc. Timestamps reveal dev{2,3,4} ran concurrently thanks to MCollective.


To be honest, this whole feature feels like a early tech preview and not something that should be shipped. This is basically the plumbing a user friendly feature should be written on and that’s not happened yet. You can see from above it’s super complex – and you even have to write some pure ruby stuff, wow.

If you wanted to figure this out from the docs, forget about it, the docs are a complete mess, I found a guide in the Learning VM which turned out to be the best resource showing a actual complete walk through. This is sadly par for the course with Puppet docs these days 🙁

There’s some more features here – you can make cross node monitor checks to confirm the DB is actually up before attempting to start the web server for example, interesting. But implementing new checks is just such a chore – I can do it, I doubt your average user will be bothered, just make it so we can run Nagios checks, there’s 1000s of these already written and we all have them and trust them. Tbf, I could probably write a generic nagios checker myself for this, I doubt average user can.

The way nodes depend on each other and are ordered is of course obvious. It should be this way and these are actual dependencies. But at the same time this is stages done large. Stages failed because they make this giant meta dependency layered over your catalog and a failure in any one stage results in skipping entire other, later, stages. They’re a pain in the arse, hard to debug and hard to reason about. This feature implements the exact same model but across nodes. Worse there does not seem to be a way to do cross node notifies of resources. It’s as horrible.

That said though with how this works as a graph across nodes it’s the only actual option. This outcome should have been enough to dissuade the graph approach from even being taken though and something new should have been done, alas. It’s a very constrained system, it demos well but building infrastructure with this is going to be a losing battle.

The site manifest has no data capabilities. You can’t really use hiera/lookup there in any sane way. This is unbelievable, I know there were general lack of caring for external data at Puppet but this is like being back in Puppet 0.22 days before even extlookup existed and about as usable. It’s unbelievable that there’s no features for programatic node assignment to roles for example etc, though given how easy it is to make cycles and impossible scenarios I can see why. I know this is something being worked on though. External data is first class. External data modelling has to inform everything you do. No serious user uses Puppet without external data. It has to be a day 1 concern.

The underlying Puppet Server infrastructure that builds these catalogs is ok, I guess, the catalog is very hard to consume and everyone who want to write a thing to interact with it will have to write some terrible sorting/ordering logic themselves – and probably all have their own set of interesting bugs. Hopefully one day there will be a gem or something, or just a better catalog format. Worse it seems to happily compile and issue cyclic graphs without error, filed a ticket for that.

The biggest problem for me is that this is in the worst place of intersection between PE and OSS Puppet, it is hard/impossible to find out roadmap, or activity on this feature set since it’s all behind private Jira tickets. Sometimes some bubble up and become public, but generally it’s a black hole.

Long story short, I think it’s just best avoided in general until it becomes more mature and more visible what is happening. The technical issues are fine, it’s a new feature that’s basically new R&D, this stuff happens. The softer issues makes it a tough one to consider using.

by R.I. Pienaar at July 28, 2016 08:16 AM

Toolsmith Release Advisory: Windows Management Framework (WMF) 5.1 Preview

Windows Management Framework (WMF) 5.1 Preview has been released to the Download Center.
"WMF provides users with the ability to update previous releases of Windows Server and Windows Client to the management platform elements released in the most current version of Windows. This enables a consistent management platform to be used across the organization, and eases adoption of the latest Windows release."
As posted to the Window PowerShell Blog and reprinted here:
WMF 5.1 Preview includes the PowerShell, WMI, WinRM, and Software Inventory and Licensing (SIL) components that are being released with Windows Server 2016. 
WMF 5.1 can be installed on Windows 7, Windows 8.1, Windows Server 2008 R2, 2012, and 2012 R2, and provides a number of improvements over WMF 5.0 RTM including:
  • New cmdlets: local users and groups; Get-ComputerInfo
  • PowerShellGet improvements include enforcing signed modules, and installing JEA modules
  • PackageManagement added support for Containers, CBS Setup, EXE-based setup, CAB packages
  • Debugging improvements for DSC and PowerShell classes
  • Security enhancements including enforcement of catalog-signed modules coming from the Pull Server and when using PowerShellGet cmdlets
  • Responses to a number of user requests and issues
Detailed information on all the new WMF 5.1 features and updates, along with installation instructions, are in the WMF 5.1 release notes.
Please note:
  • WMF 5.1 Preview requires the .Net Framework 4.6, which must be installed separately. Instructions are available in the WMF 5.1 Release Notes Install and Configure topic.
  • WMF 5.1 Preview is intended to provide early information about what is in the release, and to give you the opportunity to provide feedback to the PowerShell team, but is not supported for production deployments at this time.
  • WMF 5.1 Preview may be installed directly over WMF 5.0.
  • It is a known issue that WMF 4.0 is currently required in order to install WMF 5.1 Preview on Windows 7 and Windows Server 2008. This requirement is expected to be removed before the final release.
  • Installing future versions of WMF 5.1, including the RTM version, will require uninstalling the WMF 5.1 Preview.

by Russ McRee ( at July 28, 2016 04:25 AM

July 27, 2016


Fixing the mcollective deployment story

Getting started with MCollective has always been an adventure, you have to learn a ton of new stuff like Middleware etc. And once you get that going the docs tend to present you with a vast array of options and choices including such arcane topics like which security plugin to use while the security model chosen is entirely unique to mcollective. To get a true feeling for the horror see the official deployment guide.

This is not really a pleasant experience and probably results in many insecure or half build deployments out there – and most people just not bothering. This is of course entirely my fault, too many options with bad defaults chosen is to blame.

I saw the graph of the learning curve of Eve Online and immediately always think of mcollective 🙂 Hint: mcollective is not the WoW of orchestration tools.

I am in the process of moving my machines to Puppet 4 and the old deployment methods for MCollective just did not work, everything is falling apart under the neglect the project has been experiencing. You can’t even install any plugin packages on Debian as they will nuke your entire Puppet install etc.

So I figured why not take a stab at rethinking this whole thing and see what I can do, today I’ll present the outcome of that – a new Beta distribution of MCollective tailored to the Puppet 4 AIO packaging that’s very easy to get going securely.


My main goals with these plugins were that they share as much security infrastructure with Puppet as possible. This means we get a understandable model and do not need to mess around with custom CAs and certs and so forth. Focussing on AIO Puppet means I can have sane defaults that works for everyone out of the box with very limited config. The deployment guide should be a single short page.

For a new user who has never used MCollective and now need certificates there should be no need to write a crazy ~/.mcollective file and configure a ton of SSL stuff, they should only need to do:

$ mco choria request_cert

This will make a CSR, submit it to the PuppetCA and wait for it to be signed like Puppet Agent. Once signed they can immediately start using MCollective. No config needed. No certs to distribute. Secure by default. Works with the full AAA stack by default.

Sites may wish to have tighter than default security around what actions can be made, and deploying these policies should be trivial.

Introducing Choria

Choria is a suite of plugins developed specifically with the Puppet AIO user in mind. It rewards using Puppet as designed with defaults and can yield a near zero configuration setup. It combines with a new mcollective module used to configure AIO based MCollective.

The deployment guide for a Choria based MCollective is a single short page. The result is:

  • A Security Plugin that uses the Puppet CA
  • A connector for NATS
  • A discovery cache that queries PuppetDB using the new PQL language
  • A open source Application Orchestrator for the new Puppet Multi Node Application stuff (naming is apparently still hard)
  • Puppet Agent, Package Agent, Service Agent, File Manager Agent all setup and ready to use
  • SSL and TLS used everywhere, any packet that leaves a node is secure. This cannot be turned off
  • A new packager that produce Puppet Modules for your agents etc and supports every OS AIO Puppet does
  • The full Authentication, Authorization and Auditing stack set up out of the box, with default secure settings
  • Deployment scenarios works by default, extensive support for SRV records and light weight manual configuration for those with custom needs

It’s easy to configure using the new lookup system and gives you a full, secure, usable, mcollective out of the box with minimal choices to make.

You can read how to deploy it at it’s deployment guide.


This is really a Beta release at the moment, I’m looking for testers and feedback. I am particularly interested in feedback on NATS and the basic deployment model, in future I might give the current connectors a same treatment with chosen defaults etc.

The internals of the security plugin is quite interesting, it proposes a new internal message structure for MCollective which should be much easier to support in other languages and is more formalised – to be clear these messages always existed, they were just a bit adhoc.

Additionally it’s the first quality security plugin that has specific support for building a quality web stack compatible MCollective REST server that’s AAA compatible and would even allow centralised RBAC and signature authority.

by R.I. Pienaar at July 27, 2016 02:35 PM

Raspberry Pi unattended upgrade Raspbian to Debian Testing

I'm working on a Nitrokey/SmartCard-HSM cluster article and therefore I needed three identical computers. The current version of Raspbian (2016-05-27) is based on Debian Jessie and comes with a version of OpenSC that is too old (0.14) to work with the Nitrokey/SmartCard-HSM. Since there is no Ubuntu 16.04 official image yet I decided to upgrade Raspbian to Debian Testing. Since I don't want to answer yes to any config file changes or service restarts I figured out how to do an unattended dist-upgrade.

July 27, 2016 12:00 AM

July 23, 2016


Highster mobile android review on phonetrack-reviews

He was using his single dad status to get women, and also lying to get women.It also lets you to replace the unwanted message with a new message.Until the last version for the roku came out.The husband has not been alert to the dangers which his wife faces.Additionally, during divorce settlements and child custody cases, evidence against a cheating wife or husband can help the other spouse to win a better settlement.You should not have to go through this alone.Indicates the success of fluid therapy by showing enhanced convective rbc flow.Why did you choose this High Ranking Spy Software for Android-based Gadgets. Upgrade to see the number of monthly visits from mobile users.It just seems like two people married and happy together, but with respect for themselves both staying the people they were, are going to live the longest marriages.My son was arrested for domestic violence involving his wife.Nbsp; is there a need for future workers in this field?Professionally designed site is very crucial to the success of any online business.According to a court record.Remember: there is no adequate substitution for a personal consultation with your physician.

by Oleksiy Kovyrin at July 23, 2016 08:49 PM

Steve Kemp's Blog

A final post about the lua-editor.

I recently mentioned that I'd forked Antirez's editor and added lua to it.

I've been working on it, on and off, for the past week or two now. It's finally reached a point where I'm content:

  • The undo-support is improved.
  • It has buffers, such that you can open multiple files and switch between them.
    • This allows this to work "kilua *.txt", for example.
  • The syntax-highlighting is improved.
    • We can now change the size of TAB-characters.
    • We can now enable/disable highlighting of trailing whitespace.
  • The default configuration-file is now embedded in the body of the editor, so you can run it portably.
  • The keyboard input is better, allowing multi-character bindings.
    • The following are possible, for example ^C, M-!, ^X^C, etc.

Most of the obvious things I use in Emacs are present, such as the ability to customize the status-bar (right now it shows the cursor position, the number of characters, the number of words, etc, etc).

Anyway I'll stop talking about it now :)

July 23, 2016 03:31 PM

Feeding the Cloud

Replacing a failed RAID drive

Here's the complete procedure I followed to replace a failed drive from a RAID array on a Debian machine.

Replace the failed drive

After seeing that /dev/sdb had been kicked out of my RAID array, I used smartmontools to identify the serial number of the drive to pull out:

smartctl -a /dev/sdb

Armed with this information, I shutdown the computer, pulled the bad drive out and put the new blank one in.

Initialize the new drive

After booting with the new blank drive in, I copied the partition table using parted.

First, I took a look at what the partition table looks like on the good drive:

$ parted /dev/sda
unit s

and created a new empty one on the replacement drive:

$ parted /dev/sdb
unit s
mktable gpt

then I ran mkpart for all 4 partitions and made them all the same size as the matching ones on /dev/sda.

Finally, I ran toggle 1 bios_grub (boot partition) and toggle X raid (where X is the partition number) for all RAID partitions, before verifying using print that the two partition tables were now the same.

Resync/recreate the RAID arrays

To sync the data from the good drive (/dev/sda) to the replacement one (/dev/sdb), I ran the following on my RAID1 partitions:

mdadm /dev/md0 -a /dev/sdb2
mdadm /dev/md2 -a /dev/sdb4

and kept an eye on the status of this sync using:

watch -n 2 cat /proc/mdstat

In order to speed up the sync, I used the following trick:

blockdev --setra 65536 "/dev/md0"
blockdev --setra 65536 "/dev/md2"
echo 300000 > /proc/sys/dev/raid/speed_limit_min
echo 1000000 > /proc/sys/dev/raid/speed_limit_max

Then, I recreated my RAID0 swap partition like this:

mdadm /dev/md1 --create --level=0 --raid-devices=2 /dev/sda3 /dev/sdb3
mkswap /dev/md1

Because the swap partition is brand new (you can't restore a RAID0, you need to re-create it), I had to update two things:

  • replace the UUID for the swap mount in /etc/fstab, with the one returned by mkswap (or running blkid and looking for /dev/md1)
  • replace the UUID for /dev/md1 in /etc/mdadm/mdadm.conf with the one returned for /dev/md1 by mdadm --detail --scan

Ensuring that I can boot with the replacement drive

In order to be able to boot from both drives, I reinstalled the grub boot loader onto the replacement drive:

grub-install /dev/sdb

before rebooting with both drives to first make sure that my new config works.

Then I booted without /dev/sda to make sure that everything would be fine should that drive fail and leave me with just the new one (/dev/sdb).

This test obviously gets the two drives out of sync, so I rebooted with both drives plugged in and then had to re-add /dev/sda to the RAID1 arrays:

mdadm /dev/md0 -a /dev/sda2
mdadm /dev/md2 -a /dev/sda4

Once that finished, I rebooted again with both drives plugged in to confirm that everything is fine:

cat /proc/mdstat

Then I ran a full SMART test over the new replacement drive:

smartctl -t long /dev/sdb

July 23, 2016 05:00 AM

Simon Lyall

Gather Conference 2016 – Afternoon

The Gathering

Chloe Swarbrick

  • Whose responsibility is it to disrupt the system?
  • Maybe try and engage with the system we have for a start before writing it off.
  • You disrupt the system yourself or you hold the system accountable

Nick McFarlane

  • He wrote a book
  • Rock Stars are dicks to work with

So you want to Start a Business

  • Hosted by Reuben and Justin (the accountant)
  • Things you need to know in your first year of business
  • How serious is the business, what sort of structure
    • If you are serious, you have to do things properly
    • Have you got paying customers yet
    • Could just be an idea or a hobby
  • Sole Trader vs Incorporated company vs Trust vs Partnership
  • Incorperated
    • Directors and Shareholders needed to be decided on
    • Can take just half an hour
  • when to get a GST number?
    • If over $60k turnover a year
    • If you have lots of stuff you plan to claim back.
  • Have an accounting System from Day 1 – Xero Pretty good
  • Get an advisor or mentor that is not emotionally invested in your company
  • If partnership then split up responsibilities so you can hold each other accountable for specific items
  • If you are using Xero then your accountant should be using Xero directly not copying it into a different system.
  • Remuneration
    • Should have a shareholders agreement
    • PAYE possibility from drawings or put 30% aside
    • Even if only a small hobby company you will need to declare income to IRD especially non-trivial level.
  • What Level to start at Xero?
    • Probably from the start if the business is intended to be serious
    • A bit of pain to switch over later
  • Don’t forget about ACC
  • Remember you are due provisional tax once you get over the the $2500 for the previous year.
  • Home Office expense claim – claim percentage of home rent, power etc
  • Get in professionals to help

Diversity in Tech

  • Diversity is important
    • Why is it important?
    • Does it mean the same for everyone
  • Have people with different “ways of thinking” then we will have a diverse views then wider and better solutions
  • example “Polish engineer could analysis a Polish specific character input error”
  • example “Controlling a robot in Samoan”, robots are not just in english
  • Stereotypes for some groups to specific jobs, eg “Indians in tech support”
  • Example: All hires went though University of Auckland so had done the same courses etc
  • How do you fix it when people innocently hire everyone from the same background? How do you break the pattern? No be the first different-hire represent everybody in that group?
  • I didn’t want to be a trail-blazer
  • Wow’ed out at “Women in tech” event, first time saw “majority of people are like me” in a bar.
  • “If he is a white male and I’m going to hire him on the team that is already full of white men he better be exception”
  • Worried about implication that “diversity” vs “Meritocracy” and that diverse candidates are not as good
  • Usual over-representation of white-males in the discussion even in topics like this.
  • Notion that somebody was only hired to represent diversity is very harmful especially for that person
  • If you are hiring for a tech position then 90% of your candidates will be white-males, try place your diversity in getting more diverse group applying for the jobs not tilt in the actual hiring.
  • Even in maker spaces where anyone is welcome, there are a lot fewer women. Blames mens mags having things unfinished, women’s mags everything is perfect so women don’t want to show off something that is unfinished.
  • Need to make the workforce diverse now to match the younger people coming into it
  • Need to cover “power income” people who are not exposed to tech
  • Even a small number are role models for the future for the young people today
  • Also need to address the problem of women dropping out of tech in the 30s and 40s. We can’t push girls into an “environment filled with acid”
  • Example taking out “cocky arrogant males” from classes into “advanced stream” and the remaining class saw women graduating and staying in at a much higher rate.


  • Paul Spain from Podcast New Zealand organising
  • Easiest to listen to when doing manual stuff or in car or bus
  • Need to avoid overload of commercials, eg interview people from the company about the topic of interest rather than about their product
  • Big firms putting money into podcasting
  • In the US 21% of the market are listening every single month. In NZ perhaps more like 5% since not a lot of awareness or local content
  • Some radios shows are re-cutting and publishing them
  • Not a good directory of NZ podcasts
  • Advise people use proper equipment if possible if more than a once-off. Bad sound quality is very noticeable.
  • One person: 5 part series on immigration and immigrants in NZ
  • Making the charts is a big exposure
  • Apples “new and noteworthy” list
  • Domination by traditional personalities and existing broadcasters at present. But that only helps traction within New Zealand




by simon at July 23, 2016 03:41 AM

July 22, 2016

Simon Lyall

Gather Conference 2016 – Morning

At the Gather Conference again for about the 6th time. It is a 1-day tech-orientated unconference held in Auckland every year.

The day is split into seven streamed sessions each 40 minutes long (of about 8 parallel rooms of events that are each scheduled and run by attendees) plus and opening and a keynote session.

How to Steer your own career – Shirley Tricker

  • Asked people hands up on their current job situation, FT vs PT, sinmgle v multiple jobs
  • Alternatives to traditional careers of work. possible to craft your career
  • Recommended Blog – Free Range Humans
  • Job vs Career
    • Job – something you do for somebody else
    • Career – Uniqie to you, your life’s work
    • Career – What you do to make a contribution
  • Predicted that a greater number of people will not stay with one (or even 2 or 3) employers through their career
  • Success – defined by your goals, lifestyle wishes
  • What are your strengths – Know how you are valuable, what you can offer people/employers, ways you can branch out
  • Hard and Soft Skills (soft skills defined broadly, things outside a regular job description)
  • Develop soft skills
    • List skills and review ways to develop and improve them
    • Look at people you admire and copy them
    • Look at job desctions
  • Skills you might need for a portfilio career
    • Good at organising, marketing, networking
    • flexible, work alone, negotiation
    • Financial literacy (handle your accounts)
  • Getting started
    • Start small ( don’t give up your day job overnight)
    • Get training via work or independently
    • Develop you strengths
    • Fix weaknesses
    • Small experiments
    • cheap and fast (start a blog)
    • Don’t have to start out as an expert, you can learn as you go
  • Just because you are in control doesn’t make it easy
  • Resources
    • Seth Goden
    • Tim Ferris
    • eg outsources her writing.
  • Tools
    • Xero
    • WordPress
    • Canva for images
    • Meetup
    • Odesk and other freelance websites
  • Feedback from Audience
    • Have somebody to report to, eg meet with friend/adviser monthly to chat and bounce stuff off
    • Cultivate Women’s mentoring group
    • This doesn’t seem to filter through to young people, they feel they have to pick a career at 18 and go to university to prep for that.
    • Give advice to people and this helps you define
    • Try and make the world a better place: enjoy the work you are doing, be happy and proud of the outcome of what you are doing and be happy that it is making the world a bit better
    • How to I “motivate myself” without a push from your employer?
      • Do something that you really want to do so you won’t need external motivation
      • Find someone who is doing something write and see what they did
      • Awesome for introverts
    • If you want to start a startup then work for one to see what it is like and learn skills
    • You don’t have to have a startup in your 20s, you can learn your skills first.
    • Sometimes you have to do a crappy job at the start to get onto the cool stuff later. You have to look at the goal or path sometimes

Books and Podcasts – Tanya Johnson

Stuff people recommend

  • Intelligent disobedience – Ira
  • Hamilton the revolution – based on the musical
  • Never Split the difference – Chris Voss (ex hostage negotiator)
  • The Three Body Problem – Lia CiXin – Sci Fi series
  • Lucky Peach – Food and fiction
  • Unlimited Memory
  • The Black Swan and Fooled by Randomness
  • The Setup ( website
  • Tim Ferris Podcast
  • Freakonomics Podcast
  • Moonwalking with Einstein
  • Clothes, Music, Boy – Viv Albertine
  • TIP: Amazon Whispersync for Kindle App (audiobook across various platforms)
  • TIP: Blinkist – 15 minute summaries of books
  • An Intimate History of Humanity – Theodore Zenden
  • How to Live – Sarah Bakewell
  • TIP: Pocketcasts is a good podcast app for Android.
  • Tested Podcast from Mythbusters people
  • Trumpcast podcast from Slate
  • A Fighting Chance – Elizabeth Warren
  • The Choice – Og Mandino
  • The Good life project Podcast
  • The Ted Radio Hour Podcast (on 1.5 speed)
  • This American Life
  • How to be a Woman by Caitlin Moran
  • The Hard thing about Hard things books
  • Flashboys
  • The Changelog Podcast – Interview people doing Open Source software
  • The Art of Oppertunity Roseland Zander
  • Red Rising Trilogy by Piers Brown
  • On the Rag podcast by the Spinoff
  • Hamish and Andy podcast
  • Radiolab podcast
  • Hardcore History podcast
  • Car Talk podcast
  • Ametora – Story of Japanese menswear since WW2
  • .net rocks podcast
  • How not to be wrong
  • Savage Love Podcast
  • Friday Night Comedy from the BBC (especially the News Quiz)
  • Answer me this Podcast
  • Back to work podcast
  • Reply All podcast
  • The Moth
  • Serial
  • American Blood
  • The Productivity podcast
  • Keeping it 1600
  • Ruby Rogues Podcast
  • Game Change – John Heilemann
  • The Road less Travelled – M Scott Peck
  • The Power of Now
  • Snow Crash – Neil Stevensen

My Journey to becoming a Change Agent – Suki Xiao

  • Start of 2015 was a policy adviser at Ministry
  • Didn’t feel connected to job and people making policies for
  • Outside of work was a Youthline counsellor
  • Wanted to make a difference, organised some internal talks
  • Wanted to make changes, got told had to be a manager to make changes (10 years away)
  • Found out about R9 accelerator. Startup accelerator looking at Govt/Business interaction and pain points
  • Get seconded to it
  • First month was very hard.
  • Speed of change was difficult, “Lean into the discomfort” – Team motto
  • Be married to the problem
    • Specific problem was making sure enough seasonal workers, came up with solution but customers didn’t like it. Was not solving the actual problem customers had.
    • Team was married to the problem, not the married to the solution
  • When went back to old job, found slower pace hard to adjust back
  • Got offered a job back at the accelerator, coaching up to 7 teams.
    • Very hard work, lots of work, burnt out
    • 50% pay cut
    • Worked out wasn’t “Agile” herself
    • Started doing personal Kanban boards
    • Cut back number of teams coaching, higher quality
  • Spring Board
    • Place can work at sustainable pace
    • Working at Nomad 8 as an independent Agile consultant
    • Work on separate companies but some support from colleges
  • Find my place
    • Joined Xero as a Agile Team Facilitator
  • Takeaways
    • Anybody can be a change agent
    • An environment that supports and empowers
    • Look for support
  • Conversation on how you overcome the “Everest” big huge goal
    • Hard to get past the first step for some – speaker found she tended to do first think later. Others over-thought beforehand
    • It seems hard but think of the hard things you have done in your life and it is usually not as bad
    • Motivate yourself by having no money and having no choice
    • Point all the bad things out in the open, visualise them all and feel better cause they will rarely happen
    • Learn to recognise your bad patterns of thoughts
    • “The Way of Art” Steven Pressfield (skip the Angels chapter)
  • Are places Serious about Agile instead of just placing lip-service?
    • Questioner was older and found places wanted younger Agile coaches
    • Companies had to completely change into organisation, eg replace project managers
    • eg CEO is still waterfall but people lower down are into Agile. Not enough management buy-in.
    • Speaker left on client that wasn’t serious about changing
  • Went though an Agile process, made “Putting Agile into the Org” as the product
  • Show customers what the value is
  • Certification advice, all sorts of options. Nomad8 course is recomended



by simon at July 22, 2016 10:09 PM

July 21, 2016

Cryptography Engineering

Statement on DMCA lawsuit

My name is Matthew Green. I am a professor of computer science and a researcher at Johns Hopkins University in Baltimore. I focus on computer security and applied cryptography.

Today I filed a lawsuit against the U.S. government, to strike down Section 1201 of the Digital Millennium Copyright Act. This law violates my First Amendment right to gather information and speak about an urgent matter of public concern: computer security. I am asking a federal judge to strike down key parts of this law so they cannot be enforced against me or anyone else.

A large portion of my work involves building and analyzing the digital security systems that make our modern technological world possible. These include security systems like the ones that protect your phone calls, instant messages, and financial transactions – as well as more important security mechanisms that safeguard property and even human life.

I focus a significant portion of my time on understanding the security systems that have been deployed by industry. In 2005, my team found serious flaws in the automotive anti-theft systems used in millions of Ford, Toyota and Nissan vehicles. More recently, my co-authors and I uncovered flaws in the encryption that powers nearly one third of the world’s websites, including Facebook and the National Security Agency. Along with my students, I've identified flaws in Apple’s iMessage text messaging system that could have allowed an eavesdropper to intercept your communications. And these are just a sampling of the public research projects I’ve been involved with.

I don’t do this work because I want to be difficult. Like most security researchers, the research I do is undertaken in good faith. When I find a flaw in a security system, my first step is to call the organization responsible. Then I help to get the flaw fixed. Such independent security research is an increasingly precious commodity. For every security researcher who investigates systems in order to fix them, there are several who do the opposite – and seek to profit from the insecurity of the computer systems our society depends on.

There’s a saying that no good deed goes unpunished. The person who said this should have been a security researcher. Instead of welcoming vulnerability reports, companies routinelythreaten good-faith security researchers with civil action, or even criminal prosecution. Companies use the courts to silence researchers who have embarrassing things to say about their products, or who uncover too many of those products' internal details. These attempts are all too often successful, in part because very few security researchers can afford a prolonged legal battle with well-funded corporate legal team.

This might just be a sad story about security researchers, except for the fact that these vulnerabilities affect everyone. When security researchers are intimidated, it’s the public that pays the price. This is because real criminals don’t care about lawsuits and intimidation – and they certainly won’t bother to notify the manufacturer. If good-faith researchers aren’t allowed to find and close these holes, then someone else will find them, walk through them, and abuse them.

In the United States, one of the most significant laws that blocks security researchers is Section 1201 of the Digital Millennium Copyright Act (DMCA). This 1998 copyright law instituted a raft of restrictions aimed at preventing the “circumvention of copyright protection systems.” Section 1201 provides both criminal and civil penalties for people who bypass technological measures protecting a copyrighted work. While that description might bring to mind the copy protection systems that protect a DVD or an iTunes song, the law has also been applied to prevent users from reverse-engineering software to figure out how it works. Such reverse-engineering is a necessary party of effective security research.

Section 1201 poses a major challenge for me as a security researcher. Nearly every attempt to analyze a software-based system presents a danger of running afoul of the law. As a result, the first step in any research project that involves a commercial system is never science – it’s to call a lawyer; to ask my graduate students to sign a legal retainer; and to inform them that even with the best legal advice, they still face the possibility of being sued and losing everything they have. This fear chills critical security research.

Section 1201 also affects the way that my research is conducted. In a recent project – conducted in Fall 2015 – we were forced to avoid reverse-engineering a piece of software when it would have been the fastest and most accurate way to answer a research question. Instead, we decided to treat the system as a black box, recovering its operation only by observing inputs and outputs. This approach often leads to a less perfect understanding of the system, which can greatly diminish the quality of security research. It also substantially increases the time and effort required to finish a project, which reduces the quantity of security research.

Finally, I have been luckier than most security researchers in that I have access to legal assistance from organizations such as the Electronic Frontier Foundation. Not every security researcher can benefit from this.

The risk imposed by Section 1201 and the heavy cost of steering clear of it discourage me – and other researchers -- from pursuing any project that does not appear to have an overwhelming probability of success. This means many projects that would yield important research and protect the public simply do not happen.

In 2015, I filed a request with the Library of Congress for a special exemption that would have exempted good faith security researchers from the limitations of Section 1201. Representatives of the major automobile manufacturers and the Business Software Alliance (a software industry trade group) vigorously opposed the request. This indicates to me that even reasonable good faith security testing is still a risky proposition.

This risk is particularly acute given that the exemption we eventually won was much more limited than what we asked for, and leaves out many of the technologies with the greatest impact on public health, privacy, and the security of financial transactions.

Section 1201 has prevented crucial security research for far too long. That’s why I’m seeking a court order that would strike Section 1201 from the books as a violation of the First Amendment. 

by Matthew Green ( at July 21, 2016 06:34 PM

July 20, 2016


FIPS 140-2: Once More Unto the Breach

The last post on this topic sounded a skeptical note on the prospects for a new FIPS 140 validated module for OpenSSL 1.1 and beyond. That post noted a rather improbable set of prerequisites for a new validation attempt; ones I thought only a governmental sponsor could meet (as was the case for the five previous open source based validations).

Multiple commercial vendors have offered to fund (very generously in some cases) a new validation effort under terms that would guarantee them a proprietary validation, while not guaranteeing an open source based validation. At one point we actually came close to closing a deal that would have funded an open source based validation attempt in exchange for a limited period of exclusivity; a reasonable trade-off in my opinion. But, I eventually concluded that was too risky given an uncertain reception by the FIPS validation bureaucracy, and we decided to wait for a “white knight” sponsor that might never materialize.

I’m pleased to announce that white knight has arrived; SafeLogic has committed to sponsor a new FIPS validation on “truly open or bust” terms that address the major risks that have prevented us from proceeding to date. SafeLogic is not only providing the critical funding for this effort; they will also play a significant role. The co-founders of SafeLogic, Ray Potter and Wes Higaki, wrote a book about the FIPS 140 validation process. The SafeLogic technical lead will be Mark Minnoch, who I worked with extensively when he was director of the accredited test lab that performed the open source based validations for the OpenSSL FIPS Object Module 2.0. The test lab for this effort will be Acumen Security. While I’ve not worked directly with Acumen before, I have corresponded with its director and co-founder, Ashit Vora, on several occasions and I know SafeLogic has chosen carefully. With my OpenSSL colleagues doing the coding as before, in particular Steve Henson and Andy Polyakov, we have a “dream team” for this sixth validation effort.

Note that this validation sponsorship is very unusual, and something most commercial companies would be completely incapable of even considering. SafeLogic is making a bold move, trusting in us and in the sometimes fickle and unpredictable FIPS validation process. Under the terms of this sponsorship OpenSSL retains full control and ownership of the FIPS module software and the validation. This is also an all-or-nothing proposition; no one – including SafeLogic – gets to use the new FIPS module until and if a new open source based validation is available for everyone. SafeLogic is making a major contribution to the entire OpenSSL user community, for which they have my profound gratitude.

Now, why would a commercial vendor like SafeLogic agree to such an apparently one sided deal? Your typical MBA would choke at the mere thought. But, SafeLogic has thought it through carefully; they “get” open source and they are already proficient at leveraging open source. This new OpenSSL FIPS module will become the basis of many new derivative products, even more so than the wildly popular 2.0 module, and no vendor is going to be closer to the action or more familiar with the nuances than SafeLogic. As an open source product the OpenSSL FIPS module with its business-friendly license will always be available to anyone for use in pursuing their own validation actions, but few vendors have much interest in pursuing such a specialized and treacherous process when better alternatives are available. Having sponsored and actively collaborated with the validation from the starting line, SafeLogic will be in the perfect position to be that better alternative.

There are a lot of moving parts to this plan – technical details of the new module, interim licensing, schedule aspirations, etc. – that I’ll try to cover in upcoming posts.

July 20, 2016 07:00 PM


Interacting with the Puppet CA from Ruby

I recently ran into a known bug with the puppet certificate generate command that made it useless to me for creating user certificates.

So I had to do the CSR dance from Ruby myself to work around it, it’s quite simple actually but as with all things in OpenSSL it’s weird and wonderful.

Since the Puppet Agent is written in Ruby and it can do this it means there’s a HTTP API somewhere, these are documented reasonably well – see /puppet-ca/v1/certificate_request/ and /puppet-ca/v1/certificate/. Not covered is how to make the CSRs and such.

First I have a little helper to make the HTTP client:

def ca_path; "/home/rip/.puppetlabs/etc/puppet/ssl/certs/ca.pem";end
def cert_path; "/home/rip/.puppetlabs/etc/puppet/ssl/certs/rip.pem";end
def key_path; "/home/rip/.puppetlabs/etc/puppet/ssl/private_keys/rip.pem";end
def csr_path; "/home/rip/.puppetlabs/etc/puppet/ssl/certificate_requests/rip.pem";end
def has_cert?; File.exist?(cert_path);end
def has_ca?; File.exist?(ca_path);end
def already_requested?;!has_cert? && File.exist?(key_path);end
def http
  http =, 8140)
  http.use_ssl = true
  if has_ca?
    http.ca_file = ca_path
    http.verify_mode = OpenSSL::SSL::VERIFY_PEER
    http.verify_mode = OpenSSL::SSL::VERIFY_NONE

This is a HTTPS client that uses full verification of the remote host if we have a CA. There’s a small chicken and egg where you have to ask the CA for it’s own certificate where it’s a unverified connection. If this is a problem you need to arrange to put the CA on the machine in a safe manner.

Lets fetch the CA:

def fetch_ca
  return true if has_ca?
  req ="/puppet-ca/v1/certificate/ca", "Content-Type" => "text/plain")
  resp, _ = http.request(req)
  if resp.code == "200", "w", Ob0644) {|f| f.write(resp.body)}
    puts("Saved CA certificate to %s" % ca_path)
    abort("Failed to fetch CA from %s: %s: %s" % [@ca, resp.code, resp.message])

At this point we have the CA and saved it, future requests will be verified against this CA. If you put the CA there using some other means this will do nothing.

Now we need to start making our CSR, first we have to make a private key, this is a 4096 bit key saved in pem format:

def write_key
  key =, "w", Ob0640) {|f| f.write(key.to_pem)}

And the CSR needs to be made using this key, Puppet CSRs are quite simple with few fields filled in, can’t see why you couldn’t fill in more fields and of course it now supports extensions, I didn’t add any of those here, just a OU:

def write_csr(key)
  csr =
  csr.version = 0
  csr.public_key = key.public_key
  csr.subject =
      ["CN", @certname, OpenSSL::ASN1::UTF8STRING],
      ["OU", "my org", OpenSSL::ASN1::UTF8STRING]
  csr.sign(key,, "w", Ob0644) {|f| f.write(csr.to_pem)}

Let’s combine these to make the key and CSR and send the request to the Puppet CA, this request is verified using the CA:

def request_cert
  req ="/puppet-ca/v1/certificate_request/%s?environment=production" % @certname, "Content-Type" => "text/plain")
  req.body = write_csr(write_key)
  resp, _ = http.request(req)
  if resp.code == "200"
    puts("Requested certificate %s from %s" % [@certname, @ca])
    abort("Failed to request certificate from %s: %s: %s: %s" % [@ca, resp.code, resp.message, resp.body])

You’ll now have to sign the cert on your Puppet CA as normal, or use autosign, nothing new here.

And finally you can attempt to fetch the cert, this method is designed to return false if the cert is not yet ready on the master – ie. not signed yet.

def attempt_fetch_cert
  return true if has_cert?
  req ="/puppet-ca/v1/certificate/%s" % @certname, "Content-Type" => "text/plain")
  resp, _ = http.request(req)
  if resp.code == "200", "w", Ob0644) {|f| f.write(resp.body)}
    puts("Saved certificate to %s" % cert_path)

Pulling this all together you have some code to make keys, CSR etc, cache the CA and request a cert is signed, it will then do a wait for cert like Puppet does till things are signed.

def main
  abort("Already have a certificate '%s', cannot continue" % @certname) if has_cert?
  if already_requested?
    puts("Certificate %s has already been requested, attempting to retrieve it" % @certname)
    puts("Requesting certificate for '%s'" % @certname)
  puts("Waiting up to 120 seconds for it to be signed")
  12.times do |time|
    print "Attempting to download certificate %s: %d / 12\r" % [@certname, time]
    break if attempt_fetch_cert
    sleep 10
  abort("Could not fetch the certificate after 120 seconds") unless has_cert?
  puts("Certificate %s has been stored in %s" % [@certname, ssl_dir])

by R.I. Pienaar at July 20, 2016 01:45 PM

July 19, 2016

Everything Sysadmin

Survey on easy of use for configuration management languages

I'm reposting this survey. The researcher is trying to identify what makes configuration languages difficult to use. If you use Puppet/Chef/CfEngine/Ansible (and more importantly... if you DON'T use any configuration languages) please take this survey. I took it in under 10 minutes.


I would like to invite you to take a survey on configuration languages. The survey is about 10 minutes long and does not require any specific skills or knowledge, anyone can participate in it.

The survey is a part of ongoing research at University of Edinburgh School of Informatics on configuration languages. As Paul Anderson has mentioned previously, we are interested in studying the usability of configuration languages and how it relates to the configuration errors.

The survey will be open through the 2nd of August, 2016. It is possible to take survey in parts by choosing to finish it later, but it cannot be completed past the deadline.

Please feel free to forward this survey to others.

If you have any questions or concerns, contact information is linked on the first page of the survey.

Your answers to the survey will be used in aggregate to improve our understanding of how configuration languages are understood and used.

Thank you
Adele Mikoliunaite
Masters student at
University of Edinburgh

by Tom Limoncelli at July 19, 2016 06:00 PM


Using salt-ssh to install Salt

In recent articles I covered how I've built a Continuous Delivery pipeline for my blog. These articles talk about using Docker to build a container for my blog, using Travis CI to test and build that container, and finally using a Masterless SaltStack configuration to deploy the blog. Once setup, this pipeline enables me to publish new posts by simply managing them within a GitHub repository.

The nice thing about this setup is that not only are blog posts managed hands-free. All of the servers that host my blog are managed hands-free. That means required packages, services and configurations are all deployed with the same Masterless SaltStack configuration used for the blog application deployment.

The only thing that isn't hands-free about this setup, is installing and configuring the initial SaltStack Minion agent. That is, until today. In this article I am going to cover how to use salt-ssh, SaltStack's SSH functionality to install and configure the salt-minion package on a new server.

How salt-ssh works

A typical SaltStack deployment consists of a Master server running the salt-master process, and one or more Minion servers running the salt-minion process. The salt-minion service will initiate communication with the salt-master service over a ZeroMQ connection and the Master distributes desired states to the Minion.

With this typical setup, you must first have a salt-master service installed and configured, and every server that you wish to manage with Salt must have the salt-minion service installed and configured. The salt-ssh package however, changes that.

salt-ssh is designed to provide SaltStack with the ability to manage servers in an agent-less fashion. What that means is, salt-ssh gives you the ability to manage a server, without having to install and configure the salt-minion package.

Why use salt-ssh instead of the salt-minion agent?

Why would anyone not want to install the salt-minion package? Well, there are a lot of possible reasons. One reason that comes to mind is that some systems are so performance oriented that there may be a desire to avoid performance degradation by running an agent. While salt-minion doesn't normally use a lot of resources, it uses some and that some may be too much for certain environments.

Another reason may be due to network restrictions, for example if the Minions are in a DMZ segment of a network you may want to use salt-ssh from the master so that connections are only going from the master to the minion and never the minion to the master like the traditional setup.

For today's article, the reasoning behind using salt-ssh is that I wish to automate the installation and configuration of the salt-minion service on new servers. These servers will not have the salt-minion package installed by default and I wanted to automate the initial installation of Salt.

Getting started with salt-ssh

Before we can start using salt-ssh to setup our new server we will first need to setup a Master server where we can call salt-ssh from. For my environment I will be using a virtual machine running on my local laptop as the Salt Master. Since we are using salt-ssh there is no need for the Minions to connect to this Master which makes running it from a laptop a simple solution.

Installing SaltStack

On this Salt Master, we will need to install both the salt-master and the salt-ssh packages. Like previous articles we will be following SaltStack's official guide for installing Salt on Ubuntu systems.

Setup Apt

The official install guide for Ubuntu uses the Apt package manager to install SaltStack. In order to install these packages with Apt we will need to setup the SaltStack repository. We will do so using the add-apt-repository command.

$ sudo add-apt-repository ppa:saltstack/salt
     Salt, the remote execution and configuration management tool.
     More info:
    Press [ENTER] to continue or ctrl-c to cancel adding it

    gpg: keyring `/tmp/tmpvi1b21hk/secring.gpg' created
    gpg: keyring `/tmp/tmpvi1b21hk/pubring.gpg' created
    gpg: requesting key 0E27C0A6 from hkp server
    gpg: /tmp/tmpvi1b21hk/trustdb.gpg: trustdb created
    gpg: key 0E27C0A6: public key "Launchpad PPA for Salt Stack" imported
    gpg: Total number processed: 1
    gpg:               imported: 1  (RSA: 1)

Once the Apt repository has been added we will need to refresh Apt's package cache with the apt-get update command.

$ sudo apt-get update
Get:1 trusty-security InRelease [65.9 kB]
Ign trusty InRelease                                  
Ign trusty InRelease                                 
Get:2 trusty-updates InRelease [65.9 kB]
Get:3 trusty Release.gpg [316 B]                      
Get:4 trusty Release [15.1 kB]                        
Get:5 trusty-security/main Sources [118 kB]         
Fetched 10.9 MB in 7s (1,528 kB/s)                                             
Reading package lists... Done

This update command causes Apt to look through it's known package repositories and refresh a local inventory of available packages.

Installing the salt-master and salt-ssh packages

Now that we have SaltStack's Apt repository configured we can proceed with installing the required Salt packages. We will do this with the apt-get install command specifying the two packages we wish to install.

$ sudo apt-get install salt-master salt-ssh
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  git git-man liberror-perl libpgm-5.1-0 libzmq3 python-async python-croniter
  python-dateutil python-git python-gitdb python-jinja2 python-m2crypto
  python-markupsafe python-msgpack python-smmap python-zmq salt-common
Suggested packages:
  git-daemon-run git-daemon-sysvinit git-doc git-el git-email git-gui gitk
  gitweb git-arch git-bzr git-cvs git-mediawiki git-svn python-jinja2-doc
  salt-doc python-mako
The following NEW packages will be installed:
  git git-man liberror-perl libpgm-5.1-0 libzmq3 python-async python-croniter
  python-dateutil python-git python-gitdb python-jinja2 python-m2crypto
  python-markupsafe python-msgpack python-smmap python-zmq salt-common
  salt-master salt-ssh
0 upgraded, 19 newly installed, 0 to remove and 189 not upgraded.
Need to get 7,069 kB of archives.
After this operation, 38.7 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y

In addition to the salt-ssh and salt-master packages Apt will install any dependencies that these packages require. With these packages installed, we can now move on to configuring our Salt master.

Configuring salt-ssh to connect to our target minion

Before we can start using salt-ssh to manage our new minion server we will first need to tell salt-ssh how to connect to that server. We will do this by editing the /etc/salt/roster file.

$ sudo vi /etc/salt/roster

With a traditional SaltStack setup the minion agents would initiate the first connection to the Salt master. This first connection is the way the master service identifies new minion servers. With salt-ssh there is no process for Salt to automatically identify new minion servers. This is where the /etc/salt/roster file comes into the picture, as this file used as an inventory of minions for salt-ssh.

As with other configuration files in Salt, the roster file is a YAML formatted file which makes it fairly straight forwarder to understand. The information required to specify a new minion is also pretty straight forward.

  user: root

The above information is fairly minimal, the basic definition is a Target name blr1-001, a Hostname or IP address specified by host and then the Username specified by user.

The target name is used when running the salt-ssh command to specify what minion we wish to target. The host key is used by salt-ssh to define where to connect to, and the user key is used to define who to connect as.

In the example above I specified to use the root user. It is possible to use salt-ssh with a non-root user by simply adding sudo: True to the minion entry.

Testing connectivity to our minion

With the minion now defined within the /etc/salt/roster file we should now be able to connect to our minion with salt-ssh. We can test this out by executing a task against this target with salt-ssh.

$ sudo salt-ssh 'blr1-001' --priv=/home/vagrant/.ssh/id_rsa
        The host key needs to be accepted, to auto accept run salt-ssh with the -i flag:
        The authenticity of host ' (' can't be established.
        ECDSA key fingerprint is 2c:34:0a:51:a2:bb:88:cc:3b:86:25:bc:b8:d0:b3:d0.
        Are you sure you want to continue connecting (yes/no)?

The salt-ssh command above has a similar syntax to the standard salt command called with a typical Master/Minion setup; the format is salt-ssh <target> <task>. In this case our target was blr1-001 the same name we defined earlier and our task was

You may also notice that I passed the --priv flag followed by a path to my SSH private key. This flag is used to specify an SSH key to use when connecting to the minion server. By default, salt-ssh will use SaltStack's internal SSH key, which means if you wish to use an alternative key you will need to specify the key with the --priv flag.

In many cases it's perfectly fine to use SaltStack's internal SSH key, in my case the SSH public key has already been distributed which means I do not want to use Salt's internal SSH key.

Bypassing Host Key Validation

If we look at the output of the salt-ssh command executed earlier, we can see that the command was not successful. The reason for this is because this master server has not accepted the host key from the new minion server. We can get around this issue by specifying the -i flag when running salt-ssh.

$ sudo salt-ssh 'blr1-001' --priv /home/vagrant/.ssh/id_rsa -i

The -i flag tells salt-ssh to ignore host key checks from SSH. We can see from the above salt-ssh execution, when the -i flag is used, everything works as expected.

Specifying a password

In the salt-ssh commands above we used an SSH key for authentication with the Minion server. This worked because prior to setting up Salt, I deployed the public SSH key to the minion server we are connecting with. If we didn't want to use SSH keys for authentication with the Salt Minion for whatever reason, we could also use password based authentication by specifying the password within the roster file.

$ sudo vi /etc/salt/roster

To add a password for authentication, simply add the passwd key within the target servers specification.

  user: root
  passwd: example

With the above definition salt-ssh will now connect to our minion by establishing an SSH connection to and login to this system as the root user with the password of example. With a password defined, we can rerun our this time with the --priv flag omitted.

$ sudo salt-ssh 'blr1-001' -i

Now that a has returned correctly we can move on to our next step of defining the Salt states to install and configure the salt-minion package.

Using Salt to install Salt

In the Master-less Salt Minions article I had two GitHub repositories setup with various Salt state files. One repository contains salt state files that can be used to setup a generic base system, and the other contains custom state files used to setup the environment running this blog.

Breaking down the minion state

Within this second repository is a salt state file that installs and configures the salt-minion service. Let's take a quick look at this state to understand how the salt-minion package is being installed.

    - managed
    - humanname: SaltStack Repo
    - name: deb {{ grains['lsb_distrib_codename'] }} main
    - dist: {{ grains['lsb_distrib_codename'] }}
    - key_url:
    - latest
    - dead
    - enable: False

    - source: salt://salt/config/etc/salt/minion.d/masterless.conf

    - source: salt://salt/config/etc/cron.d/salt-standalone

In the above state we can see that the Salt Apt repository is being added with the pkgrep module. The salt-minion package is being installed with the pkg module and the salt-minion service is being disabled by the service module.

We can also see that two files /etc/salt/minion.d/masterless.conf and /etc/cron.d/salt-standalone are being deployed. For this article, we will use this state as is to perform the initial salt-minion installation and configuration.

Setting up the master

As with the previous article we will use both of these repositories to setup our minion server. We will get started by first using git to clone these repositories into a directory within /srv/salt.

For the first repository, we will clone the contents into a base directory.

$ sudo git clone /srv/salt/base
Cloning into '/srv/salt/base'...
remote: Counting objects: 54, done.
remote: Total 54 (delta 0), reused 0 (delta 0), pack-reused 54
Unpacking objects: 100% (54/54), done.
Checking connectivity... done.

The second repository we will clone into /srv/salt/bencane.

$ sudo git clone /srv/salt/bencane
Cloning into '/srv/salt/bencane'...
remote: Counting objects: 46, done.
remote: Total 46 (delta 0), reused 0 (delta 0), pack-reused 46
Unpacking objects: 100% (46/46), done.
Checking connectivity... done.

With all of the salt states now on our local system we can configure Salt to use these state files. To do this we will need to edit the /etc/salt/master configuration file.

$ sudo vi /etc/salt/master

Within the master file the file_roots configuration parameter is used to define where Salt's state files are located on the master. Since we have two different locations for the two sets of state files we will specify them individually as their own item underneath file_roots.

    - /srv/salt/base
    - /srv/salt/bencane

With the above defined, we can now use our Salt states to setup a new minion server. To do this we will once again run salt-ssh but this time specifying the state.highstate task.

$ sudo salt-ssh 'blr1-001' -i state.highstate
          ID: salt-minion
    Function: pkgrepo.managed
        Name: deb trusty main
      Result: True
     Comment: Configured package repo 'deb trusty main'
     Started: 21:18:54.462571
    Duration: 20440.965 ms
                  deb trusty main
          ID: salt-minion
    Function: pkg.latest
      Result: True
     Comment: The following packages were successfully installed/upgraded: salt-minion
     Started: 21:19:14.903859
    Duration: 16889.713 ms
          ID: salt-minion
    Function: service.dead
      Result: True
     Comment: Service salt-minion has been disabled, and is dead
     Started: 21:19:32.488449
    Duration: 133.722 ms
          ID: /etc/salt/minion.d/masterless.conf
    Function: file.managed
      Result: True
     Comment: File /etc/salt/minion.d/masterless.conf updated
     Started: 21:19:32.626328
    Duration: 11.762 ms
                  New file
          ID: /etc/cron.d/salt-standalone
    Function: file.managed
      Result: True
     Comment: File /etc/cron.d/salt-standalone updated
     Started: 21:19:32.638297
    Duration: 4.049 ms
                  New file

Succeeded: 37 (changed=23)
Failed:     0
Total states run:     37

From the results of the state.highstate task, we can see that 37 Salt states were verified and 23 of those resulted in changes being made. We can also see from the output of the command above that our salt-ssh execution just resulted in the installation and configuration of the salt-minion package.

If we wanted to verify that this is true even further we can use the Salt module to execute the dpkg --list command.

$ sudo salt-ssh 'blr1-001' -i "dpkg --list | grep salt"
    ii  salt-common                        2016.3.1+ds-1                    all          shared libraries that salt requires for all packages
    ii  salt-minion                        2016.3.1+ds-1                    all          client package for salt, the distributed remote execution system


With the above, we can see that we were successfully able to install the salt-minion agent to a remote system via salt-ssh. While this may seem like quite a bit of work to setup for a single minion. The ability to install Salt with salt-ssh can be very useful when you are setting up multiple minions, as this same methodology works whether you're installing Salt on 1 or 1,000 minions.

Posted by Benjamin Cane

July 19, 2016 10:30 AM

July 15, 2016


Database schemas vs. identity

Yesterday brought this tweet up:

This is amazingly bad wording, and is the kind of thing that made the transpeople in my timeline (myself included) go "Buwhuh?" and me to wonder if this was a snopes worthy story.

No, actually.

The key phrase here is, "submit your prints for badges".

There are two things you should know:

  1. NASA works on National Security related things, which requires a security clearance to work on, and getting one of those requires submitting prints.
  2. The FBI is the US Government's authority in handling biometric data

Here is a chart from the Electronic Biometric Transmission Specification, which describes a kind of API for dealing with biometric data.

If Following Condition ExistsEnter Code
Subject's gender reported as femaleF
Occupation or charge indicated "Male Impersonator"G
Subject's gender reported as maleM
Occupation or charge indicated "Female Impersonator" or transvestiteN
Male name, no gender givenY
Female name, no gender givenZ
Unknown genderX

Source: EBTS Version 10.0 Final, page 118.

Yep, it really does use the term "Female Impersonator". To a transperson living in 2016 getting their first Federal job (even as a contractor), running into these very archaic terms is extremely off-putting.

As someone said in a private channel:

This looks like some 1960's bureaucrat trying to be 'inclusive'

This is not far from the truth.

This table exists unchanged in the 7.0 version of the document, dated January 1999. Previous versions are in physical binders somewhere, and not archived on the Internet; but the changelog for the V7 document indicates that this wording was in place as early as 1995. Mention is also made of being interoperable with UK law-enforcement.

The NIST standard for fingerprints issued in 1986 mentions a SEX field, but it only has M, F, and U; later NIST standards drop this field definition entirely.

As this field was defined in standard over 20 years ago and has not been changed, is used across the full breadth of the US justice system, is referenced in International communications standards including Visa travel, and used as the basis for US Military standards, these field definitions are effectively immutable and will only change after concerted effort over decades.

This is what institutionalized transphobia looks like, and we will be dealing with it for another decade or two. If not longer.

The way to deal with this is to deprecate the codes in documentation, but still allow them as valid.

The failure-mode of this comes in with form designers who look at the spec and build forms based on the spec. Like this example from Maryland. Which means we need to let the forms designers know that the spec needs to be selectively ignored.

At the local level, convince your local City Council to pass resolutions to modernize their Police forms to reflect modern sensibilities, and drop the G and N codes from intake forms. Do this at the County too, for the Sheriff's department.

At the state level, convince your local representatives to push resolutions to get the State Patrol to modernize their forms likewise. Drop the G and N codes from the forms.

At the Federal employee level, there is less to be done here as you're closer to the governing standards, but you may be able to convince The Powers That Be to drop the two offensive checkboxes or items from the drop-down list.

by SysAdmin1138 at July 15, 2016 03:17 PM